04-18-2024 05:03 AM
Hi Team,
Can anyone explain how to do period-based Incremental load based on the Ship_Date column(Date format. not timestamp)?
In our current ETl tool, we delete the last 3 months(Ship_Date) records from the Data Warehouse table and load the last 3 months' data from the source table. because there will be changes in the last month's data. So we refresh the data daily for the last 3 months based on Ship_Date.
How to achieve this in Incorta load?
Thanks
Solved! Go to Solution.
04-19-2024 04:23 AM
I have similar use case and I would appreciate an answer from anyone already familiar with Incorta.
04-22-2024 07:06 AM
@Maran_Rengasamy - If I understand, it's not a date format? Could you give a few ship_date sample values? Are you working to keep a full history of records in Incorta or only the last three months?
04-24-2024 04:12 AM
Hi Joe,
Thanks for your response. The Ship_Date is in Date format(yyyy/mm/dd). Source data is in Oracle DB and the target table is in SQL Server. Below are some sample dates.
I'm trying to keep the history data(for example 2017/01/01 to today) and try to refresh only the last 3 months' data daily.
2024/04/23 |
2024/04/18 |
2024/04/19 |
2024/03/25 |
2024/03/21 |
2024/03/06 |
2024/03/23 |
2024/03/15 |
2024/02/22 |
04-26-2024 05:21 AM - edited 04-26-2024 05:23 AM
Hi Maran,
[edit] - it's not as complex as my post probably reads 😉 Let me know if you want to jump on a call and walk through it.
As you know Incorta doesn't do deletes, but will update if you do an incremental load and have key columns defined and constraints enforced - so you could set up your incremental query with something like ship_date >= add_days(current_day, -30 ) and let 'er rip.
Alternatively - what I did when faced with a similar situation - was create a couple of tables and an MV:
Table 1: Historic data - more or less static and does not need to be re-run ( in my case a full refresh was > 24 hours so I needed to *not* reload static data)
Table 2: In flux data - in your case the last three months which can be pulled in a full load
MV: Union of the two. Since the two sources are resident in Incorta this loads really quickly.
I kept Table 1 in a dedicated schema so it was never loaded unless I wanted it to be. Table 2 and the MV were in my "main" schema because I wanted Incorta to enforce running Table 2 prior to the MV.