Hi Joseph - you can do this with a SQL hashbytes function. It allows you to pick whatever fields you want and creates a hash key out of them. Then just make the HashKey the primary Key on the incorta table when it loads and the upsert should work. Here's an example with some made up data....
This will create a field called HashKey (or whatever you name it) and any time either myFirstField, mySecondField or myThirdField changes Incorta will insert a new row. If myFourthField changes, Incorta will update your current row that has the identical values for the first three fields.
Basically it creates a composite primary key. So if you put ALL your fields in the CONCAT list you'll never update anything, you'll always insert a new row unless you get a completely identical row.
Hope this helps,
SELECT myFirstField, mySecondField, myThirdField, myFourthField, CONVERT( VARCHAR(32), HashBytes( 'MD5', CONCAT( myFirstField, mySecondField, myThirdField ) ), 2 ) as HashKey
If I understand correctly, it sounds like you want to pull new rows without having a timestamp or anything that helps identify them as new. If you can't put something in the query to only pull new rows (and it sounds like you can't if there isn't a time stamp) then you have to pull everything and let Incorta decide what's new based on what has a different 'Key' than already exists on the table.
That's my understanding of how incremental loading works though - maybe some other replies will help clarify.
Hi joseph. delpercio -- please reference this other Q&A thread on the community that Mike and I discussed while back https://community.incorta.com/t/x17nm3/incremental-load.
Essentitally, I think you're going to have to have a new column (e.g. md5col) on your source system table which stores off the current state of the md5checksum. Once you have that created then your incremental SELECT statement in Incorta can look something like this:
SELECT * FROM <table> WHERE md5col != md5(col1 || col2 || col3)
Dan Brock I will have to review this a bit more then, can I not do that type of incremental type in the incremental query? I made a recordid in a source table that I could control to test it. However, you can't make it a key in Incorta unless it is a int or string, so I made it an integer. I can change, but that still doesn't satisfy the timestamp requirement the incremental loader is looking for.
I hope I am not deviating from the main question.
You can do incremental without a timestamp.
Ex: Normal Query: select record_id, record_type from table_a Update Query: select record_id, record_type from table_a
and select "record_id" as a key column in Incorta UI.
This will capture delta i.e. if record_id does not present, it will insert the record, if record_id is present, it will update the record.
But the catch is table size. If your source table size is too large then for every incremental load, Incorta scans the entire table and it will take time.
That is where we are using the timestamp to restrict the number of records Incorta scans.
Ex: Normal Query: select record_id, record_type from table_a Update Query: select record_id, record_type from table_a where last_modified > ?
Now, Incorta is restricted to scan only a certain set of records and do Upsert operation.