<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How does Incorta treat duplicate rows for load and update? in Administrative Discussions</title>
    <link>https://community.incorta.com/t5/administrative-discussions/how-does-incorta-treat-duplicate-rows-for-load-and-update/m-p/4673#M188</link>
    <description>&lt;P&gt;&lt;SPAN&gt;In compaction,&amp;nbsp; the most recent record gets selected.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;For the first question the last one wins.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;For the second, the last is picked.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For the order of operations:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Increments is scanned in order, and duplicates are marked. Once finished, we remove duplicates by rewriting files if needed.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 12 Jul 2023 18:58:16 GMT</pubDate>
    <dc:creator>amit_kothari</dc:creator>
    <dc:date>2023-07-12T18:58:16Z</dc:date>
    <item>
      <title>How does Incorta treat duplicate rows for load and update?</title>
      <link>https://community.incorta.com/t5/administrative-discussions/how-does-incorta-treat-duplicate-rows-for-load-and-update/m-p/4663#M184</link>
      <description>&lt;P&gt;If I have a table with keys defined and primary key enforcement turned on how does Incorta ingest and process duplicate rows?&amp;nbsp; &amp;nbsp;First in wins?&amp;nbsp; Last one in wins?&amp;nbsp; Other?&amp;nbsp; &amp;nbsp;&lt;/P&gt;&lt;P&gt;If a duplicate(s) is found during an incremental update does the existing Incorta record get updated in the same fashion ( i.e. if there are two duplicates in the incremental load does the existing record get updated with the first "new" record in or the last ) ?&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think I understand that incremental loads create their own parquet file - is the order of operations 1) full load creates temp file, goes through compaction and dedup, and then creates final file, 2)&amp;nbsp; incremental load creates temp file, goes through compaction and dedup, and final file, and then 3) original and subsequent incremental files are read into memory w/ further dedup in chrono/file order?&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="RADSr_0-1688853743880.jpeg" style="width: 400px;"&gt;&lt;img src="https://community.incorta.com/t5/image/serverpage/image-id/2309iD05095B6E2918A18/image-size/medium?v=v2&amp;amp;px=400" role="button" title="RADSr_0-1688853743880.jpeg" alt="RADSr_0-1688853743880.jpeg" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 08 Jul 2023 22:03:14 GMT</pubDate>
      <guid>https://community.incorta.com/t5/administrative-discussions/how-does-incorta-treat-duplicate-rows-for-load-and-update/m-p/4663#M184</guid>
      <dc:creator>RADSr</dc:creator>
      <dc:date>2023-07-08T22:03:14Z</dc:date>
    </item>
    <item>
      <title>Re: How does Incorta treat duplicate rows for load and update?</title>
      <link>https://community.incorta.com/t5/administrative-discussions/how-does-incorta-treat-duplicate-rows-for-load-and-update/m-p/4673#M188</link>
      <description>&lt;P&gt;&lt;SPAN&gt;In compaction,&amp;nbsp; the most recent record gets selected.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;For the first question the last one wins.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;For the second, the last is picked.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For the order of operations:&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;Increments is scanned in order, and duplicates are marked. Once finished, we remove duplicates by rewriting files if needed.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jul 2023 18:58:16 GMT</pubDate>
      <guid>https://community.incorta.com/t5/administrative-discussions/how-does-incorta-treat-duplicate-rows-for-load-and-update/m-p/4673#M188</guid>
      <dc:creator>amit_kothari</dc:creator>
      <dc:date>2023-07-12T18:58:16Z</dc:date>
    </item>
  </channel>
</rss>

