<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Incremental loads - when and how much does it negatively impact runtime queries? in Administrative Discussions</title>
    <link>https://community.incorta.com/t5/administrative-discussions/incremental-loads-when-and-how-much-does-it-negatively-impact/m-p/3864#M106</link>
    <description>&lt;P&gt;Before anyone types "it depends" know that I'm a consultant and claim that answer for myself&amp;nbsp; &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am interested to know if there are any measurements which can help me determine a strategy for doing incremental builds as my routine, and only doing full loads as runtime performance dictates.&amp;nbsp; I'm predominantly on Incorta cloud so kind of don't care if I clutter up a directory w/ 1 or 1000 files, but do care if I watch the "wheel spin" when I open dashboards.&lt;/P&gt;&lt;P&gt;I don't know enough to know if this should be based on number of parquet files, size of the files, a combination of both, or some other factor(s).&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hoping the fine folks in product development, support, or services have done some benchmarking and can provide some guidance.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 07 Mar 2023 22:33:17 GMT</pubDate>
    <dc:creator>RADSr</dc:creator>
    <dc:date>2023-03-07T22:33:17Z</dc:date>
    <item>
      <title>Incremental loads - when and how much does it negatively impact runtime queries?</title>
      <link>https://community.incorta.com/t5/administrative-discussions/incremental-loads-when-and-how-much-does-it-negatively-impact/m-p/3864#M106</link>
      <description>&lt;P&gt;Before anyone types "it depends" know that I'm a consultant and claim that answer for myself&amp;nbsp; &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am interested to know if there are any measurements which can help me determine a strategy for doing incremental builds as my routine, and only doing full loads as runtime performance dictates.&amp;nbsp; I'm predominantly on Incorta cloud so kind of don't care if I clutter up a directory w/ 1 or 1000 files, but do care if I watch the "wheel spin" when I open dashboards.&lt;/P&gt;&lt;P&gt;I don't know enough to know if this should be based on number of parquet files, size of the files, a combination of both, or some other factor(s).&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hoping the fine folks in product development, support, or services have done some benchmarking and can provide some guidance.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Mar 2023 22:33:17 GMT</pubDate>
      <guid>https://community.incorta.com/t5/administrative-discussions/incremental-loads-when-and-how-much-does-it-negatively-impact/m-p/3864#M106</guid>
      <dc:creator>RADSr</dc:creator>
      <dc:date>2023-03-07T22:33:17Z</dc:date>
    </item>
    <item>
      <title>Re: Incremental loads - when and how much does it negatively impact runtime queries?</title>
      <link>https://community.incorta.com/t5/administrative-discussions/incremental-loads-when-and-how-much-does-it-negatively-impact/m-p/5456#M265</link>
      <description>&lt;P&gt;Incremental logic will be executed by the source database.&amp;nbsp; If you just add simple where clause like, "LAST_UPDATE_DATE &amp;gt; ?", it may be fine.&amp;nbsp; If the source extraction query involves multiple tables and you use OR to connect multiple filters like&lt;/P&gt;
&lt;LI-CODE lang="java"&gt;WHERE TB1.LAST_UPDATE_DATE &amp;gt;? 
OR TB2.LAST_UPDATE_DATE &amp;gt;? 
OR TB3.LAST_UPDATE_DATE &amp;gt;? &lt;/LI-CODE&gt;
&lt;P&gt;The database may not perform well on such query and thus become negative impacts.&lt;/P&gt;
&lt;P&gt;When Incremental refresh will extract a large volume of the data that need to be merged, Incorta may work better by doing a full refresh.&lt;/P&gt;
&lt;P&gt;The incremental logic may make the table fragmented. When the refresh is scheduled with a very high frequency and thus produce many small files.&amp;nbsp; It will have negative impact to the downstream process and thus a parquet merge tool may have to be used.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2024 19:34:50 GMT</pubDate>
      <guid>https://community.incorta.com/t5/administrative-discussions/incremental-loads-when-and-how-much-does-it-negatively-impact/m-p/5456#M265</guid>
      <dc:creator>dylanwan</dc:creator>
      <dc:date>2024-01-31T19:34:50Z</dc:date>
    </item>
  </channel>
</rss>

