<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Performance Analysis of Postgress connection in Data &amp; Schema Discussions</title>
    <link>https://community.incorta.com/t5/data-schema-discussions/performance-analysis-of-postgress-connection/m-p/2846#M187</link>
    <description>&lt;P&gt;We called the approach "Incorta Over Incorta".&amp;nbsp; It is a SQL table that uses the posgreSQL driver against Incorta SQL interface.&amp;nbsp; In general, it is not recommended unless you have to use it for the support of multiple-source.&lt;/P&gt;
&lt;P&gt;Since it is running against the Incorta SQL interface, which runs by the Incorta Analytics service, the data will have to be refreshed available from the Incorta Analytics service.&amp;nbsp; It means that the data that was extracted but has not yet loaded will not be available.&amp;nbsp; You will see the data from the last time the data was refreshed, not the current uncompleted refresh.&lt;BR /&gt;&lt;BR /&gt;It can perform very well if the data can fit the memory and the SQL query does not cause the SQL interface to fallback to Spark.&amp;nbsp; However, it will compete the resource that will serve the users who is running Incorta dashboard against Incorta Analytics service as well.&lt;/P&gt;
&lt;P&gt;If the query does fallback to Spark, which will be handled by the SQLapp job that is running in your Spark cluster.&amp;nbsp; It may also perform very well as spark has parallel process.&lt;/P&gt;
&lt;P&gt;Whether the query will be by Spark fallback is not really a runtime decision but based on the following criteria:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Whether the query involves any table that is not loaded into memory, such as those tables with optimized set to false.&lt;/LI&gt;
&lt;LI&gt;Whether the query syntax can be supported by Incorta query engine.&amp;nbsp; For example, joining different tables and joins are not defined in Incorta&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Using a MV is recommended but the merge of multiple data sets will involve duplicating the data.&amp;nbsp; You will have to extract the data into a SQL table and then use it as the source for the MV.&amp;nbsp; MV can support incremental and the source extract can support incremental.&amp;nbsp; It may not be too bad to duplicate the data.&amp;nbsp; If the source table is not used in any analytic query, we suggest you to make it non-optimized to save the memory and data refresh time.&lt;/P&gt;
&lt;P&gt;If you are not using Incorta over Incorta and are not using any 3rd party tool against Incorta, SQL interface can be disabled.&amp;nbsp; If you disable SQL interface, SQLapp can be disabled, it will not compete with Incorta MV on the limited resource from Spark cluster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 03 Oct 2022 23:07:14 GMT</pubDate>
    <dc:creator>dylanwan</dc:creator>
    <dc:date>2022-10-03T23:07:14Z</dc:date>
    <item>
      <title>Performance Analysis of Postgress connection</title>
      <link>https://community.incorta.com/t5/data-schema-discussions/performance-analysis-of-postgress-connection/m-p/2828#M184</link>
      <description>&lt;P&gt;Hi Community,&lt;/P&gt;&lt;P&gt;Basically, I am creating a multi-source table using postgres connection, which utilizes the tables already existed in other Incorta schemas. Similar to writing an MV with Union All.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I just wanted to understand the performance impact of this approach versus the standard MV creation using Spark SQL or pySpark. Is there any documentation that I could use?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Srini&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Sep 2022 18:07:21 GMT</pubDate>
      <guid>https://community.incorta.com/t5/data-schema-discussions/performance-analysis-of-postgress-connection/m-p/2828#M184</guid>
      <dc:creator>srini_ch</dc:creator>
      <dc:date>2022-09-30T18:07:21Z</dc:date>
    </item>
    <item>
      <title>Re: Performance Analysis of Postgress connection</title>
      <link>https://community.incorta.com/t5/data-schema-discussions/performance-analysis-of-postgress-connection/m-p/2846#M187</link>
      <description>&lt;P&gt;We called the approach "Incorta Over Incorta".&amp;nbsp; It is a SQL table that uses the posgreSQL driver against Incorta SQL interface.&amp;nbsp; In general, it is not recommended unless you have to use it for the support of multiple-source.&lt;/P&gt;
&lt;P&gt;Since it is running against the Incorta SQL interface, which runs by the Incorta Analytics service, the data will have to be refreshed available from the Incorta Analytics service.&amp;nbsp; It means that the data that was extracted but has not yet loaded will not be available.&amp;nbsp; You will see the data from the last time the data was refreshed, not the current uncompleted refresh.&lt;BR /&gt;&lt;BR /&gt;It can perform very well if the data can fit the memory and the SQL query does not cause the SQL interface to fallback to Spark.&amp;nbsp; However, it will compete the resource that will serve the users who is running Incorta dashboard against Incorta Analytics service as well.&lt;/P&gt;
&lt;P&gt;If the query does fallback to Spark, which will be handled by the SQLapp job that is running in your Spark cluster.&amp;nbsp; It may also perform very well as spark has parallel process.&lt;/P&gt;
&lt;P&gt;Whether the query will be by Spark fallback is not really a runtime decision but based on the following criteria:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Whether the query involves any table that is not loaded into memory, such as those tables with optimized set to false.&lt;/LI&gt;
&lt;LI&gt;Whether the query syntax can be supported by Incorta query engine.&amp;nbsp; For example, joining different tables and joins are not defined in Incorta&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Using a MV is recommended but the merge of multiple data sets will involve duplicating the data.&amp;nbsp; You will have to extract the data into a SQL table and then use it as the source for the MV.&amp;nbsp; MV can support incremental and the source extract can support incremental.&amp;nbsp; It may not be too bad to duplicate the data.&amp;nbsp; If the source table is not used in any analytic query, we suggest you to make it non-optimized to save the memory and data refresh time.&lt;/P&gt;
&lt;P&gt;If you are not using Incorta over Incorta and are not using any 3rd party tool against Incorta, SQL interface can be disabled.&amp;nbsp; If you disable SQL interface, SQLapp can be disabled, it will not compete with Incorta MV on the limited resource from Spark cluster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Oct 2022 23:07:14 GMT</pubDate>
      <guid>https://community.incorta.com/t5/data-schema-discussions/performance-analysis-of-postgress-connection/m-p/2846#M187</guid>
      <dc:creator>dylanwan</dc:creator>
      <dc:date>2022-10-03T23:07:14Z</dc:date>
    </item>
  </channel>
</rss>

