<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Plan for Spark Version 3.2 in Data &amp; Schema Discussions</title>
    <link>https://community.incorta.com/t5/data-schema-discussions/plan-for-spark-version-3-2/m-p/2381#M137</link>
    <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I was wondering if there is a roadmap for the upgrade of Incorta Spark to version 3.2 or 3.3?&lt;/P&gt;&lt;P&gt;The reason I ask is because Spark version 3.2 saw the implementation of Pandas API on Spark (&lt;A href="https://spark.apache.org/docs/3.2.1/api/python/getting_started/quickstart_ps.html" target="_blank"&gt;https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/index.html&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;As we know, one of the limitations of using the standard Pandas library in Spark is its inability to scale linearly with data volume due to single-machine&amp;nbsp; processing however this limitation is overcome using Pandas API on Spark.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Another advantage (for me personally) is that the Pandas API on Spark uses the Plotly backend allowing us to create interactive charts which is extremely useful during the Exploratory Data Analysis and Model Evaluation stages.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Currently when we use Matplotlib.pyplot or Seaborn for EDA we can only generate static charts and also need to be extremely careful with sampling when working with large datasets.&lt;/P&gt;&lt;P&gt;Any feedback would be much appreciated!&lt;/P&gt;&lt;P&gt;Sam&lt;/P&gt;</description>
    <pubDate>Tue, 09 Aug 2022 17:29:20 GMT</pubDate>
    <dc:creator>Stracey</dc:creator>
    <dc:date>2022-08-09T17:29:20Z</dc:date>
    <item>
      <title>Plan for Spark Version 3.2</title>
      <link>https://community.incorta.com/t5/data-schema-discussions/plan-for-spark-version-3-2/m-p/2381#M137</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I was wondering if there is a roadmap for the upgrade of Incorta Spark to version 3.2 or 3.3?&lt;/P&gt;&lt;P&gt;The reason I ask is because Spark version 3.2 saw the implementation of Pandas API on Spark (&lt;A href="https://spark.apache.org/docs/3.2.1/api/python/getting_started/quickstart_ps.html" target="_blank"&gt;https://spark.apache.org/docs/latest/api/python/user_guide/pandas_on_spark/index.html&lt;/A&gt;).&lt;/P&gt;&lt;P&gt;As we know, one of the limitations of using the standard Pandas library in Spark is its inability to scale linearly with data volume due to single-machine&amp;nbsp; processing however this limitation is overcome using Pandas API on Spark.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Another advantage (for me personally) is that the Pandas API on Spark uses the Plotly backend allowing us to create interactive charts which is extremely useful during the Exploratory Data Analysis and Model Evaluation stages.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Currently when we use Matplotlib.pyplot or Seaborn for EDA we can only generate static charts and also need to be extremely careful with sampling when working with large datasets.&lt;/P&gt;&lt;P&gt;Any feedback would be much appreciated!&lt;/P&gt;&lt;P&gt;Sam&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2022 17:29:20 GMT</pubDate>
      <guid>https://community.incorta.com/t5/data-schema-discussions/plan-for-spark-version-3-2/m-p/2381#M137</guid>
      <dc:creator>Stracey</dc:creator>
      <dc:date>2022-08-09T17:29:20Z</dc:date>
    </item>
    <item>
      <title>Re: Plan for Spark Version 3.2</title>
      <link>https://community.incorta.com/t5/data-schema-discussions/plan-for-spark-version-3-2/m-p/2382#M138</link>
      <description>&lt;P&gt;Hi Sam, yes, Spark 3.2 is on our roadmap for Incorta for Q4 of this year and is currently in development. Spark 3.3 will follow later once it supports other components leveraged by our platform.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2022 20:19:08 GMT</pubDate>
      <guid>https://community.incorta.com/t5/data-schema-discussions/plan-for-spark-version-3-2/m-p/2382#M138</guid>
      <dc:creator>DustinB</dc:creator>
      <dc:date>2022-08-09T20:19:08Z</dc:date>
    </item>
  </channel>
</rss>

