Connect Jupyter Notebook to Incorta's Spark session?
Is there a way to connect Incorta to an IDE or Jupyter Notebook to debug the code and be able to see the results after every few lines of code?
I am trying to create materialized views using Pyspark. My current process looks like this:
1. Write the code in Materialized View script box
2. Save the results in a data frame
3. Click save.
If it saves then the code is all good or else I have to search for what mistake I made. Sometimes the errors that Incorta returns are helpful but most of the times they are just long logs. Also, if my script is all good, I have to load the schema and create a dashboard to see if everything loaded the way I wanted.
Does anyone know a better alternative?
PS - I am working on confidential data with limited permissions to share so I cannot use any tools where the data has to go through any cloud service.
Hi Amar Meena -- we know you're pain. As of Incorta v4.3, we're looking into options to provide a friendlier IDE Notebook experience when developing MVs in Incorta.
For the time being, you can wire your notebook to the standalone Spark instance that bundles with Incorta (via the Spark Master URL), although we don't have any documented steps for this at the moment. Keep in mind, if you do, you will lose access to some of the Incorta MV "non-Spark" functions like "read" and "save". Thus, if you go this route, you will be responsible for pulling parquet data in to your Notebook script and saving it back to parquet should you go this route.