MV on a remote (Azure datalake) parquet location
I am trying to read parquet files directly from a remote Azure location. I dont want to load data within Incorta as there are frequent writes and dont want data inconsistency issues.
First, I created a datasource to connect to the Azure datalake. So far no issues. I then created a schema as shown below using the Azure connection and it immediately recognized the schema from the parquet files too. So I can safely assume that there are no access/file corruption issues.
But when I create an MV so that I can use it in my dashboard, it fails. The MV creation code is below:
df_MPL=read("ADLSGen2_Testing_Without_loading_into_incorta.memBalRemote") df_MPL.createOrReplaceTempView("membalv") dfmembal = spark.sql("SELECT * from membalv") save(dfmembal)
INC_005005001:Failed to load data from [spark://s123vmeinc2.vpc.company.net:7077] with properties [[error, list index out of range ('IndexError', ':', IndexError('list index out of range',)) ]]
Can you please help let me know if I am missing anything. Is this (creating an MV for reading remote files) even a right approach to read files from remote?
Thanks a whole lot..