cancel
Showing results for 
Search instead for 
Did you mean: 
dylanwan
Employee
Employee

Symptoms

Your MV ran successfully in Incorta Notebook, but when you tried to validate and save the MV, it failed.

Diagnosis

Incorta performs sampling on the data when you save or validate an MV.  If the dataframe does not have enough data, the MV may fail.  Unfortunately, the messaging you receive when there is a lack of data as the result of sampling is not very clear. 

Incorta does not perform data sampling when you run the logic from the Incorta Notebook. That's why the ML won't fail there.

We cannot change the default to no sampling because when the data set is huge, it will take a large amount of resources and time to validate and save the MV.

Solution

Add the property spark.dataframe.sampling.enabled to the Incorta MV, and set the property to false when you run into issues.  Adding this setting will allow the MV to bring back all the data which will prevent the issue with sampling not bringing back enough.  Try to validate and save the MV again.  It should succeed!

dylanwan_1-1646281043859.png

 

 

 

Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎10-07-2022 11:59 AM
Updated by: