cancel
Showing results for 
Search instead for 
Did you mean: 

Groupby Pivot in Materialized View

mkrieger
Ranger

Hello, 

I am trying to create a materialized view with pyspark where I change the shape of a dataframe using a groupby and pivot function to create many new columns  out of a pre existing column.

Like so:

 

result = df.groupby("ItemId").pivot("Question").agg(first("Answer", ignorenulls = True))

 

My code successfully validates, however when I attempt to load it, I run into the following error message:
INC_03020207: The result definition mismatches the current table definition. Columns [INCORTA LISTS THE COLUMN NAME VALUES I AM ATTEMPTING TO ADD HERE] are missing(-)/extra(+). Please re-run table discovery
 
How can I overwrite / update the table definition to perform my transformation?
1 REPLY 1

rsather
Rocketeer

Hello,

I have had this problem when new values are added in the source data. The key for me was to force Incorta to think the script was different so it discovers the new "columns". I add a # on an empty line and it's enough to be different and pull in the new fields. The next time it happens I take the # out of the script for the next "change". Keep going back and forth each time the underlying data changes.

Ryan