Given a data set as below:
the data source is attached: Group_Concat_Test.csv
It has a column named
value that contains a single data per row, And it is required to concatenate these rows grouping by the key column and it should be shown as comma-separated data values.
It should be like that:
Please help me to achieve it 🤷♀️ 😀.
I am certain the ordering of the values can be done, but I've spent longer than I had anticipated just getting this simple select to work:
SELECT "Key", array_join(collect_set("Value"), ',') as VLU
group by "Key"
order by key ;
I imported the text file ( note: I don't recommend using keywords as column names 😉 ) ran the table, then created an MV using Spark SQL
( I did it in the notebook view so I could test it easily as I went NOTE TO PRODUCT TEAM!
and the results are