I am trying to do a seemingly simple thing. I thought I could use a result-set approach but it did not work.
So, here is the task: group averages of numbers based on a category. An aggregate insight does that well. The problem is I want to put in a formula to calculate the difference between the averages.
The result is a transposed set with a difference column:
Any and all suggestions are appreciated.
Solved! Go to Solution.
Here's how this can be solved using an MV with Spark SQL:
I'm using an 'AdventureWorks' database, if your not familiar its simply a sample database often used for testing for a retailer that has tables such as OrderDetail, OrderHeader, Products, ProductSubcategory, ProductCategory etc...
In the first screenshot you can see I have a query that joins a few tables to create a result set in the same format you have with Categories & AvgLineTotal. This will become an inner query, but I'm applying the avg here to improve query performance(reduce as early as possible)
In the screen shot below you can see the result set of the query:
Next, lets use a Pivot function in an outer query, youll notice an avg is applied in the pivot function as its required to apply an aggregate function with pivot, however this does not effect the result in this case. Also to note, most modern versions of sql support this now, so we are going to use this to swap the categories into fields as shown below:
After running this we can now see the result set getting closer to our desired outcome:
And finally, we can perform any function/mathematical operations against those category fields, in our case I chose 'Bikes-Clothing AS Bike_Clothing_Diff' in the outer query as shown below and we can see our difference:
SELECT *, Bikes-Clothing AS Bike_Clothing_Diff FROM( SELECT pc.categoryname AS Category, avg(sod.linetotal) AS AvgLineTotal FROM OnlineStore2.salesorderdetail sod Left Join OnlineStore2.product p ON sod.productid = p.productid Left Join OnlineStore2.productsubcategory ps ON p.productsubcategoryid = ps.productsubcategoryid Left Join OnlineStore2.productcategory pc ON ps.productcategoryid = pc.productcategoryid GROUP BY pc.categoryname ) a PIVOT(avg(AvgLineTotal) FOR Category IN('Bikes','Clothing','Accessories','Components'))
Hopefully this gives you an additional option, please let us know if you have any further questions or if you need any further detail anywhere in this process.
Zach, Solutions Engineer @ Incorta
Here is an example I have extended upon the previous. In the first image you can see that within the pivot function after the 'PIVOT(avg(AvgLineTotal' you can then simply add a comma and add the next aggregate 'SUM(OrderQTYTotal)' as shown in the first image below.
Doing this will create a field for each combination of aggregates and categories as shown in the next image.
SELECT *, Bikes_AvgLineTotal-Clothing_AvgLineTotal AS Bike_Clothing_AvgLineTotal_Diff, Bikes_OrderQTYTotal-Clothing_OrderQTYTotal AS Bike_Clothing_OrderQTYTotal_Diff FROM( SELECT pc.categoryname AS Category, avg(sod.linetotal) AS AvgLineTotal, sum(OrderQTY) AS OrderQTYTotal FROM OnlineStore2.salesorderdetail sod Left Join OnlineStore2.product p ON sod.productid = p.productid Left Join OnlineStore2.productsubcategory ps ON p.productsubcategoryid = ps.productsubcategoryid Left Join OnlineStore2.productcategory pc ON ps.productcategoryid = pc.productcategoryid GROUP BY pc.categoryname ) a PIVOT(avg(AvgLineTotal) AS AvgLineTotal, SUM(OrderQTYTotal) AS OrderQTYTotal FOR Category IN('Bikes','Clothing','Accessories','Components'))
Appreciate the follow up, always happy to help so let us know if you have any more questions!
Zach, Solutions Engineer @ Incorta