Solved: Re: What is a simple Excel reference is hard in In... - Page 2

marcpaige · ‎10-26-2022

I am trying to do a seemingly simple thing. I thought I could use a result-set approach but it did not work.

So, here is the task: group averages of numbers based on a category. An aggregate insight does that well. The problem is I want to put in a formula to calculate the difference between the averages.

Data:

Category	Average A
Cat1	10
Cat2	5

The result is a transposed set with a difference column:

Metric	Cat1	Cat2	Difference
Average A	10	5	5

Any and all suggestions are appreciated.

marcpaige · ‎10-28-2022

To round out the solution: I also had to use an Incorta Analyzer table to calculate the averages and have access to them from the KPI insight. The MV wouldn't allow me to do the calculations in SQL.

ZachBreimayer · ‎10-28-2022

Here's how this can be solved using an MV with Spark SQL:

I'm using an 'AdventureWorks' database, if your not familiar its simply a sample database often used for testing for a retailer that has tables such as OrderDetail, OrderHeader, Products, ProductSubcategory, ProductCategory etc...

In the first screenshot you can see I have a query that joins a few tables to create a result set in the same format you have with Categories & AvgLineTotal. This will become an inner query, but I'm applying the avg here to improve query performance(reduce as early as possible)

In the screen shot below you can see the result set of the query:

Next, lets use a Pivot function in an outer query, youll notice an avg is applied in the pivot function as its required to apply an aggregate function with pivot, however this does not effect the result in this case. Also to note, most modern versions of sql support this now, so we are going to use this to swap the categories into fields as shown below:

After running this we can now see the result set getting closer to our desired outcome:

And finally, we can perform any function/mathematical operations against those category fields, in our case I chose 'Bikes-Clothing AS Bike_Clothing_Diff' in the outer query as shown below and we can see our difference:

SELECT *, Bikes-Clothing AS Bike_Clothing_Diff FROM(
SELECT pc.categoryname AS Category, avg(sod.linetotal) AS AvgLineTotal
FROM OnlineStore2.salesorderdetail sod
     Left Join OnlineStore2.product p ON sod.productid = p.productid
     Left Join OnlineStore2.productsubcategory ps ON p.productsubcategoryid = ps.productsubcategoryid
     Left Join OnlineStore2.productcategory pc ON ps.productcategoryid = pc.productcategoryid
GROUP BY pc.categoryname
) a
       PIVOT(avg(AvgLineTotal)
       FOR Category IN('Bikes','Clothing','Accessories','Components'))

Hopefully this gives you an additional option, please let us know if you have any further questions or if you need any further detail anywhere in this process.

Regards,

Zach, Solutions Engineer @ Incorta

marcpaige · ‎10-28-2022

Thank you, Zach. This is very helpful.

marcpaige · ‎11-04-2022

I have a follow-up for you Zach. @ZachBreimayer: How would the SQL change for multiple aggregate columns?

ZachBreimayer · ‎11-04-2022

Hi Marc,

Here is an example I have extended upon the previous. In the first image you can see that within the pivot function after the 'PIVOT(avg(AvgLineTotal' you can then simply add a comma and add the next aggregate 'SUM(OrderQTYTotal)' as shown in the first image below.

Doing this will create a field for each combination of aggregates and categories as shown in the next image.

SELECT *, Bikes_AvgLineTotal-Clothing_AvgLineTotal AS Bike_Clothing_AvgLineTotal_Diff, Bikes_OrderQTYTotal-Clothing_OrderQTYTotal AS Bike_Clothing_OrderQTYTotal_Diff FROM(
SELECT pc.categoryname AS Category, avg(sod.linetotal) AS AvgLineTotal, sum(OrderQTY) AS OrderQTYTotal
FROM OnlineStore2.salesorderdetail sod
     Left Join OnlineStore2.product p ON sod.productid = p.productid
     Left Join OnlineStore2.productsubcategory ps ON p.productsubcategoryid = ps.productsubcategoryid
     Left Join OnlineStore2.productcategory pc ON ps.productcategoryid = pc.productcategoryid
GROUP BY pc.categoryname
) a
       PIVOT(avg(AvgLineTotal) AS AvgLineTotal, SUM(OrderQTYTotal) AS OrderQTYTotal
       FOR Category IN('Bikes','Clothing','Accessories','Components'))

Appreciate the follow up, always happy to help so let us know if you have any more questions!

Zach, Solutions Engineer @ Incorta

What is a simple Excel reference is hard in Incorta?