cancel
Showing results for 
Search instead for 
Did you mean: 
amit_kothari
Employee
Employee

Simply put, a Z-Score is the number of standard deviations from the mean a data point is.  In other words, Z-Score measures a particular performance against the entire field of values, accounting for both the average result and the full range of performances from top to bottom.  In sports, this measure can be particularly useful for comparing athlete performances when different venues are considered.  For example, shooting a 62 for a professional golfer is great, but how great is it really if the golf course is super easy or other golfers are also scoring really low on that particular day.  This article from Grantland does a great job of describing this measure.

Formulaically, a z-score is calculated as:

z = (x - µ)/s

where:

   µ is the mean of the population, and

   s is the standard deviation of the population

I recently had to write a SQL statement against the Incorta SQL interface to produce this calculation.  The trick with SQL is that you have to do a "first pass" to calculate the standard deviation.  As seen below, I do this in a separate SELECT and CROSS JOIN it to my main query.  I figured I would share it here in case it benefits others down the road as this is a fairly common metric.

The SQL statement below calculates the Z-Score for each AccountNumber based on how much a particular account purchased (AcctSalesTotal) as compared to the rest of the accounts in June, 2014.  The formula for the Z-Score can be seen at the top of the statement, (t.AcctSalesTotal - tt.AvgAll) / tt.sdAll As zScore.  The mean (AvgAll) and the standard deviation (sdAll) are calculated in the SELECT below and CROSS JOINED into our main SELECT.

SELECT t.AccountNumber,
(t.AcctSalesTotal - tt.AvgAll) / tt.sdAll As zScore
FROM
        (SELECT soh.AccountNumber, SUM(sod.UnitPrice) AS AcctSalesTotal
        FROM AW.salesorderdetail sod, AW.salesorderheader soh
        WHERE soh.SalesOrderID = sod.SalesOrderID
        AND date_part('month', soh.OrderDate) = 6
        AND date_part('year', soh.OrderDate) = 2014
        GROUP BY soh.AccountNumber) as t
CROSS JOIN
        (
        SELECT avg(tt.AcctSalesTotal) as AvgAll, stddev(tt.AcctSalesTotal) as sdAll
        FROM
                (SELECT soh.AccountNumber, SUM(sod.UnitPrice) AS AcctSalesTotal
                FROM AW.salesorderdetail sod, AW.salesorderheader soh
                WHERE soh.SalesOrderID = sod.SalesOrderID
                AND date_part('month', soh.OrderDate) = 6
                AND date_part('year', soh.OrderDate) = 2014
                GROUP BY soh.AccountNumber) as tt
        ) as tt;

Happy building!

Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎04-27-2022 05:30 PM
Updated by: