on 12-04-2024 10:48 AM
This update allows users to enable the Advanced SQL Interface without automatically activating Null Handling, preserving the behavior of existing dashboards and join calculations.
Understanding how these features interact will help you make informed decisions about your configuration.
In earlier 2024.7.x releases, the Loader Service always treated records with null values individually. During join calculations, a null value was not considered equal to another null, zero, or empty value. The Loader Service also accounted for null values when evaluating join filters.
Starting with the 2024.7.3 release, the Loader Service will respect the Null Handling flag when performing joins and evaluating join filters:
In earlier 2024.7.x releases, enabling the Advanced SQL Interface automatically enables Null Handling.
Starting with the 2024.7.3 release, the Advanced SQL Interface can operate independently of Null Handling, offering users greater flexibility and control.
As mentioned, the Advanced SQL Interface determines the optimal execution path for queries by leveraging the Engine or Spark to optimize performance. While Spark inherently respects null values, the Engine requires Null Handling to be enabled; otherwise, nulls are interpreted as zeros or empty values, potentially leading to inconsistencies.
Enable Null Handling alongside the Advanced SQL Interface to ensure consistent query results.
Given the dataset:
ProductName | Category | Price |
Laptop |
Electronics |
1000 |
Mouse |
Accessories |
25 |
Keyboard | Accessories |
45 |
Monitor |
Electronics |
200 |
Tablet | Electronics |
NULL |
Printer |
Electronics |
150 |
Smartphone |
Electronics |
800 |
Webcam |
Accessories |
NULL |
Speaker |
Audio | 120 |
MS-Office |
Software |
100 |
MS-Office |
Software |
50 |
MS-Office |
Software |
NULL |
SELECT ProductName, avg(Price) as avgPrice, percentile(Price, 0.3) as percentilePrice
FROM productsalesschema.productdata
WHERE ProductName IN ('Webcam', 'Tablet', 'MS-Office')
GROUP BY ProductName
ORDER BY ProductName
ProductName | Avg price | percentilePrice |
MS-Office |
75.00 |
65.00 |
Tablet |
NULL |
NULL |
Webcam |
NULL |
NULL |
Create a new listing table insight in any Dashboard using the Spark SQL View mentioned above. Then, drag and drop ProductName and avgPrice into the insight.
Result: Querying only ProductName and avgPrice will be processed by the Engine, which ignores null values because Null Handling is disabled.
ProductName | avgPrice |
MS-Office |
50.00 |
Tablet |
0.00 |
Webcam |
0.00 |
ProductName | avgPrice | percentilePrice |
MS-Office |
75.00 |
65.00 |
Tablet |
NULL |
NULL |
Webcam |
NULL |
NULL |
Let’s try running the following Spark SQL queries directly from DbVisualizer:
The query below retrieves the Product Name, Average Price, and Percentile of the Price. It is processed by Spark, which ensures that null values are properly accounted for when calculating both the average and the percentile.
SELECT ProductName, avg(Price) as avgPrice,
percentile(Price, 0.3) as percentilePrice
FROM productsalesschema.productdata
WHERE ProductName IN ('Webcam', 'Tablet', 'MS-Office')
GROUP BY ProductName
ORDER BY ProductName
Result:
ProductName | avgPrice | percentilePrice |
MS-Office |
75.00 |
65.00 |
Tablet |
NULL |
NULL |
Webcam |
NULL |
NULL |
The query below retrieves the Product Name and Average Price. The Engine processes it, and since the Null Handling feature is disabled, null values will be treated as zeros.
SELECT ProductName, avg(Price) as avgPrice
FROM productsalesschema.productdata
WHERE ProductName IN ('Webcam', 'Tablet', 'MS-Office')
GROUP BY ProductName
ORDER BY ProductName
Result:
ProductName | avgPrice |
MS-Office |
50.00 |
Tablet |
0.00 |
Webcam |
0.00 |
This update provides greater flexibility by allowing users to enable the Advanced SQL Interface independently of Null Handling. However, users must review their configurations to ensure query results remain consistent and accurate.