cancel
Showing results for 
Search instead for 
Did you mean: 
emanmoussa
Employee
Employee

Introduction

This update allows users to enable the Advanced SQL Interface without automatically activating Null Handling, preserving the behavior of existing dashboards and join calculations.

What you need to know before reading this article

  • Null Handling Overview: Null Handling differentiates between nulls, zeros, empty strings, and empty dates.
  • Advanced SQL Interface Overview:
    • This feature connects Incorta to external BI tools (e.g., Tableau, Power BI) and optimizes query execution using Spark and the Incorta Engine.
    • The Advanced SQL Interface determines the optimal execution path for queries, leveraging the Engine or Spark to enhance performance.

Understanding how these features interact will help you make informed decisions about your configuration.

Null Handling and Advanced SQL Interface Updates

Null Handling: Behavior Changes

In earlier 2024.7.x releases, the Loader Service always treated records with null values individually. During join calculations, a null value was not considered equal to another null, zero, or empty value. The Loader Service also accounted for null values when evaluating join filters.

Starting with the 2024.7.3 release, the Loader Service will respect the Null Handling flag when performing joins and evaluating join filters:

  • If Null Handling is enabled, a null value will not be treated as equal to another null, zero, or empty value during join calculations. This ensures that nulls are preserved as distinct values during the join process.
  • If Null Handling is disabled:
    • Nulls in numeric columns will be treated as zeros.
    • Nulls in string/text columns will be treated as empty strings.
    • Nulls in date columns will be treated as empty dates.

Key Consideration:

  • After enabling or disabling this setting, perform a load from staging or wait for the next load job to process the changes. All services must be restarted.

Advanced SQL Interface: Behavior Changes

In earlier 2024.7.x releases,  enabling the Advanced SQL Interface automatically enables Null Handling.

Starting with the 2024.7.3 release, the Advanced SQL Interface can operate independently of Null Handling, offering users greater flexibility and control.

As mentioned, the Advanced SQL Interface determines the optimal execution path for queries by leveraging the Engine or Spark to optimize performance. While Spark inherently respects null values, the Engine requires Null Handling to be enabled; otherwise, nulls are interpreted as zeros or empty values, potentially leading to inconsistencies.

Recommendation:

Enable Null Handling alongside the Advanced SQL Interface to ensure consistent query results. 

Use Cases of Inconsistency when Advanced SQLi is enabled, and Null Handling is disabled

Given the dataset:

ProductName Category  Price
Laptop
Electronics
1000
Mouse
Accessories
25
Keyboard Accessories
45

Monitor

Electronics
200
Tablet Electronics
NULL
Printer
Electronics
150
Smartphone
Electronics
800
Webcam
Accessories
NULL
Speaker
Audio 120
MS-Office
Software
100
MS-Office
Software
50
MS-Office
Software
NULL
Example1: Spark SQL Views & Insights in Incorta:
  1. Edit or Create a business schema
  2. Create a Spark SQL View with the query below and click on execute

 

 

 

SELECT ProductName, avg(Price) as avgPrice, percentile(Price, 0.3) as percentilePrice 
FROM productsalesschema.productdata 
WHERE ProductName IN ('Webcam', 'Tablet', 'MS-Office') 
GROUP BY ProductName 
ORDER BY ProductName

 

 

 

  1. Result:  Since the query involves the percentile function, Spark processes it, and the result will include null where appropriate: 
    ProductName Avg price  percentilePrice
    MS-Office
    75.00
    65.00
    Tablet
    NULL
    NULL
    Webcam
    NULL
    NULL
     emanmoussa_1-1732562439008.png
  2. Create a new listing table insight in any Dashboard using the Spark SQL View mentioned above. Then, drag and drop ProductName and avgPrice into the insight.

  3. Result: Querying only ProductName and avgPrice will be processed by the Engine, which ignores null values because Null Handling is disabled. 

    ProductName  avgPrice
    MS-Office
    50.00
    Tablet
    0.00
    Webcam
    0.00

     

    emanmoussa_2-1732562499775.png
  4. Drag and drop percentilePrice to the insight and observe the result. 
  5. Result: Adding percentilePrice to the insight triggers Spark to process the entire query, ensuring null values are respected and accurately reflected in the results.
    ProductName avgPrice  percentilePrice
    MS-Office
    75.00
    65.00
    Tablet
    NULL
    NULL
    Webcam
    NULL
    NULL

emanmoussa_3-1732562557287.png

Example 2: Running Queries from BI Tools:

Let’s try running the following Spark SQL queries directly from DbVisualizer:

Scenario 1:

The query below retrieves the Product Name, Average Price, and Percentile of the Price. It is processed by Spark, which ensures that null values are properly accounted for when calculating both the average and the percentile.

 

 

 

SELECT ProductName, avg(Price) as avgPrice, 
percentile(Price, 0.3) as percentilePrice 
FROM productsalesschema.productdata 
WHERE ProductName IN ('Webcam', 'Tablet', 'MS-Office') 
GROUP BY ProductName 
ORDER BY ProductName

 

 

 

Result: 

ProductName avgPrice  percentilePrice
MS-Office
75.00
65.00
Tablet
NULL
NULL
Webcam
NULL
NULL
Scenario 2:

The query below retrieves the Product Name and Average Price. The Engine processes it, and since the Null Handling feature is disabled, null values will be treated as zeros.

 

 

SELECT ProductName, avg(Price) as avgPrice 
FROM productsalesschema.productdata 
WHERE ProductName IN ('Webcam', 'Tablet', 'MS-Office') 
GROUP BY ProductName 
ORDER BY ProductName

 

 

 

 Result: 

ProductName  avgPrice
MS-Office
50.00
Tablet
0.00
Webcam
0.00

Conclusion:

This update provides greater flexibility by allowing users to enable the Advanced SQL Interface independently of Null Handling. However, users must review their configurations to ensure query results remain consistent and accurate.

 

Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎12-04-2024 10:48 AM
Updated by: