02-28-2022 02:56 PM - edited 12-12-2023 11:04 AM
Incorta’s Direct Data Mapping technology is the reason that Incorta is so fast and much of what it does depends on the relationships or joins that you define between the objects that you create and load in Incorta. Understanding how to work with joins will help you set up your data model so that it gives your users what they need.
We recommend that you be familiar with these Incorta concepts before exploring this topic further.
The information in this article applies to versions 4.0 of Incorta and above.
Join definitions are what allow you to create the needed relationships between your data to guarantee you get the results that you expect on your Incorta dashboards. Incorta also uses the joins you define to create the Direct Data Mapping that gives Incorta all that speed. That said, it is important to keep a few things in mind as you work with Incorta joins.
During the load process, Incorta uses its proprietary Direct Data Mapping process to build the relationships between all the data objects that have been loaded into Incorta. This process takes into account all of the joins that have been defined and actually builds a “map” based on them. While you need to define the joins that create the right relationships between your tables, be careful about building every join that it is possible to build. In fact, if you do not need a join, do not define it. There are two benefits to not building joins that will not be used.
Aliases are created to resolve circular joins or to support role playing dimensions. For example, a sales_order table may have both ship_to_customer and bill_to_customer columns so in Incorta you would need an Alias of the customer table so that you can define two joins from the sales_order table.
Make sure that any additional joins that are needed are added for the Alias to do the job you created it for and that any joins that are not needed are removed. Find out more about Alias Join Rules in Docs.
A benefit of Incorta is that once you have brought the data into Incorta, you can join data from disparate data sources. One of the features that supports this is the ability to join tables that are in different physical schemas. We call this a cross-schema join.
Cross-schema joins are powerful, but prior to release 6 (on premises) and release 2023.7.0 (Incorta Cloud) because data is loaded by schema, if your load schedules are not synched for data loaded in cross-schema joined physical schemas, results may not be consistent. For example, if you have customer attribute data in two different systems that you import into Incorta so that you can report a unified view of the customer and one of the sources loads once per day while the other loads several times per day, you may not always see an up to date picture of your customer simply because the load schedules are not fully synched. In later Incorta releases with the Load Plan feature, you can schedule the joined schemas together and Incorta can manage the dependencies between tables for you directly.
Because Incorta is aware of every join path between tables, often it has multiple to choose from. Base table specification allows you to force Incorta to use the join path that you select. There are a number of use cases where they come in handy.
Note that you can set Base Tables from the Analyzer which means that you can apply them to whatever objects you can create from the Analyzer screen, i.e. insights, Incorta Analyzer tables, and Incorta views.
If you create formula columns at the table level, it is possible to use these for joins. This can be advantageous if you need to reshape a data element in some way in order to format it in such a way that it can be used for joins.
Note, however, that there is a table formula column use case that is not supported for use in joins. If you define a table formula column that uses the lookup() function, you will not be able to use that column for joins.
If there are related articles either on Community or in Docs or elsewhere, list the links here: