In order to achieve a holistic view of business operations, it is common for organizations to unify similar data from disparate sources. This is often required after an ERP migration, company acquisition, or migration of data from on-premises solutions to cloud-based solutions.
For simple use cases where disparate data sources share similar schemas, users can leverage the native multi-source table feature. This is the best option for simple use cases like merging two separate instances of Oracle EBS. For use cases that require more flexibility and control, it is recommended to use one of the three approaches in this article (SQL View, Incorta SQL Table, Materialized View).
Using a SQL View is a good option when you have a small amount of data (less than 100 million rows) because SQL Views are processed at runtime. As users open and interact with your dashboard, the data presented will be processed at that moment. The logic is very simple with this approach, but you will not be able to join the view to other tables across schemas.
The SQL View feature was made available with release 2022.3 of Incorta and must be activated from the CMC.
First, create a business schema with two views. One will contain data from your first source. In this example we will use Oracle EBS. The second will be similar data from your second source. We will use SAP. Both sources contain invoice data from accounts payable modules.
Then create a SQL View in your business schema to union the two sources together.
The query includes WHERE clauses that are specific to the use case. A WHERE clause is not required for all use cases.
In our example we finish with three views: (EBS_AP, SAP_AP, and a third "union" view). We can now begin building insights using the union view that combines data points from both sources.
Option 2: Incorta SQL Table
The SQL Table is very similar to the SQL view, but the data is persisted in the physical schema. SQL Views are processed at runtime. If you have a large volume of data (more than 100 million rows) and/or your use case requires other data to be joined to the resulting table, this is a better option than the SQL view. Incorta SQL Tables should be created in a new schema so the load can be scheduled to execute after the source system data is loaded.
In this example, we create a new schema that serves as a bridge between our two disparate sources. In the new schema, create a new derived table using Incorta SQL. Similar to the previous example, we will write a SQL query to union data from a business schema. We use business schemas as a way to simplify the complexity of the source systems when writing our query.
The new Incorta SQL Table will be refreshed with each load of the schema, and it can be joined to tables across other schemas for data enrichment. When this table is joined to others for enrichment, logic must be applied to ensure that data is queried from the correct source. You can learn more about business schema design for multi-source reporting here.
Option 3: Materialized View
A Materialized View (MV) should be used when you have a large volume of data to work with (more than 100 million rows) or require complex business logic to unify the data from multiple tables together. Materialized views are processed during the load of a schema. This minimizes latency when interacting with data in dashboards and allows users to join the MV table with other tables across schemas.
Similar to the last example, we will create a new schema to serve as the bridge between our Oracle EBS and SAP data. In the new schema, create a materialized view using your language of choice. We will use Spark SQL. The script below creates a table with four columns: Source Name, Transaction ID, Transaction Line ID, and Order Date.
After the bridge table is created, we can use the IDs and dates to join to the existing EBS data, SAP data, and date dimensions.
As we explore the data and build insights, Incorta intelligently builds the multi-source query allowing us to produce a holistic view of business operations in a single dashboard. Below is an example of a query plan using a bridge MV.
To ensure that dimensions and measures are pulled from the appropriate sources, logic should be defined at the business schema level. You can learn more about business schema design for multi-source reporting here.
These are three common approaches to multi-source table design in Incorta. The best approach for you will depend on your data sources, data volume, and use case(s).