Incorta now supports Load Plans which replace individual Schema Load schedules to make loading data into Incorta more efficient. The advantage of Load Plans is that they allow Incorta to identify the dependencies between schemas and to load many schemas together at once. This elliminates the need to schedule related schemas individually and more importantly it eliminates the need to manually time when to start each schema load, which will eliminate guesswork and wasted idle time. Additionally, post-processing of loads is reduced because now that step only needs to happen once per Load Plan Group instead of once for each schema.
This article provides tips and tricks for getting the most out of Load Plans for scheduling new data loads and converting existing Schema Load schedules to Load Plan schedules successfully.
As noted in the introduction, Load Plans are more efficient than their predecessors, Schema Loads, in that they allow multiple schemas to be run together. When a load plan runs with more than one schema, it treats all the table objects within all the schemas included in a group (more on this later) as if they all belong to a single pool. Then, it determines all the dependencies within the pool and where it can run objects in parallel. This short planning phase is then followed by the actual load that runs efficiently by following the plan and then only executes the post-load process once per sequential loading group. This explains the biggest differentiator of Load Plans, which otherwise operate in terms of scheduling and supporting full, incremental, and staging loads, much like schema loads always have.
In the remainder of this article, we will explore how to take full advantage of the Load Plan feature.
Once you upgrade to a version of Incorta that features Load Plans, your Schema Load schedules will be replaced with Load Plan schedules on a one for one basis. That is to say, everywhere you have a schema scheduled, you will now have a Load Plan that contains a single schema on the same schedule. For the sake of efficiency, you will want to consolidate your schemas into fewer Load Plans. To do this:
When getting ready to define a Load Plan, there are several factors to think about as you determine which schemas to include in the Load Plan.
Schemas themselves are generally defined with tables that are related or joined to one another, but there are often multiple schemas with related data that it makes sense to load together. For example, you might want to load Inventory, BOM, and WIP schemas together. A good indicator for which schemas to load together, besides natural business fit, would be schemas with tables with cross schema joins, which tells you that the data is related.
Another indicator that schemas are related is if their data elements are reported on together. You would want the data to stay synchronized among these schemas for the consumers of your dashboards so they see a chronologically complete picture. This sort of relationship can be identified when you see business views that use data elements from multiple physical schemas or dashboards with insights based on different physical schema data. The data lineage feature can help you identify these relationships.
An alternative, but efficient way to group schemas in Load Plans is to look at the timings of when they are ready in their source systems. The schemas in a Load Plan do not necessarily need to be related. Incorta will figure out the dependencies between them and load as many tables as it can in parallel. If there is nothing a schema needs to wait on, you can load it as soon as it is ready, with whichever other schemas are ready simultaneously.
Another way you might consider grouping schemas in Load Plans, or within Groups in Load Plans, is by the length of time that schemas typically take to process. For example, you might have four schemas that you are thinking about clubbing together in a Load Plan: two run in less than five minutes, and two run in about thirty minutes. You could put them all into one load plan without any groups (or you can think of it as one group) but you would have to wait for all four of the schemas to finish processing before the data from any of them is available.
Assuming that the short-running schemas are not dependent on the longer-running schemas, then by changing the way you set up your Load Plan or Load Plans, you can make the data from the fast-loading schemas available for analytics as soon as they complete the load process (in less than five minutes) and still make the data from the two slower loading schemas available in about thirty minutes. You would accomplish this by splitting the four schemas into two different Load Plans or by distributing the schemas into two groups within a single Load Plan. In the latter case, you would add the two fast-running schemas to a group (Group 1) that comes before the group (Group 2) containing the two long-running schemas. Note that if you use one load plan with two groups, the second group will not begin until the first group has completed processing.
Starting with release 2023.7.0, it became possible to define Sequential Loading Groups within a Load Plan. This gives you the ability, within a Load Plan, to orchestrate the order of the load as groups run sequentially, meaning that all the schemas within Group 1 will load before any of the schemas in Group 2 will begin to load, and so forth.
Above, we described a use case where groups enable you to control when data becomes available for analytics based on load time. Another and possibly more critical reason to use groups is that they allow you to set the order when it is necessary that one table object fully processes before another can begin to process. Here are a couple of scenarios where this could come in handy.
Load Plans with Sequential Loading Groups give you the flexibility to manage these use cases and many more.