cancel
Showing results for 
Search instead for 
Did you mean: 
amit_kothari
Employee
Employee

Problem Statement

An Incorta job runs for more than 40 hours and none of us noticed until business users contacted us and asked why they do not see the updated data on the dashboard.

Incorta provided us with the Long Spinning Job Alert before. Why does it not work?

Requirements

We should get the notification via email or messenger.

Solution

Deploy the Incorta Metadata schema

We recommend all customers deploy the Incorta metadata schema. In particular, the tables involved are

  • LDR_JOB
  • LDR_JOB_DETAILS

Please note that these two history tables are useful for Job Analysis (Incorta Loading Job Analysis ) and for sending alert after the fact, but they are not satisfying our requirement.

  • LDR_JOB_HISTORY
  • LDR_JOB_DETAIL_HISTORY

Create formula columns for displaying the Job Load State

Unfortunately Incorta currently does not ship the schema for the Incorta metadata database out of the box. You can, however, use the Schema Wizard to create these Incorta tables in a local schema. Note that you will need to create additional formula columns manually.

You can copy and paste the following formula columns:

Create the formula column STATE_NAME in LDR_JOB:

case(
     IncortaMetadata.LDR_JOB.STATE=-1,'Initial'
   , IncortaMetadata.LDR_JOB.STATE=1,'Extract Ready'
   , IncortaMetadata.LDR_JOB.STATE=2,'Extraction Started'
   , IncortaMetadata.LDR_JOB.STATE=3,'Extraction Failed'
   , IncortaMetadata.LDR_JOB.STATE=4,'Extraction Finished'
   , IncortaMetadata.LDR_JOB.STATE=5,'Loading Started'
   , IncortaMetadata.LDR_JOB.STATE=6,'Loading Failed'
   , IncortaMetadata.LDR_JOB.STATE=7,'Loading Finished'
   , IncortaMetadata.LDR_JOB.STATE=8,'Transformation Started'
   , IncortaMetadata.LDR_JOB.STATE=9,'Transformation Failed'
   , IncortaMetadata.LDR_JOB.STATE=10,'Transformation Finished'
   , IncortaMetadata.LDR_JOB.STATE=11,'PostLoad Started'
   , IncortaMetadata.LDR_JOB.STATE=-10,'Not Completed'
   , IncortaMetadata.LDR_JOB.STATE=100,'Finished with Errors'
   , IncortaMetadata.LDR_JOB.STATE=102,'Interrupting'
   , IncortaMetadata.LDR_JOB.STATE=101,'Interrupted'
   , 'N/A')

Create the formula column STATE_NAME in LDR_JOB_DETAILS:

case(
     IncortaMetadata.LDR_JOB_DETAILS.STATE=-1,'Initial'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=1,'Extract Ready'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=2,'Extraction Started'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=3,'Extraction Failed'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=4,'Extraction Finished'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=5,'Loading Started'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=6,'Loading Failed'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=7,'Loading Finished'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=8,'Transformation Started'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=9,'Transformation Failed'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=10,'Transformation Finished'
,    IncortaMetadata.LDR_JOB_DETAILS.STATE=11,'PostLoad Started'
,    'N/A')

Add the DH_UPDATE_TIME column to the extract SQL

Here are the SQL statements used in extracting data from a MySQL metadata database.

The reason why we need this column is to report the correct status to the user in the alert. We are showing the status when the metadata database is refreshed in Incorta. The status is as good as the time when the data was extracted.

SELECT
    `ID`,
    `NAME`,
    `NODE_NAME`,
    `TENANT_ID`,
    `OWNER_ID`,
    `STATE`,
    `START_TIME`,
    `LAST_MODIFIED`,
    `EXTRACT_START`,
    `EXTRACT_END`,
    `TRANSFORMATION_START`,
    `TRANSFORMATION_END`,
    `LOAD_START`,
    `LOAD_END`,
    `IS_TEST_RUN`,
    current_timestamp() DH_UPDATE_TIME
FROM
    `incorta`.`LDR_JOB`
and the other table;
SELECT
    `JOB_ID`,
    `TABLE_NAME`,
    `STATE`,
    `LOAD_TYPE`,
    `EXTRACT_START`,
    `EXTRACT_END`,
    `LOAD_START`,
    `LOAD_END`,
    `EXTRACTED_ROW_COUNT`,
    `REJECTED_ROW_COUNT`,
    `LOADED_ROW_COUNT`,
    `PRE_ROW_COUNT`,
    `POST_ROW_COUNT`,
    `MESSAGE`,
    `TRANSFORMATION_START`,
    `TRANSFORMATION_END`,
    current_timestamp() DH_UPDATE_TIME
FROM
    `incorta`.`LDR_JOB_DETAILS`;

Joins between LDR_JOB and LDR_JOB_DETAILS

Create a Current Running Jobs dashboard

Create a dashboard including the the following formula fields.

The formula used to calculate the Job level duration:

timeBetween(IncortaMetadata.LDR_JOB.DH_UPDATE_TIME, IncortaMetadata.LDR_JOB.START_TIME) / 60000

The formula used to calculate the Table level duration:

timeBetween(
ifNull(
IncortaMetadata.LDR_JOB_DETAILS.LOAD_END,
ifNull(
IncortaMetadata.LDR_JOB_DETAILS.TRANSFORMATION_END,
ifNull(
IncortaMetadata.LDR_JOB_DETAILS.EXTRACT_END,
IncortaMetadata.LDR_JOB_DETAILS.DH_UPDATE_TIME
)
)
),
ifNull(
IncortaMetadata.LDR_JOB_DETAILS.EXTRACT_START,
ifNull(
IncortaMetadata.LDR_JOB_DETAILS.TRANSFORMATION_START,
ifNull(
IncortaMetadata.LDR_JOB_DETAILS.LOAD_START,
IncortaMetadata.LDR_JOB_DETAILS.DH_UPDATE_TIME
)
)
)
) / 60000

Conditional formatting is defined to highlight a duration that is more than one hour.
This can be adjusted based on the your requirement.

Create the Data Alert - Long Running Jobs

If the Incorta Metadata is not refreshed often enough, you may get false alarms in the alerts! You need to schedule the alert and schedule the schema refresh of Incorta Metadata according to the requirement on how timely the information needs to be.

Schedule the Alert to be executed periodically. The alert needs to be there to check the job load as frequently as the minimum SLA for acting on the issue.

On the one hand, scheduling the job to run every hour does not mean that you will get an email every hour. On the other hand, when a long running job is currently running, even though it is expected to run for a long time, it is still reasonable to send an alert every hour to show the status.

Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎04-27-2022 05:54 PM
Updated by: