on 03-08-2022 01:49 PM - edited on 02-16-2023 03:43 PM by Tristan
A number of useful scripts have been developed to assist in the collecting of information for debugging issues or working with Incorta support. This page contains those scripts and more will be added as appropriate.
We recommend that you be familiar with these Incorta concepts before exploring this topic further.
These concepts apply to most releases of Incorta 4.x and later but are most useful for customers who have installed Incorta on-premises or in their own private cloud.
Some scripts will receive updates from time to time, so be sure to check back for the latest versions. If a particular script does not work in your environment please contact Incorta Support to see if there is a different version available.
To use performance.jsp you must have backend access to the analytics node server. Note that performance.jsp is only applicable to versions of Incorta prior to 5.0.
The data collection script is a bash script that can be used to collect information on your Incorta environment. Typically you would not need to run this script unless Support asks you to.
The package contains a parameters file that requires this data:
Note: The script and parameters file have to be in the same directory as the script will read the inputs from this file.
The script will gather the following information:
To run the script after receiving it from Support:
- download it into the server running incorta
$ chmod +x Incorta_DataCollection.sh
$ ./Incorta_DataCollection.sh
In some environments, the Spark WebUI may not be accessible due to ports not being open or other security considerations. It is still possible to view the output files (stdout and stderr) if you have access to the Spark master machine.
Since the python code with the Incorta Materialized View (MV) name is stored in the folder, we can identify the log for a specific MV:
cd <spark home>/work
ls -ld app-*-*/0/* |grep BOOK_BY
The main purpose of this tool is to identify regressions between Incorta builds. It can also be used to simulate concurrent user requests for stress testing Incorta dashboards.
You will need to have backend access to the analytics node server
The tool has two output formats (HTML/Tab-separated .txt file) that get generated based on the Mode.
Run Mode HTML
HTML report for the list of all visited insights and its folder/dashboard hierarchy with the loading time. It will flag the skipped and failed insights.
If the skipped insights are the hierarchy insights, or if the skipped not owned dashboards flag is on, then all the insights belonging to a dashboard that is not owned by the running user will be skipped. An example of the HTML output:
Run Mode Tab-separated .txt
A Tab-separated .txt file will be generated in the run folder to save the report for further analysis. The file will contain the following columns:
Column |
Description |
Possible Values |
Time |
Load Time in Milliseconds |
1 for skipped insights, -2 for failed insights |
Run Name |
The Run Identifier Name |
|
Run Timestamp |
Timestamp when the run started |
field, gauge, single row chart, series chart, pivot chart, flat table, aggregated table, detail table, pivot, unknown |
Path |
The Path to the entity |
|
Error |
The error message if the insight failed |
format, format2, sort, sort2, right_fail, fail, left_fail, mismatch (exact lines count), mismatch (different lines count) |
Entity Type |
Entity Type |
Tenant, folder, dashboard, insight |
Compare Mode HTML
The HTML report for compare mode will list the count of mismatched insights with the insight type and mismatch category/sub-category. An example of the HTML output:
Compare Mode Tab-separated .txt
A Tab-separated compare.txt file will be generated in the compare folder to save the report for further analysis. The file will contain the following columns:
Column |
Description |
Possible values |
User |
Username for the user who rendered the insight |
- |
Tenant |
Tenant name |
- |
Left |
New run name |
- |
Right |
Referenced (compared with) run name |
- |
Insight category |
category of the rendered insight |
field, gauge, single row chart, series chart, pivot chart, flat table, aggregated table, detail table, pivot, unknown |
Mismatch type |
identified mismatch type between the two runs |
format, format2, sort, sort2, right_fail, fail, left_fail, mismatch (exact lines count), mismatch (different lines count) |
Insight name |
the insight name |
if the insight is unnamed then guid is used |
Path |
Incorta relative insight path |
- |
File path (Left) |
file system path to the new run file |
- |
File path (Right) |
file system path to the old run file |
- |
Mismatch Category |
Mismatch Sub Category |
Description |
Notes |
1- Exact Match |
exact |
exactly matching |
|
2- Semantic Match |
format |
format mismatch |
Some values have different format |
|
format2 |
format mismatch with ignoring precision |
Some variation of above around precision |
|
sort |
sorting mismatch |
The order does not match for reports where sorting was declared |
|
sort2 |
sorting mismatch with ignoring percision |
differences in order with some variations around precision |
|
fail |
both are failing |
Report did not run - no comparison needed |
|
right_fail |
old run failed |
New run succeeded, but old run had failed |
4- Mismatch |
left_fail |
old run succeeded but the new run fails |
|
5- Unknown |
mismatch (exact lines count) |
content mismatch, but the line count is exactly the same between both runs. |
|
|
mismatch (different lines count) |
mismatch and line count is different. |
|
BASE_DIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
host='http://localhost:8080'
tenant='ebs_cloud'
usr='admin'
passEnc='vNurdjf9YdcpAFPBqSawXQ=='
path='/home/incorta/zakaria' #doesn't end with '/', can be empty
comparewithrunname='' #compare run name, leave it empty for run mode
compareonly='' #set it to 'true' to compare only folderIds='' #comma separated ids '746,225,222,227', leave empty if not needed
dashboardIds='' #comma separated guids 7fa751fd-fa26-4036-9d85-cd718631bfd6,3e1e7e99-3c11-491c-8938-f5efb67515be', leave empty if not needed
runwithusers='' #comma separated usernames 'user1,user2', leave empty if not needed
workerCount=4 #number of threads
output='html' #can be html,csv,txt and data.
runname=`date "+%Y%m%d%H%M%S"` #run name. it is the current timestamp but can be anything
if [ "$output" != "data" ]; then
curl -s -d "tenant=$tenant&login=$usr&passEnc=$passEnc&workspacePath=$path¤tRun=$runname&ref=$comparewithrunname&compareOnly=$compareonly&foldersIdsStr=$folderIds&guidsStr=$dashboardIds&userNamesStr=$runwithusers&workersCount=$workerCount&output=$output" $host/incorta/performance.jsp > $runname.$output
else #compare is not supported with 'data' output
curl -s -d "tenant=$tenant&login=$usr&passEnc=$passEnc&workspacePath=$path¤tRun=$runname&foldersIdsStr=$folderIds&guidsStr=$dashboardIds&userNamesStr=$runwithusers&workersCount=$workerCount&output=data" $host/incorta/performance.jsp > $runname.zip
fi
Incorta admins often need to migrate changes from a UAT or Dev instances to PROD but they are not sure which changes took place in either environment. The manual comparison is tedious and error-prone...even the raw XML comparison is still error-prone because the same entity (insight, schema, table, etc) or attribute can be found in different positions in the two XMLs.
Raw XML comparison also would reveal changes that are merely there due to Incorta schema upgrades but they are not really important changes that the user need to be aware of.
The Tenant Comparison Tool is a standalone command-line (CLI) tool that helps users do in comparing two different tenants (for example UAT tenant and PROD tenant) by comparing the dashboards, schemas, business schemas and data sources between the two tenants.
The user can specify the output format of the tool, HTML or CSV. Using the CSV format, the user can upload the comparison output CSV files to Incorta, and use the pre-built schemas and dashboards shipped with the tool to analyze the differences between the tenants using Incorta dashboards.
‘Entity’ id will refer to one of the following: Schema, Business Schema, Dashboard, Data Source
The tool can produce comparison reports in two different file formats, HTML and CSV format.
Parameter |
Description |
-b,--base-path |
The path to the base tenant export (zip) file. |
-n,--new-path |
The path to the new version of the tenant export (zip) file. |
-o,--output-path |
The output directory where you should save the comparison reports and the migrated entities to. The default output path is './comparison-output' which is next to the jar file. |
-f,--output-format |
The output comparison file format, you can specify whether it is 'html' or 'csv'. The default format is HTML. |
-s,--schema-list |
A comma-separated list of schema/business-schema names to compare. If not specified or a value 'all' is passed, all base-new pair of schemas should be compared. |
-d,--dashboard-list |
A comma-separated list of dashboard names to compare. If not specified or a value 'all' is passed, all base-new pair of dashboards should be compared. |
-c,--datasource-list |
A comma-separated list of data source names to compare. If not specified or a value 'all' is passed, all base-new pair of data sources should be compared. |
-m,--mode |
Specify 'compare-only' to only compare the tenants. The default value is 'compare-only'. There is another mode called 'migrate' which will migrate (step-by-step) the mentioned entities from the base to the revised version (this is still under investigation). |
-u,--create-subfolder |
Whether to save the output in a sub-folder (named by the run timestamp) under the specified output path. |
This command compares the whole SAP tenant environments by providing the paths of their exports:
1./compare-incorta.sh \ 2 -b tenant_sapeccdev_20200609.zip \ 3 -n tenant_sapeccdev_ent_20200609.zip \ 4 -o compare-tenant-output \ 5 -m compare-only \ 6 -f csv \ 7 -s all \ 8 -d all \ 9 -c all
Another example of comparing specific schemas (SAPECC_PP and MS_AR), specific dashboards (Unused Business Views and Sales Order Check) and all data sources would be:
1./compare-incorta.sh \ 2 -b tenant_sapeccdev_20200609.zip \ 3 -n tenant_sapeccdev_ent_20200609.zip \ 4 -o compare-tenant-output \ 5 -m compare-only \ 6 -f csv \ 7 -s SAPECC_PP,MS_AR \ 8 -d "Unused Business Views,Sales Order Check" \ 9 -c all
The standard output summary for the above command:
The following screenshot shows the output index.html page generated, and each change has a hyperlink to the changed report. This is useful when dealing with an HTML output format:
The following screenshots show the output after uploading the CSV output into Incorta and previewing the Schema Summary dashboard:
Scanning Phase
The tool scans the input paths of the two tenant exports, then constructs the pairs of the corresponding entities (schemas, dashboards, etc.) to be compared later. The user can specify the names of specific entities to limit the scope of comparison or can include every existing entity (can be shown in the command line options section).
Comparison Phase
The pairs saved during the scanning phase are being processed. Each entity is being processed one after the other, starting with schemas and business schemas, then data sources, and finally dashboards.
For each pair of corresponding entities:
It can be difficult to look for specific information in Incorta log files manually, and often the log files may be too large to be opened in a text editor. Incorta maintains several versions of log parsers that will parse out the logs from the Loader or Analytics Service into a CSV file that makes it much easier to read, search or even use as a data source for an Incorta dashboard. There are also versions that can parse the log files without having to extract the log files from a .zip file in case of very large log files. Please contact Incorta Support if you are in need of one and they can provide the appropriate version.
This python script takes an unzipped tenant export file and produces a .csv file which has all the table and column details like data type, function, label, etc. within a schema. Please click HERE for the latest version of the script.
Network or server issues could potentially cause nodes to go out of sync, a problem which may cause data inconsistency and information loss. The below script can be run periodically to review the successfully loaded jobs in the metadata and determine if the sync message arrived for the node.
The jar file takes only 1 parameter which is its absolute path if it is ran from different path which is provided automatically by running it from the shell script.
In order for the tool to function correctly you must provide a ‘logger.properties’ file which looks like this:
mail.smtp.auth=true
mail.smtp.starttls.enable=true
mail.smtp.host=smtp.gmail.com
mail.smtp.port=587
mail.smtp.ssl.trust=smtp.gmail.com
mail.session.mail.transport.protocol=smtp
from=system@incorta.com
to=[ADD EMAIL ADDRESSES]
password=[ADD PASSWORD]
env=[ENTER ENVIRONMENT NAME]
initialStart=72
log_location=[ADD PATH TO /IncortaNode/ FOLDER]