.png)
- Article History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
on
03-27-2023
05:06 PM
- edited on
06-16-2023
12:46 PM
by
Tristan
What is the MV assistant tool?
The MV Assistant helps you identify the appropriate value combination of Spark-related configurations for each materialized view (MV) in Incorta. A dedicated dashboard displays the recommended values and a comparison between the performance of the current and recommended configurations. In addition, the metadata database saves these recommended values per MV. As a schema developer, you can manually apply the recommended values for each MV.
How it works
The MV Assistant consists of a Spark Listener, a Heuristics Recommender, and a Cleanup job.
- The Spark Listener collects the required metrics while running the Spark application of an MV load job, simulates different values of the Spark configurations, and saves the output to a file in the following path:
<TENANT_NAME>/mvlenslogs/pending/<SCHEMA_NAME>/<TABLE_NAME>/
. - The Heuristics Recommender uses the output file created by the Spark Listener and applies several heuristics to find the most optimal set of values and saves them to the metadata database. After using the output file, it is moved to the
<TENANT_NAME>/mvlenslogs/archived/<SCHEMA_NAME>/<TABLE_NAME>/
directory. - The Cleanup Job, when enabled, deletes the archived files that the Heuristics Recommender has already used after a specific number of days or after reaching the maximum number of archived files to keep.
The following are the configurations that the MV Assistant analyzes and recommends new values for.
Configuration | Syntax | Description |
Executor Instances | spark.executor.instances |
Determines the total number of executors to allocate for the application |
Executor Cores | spark.executor.cores |
Determines the number of cores per executor |
Executor Memory | spark.executor.memory |
Determines the amount of memory to be allocated to each executor |
Shuffle Partitions | spark.sql.shuffle.partitions |
Determines how many partitions the data is partitioned into after shuffling |
How to set up the MV assistant tool
- From the CMC on the tenant, the level enables the MV assistant toggle.
- Once the toggle is enabled you need to restart the loader service for it to take effect
- You will need to import the MVlens schema (attached)
- Import the MV lens dashboard (attached)
- After a few runs of the schemas on the environment, you will need to load the MV lens schemas to pull the JSON files and push to metadata database
- Once the MVlens schema is loaded into the database, open the dashboard to get the current vs. the recommended values by the MV Assistant.
Limitations
1- The MV assistant will only work on MV's that have at least one successful run