cancel
Showing results for 
Search instead for 
Did you mean: 
Tristan
Employee
Employee

Introduction

Over time as you add new users and new data sources, import additional data, and use Incorta data  in different ways, the stresses on your Incorta instance architecture will change as well.  Like other enterprise applications, it is important for you to monitor the processes that Incorta requires to make sure that your application administration team is aware of any issues that could adversely affect the experience for your users.  Appropriate monitoring can provide insight into how your instance is performing over time so that you can take corrective actions to ensure a good experience or to reduce cost.

What you should know before reading this article

We recommend that you be familiar with these Incorta concepts before exploring this topic further.

  • Install Incorta
  • Hardware Sizing Guide

Applies to

These concepts apply to all releases of Incorta.  Note that for Incorta Cloud customers, Incorta will manage most of the monitoring for you.

Let's Go

There are many different products that can be used for monitoring applications and in the future Incorta will be offering an expanded self monitoring capability from within the Cluster Management Console (CMC).  Until such time as the new monitoring features are available natively, choose the third party tool that works best for you.  Some commonly used tools include Cloudwatch (AWS) often combined with Datadog, and Appdynamics.

The table below lists the recommended areas to monitor for your Incorta implementation.  We also recommend that you set up alerts on most of these measures so that if usage goes too high, your admin team gets notified. 

To Monitor

What To Look For

Alert

CPU

If the CPU goes above 90% for an extended period of time, you should be aware.  Note that sometimes this is normal.

Send an alert if the CPU usage is greater than 90% for more than 30 minutes.

I/O Wait

High I/O wait can slow Incorta down.

Send an alert if the I/O wait is above 90% for more than 30 minutes

RAM

If the RAM usage is very high, it could cause a crash.  Rising usage over time could indicate a memory leak.

Send an alert when RAM usage goes above 90%.

Loader Service

Check to make sure that the service is up and running.  To identify the Incorta loader java process to monitor, run: 

<incorta home>/IncortaNode/listServices.sh 

The Service Location will contain a GUID that you can use to monitor the loader process.

Send an alert if the service goes down.  Note that starting with 4.9, the CMC will send an email to the designated Administrator email address if the Loader Service goes down and will attempt to restart itself up to three times.

Analytics Service

Check to make sure that the service is up and running.   To identify the Incorta loader java process to monitor, run: 

<incorta home>/IncortaNode/listServices.sh 

The Service Location will contain a GUID that you can use to monitor the loader process.

Send an alert if the service goes down.  If you use a load balancer, it should be possible to configure a health check.  Note that starting with 4.9, the CMC will send an email to the designated Administrator email address if the Analytics Service goes down and will attempt to restart itself up to three times.

Data Agent Service

If you are using a Data Agent to connect from Incorta to your data sources, check to make sure that the Data Agent service installed in your network (not in Incorta) is up and running.

Send an alert if the Data Agent service goes down.

CMC Service

Check to make sure that the service is up and running.

Send an alert if the service goes down.

Spark

Check to make sure that the service is up and running.

Send an alert if the service goes down.

Zookeeper

Check to make sure that the service is up and running

Send an alert if the service goes down.

Active Memory (for Analytics and Loader services)

Check on On Heap and Off Heap memory usage.  You can check from the OS or from the CMC endpoint.

For more advanced monitoring of memory usage, consider the use of an Application Performance Monitoring tool such as AppDynamics.

Disk Usage

Check to see that the disk is not filled up.

Send alerts when the disk reaches 80% (Warning) of capacity and 90% of capacity (Action Required).

Network Traffic

Network traffic can be used as a diagnostic tool.  If an issue occurs, you can look at network traffic history to help determine the root cause of the issue.   

 

Requests (count)

Tracking requests can be used as a diagnostic tool.  The number of  requests coming into your instance

 

Servers

Check to see how long the server has been up so that you know if there has been a reboot.

Send an alert if a server goes down.

MySQL Connections (Metadata database)

By default, MySQL has a maximum of 300 connections.  Incorta rarely uses more than 30 connections.

Send an alert if the number of connections rises above 125.

Reference Material

Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎05-26-2023 01:02 PM
Updated by: