on 03-14-2022 03:17 PM - edited on 07-28-2022 03:05 PM by KailaT
Backing up your Incorta environment is one of the most important tasks to perform in order to ensure you have a reliable path for restoring your Incorta reporting environment should the need arise. This document describes the recommended strategies for backing up Incorta and will assist you in selecting the appropriate components and configuration for backup.
We recommend that you be familiar with these Incorta concepts before exploring this topic further.
Making yourself aware of the above topics will assist you in determining the best strategy for backing up your Incorta environment
These concepts apply to all releases of Incorta. Note: this document will also make reference to backup capabilities that are available and will be performed as part of the Incorta Cloud hosted environment.
There are many different strategies for backing up the Incorta environment. Which combination of strategies you end up taking, as alluded to above, may in large part depend on what your IT department already does for Production application servers as part of standard backup, recover, and disaster recovery processes. For example, there may already be processes in place to back up the server where Incorta has been installed and/or the server where your metadata database, that contains critical information about your Incorta application, has been loaded. Similarly, if some shared or network storage has been provisioned where your Incorta data files (parquet, snapshots, etc.) reside, then there may already be enterprise software that is backing up these storage locations. Having said that, what are the core aspects of an Incorta Backup that need to be considered?
Incorta is an interactive, dynamic and 24/7 application that services the needs of your online users as they access the data for reporting. However, Incorta is also the data ingestion engine for loading and preparing the data for reporting, and it is also the development environment where changes can be made that alter the structures, views, rules, and dashboards that define what you are presenting to the end users for reporting. Because of this, there are a lot of moving parts that need to be taken into account as it relates to the integrity of a backup.
The most common method of backing up the Incorta application that you have built is to back up the Tenant. In some contexts, it may also be referred to as exporting the Tenant. Support may ask for this, for example, in conjunction with a support ticket that has been opened. Also, as a best practice, when performing changes / migrations from one environment to the next, it is always recommended that you take a backup of the target location Tenant prior to making the changes so that you may revert if necessary. The ad hoc export of a Tenant may be accomplished in one of two manners:
It is also recommended that regular tenant backups are scheduled. This should be done at least once per day, but in heavy development environments, and especially on Dev servers, it may be beneficial to take Tenant backups more frequently in order to keep snapshots of the Tenant application at various moments in time that can be reverted to if needed. The manner in which Tenant backups are scheduled and run differs depending on what version of Incorta you are running.
The data that has been loaded for your reporting is stored in physical files. If there is a desire to backup your data as it has been loaded into Incorta, it will be necessary to construct a process for backing up the location where the data files are located in your installation:
Incorta does not contain an automated or scheduled method for backing up your data as it resides on the file system. Therefore, as mentioned at the beginning of this document, it is important to understand if your enterprise IT already performs regular backups of the servers and/or network storage devices used in your Incorta environment. In cloud environments such as AWS for example, there may already be processes that are taking Amazon Machine Images (AMI) to back up the machine, and if Amazon EBS volumes are in use, there may be processes in place that are already creating snapshots of those volumes. These concepts are similar across clouds (Google or Microsoft) and other virtual platforms. But even if your organization is using on-premises hardened servers, your IT group should be able to assist you in setting up an appropriate backup methodology for your Incorta servers.
Important Note: As mentioned above, your application is represented by the metadata that is exported as part of the Tenant Backup. That metadata of your application directly maps to the structures written to your data files. Therefore, if it is not possible to time your tenant backup to be taken at the same time as your physical backups, then to accomplish a restore you should choose the tenant backup that is closest to the time of your file backups in order to minimize the possibility that the structure of the files does not match the structure from the restore of the tenant. Also, while not strictly required, it is recommended that prior to taking the data backups, the Loader service should have first been stopped in order to ensure there are no active file write processes running at the time the backup is being taken.
Incorta relies heavily on a metadata database that stores all the data about the structure of your schemas, views, dashboards, security, schedules and all aspects relating to your Incorta reporting applications. In fact, creating a Tenant backup is largely an export of much of this metadata.
The first step in understanding the options for creating a backup of your metadata database is understanding which database vendor is being used and where the database resides. For single server, on premises installations, Incorta has the option of installing MySQL on the same server that Incorta is installed on, and uses that database to run the Incorta application. However, more often than not, the customer already has either MySQL or Oracle running within their enterprise and these database servers can be used to create the "incorta database schema". In either case, backing up this database is a function of using the approved database backup procedures / software as documented by each of these database's documentation. There is no function within Incorta to request and automate a backup of the metadata database itself.
Important Note: Fully restoring a Tenant from a Tenant backup has the effect of fully replacing all of the information about your application in the metadata database. Therefore it is not necessary to restore all 3 component together. However, if you have a true database outage, having a backup of the database itself or at lease an export of the Incorta Schema will allow you to ensure all the tables and structures that Incorta requires are in place.
Backing up the Incorta Binaries located at the <incorta install path>/IncortaAnalytics directory largely follows the same analysis as is applied to backing up the Data. The process for backing up the Incorta binaries may automatically be covered by the same process that is already backing up the OS for the server / VM itself. Similarly, in cloud environments, this may already be accomplished via the application of volume image snapshots, etc. It should be recognized that in addition to the binaries that run the core CMC, Analytics and Loader services of Incorta, the base Incorta install also includes installs of other related software such as Zookeeper and Spark, which are located within the ../IncortaAnalytics/ structure. Further, there are some key configuration files for such things as Active Directory / LDAP synchronization, SSO Login, SSL, various logs, tomcat, Spark defaults, and other settings that have an impact on how your application operates. It is very important that your Install directory is also backed up to retain all the settings that are affecting the operation of your application.
This section covers a method for performing backups that is more appropriate for smaller Incorta implementations. This link and associated scripts for download, describe a process that performs a FULL backup of Incorta. It includes:
The reason why this method is more suitable for smaller and even more specifically single server environments is the weakness of the way it backs up the data itself. This script accomplishes the data backup by performing a Zip of the data directories, and then moving those Zip files to an archive location. The Zip process is not very fast, so this method of backup is not an efficient way to back up large volumes of large data files. Having said that, this process can also be used in conjunction with its associated control properties file to only perform the first 3 of the above bullet steps to accomplish a fairly comprehensive backup.
Related Material