cancel
Showing results for 
Search instead for 
Did you mean: 
JeffW
Employee
Employee

Introduction

Incorta backup and restore scripts are shell scripts that are intended to perform backup and restore for multi-node installations of Incorta. The scripts are capable of backing up and restoring the following parts of Incorta environment:

  • Cluster configuration files, which include:

    • CMC node configuration files which include: cmcDataconf, and incorta directories

    • Incorta nodes configuration files which include:

      • Services configuration files which include: incorta and conf directories

      • NodeAgent configuration files which include: nodeAgent.cfg file

      • node.properties file

    • Spark nodes configuration files which include: conf directory

    • Zookeeper nodes configuration files which include: conf directory

  • Tenant-related files, which include:

    • Tenant metadata (tenant export)

    • Data files

    • Parquet files

    • Snapshot files

    • Compacted parquet files

    • Schemas time log files

The scripts were tested on a multi-node environment running CentOS (version 7).

Configuration Files

The user has to supply a configuration file to each script (backup script and restore script) to identify the locations of Incorta cluster nodes the scripts will backup or restore, and identify components of the cluster to include in the backup/restore process. The following sub-sections explain the configuration file of each script in detail.

Backup Script Configuration File: backup.parameters

The 'backup.parameters' file contains the following parameters:

backup.parameters file Expand source

Restore Configuration File: restore.parameters

You will notice that the restore configuration file contains fewer parameters. This is because all the yes/no parameters in the backup configuration file are converted to interactive command-line prompts that the user has to answer when running the restore script. 

The file 'restore.parameters' contains the following parameters that the user needs to specify:

restore.parameters file Expand source

Scripts

Backup Script

The backup script can be used to back up either cluster configuration files or tenant-related files or both depending on the provided parameters file.

The backup script writes the backups it creates to a backup repository, which is a directory that consists of the following subdirectories:

  1. cluster-config-backups directory: stores backups of cluster-level configuration files (e.g. cmcDataconf and incorta directories from CMC node).
  2. tenant-backups directory: stores backups of tenant-level files (e.g. parquetcompacted parquet files).

After reading the parameters file the script does the following:

  • Create a log file with a name following this pattern incorta-backup-<backup-start-timestamp>
  • Validate provided parameters and exit if anything is wrong after notifying the user about the error.
  • Apply retention rule specified by RETENTION_PERIOD parameter on both cluster-config-backups and tenant-backups repositories. If there exist some backups that violate the retention rule (older than the value specified by RETENTION_PERIOD parameter in days) it will list them to the user and ask the user whether to delete these old backups and their log files or keep them.  
  • Check whether the backup operation is feasible or not by calculating the average size of similar backups (older backups for the same cluster or tenant) and check the remaining free space in the backup repository by using BACKUPS_REPO_MAX_SIZE parameter. If the backup is not feasible it will not remove any old backup but it will exit after notifying the user about the issue and ask the user to free some space then re-run the script.
  • Depending on the type of the backup required (whether it is cluster configuration files backup or tenant-related files backup) it creates backup directory following these patterns:
    • <cluster-name>-backup-<backup-start-timestamp> under cluster-config-backups
    • <tenant-name>-backup-<backup-start-timestamp> under tenant-backups
  • According to the user inputs read from the parameters file it may backup:
    • Tenant metadata (tenant export), dataparquetsnapshot, compacted parquet, and schemas time log files
    • Cluster configuration files, that include:
      • CMC nodes configuration files which include cmcDataconf, and incorta directories
      • Incorta nodes configuration files which include:
        • Services configuration files which include: incorta and conf directories
        • NodeAgent configuration files which include: nodeAgent.cfg file
        • node.properties file
      • Spark nodes configuration files which include: conf directory
      • Zookeeper nodes configuration files which include: conf directory
  • The script also backups the backup.parameters file so it can be used for restoration instead of providing separate restore.parameters file.
  • Notify the specified user in the parameters file through email upon success or failure of the backup operation with the logs.

Restore Script

The restore script can be used to restore either cluster configuration files or tenant-related files or both depending on user inputs.

The script does the following:

  • Create a log file with a name following this pattern incorta-restore-<restore-start-timestamp>.
  • Ask the user about the required type of restoration whether it is cluster configuration files or tenant-related files restoration. According to the response of the user, a set of questions will be asked to collect the parameters of the required operation. Regarding the user choice of a certain backup it can be done using two ways:
    • Using the date (YYYYMMDD) of the backup. If multiple backups exist with the same date it will ask the user to enter the time (HHMM) of the required backup.
    • Using the backup directory path provided explicitly If it is generated from the backup script.
  • If cluster configuration files restoration is required then parameters of the cluster nodes will be required (user can provide restore.parameters file if the restoration is to be on a different cluster or depend on the already backed up backup.parameters file if it is on the same cluster).
  • According to the user inputs read from the parameters file it may restore:
    • Tenant metadata (tenant export), dataparquetsnapshot, compacted parquet, and schemas time log files
    • Cluster configuration files, that include:
      • CMC nodes configuration files which include: cmcDataconf, and incorta directories
      • Incorta nodes configuration files which include:
        • Services configuration files which include: incorta and conf directories
        • NodeAgent configuration files which include: nodeAgent.cfg file
        • node.properties file
      • Spark nodes configuration files which include: conf directory
      • Zookeeper nodes configuration files which include: conf directory
  • Notify the user through email upon success or failure of the restore operation with the logs.
  • If tenant-related files should be included in the backup or restore operations, then the scripts must be run from the CMC node so it can have access to the TMT command-line interface (tmt.sh). tmt.sh is used for the following tasks:
    • Read the tenant directory path on the shared file system
    • Export and Import the tenant metadata.
  • For password mode only, the expect command must be installed on the machine running the script. The expect command is used to interact with interactive commands on behalf of the user so it can enter passwords when asked for.
  • The scripts can be used to back up one cluster (cluster A) then restore to another cluster (cluster B) but the user should make sure that cluster B has nodes with a structure similar to that of cluster A (see diagram below). Also, there exist some configuration files that have information (IP addresses and ports) that should be edited manually in order to make sure not to break cluster configurations. These files are:
    • CMC node configuration files which include:
      • All files under CMC_HOME/cmcData
      • CMC_HOME/conf/catalina.properties
    • Incorta nodes configuration files which include:
      • INCORTA_NODE_HOME/node.properties
      • INCORTA_NODE_HOME/services/SERVICE_GUID/conf/catalina.properties
      • INCORTA_NODE_HOME/services/SERVICE_GUID/incorta/service.properties
      • INCORTA_NODE_HOME/nodeAgent/nodeAgent.cfg
    • Spark nodes configuration files which include:
      • SPARK_NODE_HOME/conf/spark-env.sh
      • SPARK_NODE_HOME/conf/slaves (if it exists)
    • Zookeeper nodes configuration files which include:
      • ZOOKEEPER_NODE_HOME/conf/zoo.cfg

Usage

  1. Download the scripts and parameters files to the CMC node (If tenant-related files backup or restore operation is required) or any node (inside or outside the cluster if cluster configuration files backup or restore operation is required). There is a Zip file attached to this document named backup_restore_files.zip that contains all the files discussed in this document.
  2. Make the script executable:
    $ chmod +x backup.sh restore.sh
  3. Modify the parameters files to specify the cluster information and which components to backup.
  4. Run backup script:
    $ ./backup.sh
  5. Run restore script:
    $ ./restore.sh
Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎03-15-2022 08:47 AM
Updated by: