The main objective of disaster recovery is to ensure that customers can respond to a disaster and minimize the effect on business operations. This is done by making sure that data and applications are restored to the pre-disaster functional state with little down time.
This article describes how to set up and switch to the Disaster Recovery (DR) system in case of primary site failures.
This article requires understanding of Incorta architecture and its various components. It is geared mostly towards Incorta Administrators who are responsible for installation and administration. In addition, understanding of various replication techniques involving shared storage and databases is required.
For more details on installation and administration of Incorta, please review guides at https://docs.incorta.com.
This article applies to on-premise installations for all versions of Incorta. Disaster Recovery will be handled automatically for Incorta Cloud customers.
An Incorta Analytics deployment consists of the following major components
- Incorta Cluster: It mainly consists of Loader and Analytic Services
- Shared Storage: It stores data ingested from various data sources
- Spark Cluster: Incorta integrates with spark to perform complex transformations and run queries on Parquet data
- Zookeeper Ensemble: Zookeeper cluster to coordinate various tasks within the Incorta cluster
- Metadata database: Stores core metadata information
There are various solutions to enable Disaster Recovery. The following architecture uses duplication of the primary site architecture to a Disaster Recovery site.
The above diagram illustrates the replication of the metadata database and the contents of shared storage from the primary site to the disaster recovery site.
The Metadata database is a lightweight database and is used to hold dictionary information related to Incorta. It can be MySQL or Oracle. Shared storage is used to store the actual user data extracted from source systems.
In case of total primary site failure, Incorta on the Disaster Recovery site should be started. Since the actual data and the metadata is replicated from the primary site to the DR site, Incorta will be up and running. If the replication process is near real time then there will be no loss of data.
Install Incorta cluster on DR site exactly similar to primary site. This means:
- Incorta version and the number of loader and analytics services are the same as the primary site
- Installation directory structure is the same
- Network CNAMES are the same as the primary for each of the loader and analytics nodes as this is needed for a seamless switch from Primary to DR and vice versa
Follow the above process for Spark and Zookeeper cluster as well.
Shutdown the Incorta cluster and Zookeeper cluster on the disaster recovery site so that nothing is written accidentally to that environment by login to the environment. Put the metadata database in readonly mode.
Set up replication for the following:
To keep the Incorta version on both the primary and disaster recovery sites the same, replicate the whole installation directory structure from the primary to the disaster recovery site for all the nodes including CMC.
Also keep the Spark cluster and Zookeeper ensemble in sync using the same directory structure and CNAMEs. The Spark cluster can be maintained separately without replication as it is an external piece and can be plugged in when the disaster recovery site needs to be activated. To keep downtime to a minimum, it is preferred to have them replicated as well.
Use appropriate technology provided by shared storage vendors to replicate the whole tenant directory from primary to disaster recovery site shared storage
Replicate metadata database from primary site to disaster recovery site. Make sure that metadata database on the disaster recovery site is in readonly mode and only replication is allowed.
When the primary site goes down and you need to activate the disaster recovery site take the following steps:
Reverse the replication from the Disaster Recovery site to the primary for
- Incorta Installation directories
- Shared Storage
- Metadata database
Start Incorta on the Disaster Recovery site. If you are using a load balancer on top of the analytic services, make sure that the url is activated.