on 05-25-2022 09:21 AM
This document discusses Incorta High Availability and Disaster Recovery architectures at a high level.
Audience: Infrastructure and operations teams who are responsible for installing and maintaining Incorta.
The following is a list of architectural principles to be used for the Incorta DR Architecture
A Typical Incorta High Availability architecture consists of:
The High Availability architecture deals with individual node failures and does not take care of disasters where a whole site fails. The following figure illustrates the various components in detail within a High Availability architecture.
This sample HA Architecture for the primary site consisting of the following:
One half of the cluster consist of Incorta Node-1 , Spark Node-1 and Zookeeper Node-1 and resides on Server 1. The other half of the cluster consist of Incorta Node-2, Spark Node-2 and Zookeeper Node-2 and resides on Server 2.
Since the Zookeeper Ensemble requires at least 3 nodes, the third zookeeper node can be placed on any small VM.
The Metadata database should also be highly available. It can be a MySQL or Oracle cluster for earlier versions of Incorta. For later versions, it should be a MySQL cluster.
In case of individual node failures on any of the servers, the backup nodes on the other server will still be available to keep Incorta functioning.
There are various solutions to enable Disaster Recovery. The following architecture uses duplication of the primary site High Availability architecture to a Disaster Recovery site.
DR Architecture involves replication of two key components from the Primary Site to the DR Site.
The above diagram illustrates the replication of the metadata database and the contents of shared storage from the primary site to the disaster recovery site.
The Metadata database is a lightweight database and is used to hold dictionary information related to Incorta. For earlier version of Incorta it can be MySQL or Oracle. For later releases of Incorta, please use a MySQL database.
Shared storage is used to store the actual user data extracted from source systems.
In case of total primary site failure, Incorta on the Disaster Recovery site should be started. Since the actual data and the metadata is replicated from the primary site to the DR site, Incorta will be up and running. If the replication process is near real time then there will be little to no loss of data.