on 01-22-2023 06:52 AM - edited on 01-23-2023 02:47 AM by FadiB
Apache Kafka is a distributed publish-subscribe messaging system that can handle a high volume of data and enables you to pass messages from one end-point to another. Kafka is suitable for both offline and online message consumption. Kafka messages are persisted on the disk and replicated within the cluster to prevent data loss. Kafka is built on top of the ZooKeeper synchronization service. It integrates very well with Apache Storm and Sparks for real-time streaming.
Kafka has better throughput, built-in partitioning, replication, and inherent fault-tolerance, which makes it a good fit for large-scale message processing applications.
Kafka is very fast and guarantees zero downtime and zero data loss.
Currently the version supported in Kafka is 2.10
In the "Add New Data Source" window:
For an Incorta Cluster with two or Incorta Nodes that each run a Loader Service, you must specify a Kafka Consumer Service Name in the Cluster Management Console.
A Cluster Management Console (CMC) Administrator for your Incorta Cluster must configure the Kafka Consumer Service Name. Changes to this property require that the CMC Administrator restart each Loader Service.
If your Incorta Cluster contains more than two Incorta Nodes each with a Loader Service, then you must specify the Incorta Node and Loader Service to use. If you do not assign a loader service to consume the Kafka messages, Incorta assigns a Loader Service randomly. This can result in unexpected behavior and row duplication.
Here are the steps to specify the required properties for the Server Configurations:
<NODE_NAME>.<SERVICE_NAME>
.
This will make the loader service consume messages instantly fromt he broker defined in the datasource .
Once a message is produced on kafka you should find this in the loader tenant logs .
/<incortai installtion path /IncortaNode/services/<loader id >/logs/incorta/<tenant>
To check if the message is parsed properly or rejected you need to check this log :
/<incortai installtion path >/IncortaNode/services/<loader id >/logs/kafka/<tenant>
you need more details to this logging you need to adapt this file ( please create if it does not exist ) .
You can also add additional needed properties in this file .
<Tenant directory >/<tenant>/KAFKA/kafka-consumer.properties
ex:
kafka.logger.logInfo=true
kafka.logger.logWarning=true
The CSV file that will hold consumed messages will be stored at the below location
<Tenant directory >/<tenant>/KAFKA/<datasource name >/<tablename >
It should be populated if the message sent by the broker is consumed properly in the loader.
1- Access to the support server by ssh .
Once logged in go to docker image 1 :
2- access container: access_docker support_c1
3- The location of Kafka installation is
/home/incorta/kafka_10
4- You will need to start Kafka zookeeper and Kafka as below
start zookeeper : ./zookeeper-server-start.sh ../config/zookeeper.properties & start kafka ./kafka-server-start.sh ../config/server.properties &
33108 -- > zookeeper port
bin/kafka-console-producer.sh --broker-list localhost:33107 --topic <topicname> < samplerecords.json