Understanding Apache Kafka Topics Partitions and Brokers.
Here we will go through some of the Apache Kafka theory concepts and terms. If you are new to Apache Kafka, Please read Apache Kafka introduction and Installation on windows post before you read this article. This is extremely important to understand Apache Kafka topics, partitions, brokers and offsets before we start actual CLI usage / application coding.
Topic in Kafka is heart of everything. It is stream of data / location of data in Kafka. We can create many topics in Apache Kafka, and it is identified by unique name.
Topics are split into partitions, each partition is ordered and messages with in a partitions gets an id called Offset and it is incremental unique id.
Offset is specific to Partition, Offset 0 of Partition 0 is completely different from Offset 0 of Partition 1. No of partition is important when we create the topic and that need to be specified at the time of create topic CLI command. Data is stored in Partition for a limited time and it is immutable and it can’t be changed.
Broker is a server / node in Apache Kafka. That means Apache Kafka cluster is composed of multiple brokers.
Each Broker in Cluster identified by unique ID ( Integer ). Each Broker contains certain partitions of a topic. When we specify number of partition at the time of Topic creation data is spread to Brokers available in the clusters. For example. Topic 1 with 2 partitions utilize Broker 1 and Broker 2 if cluster contains more that 2 brokers / node. If it is only one Broker, both partitions are stored in same Broker. That means broker count is less than the partition count, multiple partition of same topic is available in any one of the broker.
Apache Kafka is distributed system. Replication (Copy) is important for Distributed system / Big data world. Topic replication factor indicates how many copy we need to maintain in Broker for topic messages / Partition. One of the required parameter we need to specify at the time of topic creation CLI command.
Producers & Consumers
Producers write data to topics. And Consumers read the data from Topic.
Hope all of you are clear about Kafka topic, partitions, brokers, producers and consumers. Now we can start using Apache Kafka CLI command to create topics and describe for better understanding the theory terms.