Search Tutorials


Understanding Apache Kafka Architecture| JavaInUse



Understanding Apache Kafka Architecture

In previous tutorial we saw what is Apache Kafka and what is it's need. In this tutorial we will be looking at Apache Kafka Architecture. In the next tutorial we will be looking at Internal Working Of Apache Kafka

RabbitMQ - Table Of Contents

What is Apache Kafka Understanding Apache Kafka Architecture Internal Working Of Apache Kafka Getting Started with Apache Kafka - Hello World Example Spring Boot + Apache Kafka Example

Video

This tutorial is explained in the below Youtube Video.

Apache Kafka Topic

Apache Kafka is a messaging system where messages are sent by producers and these messages are consumed by one or more consumers. Producers send the messages to Apache Kafka Topics. From the topics these messages are then consumed by the consumers. Topics have unique names which are used by producers and consumers for sending/consuming messages.
Topics are the base abstraction of where data lives within Kafka. They can be considered similar to the concept of table in a database. Each topic is backed by logs which are partitioned and distributed.
Apache Kafka Topic

Apache Kafka Broker

The physical/virtual machines or servers where topics reside are called brokers. Kafka Broker is a software process that runs on machine. Broker has access to resources such as file systems where logs are maintained. A single topic can have multiple partitions running on different brokers.
Apache Kafka Broker
We can add more brokers to scale the Kafka Application.
Apache Kafka Broker Cluster
The advantages of using Apache Kafka Cluster are as follows -
  • Clustering - Apache Kafka has a clustered set of brokers that work in unison.
  • Distributed - The data in Apache Kafka is distributed in all the brokers. This is done by making use of partitions which are distributed across multiple brokers.
  • Fault Tolerant - Apache Kafka maintains replicated copies of data. So if any broker in the cluster goes down, it does not affect the working of Apache Kafka Cluster. This is done by setting the replication value to greater than 1.
  • Application scaling - Apache Kafka can be scaled horizontally. This increases the throughput.

Apache Zookeeper

To manage the cluster we make use of Apache Zookeeper. Apache Zookeeper is a coordination service for distributed application that enables synchronization across a cluster. Zookeeper can be viewed as centralized repository where distributed applications can put data and get data out of it. It is used to keep the distributed system functioning together as a single unit, using its synchronization, serialization and coordination goals, selecting leader node.
Apache Kafka Broker Cluster Zookeeper