Apache Kafka – What Is It?

For the uninitiated, the Kafka project created by LinkedIn in 2012 and adopted by Apache is a public subscribe distributed messaging system. This post seeks to provide an overview on Kafka by presenting the ideas related to producers, topic, brokers and consumers.

Introduction to Kafka:

Kafka written in Scala is a scalable, high throughput, replicated, partitioned log system. It was created at LinkedIn primarily aimed at live feeds coming from all social media channels whether they were coming from Twitter, Facebook or LinkedIn itself. Later on, it was open sourced so that other organizations may be able to adopt it as well. Like other messaging systems, messages are written to and read from the server – but with Kafka clusters it happens at a good speed.

Kafka is considered to be a “public subscribe distributed messaging system” rather than a “queue system” since the message is received from the producer and broadcast to a group of consumers rather than a single consumer.

Architecture of Kafka:

Having seen the history of Kafka, let us move onto its architecture. These are the basic terms associated with the Kafka architecture – producer, broker, consumer and topic.Kafka cluster

Producer:

Different producers like Apps, DBMS, NoSQL write data to the Kafka cluster. The Kafka cluster consists of many “brokers”. Each “broker” in layman term is a “server”. Each message is given a key which assures that all messages with the same key arrive at the same partition. The producer continuously keeps writing messages to the Kafka cluster without waiting for any acknowledgement. It is this asynchronous way of producing and adding messages to the cluster that gives Kafka its immense speed which is an absolute necessity with today’s live social media feeds.

Topic:

Messages of a similar type are considered to be a ‘Topic’. A ‘Topic’ is similar to a ‘File’ structure. Messages are published to a ‘Topic’ and there is a partition associated with each ‘Topic’.

Brokers:

The “broker” in Kafka is similar to what a traditional “broker” would do. It holds the messages that have been written by the producer before being consumed by the ‘consumer’.

There are many “brokers” or “servers” inside the Kafka cluster. Each “broker” has a partition and as already stated each partition is associated with a ‘Topic’. The brokers receive the messages and they are stored in the “brokers” for ‘n’ number of days (which can be configured). After the ‘n’ of days has expired, the messages are discarded. It is important to state here again that Kafka does not check whether each consumer or consumer groups have read the messages.

Consumer:Kafka Under The Hood

After the “producers” have produced the message and sent it to the Kafka brokers, the consumers then read the message. Each “consumer” or “consumer group” is subscribed to different “topics” and they read from the “partition” for the “topics” they are subscribed to. If one of the brokers goes down, then the other brokers support the system and makes sure it is running smoothly.

ZooKeeper:

The Zookeeper’s primary responsibility is to coordinate with the different components of Kafka cluster. The producer hands the message to the “broker leader” which writes the message onto itself and replicates it onto other brokers. LinkedIn, Yahoo, Twitter, Pinterest, Tumblr, Goldman Sachs and Netflix are just a few examples of organizations that have adopted Kafka into their production systems.

This post gave an overview of Kafka followed by its architecture. Kafka will no doubt be embraced by more organizations as time goes by.

For more information on Kafka visit: Kafka.apache.org

About Aditi Malhotra

Aditi Malhotra is the Content Marketing Manager at Whizlabs. Having a Master in Journalism and Mass Communication, she helps businesses stop playing around with Content Marketing and start seeing tangible ROI. A writer by day and a reader by night, she is a fine blend of both reality and fantasy. Apart from her professional commitments, she is also endearing to publish a book authored by her very soon.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top