Newbie’s intro to kafka

Rohit Satwadhar
2 min readDec 4, 2021

--

Recently, I started learning about Kafka, and I’ll share how I’ve come to understand it through some questions I encountered.

Kafka is defined as an “event streaming platform used to collect, store, and process real-time data streams.” To put this into context, imagine you have a website and want to track how often a specific link is clicked. Each click is an event containing data that can be used to tailor the website to user preferences. To manage this, you need a system that can capture, process, and store these events for future use. Kafka is well-suited for such tasks.

In this scenario, the website acts as the producer. After generating the event, the website sends it to a Kafka broker. The broker, which is the Kafka server, manages multiple topics. Each topic contains messages of a particular type, which helps in organizing and processing them.

Simple Kafka Architecture

Simple Kafka Architecture:

Producers send events to topics, and the broker stores these messages in its logs. Each message is assigned a unique offset ID, similar to an index in an array. This offset ID helps consumers retrieve messages from a specific position.

Consumers are applications that retrieve messages from topics. Each consumer subscribes to a topic and pulls messages starting from the offset it has. Notably, consumers don’t know which producer created a message, which is beneficial for distributed systems due to the loose coupling of components. Once a consumer receives a message, it processes it according to its needs. For instance, it might use the data to train a machine learning model for generating recommendations.

This overview covers the basic concept of Kafka, though there are more complexities like partitions, Zookeeper, and consumer groups that are not discussed here. This article should give you a solid starting point for understanding Kafka.

--

--

Rohit Satwadhar
Rohit Satwadhar

Written by Rohit Satwadhar

I Write about new things that I learn. That is how I remember stuff. These things are mostly tech related.

No responses yet