inanzzz | Apache Kafka components and general tradeoffs

Apache Kafka is a distributed event store and streaming platform. In this post I'll do my best to explain Kafka, its components and some tradeoffs without getting into too much details.

Components

Zookeeper

Zookeeper is used for coordination and tracking the status of Kafka cluster nodes. It also keeps track of Kafka topics, partitions, offsets so on.

Cluster

This is not unique to Kafka hence it is rather a common terminology in the distributed computing system. It is just a group of servers that work towards a common purpose. Given Kafka is a distributed system, its cluster can have a group of servers which are called brokers. There are two types of clusters which are Single Broker Cluster and Multi-Broker Cluster.

Broker

A Kafka broker is a server that runs the Kafka software and stores data. In a Kafka cluster, there can be one or more brokers. It is responsibility for event exchange between a producer and a consumer. For Kafka producer, it acts as a event receiver and for Kafka consumer, it acts as a event sender.

Event

Events are a significant change in state. They are triggered when "something" happens in the system. When you read or write data to Kafka, you do this in the form of events. Conceptually, an event has a key, value, timestamp and optional metadata headers.

Topic

They represent "categories/groups" under which data or events are stored. A Kafka Topic is a unique name given to a data stream or event stream. For example, "alarms".

Producer

A Producer is an entity whose responsibility is to publish/send (write) events to Kafka. This is your program. The event is always sent to selected topic.

Consumer

A Consumer is an entity whose responsibility is to subscribe/receive (read) events from Kafka and process them as per their business logic. This is your program. A consumer can subscribe to one or more topics to receive the events.

Consumer Group

It represents group of consumers where multiple consumers are combined to share the workload between. It is possible that multiple consumer groups subscribing to the same or different topics. Two or more consumers in same consumer group do not receive the same message. Each consumer always receive a different message so no duplicate processing. This is because the offset moves to the next number as soon as the message is consumed by any of the consumers in that consumer group.

Partition

Each topic in Kafka can be divided into partitions. Short version, partitions as segments within a topic which enable Kafka to manage data more efficiently. When a producer writes a data to the broker using a topic, broker stores the data linked to that topic. In case of a massive set of data, broker would struggle to store it in a single machine. Given Kafka is distributes system, we can benefit from it by breaking the topic in partitions and distribute the partitions on a different machine to store data. The number of partitions for a topic is defined at topic creation.

Offset

In each partition of a Kafka topic, there is a sequence number (Offset) is assigned to each message. As soon as a new message arrives in a partition, a sequence number is assigned to the message. For any given topic, each partitions would have different offsets. This offset number is local to the topic partition so no concept of global applies here. Based on this, if you wish to locate/identify a message, you can construct name using "Topic Name - Partition No - Offset No".

Replica

In Kafka, replication means that data is written down not just to one broker, but many. The replication factor is a topic setting and is specified at topic creation. A replication factor of 1 means no replication.

Tradeoffs

This is just a very very short list but contains some "must-know" ones. You can use this visualisation tool if you wish.

Whether or not a message is consumed by a consumer, it can be deleted from its topic. This is true only if retention.ms and delete.retention.ms settings are configured at topic creation. Actual deletion kicks in slightly longer than the specified duration. e.g. (retention ms + ~5 minutes)

Avoiding to re-consume a message is done by using Consumer Group and committing read offset to its group. This is not possible without consumer groups hence same message will be re-consumed which might lead to unfortunate issues at the consumer end.

Independent consumers assigned to topics would re-consume same message if a consumer reads the message, restarted after and the message is still in the partition (hasn't expired yet due to retention policy). If you don't mind re-consuming same message or have a control mechanism in your code, it is still viable to use independent consumers rather than consumer groups.

It is possible to change the number of partitions for a topic after creation but can be disruptive.

Consumers can read from multiple partitions concurrently.

Kafka is not a message queue. If a message is marked/committed as "consumed" by a consumer group, all previous messages in the partition are also marked/committed. The last stream position (offset per partition) you commit is where consumer starts re-consuming. Based on this, if message 5 has failed at consumption and the consumer moved on to next messages in partition, message 5 will be lost so when the consumer is restarted, message 5 won't be there anymore because consumer continues from offset in that partition. This concept is applicable to consumer groups, not independent consumers.

If you want to send messages to a specific partition of a multi partition topic, make sure to key sarama.ProducerMessage.Key field. Which partition the message would go is decided based on the hashing (HashPartitioner) of the message key. Just bear in mind, if the key variation is low but the partition count is high, it is highly likely be possible that the most of the partitions won't get any messages. e.g. Given there are 6 key variations and 30 partitions, only 6 partitions would get messages which beats the purpose of partitions. When not sure, avoid setting sarama.ProducerMessage.Key to allow RandomPartitioner handle message allocation on partitions. This would at least evenly distribute messages and improve consumption performance.

All consumers within a consumer group would be assigned to an individual partition of a topic. It means a message in a partition will be consumed by its own consumer.

If a topic has N partition and there is M independent consumers, same message in each partitions will be consumed by all M consumers. e.g. There is a topic with 1 partition which contains 5 messages and 2 independent consumers, each consumers will consume 5 messages in isolation. This means each message is handled twice.
- 1 partition with 5 message, 1 consumer (X): X consumes 5 messages
- 1 partition with 5 message, 2 consumer (X, Y): X consumes 5 messages and Y consumes 5 messages
- 2 partitions with 5 messages shared between, 1 consumer (X): X consumes 5 messages from 2 partitions
- 2 partitions with 5 messages shared between, 2 consumer (X): X consumes 5 messages from 2 partitions and X consumes 5 messages from 2 partitions

Semantics

At-least-once

A message will definitely be written once but may possibly written more than once too.

At-most-once

A message will definitely be written once at max but may possibly not written at all too.

Exactly-once

A message will definitely be written once.

Suggestions

If the producer is transactional, make sure the consumer isolation level is set to consume only committed messages using isolation_level=1 - ref. If you don't do this, consumer will consume uncommitted and aborted messages.

Make sure that the message has been written to Kafka and acknowledged. For that you can use acks=all - ref.

Avoid producing duplicate messages by using idempotence. For that you can use enable.idempotence=true - ref.

In a single consumer group, do not run more consumers than the number of topic partitions. You can however run multiple consumer application instances whose count should be equal to the number of topic partitions.

Do not ignore errors in message consumption. In case of an error, forward message to a "dead letter" topic.

Do not make your producer or consumer own the responsibility of topic creation. For that you can use auto.create.topics.enable=false - ref. Handle the process manually by a dedicated service.

Some commands

// List all topics

$ kafka-topics.sh --bootstrap-server localhost:9092 --list
errors-topic
warnings-topic

// List details of a topic

$ kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic errors-topic
Topic: errors-topic	TopicId: EMvnZ6AXQ1O50G4yh4dLDA	PartitionCount: 3	ReplicationFactor: 1	Configs: retention.ms=3600000,delete.retention.ms=3600000
	Topic: errors-topic	Partition: 0	Leader: 1	Replicas: 1	Isr: 1
	Topic: errors-topic	Partition: 1	Leader: 1	Replicas: 1	Isr: 1
	Topic: errors-topic	Partition: 2	Leader: 1	Replicas: 1	Isr: 1

// List all consumer groups

$ kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
errors-topic-group
warnings-topic-group

// List details of consumer group

$ kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group errors-topic-group
GROUP              TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                         HOST            CLIENT-ID
errors-topic-group errors-topic    0          1               1               0               inanzzz-server-2e6992aa-107b-4eb4-82df-32478a7b9478 /172.31.0.1     inanzzz-server
errors-topic-group errors-topic    1          -               0               -               inanzzz-server-2e6992aa-107b-4eb4-82df-32478a7b9478 /172.31.0.1     inanzzz-server
errors-topic-group errors-topic    2          -               0               -               inanzzz-server-2e6992aa-107b-4eb4-82df-32478a7b9478 /172.31.0.1     inanzzz-server

// Purge messages in a topic by changing retention from 1 hour (3600000) to 1 second (1000). You should revert it afterwards!

$ kafka-configs.sh --bootstrap-server localhost:9092 --topic errors-topic --alter --add-config retention.ms=1000
Completed updating config for topic errors-topic.