Kafka Clients: Producer and Consumer

In the world of Apache Kafka, two essential components play crucial roles in the production and consumption of messages: the Kafka Producer and the Kafka Consumer. Let’s dive into the details of each client and understand their responsibilities.

Kafka Producer

The Kafka Producer is responsible for publishing messages to one or more Kafka topics. Its primary duties include:

  1. Message Production: The producer creates messages and sends them to Kafka brokers. It can send messages to a specific partition or let Kafka handle the partition assignment based on the message key.

  2. Serialization: Before sending messages, the producer serializes the message key and value into byte arrays using serializers specified in the producer configuration.

  3. Partitioning: The producer determines the partition to which a message should be sent. It can use a custom partitioner or rely on Kafka’s default partitioning logic.

  4. Compression: Producers can compress messages before sending them to reduce network bandwidth and storage space. Kafka supports various compression codecs, such as Gzip, Snappy, and LZ4.

Here’s an example of creating a Kafka producer using the Java client:

Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

KafkaProducer<String, String> producer = new KafkaProducer<>(props);

ProducerRecord<String, String> record = new ProducerRecord<>("my-topic", "key", "value");
producer.send(record);

producer.close();

Kafka Consumer

The Kafka Consumer is responsible for subscribing to one or more topics and consuming messages from those topics. Its main responsibilities include:

  1. Topic Subscription: Consumers subscribe to one or more topics or partitions to receive messages from them. They can dynamically adjust their subscriptions during runtime.

  2. Message Consumption: Consumers poll for messages from the subscribed topics and process them. They can control the offset from which they want to start consuming messages.

  3. Deserialization: Consumers deserialize the received message key and value using deserializers specified in the consumer configuration.

  4. Offset Management: Consumers are responsible for keeping track of the offsets they have consumed. They can choose to commit offsets automatically or manually to mark their progress.

  5. Consumer Groups: Consumers can be part of a consumer group, where multiple consumers collaborate to consume messages from a topic. Each consumer in the group is assigned a subset of partitions to ensure fair distribution and parallel consumption.

Here’s an example of creating a Kafka consumer using the Java client:

Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "my-group");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Collections.singletonList("my-topic"));

while (true) {
    ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
    for (ConsumerRecord<String, String> record : records) {
        System.out.printf("Received message: (key=%s, value=%s)%n", record.key(), record.value());
    }
}

Interaction between Producer and Consumer

The interaction between producers and consumers in Kafka follows a publish-subscribe model:

  1. Producers write messages to Kafka topics without knowledge of the consumers.

  2. Consumers subscribe to topics and consume messages independently, at their own pace.

  3. Kafka acts as a highly scalable and fault-tolerant message broker, ensuring durability and allowing multiple consumers to read the same messages.

This decoupling of producers and consumers enables flexible and scalable architectures, where producers and consumers can be added or removed dynamically without affecting each other.