8. Kafka Questions

#BACKEND #WEB

This section focuses on Kafka fundamentals, including topics, partitions, consumer groups, message delivery guarantees, event-driven architecture, serialization, failure handling, and how Kafka integrates with Spring Boot in distributed systems.

1. What is Kafka?

Apache Kafka is a distributed event streaming platform used for high-throughput, fault-tolerant, real-time data processing.

Kafka is commonly used for:

Event-driven systems
Messaging
Log aggregation
Real-time analytics
Microservice communication
Data streaming pipelines

Kafka was originally developed by LinkedIn and later became an Apache project under the Apache Software Foundation.

Kafka stores streams of records called events or messages.

Example event:

{
  "orderId": 1001,
  "status": "CREATED"
}

Kafka is designed for:

Horizontal scalability
Distributed systems
High durability
Massive throughput

Unlike traditional queues, Kafka persists messages on disk and allows consumers to replay events.

2. What is a topic?

A topic is a logical category or stream of messages in Kafka.

Example topics:

orders
payments
notifications
user-events

Producers send messages to topics.

Consumers read messages from topics.

Example:

Producer → orders topic → Consumer

Topics are central to Kafka architecture.

3. What is a partition?

A partition is a subdivision of a topic.

Kafka topics are split into partitions for:

Scalability
Parallel processing
Distributed storage

Example:

orders topic
 ├── Partition 0
 ├── Partition 1
 └── Partition 2

Each partition is an ordered sequence of messages.

Kafka guarantees ordering only inside a partition.

Partitions allow Kafka to scale horizontally across multiple brokers.

4. What is a consumer group?

A consumer group is a set of consumers working together to process messages from a topic.

Example:

Consumer Group A
 ├── Consumer 1
 ├── Consumer 2
 └── Consumer 3

Kafka distributes partitions among consumers in the same group.

Important rule:

One partition → One consumer within a group

Benefits:

Parallel processing
Scalability
Load balancing
Fault tolerance

Different consumer groups can independently consume the same topic.

5. What is an offset?

An offset is the position of a message inside a partition.

Example:

Partition 0
Offset 0
Offset 1
Offset 2

Kafka uses offsets to track consumption progress.

Consumers commit offsets after processing messages.

Offsets are extremely important for:

Retry handling
Recovery
Delivery guarantees
Replay functionality

6. What is a broker?

A broker is a Kafka server.

Kafka clusters usually contain multiple brokers.

Example:

Broker 1
Broker 2
Broker 3

Brokers are responsible for:

Storing partitions
Serving producers
Serving consumers
Replication
Fault tolerance

Kafka distributes partitions across brokers.

7. What is producer?

A producer is an application that sends messages to Kafka topics.

Example:

kafkaTemplate.send("orders", order);

Producer responsibilities:

Serialize messages
Choose partitions
Send events
Handle retries

Example use cases:

Order service publishing events
Payment system sending updates
Logging systems sending logs

8. What is consumer?

A consumer reads messages from Kafka topics.

Example:

@KafkaListener(topics = "orders")
public void consume(OrderEvent event) {
}

Consumers process events asynchronously.

Responsibilities:

Read messages
Deserialize data
Process business logic
Commit offsets

9. Why is Kafka used?

Kafka is used because it handles massive real-time event streams efficiently.

Advantages:

Feature	Benefit
High throughput	Millions of messages
Scalability	Distributed partitions
Durability	Persistent storage
Fault tolerance	Replication
Replay support	Re-read old events
Decoupling	Independent services

Kafka is heavily used in:

Microservices
Financial systems
E-commerce
Real-time analytics
Streaming systems

10. Difference between Kafka and RabbitMQ?

This is a very common interview question.

Kafka

Designed for:

Event streaming
High throughput
Distributed logs
Replayable events

Messages are persisted for configurable retention periods.

RabbitMQ

Designed for:

Traditional message queues
Complex routing
Short-lived messages

Messages are usually removed after consumption.

Main differences:

Kafka	RabbitMQ
Distributed log	Traditional message broker
Pull-based	Push-based
Very high throughput	Lower throughput
Replay support	Limited replay
Persistent event stream	Queue processing
Better for streaming	Better for task queues

11. What is event-driven architecture?

Event-driven architecture is a system design where services communicate through events.

Example:

Order Created Event
→ Payment Service
→ Inventory Service
→ Notification Service

Instead of direct synchronous calls:

Service A → Service B

services publish events asynchronously.

Benefits:

Loose coupling
Scalability
Better resilience
Independent services

Kafka is one of the most popular platforms for event-driven architecture.

12. What is at-least-once delivery?

At-least-once delivery guarantees messages are never lost.

However:

Duplicates may occur

If acknowledgment fails, Kafka may resend message.

This is the most common Kafka delivery mode.

13. What is at-most-once delivery?

At-most-once delivery guarantees no duplicate processing.

However:

Messages may be lost

If consumer commits offset before processing succeeds, failures may lose data.

14. What is exactly-once delivery?

Exactly-once delivery guarantees messages are processed only once.

Kafka supports exactly-once semantics using:

Idempotent producers
Transactions
Offset coordination

Exactly-once is more complex and has performance trade-offs.

15. What is idempotent consumer?

An idempotent consumer can safely process the same message multiple times without producing incorrect results.

Example:

Processing same payment event twice should not charge customer twice.

Idempotency is critical because duplicates can occur in distributed systems.

16. Why should consumers be idempotent?

Because duplicate delivery is possible.

Example failure scenario:

Consumer processes message
Database update succeeds
Consumer crashes before offset commit
Kafka redelivers message

Without idempotency:

Duplicate business operations

may occur.

Examples:

Double payments
Duplicate emails
Incorrect inventory updates

17. What happens if consumer fails after processing but before committing offset?

Kafka assumes message was not processed successfully.

Result:

Message is reprocessed

This is why duplicates may happen in at-least-once delivery systems.

Consumers should therefore be idempotent.

18. What is dead letter topic?

Dead Letter Topic (DLT) stores messages that repeatedly fail processing.

Example:

orders-dlt

Used for:

Failed messages
Debugging
Manual investigation
Preventing endless retries

Very important in production systems.

19. What is retry topic?

Retry topics temporarily store failed messages before retrying later.

Example flow:

Main Topic
→ Retry Topic
→ Dead Letter Topic

Useful for transient failures:

Temporary database outage
Network issues
External API failure

20. How do you handle poison messages?

Poison messages are messages that always fail processing.

Handling strategies:

Strategy	Explanation
Dead Letter Topic	Move failed messages
Retry limit	Prevent infinite retries
Validation	Reject invalid data
Monitoring	Alert failures

Without proper handling, poison messages can block consumers continuously.

21. What is consumer lag?

Consumer lag is the difference between:

Latest offset - Consumer offset

High lag means consumers cannot keep up with message production.

Lag monitoring is critical in production Kafka systems.

22. How do you monitor Kafka?

Common monitoring metrics:

Consumer lag
Broker health
Throughput
Partition distribution
Error rates
Disk usage

Common tools:

Prometheus
Grafana
Kafka Exporter
Confluent Control Center

Monitoring is extremely important in distributed event systems.

23. What is message key?

A message key helps Kafka determine partition assignment.

Example:

kafkaTemplate.send("orders", orderId, order);

Same key usually maps to same partition.

Benefits:

Ordering guarantee
Related message grouping

24. How does Kafka choose partition?

Kafka partition selection depends on:

Situation	Behavior
Key exists	Hash(key) determines partition
No key	Round-robin distribution

Same key always maps consistently to same partition unless partition count changes.

25. How do you guarantee ordering in Kafka?

Kafka guarantees ordering only inside a partition.

To maintain ordering:

Use same message key

Example:

order-1001

All events for same order go to same partition.

26. Can Kafka guarantee global ordering?

No.

Kafka cannot guarantee ordering across multiple partitions.

Only partition-level ordering exists.

Global ordering would severely reduce scalability.

This is a very important interview point.

27. What is schema registry?

Schema Registry manages message schemas centrally.

Commonly used with:

Avro
Protobuf

Benefits:

Schema validation
Version compatibility
Producer-consumer consistency

Without schema management, evolving message structures becomes risky.

28. What is Avro?

Avro is a binary serialization format commonly used with Kafka.

Advantages:

Compact size
Fast serialization
Strong schema support
Version compatibility

Avro works very well with Schema Registry.

29. What is JSON serialization?

JSON serialization converts objects into JSON text format.

Example:

{
  "id": 1,
  "name": "John"
}

Advantages:

Human-readable
Easy debugging
Simple integration

Disadvantages:

Larger payload
Slower than binary formats like Avro

30. How do you integrate Kafka with Spring Boot?

Usually using:

spring-kafka

dependency.

Producer example:

@Autowired
private KafkaTemplate<String, Object> kafkaTemplate;

kafkaTemplate.send("orders", order);

Consumer example:

@KafkaListener(topics = "orders")
public void consume(OrderEvent event) {
}

Spring Boot auto-configures:

KafkaTemplate
ConsumerFactory
ProducerFactory
Listener containers

using application configuration.

Example:

spring.kafka.bootstrap-servers=localhost:9092

Spring Kafka greatly simplifies Kafka integration in enterprise applications.

Code4Fun

8. Kafka Questions

1. What is Kafka?

2. What is a topic?

3. What is a partition?

4. What is a consumer group?

5. What is an offset?

6. What is a broker?

7. What is producer?

8. What is consumer?

9. Why is Kafka used?

10. Difference between Kafka and RabbitMQ?

Kafka

RabbitMQ

11. What is event-driven architecture?

12. What is at-least-once delivery?

13. What is at-most-once delivery?

14. What is exactly-once delivery?

15. What is idempotent consumer?

16. Why should consumers be idempotent?

17. What happens if consumer fails after processing but before committing offset?

18. What is dead letter topic?

19. What is retry topic?

20. How do you handle poison messages?

21. What is consumer lag?

22. How do you monitor Kafka?

23. What is message key?

24. How does Kafka choose partition?

25. How do you guarantee ordering in Kafka?

26. Can Kafka guarantee global ordering?

27. What is schema registry?

28. What is Avro?

29. What is JSON serialization?

30. How do you integrate Kafka with Spring Boot?

Comments