Why Kafka for Pub-Sub
Modern distributed systems require reliable communication between services. When services need to share data asynchronously with guaranteed delivery, ordering, and the ability to replay messages, Apache Kafka stands out as the industry standard for event streaming. Unlike simple HTTP requests that fail when services are unavailable, Kafka provides a durable, persistent message queue that decouples producers from consumers and ensures no message is lost.
Traditional inter-service communication often relies on HTTP REST APIs, which create tight coupling between services. When a producer service makes a direct HTTP call to a consumer service, both must be available simultaneously, and the producer must handle retries, timeouts, and circuit breaking. Kafka fundamentally changes this model by introducing an intermediary event log that producers write to and consumers read from independently.
Key Benefits
- Decoupling: Producers and consumers operate independently without knowing about each other
- Durability: Messages persist on disk and survive broker restarts
- Ordering: Messages within a partition maintain strict order
- Replayability: Consumers can rewind to reprocess messages from any point
- Scalability: Partition-based parallelism enables high throughput
The publish-subscribe pattern in Kafka enhances traditional message queuing by maintaining an immutable event log. Unlike simple queues that delete messages after consumption, Kafka retains messages for a configurable period, enabling scenarios like event sourcing, audit trails, and replay for new consumers joining the system.
Learn more about our approach to distributed systems architecture that leverages modern messaging patterns for scalable applications.
Setting Up KafkaJS in Your Node.js Project
Getting started with KafkaJS is straightforward. Install the library using npm or yarn, then create a Kafka client instance configured for your environment. KafkaJS is the official Node.js client for Apache Kafka and provides a modern, Promise-based API for all Kafka operations.
Installation
npm install kafkajs
Creating a Kafka Client
const { Kafka } = require('kafkajs');
const kafka = new Kafka({
clientId: 'my-app',
brokers: ['localhost:9092'],
retry: {
initialRetryTime: 100,
retries: 8
}
});
The Kafka client manages connections to your Kafka brokers. The clientId identifies your application in the Kafka cluster, while brokers specifies the list of Kafka broker addresses to connect to. The retry configuration ensures your client gracefully handles temporary connection failures without crashing.
Connection Best Practices
For production environments, always configure appropriate retry behavior and consider using multiple broker addresses for high availability. When running in containerized environments, use service discovery to dynamically resolve broker addresses rather than hardcoding them. This approach ensures your application can handle broker failures and scaling operations seamlessly.
As noted in the KafkaJS producing documentation, configuring retry parameters is essential for building resilient clients that can recover from transient network issues.
For Node.js applications requiring reliable message processing, proper client configuration is the foundation of durable event streaming systems. Our backend development services include architecture design for distributed systems using these patterns.
Building a Durable Producer
The producer is responsible for publishing messages to Kafka topics. Building a durable producer means configuring it to ensure messages are reliably delivered and not lost during broker failures. The KafkaJS producing API provides comprehensive options for controlling delivery guarantees.
Creating and Connecting the Producer
const producer = kafka.producer();
await producer.connect();
console.log('Producer connected');
The producer must be connected before sending messages. Connection establishes the TCP connection to the Kafka broker and performs the initial handshake to authenticate and authorize the connection.
Sending Messages
await producer.send({
topic: 'user-events',
messages: [
{
key: 'user-123',
value: JSON.stringify({ event: 'signup', timestamp: Date.now() })
}
]
});
Each message consists of a key and value. The key is used for partitioning--messages with the same key go to the same partition, ensuring ordering for related messages. The value contains your actual message payload, which should be a string or Buffer.
Understanding Acknowledgment Levels
The acks parameter controls durability guarantees for your messages:
| Level | Description | Use Case |
|---|---|---|
acks: 0 | Fire and forget; no acknowledgment | Highest throughput, acceptable for metrics |
acks: 1 | Wait for leader acknowledgment | Balanced performance and reliability |
acks: -1 | Wait for all in-sync replicas | Maximum durability for critical data |
await producer.send({
topic: 'user-events',
acks: -1, // Wait for all replicas
messages: [
{ key: 'user-123', value: JSON.stringify({ event: 'signup' }) }
]
});
For production systems processing critical data, use acks: -1 to ensure messages survive broker failures. This setting waits for all in-sync replicas to acknowledge receipt before returning success.
Message Headers
Kafka supports headers for metadata, useful for tracing and context:
await producer.send({
topic: 'user-events',
messages: [{
key: 'user-123',
value: JSON.stringify({ event: 'signup' }),
headers: {
'correlation-id': 'uuid-here',
'trace-id': 'span-id',
'content-type': 'application/json'
}
}]
});
Headers are useful for adding tracing context, content type information, and custom metadata without modifying the message payload. This enables better observability and debugging capabilities in distributed systems.
Compression for Efficiency
KafkaJS supports multiple compression types to reduce network bandwidth:
const { CompressionTypes } = require('kafkajs');
await producer.send({
topic: 'user-events',
compression: CompressionTypes.GZIP,
messages: [
{ key: 'user-123', value: largeMessagePayload }
]
});
Available compression types include GZIP (built-in), Snappy, LZ4, and ZSTD. Compression significantly reduces network bandwidth and storage requirements, especially for repetitive data patterns. For high-throughput systems, compression can reduce message sizes by 50-80% depending on the data structure.
Implementing Kafka-based event streaming is a key component of modern microservices architecture. Our team can help you design and deploy scalable event-driven systems that improve application reliability and performance.
Configure your producer for different reliability requirements
Acknowledgment Levels
Choose between fire-and-forget, leader-only, or all-replicas acknowledgment based on your durability needs.
Compression
Reduce network and storage overhead with GZIP, Snappy, LZ4, or ZSTD compression.
Idempotence
Prevent duplicate messages with exactly-once semantics for critical business transactions.
Retry Strategy
Configure exponential backoff with jitter to handle temporary failures gracefully.
Creating Consumers and Consumer Groups
Consumer groups enable parallel processing and automatic load balancing. All consumers with the same groupId coordinate to divide topic partitions among themselves. As documented in the KafkaJS consuming documentation, this coordination ensures efficient message processing across multiple consumers.
Consumer Group Fundamentals
const consumer = kafka.consumer({ groupId: 'user-service-group' });
await consumer.connect();
await consumer.subscribe({ topic: 'user-events', fromBeginning: false });
When a consumer fails or a new consumer joins the group, Kafka initiates a rebalance operation to redistribute partitions among available consumers. This ensures continuous processing even as your consumer fleet changes.
Processing Messages with EachMessage
The eachMessage handler provides a simple API for processing messages one at a time:
await consumer.run({
eachMessage: async ({ topic, partition, message, heartbeat }) => {
console.log({
key: message.key.toString(),
value: message.value.toString(),
headers: message.headers,
partition,
offset: message.offset
});
// Process the message
await processUserEvent(JSON.parse(message.value));
// Heartbeat during long processing
await heartbeat();
}
});
This handler automatically manages offsets and heartbeats, making it ideal for most use cases. The heartbeat ensures Kafka knows the consumer is still alive during long-running operations.
Batch Processing with EachBatch
For high-throughput scenarios requiring more control:
await consumer.run({
eachBatchAutoResolve: true,
eachBatch: async ({
batch,
resolveOffset,
heartbeat,
commitOffsetsIfNecessary,
isRunning,
isStale
}) => {
for (const message of batch.messages) {
if (!isRunning() || isStale()) break;
await processMessage(message);
resolveOffset(message.offset);
await heartbeat();
}
await commitOffsetsIfNecessary();
}
});
EachBatch provides utilities for manual offset management, giving you finer control over processing semantics. This is essential when you need to process messages atomically or implement custom batching logic.
Consumer Configuration Options
Key consumer options control behavior and performance:
| Option | Description | Default |
|---|---|---|
sessionTimeout | Time before consumer is considered failed | 30000ms |
heartbeatInterval | Heartbeat frequency | 3000ms |
maxBytesPerPartition | Data fetched per partition | 1MB |
maxWaitTimeInMs | Maximum fetch wait time | 5000ms |
autoCommitInterval | Offset commit interval | null |
autoCommitThreshold | Commit after N messages | null |
Pause and Resume
Control consumption flow for backpressure handling:
// Pause consumption
await consumer.pause([{ topic: 'user-events', partitions: [0] }]);
// Resume consumption
await consumer.resume([{ topic: 'user-events', partitions: [0] }]);
Pause/resume is useful when downstream systems are overloaded and you need to slow down message consumption without losing messages. This pattern is essential for building resilient systems that can handle traffic spikes gracefully.
For organizations building distributed systems, our AI automation services can help integrate intelligent decision-making into your event streaming workflows, enabling real-time processing and automated responses.
Error Handling and Retry Strategies
Building resilient systems requires robust error handling. KafkaJS provides configurable retry mechanisms and patterns for handling failures gracefully. Following best practices from the LogRocket tutorial on durable pub-sub systems helps avoid common pitfalls.
Automatic Retry Configuration
const producer = kafka.producer({
retry: {
initialRetryTime: 100,
retries: 5,
maxRetryTime: 30000,
factor: 2,
multiplier: 2
}
});
The retry strategy uses exponential backoff with jitter to handle temporary failures without overwhelming the system. Each retry attempt waits longer than the previous, preventing thundering herd problems.
Manual Error Handling
try {
await producer.send({
topic: 'user-events',
messages: [{ key: 'user-123', value: JSON.stringify(data) }]
});
} catch (error) {
console.error('Failed to send message:', error);
// Implement fallback: write to dead letter queue, log, etc.
}
For critical data, implement dead letter queues to capture messages that fail after all retries. This ensures no message is permanently lost and can be reprocessed later.
Idempotent Producer
For exactly-once semantics, enable idempotence:
const producer = kafka.producer({
idempotent: true,
maxInFlightRequests: 5
});
The idempotent producer prevents duplicate messages caused by producer retries. Each message gets a unique sequence number, and Kafka ensures duplicates are detected and discarded. This is essential for financial transactions and critical business data.
Production Best Practices
Deploying Kafka clients in production requires attention to monitoring, graceful shutdown, and operational resilience. Following established patterns for durable pub-sub systems ensures reliable operation under production workloads.
Health Checks and Monitoring
Monitor producer and consumer connections and metrics:
producer.on('producer.connect', () => console.log('Producer connected'));
producer.on('producer.disconnect', () => console.log('Producer disconnected'));
consumer.on('consumer.connect', () => console.log('Consumer connected'));
consumer.on('consumer.disconnect', () => console.log('Consumer disconnected'));
Track metrics like request latency, error rates, and consumer lag using monitoring tools. High consumer lag indicates your consumers can't keep up with message production.
Graceful Shutdown
Always disconnect properly to ensure in-flight messages complete:
async function shutdown() {
await producer.disconnect();
await consumer.disconnect();
console.log('Disconnected from Kafka');
}
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Proper shutdown ensures in-flight messages complete and offsets are committed, preventing message loss during deployment or scaling operations.
Connection Management
For applications with multiple producers/consumers, manage connections efficiently:
// Singleton pattern example
let producerInstance = null;
async function getProducer() {
if (!producerInstance) {
producerInstance = kafka.producer();
await producerInstance.connect();
}
return producerInstance;
}
Reusing producer connections across requests improves performance and reduces broker load. Avoid creating new connections for each message.
Topic Configuration
Consider topic-level settings for durability and performance:
const admin = kafka.admin();
await admin.createTopics({
topics: [{
topic: 'user-events',
numPartitions: 3,
replicationFactor: 3,
configEntries: [
{ name: 'retention.ms', value: '604800000' }, // 7 days
{ name: 'min.insync.replicas', value: '2' }
]
}]
});
Key topic configurations include partition count (for parallelism), replication factor (for durability), retention period (for replay capability), and minimum in-sync replicas (for write durability). Proper topic design is crucial for achieving the right balance between performance and reliability.
For enterprise-grade event streaming architectures, our cloud infrastructure services can help design and implement robust Kafka deployments that scale with your business needs.
1const { Kafka, CompressionTypes } = require('kafkajs');2 3async function main() {4 const kafka = new Kafka({5 clientId: 'user-service',6 brokers: ['localhost:9092'],7 retry: {8 initialRetryTime: 100,9 retries: 810 }11 });12 13 // Producer with durability settings14 const producer = kafka.producer({15 allowAutoTopicCreation: true,16 transactionTimeout: 3000017 });18 19 await producer.connect();20 console.log('Producer connected');21 22 // Consumer with group management23 const consumer = kafka.consumer({24 groupId: 'user-service-handlers',25 sessionTimeout: 30000,26 heartbeatInterval: 300027 });28 29 await consumer.connect();30 await consumer.subscribe({ topic: 'user-events', fromBeginning: false });31 32 // Start consuming messages33 await consumer.run({34 eachMessage: async ({ topic, partition, message, heartbeat }) => {35 try {36 const data = JSON.parse(message.value.toString());37 await processUserEvent(data);38 await heartbeat();39 } catch (error) {40 console.error('Processing failed:', error);41 // Implement retry or dead letter queue logic here42 }43 }44 });45 46 // Helper function to publish events47 async function publishEvent(event) {48 await producer.send({49 topic: 'user-events',50 compression: CompressionTypes.GZIP,51 acks: -1, // Wait for all replicas52 messages: [{53 key: event.userId,54 value: JSON.stringify(event),55 headers: {56 'event-type': event.type,57 'timestamp': Date.now().toString()58 }59 }]60 });61 }62 63 // Graceful shutdown handler64 async function shutdown() {65 console.log('Shutting down...');66 await producer.disconnect();67 await consumer.disconnect();68 console.log('Disconnected from Kafka');69 process.exit(0);70 }71 72 process.on('SIGTERM', shutdown);73 process.on('SIGINT', shutdown);74}75 76main().catch(console.error);Frequently Asked Questions
Sources
- KafkaJS Official Documentation - Producing - Primary reference for producer API, message structure, compression, and durability options
- KafkaJS Official Documentation - Consuming - Primary reference for consumer groups, offset commits, and batch processing patterns
- LogRocket: Build a durable pub-sub with Kafka in Node.js - Tutorial with practical examples and architecture patterns for production-ready systems