Build Durable Pub Sub With Kafka Node Js

Learn how to create reliable event streaming systems using Apache Kafka and Node.js. Build producers, consumers, and consumer groups for production-ready distributed architectures.

Why Kafka for Pub-Sub

Modern distributed systems require reliable communication between services. When services need to share data asynchronously with guaranteed delivery, ordering, and the ability to replay messages, Apache Kafka stands out as the industry standard for event streaming. Unlike simple HTTP requests that fail when services are unavailable, Kafka provides a durable, persistent message queue that decouples producers from consumers and ensures no message is lost.

Traditional inter-service communication often relies on HTTP REST APIs, which create tight coupling between services. When a producer service makes a direct HTTP call to a consumer service, both must be available simultaneously, and the producer must handle retries, timeouts, and circuit breaking. Kafka fundamentally changes this model by introducing an intermediary event log that producers write to and consumers read from independently.

Key Benefits

  • Decoupling: Producers and consumers operate independently without knowing about each other
  • Durability: Messages persist on disk and survive broker restarts
  • Ordering: Messages within a partition maintain strict order
  • Replayability: Consumers can rewind to reprocess messages from any point
  • Scalability: Partition-based parallelism enables high throughput

The publish-subscribe pattern in Kafka enhances traditional message queuing by maintaining an immutable event log. Unlike simple queues that delete messages after consumption, Kafka retains messages for a configurable period, enabling scenarios like event sourcing, audit trails, and replay for new consumers joining the system.

Learn more about our approach to distributed systems architecture that leverages modern messaging patterns for scalable applications.

Setting Up KafkaJS in Your Node.js Project

Getting started with KafkaJS is straightforward. Install the library using npm or yarn, then create a Kafka client instance configured for your environment. KafkaJS is the official Node.js client for Apache Kafka and provides a modern, Promise-based API for all Kafka operations.

Installation

npm install kafkajs

Creating a Kafka Client

const { Kafka } = require('kafkajs');

const kafka = new Kafka({
 clientId: 'my-app',
 brokers: ['localhost:9092'],
 retry: {
 initialRetryTime: 100,
 retries: 8
 }
});

The Kafka client manages connections to your Kafka brokers. The clientId identifies your application in the Kafka cluster, while brokers specifies the list of Kafka broker addresses to connect to. The retry configuration ensures your client gracefully handles temporary connection failures without crashing.

Connection Best Practices

For production environments, always configure appropriate retry behavior and consider using multiple broker addresses for high availability. When running in containerized environments, use service discovery to dynamically resolve broker addresses rather than hardcoding them. This approach ensures your application can handle broker failures and scaling operations seamlessly.

As noted in the KafkaJS producing documentation, configuring retry parameters is essential for building resilient clients that can recover from transient network issues.

For Node.js applications requiring reliable message processing, proper client configuration is the foundation of durable event streaming systems. Our backend development services include architecture design for distributed systems using these patterns.

Building a Durable Producer

The producer is responsible for publishing messages to Kafka topics. Building a durable producer means configuring it to ensure messages are reliably delivered and not lost during broker failures. The KafkaJS producing API provides comprehensive options for controlling delivery guarantees.

Creating and Connecting the Producer

const producer = kafka.producer();

await producer.connect();
console.log('Producer connected');

The producer must be connected before sending messages. Connection establishes the TCP connection to the Kafka broker and performs the initial handshake to authenticate and authorize the connection.

Sending Messages

await producer.send({
 topic: 'user-events',
 messages: [
 {
 key: 'user-123',
 value: JSON.stringify({ event: 'signup', timestamp: Date.now() })
 }
 ]
});

Each message consists of a key and value. The key is used for partitioning--messages with the same key go to the same partition, ensuring ordering for related messages. The value contains your actual message payload, which should be a string or Buffer.

Understanding Acknowledgment Levels

The acks parameter controls durability guarantees for your messages:

LevelDescriptionUse Case
acks: 0Fire and forget; no acknowledgmentHighest throughput, acceptable for metrics
acks: 1Wait for leader acknowledgmentBalanced performance and reliability
acks: -1Wait for all in-sync replicasMaximum durability for critical data
await producer.send({
 topic: 'user-events',
 acks: -1, // Wait for all replicas
 messages: [
 { key: 'user-123', value: JSON.stringify({ event: 'signup' }) }
 ]
});

For production systems processing critical data, use acks: -1 to ensure messages survive broker failures. This setting waits for all in-sync replicas to acknowledge receipt before returning success.

Message Headers

Kafka supports headers for metadata, useful for tracing and context:

await producer.send({
 topic: 'user-events',
 messages: [{
 key: 'user-123',
 value: JSON.stringify({ event: 'signup' }),
 headers: {
 'correlation-id': 'uuid-here',
 'trace-id': 'span-id',
 'content-type': 'application/json'
 }
 }]
});

Headers are useful for adding tracing context, content type information, and custom metadata without modifying the message payload. This enables better observability and debugging capabilities in distributed systems.

Compression for Efficiency

KafkaJS supports multiple compression types to reduce network bandwidth:

const { CompressionTypes } = require('kafkajs');

await producer.send({
 topic: 'user-events',
 compression: CompressionTypes.GZIP,
 messages: [
 { key: 'user-123', value: largeMessagePayload }
 ]
});

Available compression types include GZIP (built-in), Snappy, LZ4, and ZSTD. Compression significantly reduces network bandwidth and storage requirements, especially for repetitive data patterns. For high-throughput systems, compression can reduce message sizes by 50-80% depending on the data structure.

Implementing Kafka-based event streaming is a key component of modern microservices architecture. Our team can help you design and deploy scalable event-driven systems that improve application reliability and performance.

Producer Durability Options

Configure your producer for different reliability requirements

Acknowledgment Levels

Choose between fire-and-forget, leader-only, or all-replicas acknowledgment based on your durability needs.

Compression

Reduce network and storage overhead with GZIP, Snappy, LZ4, or ZSTD compression.

Idempotence

Prevent duplicate messages with exactly-once semantics for critical business transactions.

Retry Strategy

Configure exponential backoff with jitter to handle temporary failures gracefully.

Creating Consumers and Consumer Groups

Consumer groups enable parallel processing and automatic load balancing. All consumers with the same groupId coordinate to divide topic partitions among themselves. As documented in the KafkaJS consuming documentation, this coordination ensures efficient message processing across multiple consumers.

Consumer Group Fundamentals

const consumer = kafka.consumer({ groupId: 'user-service-group' });

await consumer.connect();
await consumer.subscribe({ topic: 'user-events', fromBeginning: false });

When a consumer fails or a new consumer joins the group, Kafka initiates a rebalance operation to redistribute partitions among available consumers. This ensures continuous processing even as your consumer fleet changes.

Processing Messages with EachMessage

The eachMessage handler provides a simple API for processing messages one at a time:

await consumer.run({
 eachMessage: async ({ topic, partition, message, heartbeat }) => {
 console.log({
 key: message.key.toString(),
 value: message.value.toString(),
 headers: message.headers,
 partition,
 offset: message.offset
 });

 // Process the message
 await processUserEvent(JSON.parse(message.value));

 // Heartbeat during long processing
 await heartbeat();
 }
});

This handler automatically manages offsets and heartbeats, making it ideal for most use cases. The heartbeat ensures Kafka knows the consumer is still alive during long-running operations.

Batch Processing with EachBatch

For high-throughput scenarios requiring more control:

await consumer.run({
 eachBatchAutoResolve: true,
 eachBatch: async ({
 batch,
 resolveOffset,
 heartbeat,
 commitOffsetsIfNecessary,
 isRunning,
 isStale
 }) => {
 for (const message of batch.messages) {
 if (!isRunning() || isStale()) break;

 await processMessage(message);
 resolveOffset(message.offset);
 await heartbeat();
 }
 await commitOffsetsIfNecessary();
 }
});

EachBatch provides utilities for manual offset management, giving you finer control over processing semantics. This is essential when you need to process messages atomically or implement custom batching logic.

Consumer Configuration Options

Key consumer options control behavior and performance:

OptionDescriptionDefault
sessionTimeoutTime before consumer is considered failed30000ms
heartbeatIntervalHeartbeat frequency3000ms
maxBytesPerPartitionData fetched per partition1MB
maxWaitTimeInMsMaximum fetch wait time5000ms
autoCommitIntervalOffset commit intervalnull
autoCommitThresholdCommit after N messagesnull

Pause and Resume

Control consumption flow for backpressure handling:

// Pause consumption
await consumer.pause([{ topic: 'user-events', partitions: [0] }]);

// Resume consumption
await consumer.resume([{ topic: 'user-events', partitions: [0] }]);

Pause/resume is useful when downstream systems are overloaded and you need to slow down message consumption without losing messages. This pattern is essential for building resilient systems that can handle traffic spikes gracefully.

For organizations building distributed systems, our AI automation services can help integrate intelligent decision-making into your event streaming workflows, enabling real-time processing and automated responses.

Error Handling and Retry Strategies

Building resilient systems requires robust error handling. KafkaJS provides configurable retry mechanisms and patterns for handling failures gracefully. Following best practices from the LogRocket tutorial on durable pub-sub systems helps avoid common pitfalls.

Automatic Retry Configuration

const producer = kafka.producer({
 retry: {
 initialRetryTime: 100,
 retries: 5,
 maxRetryTime: 30000,
 factor: 2,
 multiplier: 2
 }
});

The retry strategy uses exponential backoff with jitter to handle temporary failures without overwhelming the system. Each retry attempt waits longer than the previous, preventing thundering herd problems.

Manual Error Handling

try {
 await producer.send({
 topic: 'user-events',
 messages: [{ key: 'user-123', value: JSON.stringify(data) }]
 });
} catch (error) {
 console.error('Failed to send message:', error);
 // Implement fallback: write to dead letter queue, log, etc.
}

For critical data, implement dead letter queues to capture messages that fail after all retries. This ensures no message is permanently lost and can be reprocessed later.

Idempotent Producer

For exactly-once semantics, enable idempotence:

const producer = kafka.producer({
 idempotent: true,
 maxInFlightRequests: 5
});

The idempotent producer prevents duplicate messages caused by producer retries. Each message gets a unique sequence number, and Kafka ensures duplicates are detected and discarded. This is essential for financial transactions and critical business data.

Production Best Practices

Deploying Kafka clients in production requires attention to monitoring, graceful shutdown, and operational resilience. Following established patterns for durable pub-sub systems ensures reliable operation under production workloads.

Health Checks and Monitoring

Monitor producer and consumer connections and metrics:

producer.on('producer.connect', () => console.log('Producer connected'));
producer.on('producer.disconnect', () => console.log('Producer disconnected'));
consumer.on('consumer.connect', () => console.log('Consumer connected'));
consumer.on('consumer.disconnect', () => console.log('Consumer disconnected'));

Track metrics like request latency, error rates, and consumer lag using monitoring tools. High consumer lag indicates your consumers can't keep up with message production.

Graceful Shutdown

Always disconnect properly to ensure in-flight messages complete:

async function shutdown() {
 await producer.disconnect();
 await consumer.disconnect();
 console.log('Disconnected from Kafka');
}

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

Proper shutdown ensures in-flight messages complete and offsets are committed, preventing message loss during deployment or scaling operations.

Connection Management

For applications with multiple producers/consumers, manage connections efficiently:

// Singleton pattern example
let producerInstance = null;

async function getProducer() {
 if (!producerInstance) {
 producerInstance = kafka.producer();
 await producerInstance.connect();
 }
 return producerInstance;
}

Reusing producer connections across requests improves performance and reduces broker load. Avoid creating new connections for each message.

Topic Configuration

Consider topic-level settings for durability and performance:

const admin = kafka.admin();
await admin.createTopics({
 topics: [{
 topic: 'user-events',
 numPartitions: 3,
 replicationFactor: 3,
 configEntries: [
 { name: 'retention.ms', value: '604800000' }, // 7 days
 { name: 'min.insync.replicas', value: '2' }
 ]
 }]
});

Key topic configurations include partition count (for parallelism), replication factor (for durability), retention period (for replay capability), and minimum in-sync replicas (for write durability). Proper topic design is crucial for achieving the right balance between performance and reliability.

For enterprise-grade event streaming architectures, our cloud infrastructure services can help design and implement robust Kafka deployments that scale with your business needs.

Complete Kafka Pub-Sub Example
1const { Kafka, CompressionTypes } = require('kafkajs');2 3async function main() {4 const kafka = new Kafka({5 clientId: 'user-service',6 brokers: ['localhost:9092'],7 retry: {8 initialRetryTime: 100,9 retries: 810 }11 });12 13 // Producer with durability settings14 const producer = kafka.producer({15 allowAutoTopicCreation: true,16 transactionTimeout: 3000017 });18 19 await producer.connect();20 console.log('Producer connected');21 22 // Consumer with group management23 const consumer = kafka.consumer({24 groupId: 'user-service-handlers',25 sessionTimeout: 30000,26 heartbeatInterval: 300027 });28 29 await consumer.connect();30 await consumer.subscribe({ topic: 'user-events', fromBeginning: false });31 32 // Start consuming messages33 await consumer.run({34 eachMessage: async ({ topic, partition, message, heartbeat }) => {35 try {36 const data = JSON.parse(message.value.toString());37 await processUserEvent(data);38 await heartbeat();39 } catch (error) {40 console.error('Processing failed:', error);41 // Implement retry or dead letter queue logic here42 }43 }44 });45 46 // Helper function to publish events47 async function publishEvent(event) {48 await producer.send({49 topic: 'user-events',50 compression: CompressionTypes.GZIP,51 acks: -1, // Wait for all replicas52 messages: [{53 key: event.userId,54 value: JSON.stringify(event),55 headers: {56 'event-type': event.type,57 'timestamp': Date.now().toString()58 }59 }]60 });61 }62 63 // Graceful shutdown handler64 async function shutdown() {65 console.log('Shutting down...');66 await producer.disconnect();67 await consumer.disconnect();68 console.log('Disconnected from Kafka');69 process.exit(0);70 }71 72 process.on('SIGTERM', shutdown);73 process.on('SIGINT', shutdown);74}75 76main().catch(console.error);

Frequently Asked Questions

Ready to Build Scalable Event Systems?

Our team specializes in designing and implementing distributed architectures using Apache Kafka and modern Node.js patterns.

Sources

  1. KafkaJS Official Documentation - Producing - Primary reference for producer API, message structure, compression, and durability options
  2. KafkaJS Official Documentation - Consuming - Primary reference for consumer groups, offset commits, and batch processing patterns
  3. LogRocket: Build a durable pub-sub with Kafka in Node.js - Tutorial with practical examples and architecture patterns for production-ready systems