Skip to main content
  1. Languages/
  2. Golang Guides/

Mastering Event-Driven Architecture with Go and Apache Kafka

Jeff Taakey
Author
Jeff Taakey
21+ Year CTO & Multi-Cloud Architect.

Mastering Event-Driven Architecture with Go and Apache Kafka
#

In the landscape of modern backend development in 2025, the shift from monolithic, synchronous systems to decoupled, event-driven architectures (EDA) is not just a trend—it’s a necessity for scale. While HTTP REST and gRPC have their place, they introduce tight coupling and latency chains that can cripple high-throughput systems.

Enter the power couple: Go (Golang) and Apache Kafka.

Go’s concurrency model (goroutines and channels) is practically purpose-built for processing streams of data, while Kafka remains the undisputed king of distributed event streaming. In this guide, we are going to move beyond “Hello World.” We will build a robust producer and consumer system, handle graceful shutdowns, discuss library trade-offs, and implement best practices suitable for a production environment.

Why Event-Driven?
#

Before writing code, it’s crucial to understand the why.

  1. Decoupling: Services don’t need to know who (or what) consumes their data.
  2. Scalability: You can scale consumers horizontally to handle backpressure without touching the producer.
  3. Resilience: If a consumer goes down, Kafka retains the messages. No data is lost.

Prerequisites & Environment Setup
#

To follow this tutorial, ensure you have the following ready on your development machine:

  • Go 1.22+: We are utilizing modern Go idioms.
  • Docker & Docker Compose: To run a local Kafka cluster without installing Java manually.
  • IDE: VS Code (with Go extension) or JetBrains GoLand.

1. Setting up the Kafka Cluster
#

We will use a Kraft-mode Kafka setup (no Zookeeper required) to keep things lightweight and modern. Create a file named docker-compose.yml in your project root:

version: "3"
services:
  kafka:
    image: 'bitnami/kafka:latest'
    ports:
      - '9092:9092'
    environment:
      - KAFKA_CFG_NODE_ID=0
      - KAFKA_CFG_PROCESS_ROLES=controller,broker
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092
      - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - KAFKA_CFG_AUTO_CREATE_TOPICS_ENABLE=true
    volumes:
      - kafka_data:/bitnami/kafka
volumes:
  kafka_data:
    driver: local

Run the cluster:

docker-compose up -d

Choosing the Right Go Kafka Library
#

One of the first hurdles in the Go ecosystem is choosing a client library. Unlike Java, there isn’t one single “official” client. Here is a breakdown of the top contenders as of late 2025:

Feature segmentio/kafka-go confluent-kafka-go IBM/sarama
Type Pure Go CGO Wrapper (librdkafka) Pure Go
Performance High Very High (Native C) Medium/High
API Usability Idiomatic, standard library feel Slight C-style overhead Low-level, complex config
Maintenance Active Active (Official Confluent) Active (Community)
Best Use Case Standard Microservices High-freq Trading / Max Throughput Legacy systems

Decision: For this tutorial, we will use segmentio/kafka-go. It is pure Go (no C compiler dependencies), provides a beautiful API that leverages context.Context, and is more than fast enough for 95% of use cases.

Initialize your project:

mkdir go-kafka-eda
cd go-kafka-eda
go mod init github.com/yourname/go-kafka-eda
go get github.com/segmentio/kafka-go

Architecture Overview
#

We will simulate a simple E-Commerce scenario:

  1. Order Service (Producer): Publishes an event when a user places an order.
  2. Inventory Service (Consumer): Listens for orders to reserve stock.
sequenceDiagram autonumber participant U as User participant P as Order Service (Go) participant K as Kafka (Topic: orders) participant C as Inventory Service (Go) U->>P: POST /order activate P P->>P: Validate Order P->>K: Publish "OrderCreated" Event activate K K-->>P: Ack deactivate K P-->>U: 202 Accepted deactivate P loop Async Processing C->>K: Fetch Message activate C K-->>C: Message Payload C->>C: Deduct Inventory C->>K: Commit Offset deactivate C end

Step 1: The Producer Implementation
#

In production, you rarely want to create a new connection for every message. We need a persistent Writer that handles batching and retries automatically.

Create a file producer/main.go.

package main

import (
	"context"
	"encoding/json"
	"fmt"
	"log"
	"time"

	"github.com/segmentio/kafka-go"
)

// OrderEvent represents the payload we send to Kafka
type OrderEvent struct {
	OrderID   string  `json:"order_id"`
	Customer  string  `json:"customer"`
	Amount    float64 `json:"amount"`
	Timestamp int64   `json:"timestamp"`
}

func main() {
	// 1. Configure the Writer
	// The Writer handles connection pooling, batching, and retries internally.
	writer := &kafka.Writer{
		Addr:                   kafka.TCP("localhost:9092"),
		Topic:                  "orders-topic",
		Balancer:               &kafka.LeastBytes{}, // Distributes messages evenly
		AllowAutoTopicCreation: true,
		// Performance optimization: Batch settings
		BatchSize:  100,           // Flush when 100 messages are ready
		BatchTimeout: 10 * time.Millisecond, // ...or every 10ms
		Async:      true,          // Non-blocking writes (careful with error handling)
	}

	defer writer.Close()

	fmt.Println("🚀 Producer started. Generating orders...")

	// 2. Simulate generating orders
	for i := 1; i <= 10; i++ {
		event := OrderEvent{
			OrderID:   fmt.Sprintf("ORD-%d", i),
			Customer:  fmt.Sprintf("User-%d", i),
			Amount:    99.99,
			Timestamp: time.Now().Unix(),
		}

		payload, err := json.Marshal(event)
		if err != nil {
			log.Printf("Failed to marshal event: %v", err)
			continue
		}

		// 3. Write Message
		// We use a context with timeout to prevent hanging if Kafka is down
		ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
		
		err = writer.WriteMessages(ctx, kafka.Message{
			Key:   []byte(event.OrderID), // Key ensures ordering for specific orders
			Value: payload,
		})
		
		cancel() // Good practice to cancel context

		if err != nil {
			log.Printf("Failed to write message: %v", err)
		} else {
			log.Printf("✅ Sent: %s", event.OrderID)
		}

		time.Sleep(500 * time.Millisecond)
	}
}

Key Takeaways for the Producer:
#

  1. Async vs Sync: We enabled Async: true. In high-throughput systems, this is vital. However, it means errors might be reported asynchronously via a Completion callback (not shown here for simplicity) or logged silently. For critical financial data, consider Async: false.
  2. Message Keys: We set Key: []byte(event.OrderID). This ensures that all updates for the same order land in the same Kafka partition, guaranteeing order of processing.

Step 2: The Consumer Implementation
#

Consuming is where the complexity lies. We need to handle context cancellation (for graceful shutdowns) and commit offsets effectively to ensure “at-least-once” delivery.

Create a file consumer/main.go.

package main

import (
	"context"
	"encoding/json"
	"fmt"
	"log"
	"os"
	"os/signal"
	"syscall"

	"github.com/segmentio/kafka-go"
)

type OrderEvent struct {
	OrderID   string  `json:"order_id"`
	Customer  string  `json:"customer"`
	Amount    float64 `json:"amount"`
	Timestamp int64   `json:"timestamp"`
}

func main() {
	// 1. Configure the Reader
	// Using a GroupID ensures we are part of a Consumer Group.
	// Kafka handles load balancing if we spin up multiple instances of this app.
	reader := kafka.NewReader(kafka.ReaderConfig{
		Brokers:        []string{"localhost:9092"},
		Topic:          "orders-topic",
		GroupID:        "inventory-service-group",
		MinBytes:       10e3, // 10KB
		MaxBytes:       10e6, // 10MB
		CommitInterval: 0,    // We will commit explicitly
		StartOffset:    kafka.FirstOffset,
	})

	// 2. Setup Graceful Shutdown
	// We want to finish processing the current message before quitting.
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	sigChan := make(chan os.Signal, 1)
	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)

	go func() {
		<-sigChan
		fmt.Println("\n⚠️  Shutdown signal received. Stopping consumer...")
		cancel() // Cancel the context to stop the reader loop
	}()

	fmt.Println("🎧 Inventory Service listening for orders...")

	// 3. Consumption Loop
	for {
		// FetchMessage blocks until a message is received or context is cancelled
		m, err := reader.FetchMessage(ctx)
		if err != nil {
			if ctx.Err() != nil {
				// Context cancelled, exit loop gracefully
				break
			}
			log.Printf("Error fetching message: %v", err)
			continue
		}

		// 4. Process the Message
		processOrder(ctx, m)

		// 5. Explicit Commit
		// Only commit AFTER successful processing. 
		// If the app crashes during processing, the message is re-delivered.
		if err := reader.CommitMessages(ctx, m); err != nil {
			log.Printf("Failed to commit message: %v", err)
		}
	}

	if err := reader.Close(); err != nil {
		log.Printf("failed to close reader: %v", err)
	}
	fmt.Println("👋 Consumer stopped gracefully.")
}

func processOrder(ctx context.Context, m kafka.Message) {
	var event OrderEvent
	if err := json.Unmarshal(m.Value, &event); err != nil {
		log.Printf("Error parsing JSON: %v", err)
		return
	}

	fmt.Printf("📦 Processing Order: %s | Customer: %s | Partition: %d\n", 
		event.OrderID, event.Customer, m.Partition)
	
	// Simulate business logic latency
	// time.Sleep(100 * time.Millisecond) 
}

Key Takeaways for the Consumer:
#

  1. Graceful Shutdown: Notice the signal.Notify. In Kubernetes, pods are killed frequently. If you kill a consumer mid-process, you might leave data in an inconsistent state. The context propagation ensures we stop fetching new messages but (ideally) finish the current one.
  2. FetchMessage vs ReadMessage:
    • ReadMessage: Auto-commits offsets. Easier, but risky. If your app crashes after reading but before processing, the message is lost.
    • FetchMessage + CommitMessages: Manual control. This is the Production Standard. We process, then commit.

Running the Application
#

  1. Start Kafka: Ensure Docker is running.
  2. Start the Consumer:
    go run consumer/main.go
  3. Start the Producer (in a new terminal):
    go run producer/main.go

You should see the Producer firing off logs and the Consumer immediately picking them up and processing them.

Common Pitfalls and Solutions
#

1. The “Poison Pill” Message
#

What happens if a message contains malformed JSON? In our code above, we log the error and return. However, if the error was transient (e.g., DB down), and we returned without committing, the message would be re-delivered infinitely, blocking the partition.

Solution:

  • Permanent Errors (Invalid JSON): Log it, commit the offset (skip it), and perhaps send it to a “Dead Letter Queue” (a separate Kafka topic for bad messages).
  • Transient Errors (DB Down): Retry locally with exponential backoff. Do not commit until successful.

2. Rebalancing Storms
#

If your processing logic takes too long, Kafka might think the consumer is dead and trigger a rebalance (assigning partitions to other consumers). This stops the world for your group.

Solution:

  • Tune HeartbeatInterval and SessionTimeout in the ReaderConfig.
  • Ensure your processing loop is faster than the session timeout, or process messages asynchronously (though this complicates offset management).

Performance Tuning Checklist
#

Setting Default Recommendation for High Throughput
BatchSize (Producer) 1 100 - 1000 (depends on message size)
BatchTimeout (Producer) 1s 10ms - 50ms (reduce latency)
MinBytes (Consumer) 1 10KB - 100KB (reduces network syscalls)
Compression None Snappy or Zstd (saves bandwidth significantly)

Conclusion
#

Building event-driven systems with Go and Kafka allows you to create architectures that are loosely coupled yet highly cohesive. By utilizing segmentio/kafka-go, we gain access to a clean, idiomatic Go API that handles the heavy lifting of connection management.

Remember, the code provided here is a solid foundation. As you move to production, focus heavily on your observability (metrics/logging) and your error handling strategies (Dead Letter Queues).

Next Steps:

  • Implement a Dead Letter Queue for failed messages.
  • Add Prometheus metrics to monitor lag (the delay between production and consumption).
  • Explore Schema Registry to enforce data contracts between services.

Happy coding!


Disclaimer: Code examples are intended for educational purposes. Always review security configurations (SSL/SASL) before deploying to production environments.