Mastering Concurrency: Building a High-Performance Distributed Cache in Go from Scratch

Table of Contents

Introduction
#

In the landscape of modern backend architecture, caching is the unsung hero that stands between your database and a total meltdown. While tools like Redis or Memcached are industry standards, strictly using them without understanding their internals limits your growth as a senior engineer.

There is no better way to master Golang’s concurrency primitives—channels, Goroutines, and Mutexes—than by building a distributed system from the ground up.

In this deep-dive tutorial, we aren’t just writing a map[string]string. We are building GoDistCache: a distributed, in-memory cache system that features:

LRU (Least Recently Used) eviction policy.
Concurrency safety using sync.Mutex.
Consistent Hashing for distributing data across nodes.
Distributed Nodes communicating via HTTP.
Singleflight mechanism to prevent cache stampedes (the “Thundering Herd” problem).

By the end of this article, you will have a runnable system and a profound understanding of how distributed state is managed in 2025’s cloud-native environments.

Prerequisites & Environment Setup
#

Before we write a single line of code, ensure your environment is ready. We are targeting Go 1.23+ to leverage recent compiler optimizations and standard library improvements.

Development Environment
#

OS: Linux, macOS, or Windows (WSL2 recommended).
Go Version: 1.23 or higher.
IDE: VS Code (with Go extension) or JetBrains GoLand.
Terminal: Any bash/zsh capable terminal.

Project Initialization
#

Since we are building a modular project, let’s set up our go.mod. We won’t be using many external dependencies because the goal is to understand the core, but we might use a testing assertion library or logging if necessary. For now, let’s stick to the standard library.

mkdir godistcache
cd godistcache
go mod init github.com/yourname/godistcache

Directory Structure: We will organize our code cleanly:

godistcache/
├── lru/                # The underlying eviction algorithm
├── byteview.go         # Immutable view of memory
├── cache.go            # Concurrency control
├── consistenthash/     # Node selection logic
├── peers.go            # Interface for finding nodes
├── http.go             # Networking
├── singleflight/       # Request coalescing
├── main.go             # Entry point
└── go.mod

Step 1: The Foundation - LRU Eviction Policy
#

A cache without an eviction policy is just a memory leak waiting to happen. We need a way to remove old data when memory fills up. The Least Recently Used (LRU) algorithm is the gold standard here.

Logic:

We maintain a map for fast lookups (O(1)).
We maintain a doubly linked list to track usage order.
When a value is accessed, move it to the front of the list.
When adding a value, if the cache is full, remove the item at the back of the list.

1.1 Implementing LRU
#

Create a folder lru and a file lru/lru.go.

package lru

import "container/list"

// Cache is an LRU cache. It is not safe for concurrent access.
type Cache struct {
	maxBytes int64      // Maximum memory allowed
	nbytes   int64      // Current memory used
	ll       *list.List // Doubly linked list
	cache    map[string]*list.Element
	// OnEvicted is an optional callback executed when an entry is purged.
	OnEvicted func(key string, value Value)
}

// Value is a generic interface for things we store.
// Implementing Len() allows us to track memory usage.
type Value interface {
	Len() int
}

type entry struct {
	key   string
	value Value
}

// New is the Constructor of Cache
func New(maxBytes int64, onEvicted func(string, Value)) *Cache {
	return &Cache{
		maxBytes:  maxBytes,
		ll:        list.New(),
		cache:     make(map[string]*list.Element),
		OnEvicted: onEvicted,
	}
}

// Get looks up a key's value
func (c *Cache) Get(key string) (value Value, ok bool) {
	if ele, ok := c.cache[key]; ok {
		// Move the element to the front (convention for "recently used")
		c.ll.MoveToFront(ele)
		kv := ele.Value.(*entry)
		return kv.value, true
	}
	return
}

// Add adds a value to the cache.
func (c *Cache) Add(key string, value Value) {
	if ele, ok := c.cache[key]; ok {
		// Update existing
		c.ll.MoveToFront(ele)
		kv := ele.Value.(*entry)
		c.nbytes += int64(value.Len()) - int64(kv.value.Len())
		kv.value = value
	} else {
		// Add new
		ele := c.ll.PushFront(&entry{key, value})
		c.cache[key] = ele
		c.nbytes += int64(len(key)) + int64(value.Len())
	}
	// Evict if necessary
	for c.maxBytes != 0 && c.maxBytes < c.nbytes {
		c.RemoveOldest()
	}
}

// RemoveOldest removes the oldest item
func (c *Cache) RemoveOldest() {
	ele := c.ll.Back()
	if ele != nil {
		c.ll.Remove(ele)
		kv := ele.Value.(*entry)
		delete(c.cache, kv.key)
		c.nbytes -= int64(len(kv.key)) + int64(kv.value.Len())
		if c.OnEvicted != nil {
			c.OnEvicted(kv.key, kv.value)
		}
	}
}

1.2 The ByteView Abstraction
#

To ensure that cached values are immutable (so external code doesn’t modify the cache’s internal memory), we create a read-only wrapper.

Create byteview.go in the root:

package godistcache

// ByteView holds an immutable view of bytes.
type ByteView struct {
	b []byte
}

// Len returns the view's length
func (v ByteView) Len() int {
	return len(v.b)
}

// ByteSlice returns a copy of the data as a byte slice.
func (v ByteView) ByteSlice() []byte {
	return cloneBytes(v.b)
}

// String returns the data as a string
func (v ByteView) String() string {
	return string(v.b)
}

func cloneBytes(b []byte) []byte {
	c := make([]byte, len(b))
	copy(c, b)
	return c
}

Step 2: Concurrency Control
#

The LRU structure above is not thread-safe. In a high-throughput Golang application, multiple Goroutines will be hitting the cache simultaneously. We need to wrap our LRU with a sync.Mutex.

2.1 The Main Cache Structure
#

Create cache.go. This file acts as the bridge between the raw LRU logic and the concurrent world.

package godistcache

import (
	"godistcache/lru"
	"sync"
)

type cache struct {
	mu         sync.Mutex
	lru        *lru.Cache
	cacheBytes int64
}

func (c *cache) add(key string, value ByteView) {
	c.mu.Lock()
	defer c.mu.Unlock()
	if c.lru == nil {
		c.lru = lru.New(c.cacheBytes, nil)
	}
	c.lru.Add(key, value)
}

func (c *cache) get(key string) (value ByteView, ok bool) {
	c.mu.Lock()
	defer c.mu.Unlock()
	if c.lru == nil {
		return
	}
	if v, ok := c.lru.Get(key); ok {
		return v.(ByteView), ok
	}
	return
}

Optimization Note: In an extremely high-traffic scenario, a single mutex can become a bottleneck. A common optimization is Sharding (partitioning the cache into e.g., 256 independent maps based on the hash of the key). For this article, we stick to a single mutex for clarity, but keep sharding in mind for production optimization.

Step 3: Distribution & Consistent Hashing
#

This is where things get interesting. In a distributed system, you have multiple cache nodes (Node A, Node B, Node C). When a request for Key=“User:123” comes in, which node holds the data?

If we use simple modular hashing (hash(key) % 3), adding a new Node D changes the divisor to 4, causing nearly ALL keys to be remapped. This is a cache disaster (0% hit rate spike).

Solution: Consistent Hashing. We map both nodes and keys onto a ring (0 to 2^32-1). A key belongs to the first node found moving clockwise on the ring.

3.1 Consistent Hashing Implementation
#

Create consistenthash/consistenthash.go:

package consistenthash

import (
	"hash/crc32"
	"sort"
	"strconv"
)

// Hash maps bytes to uint32
type Hash func(data []byte) uint32

// Map stores keys and nodes on a hash ring
type Map struct {
	hash     Hash
	replicas int            // Virtual nodes per real node
	keys     []int          // Sorted hash ring
	hashMap  map[int]string // Mapping virtual node hash -> real node name
}

// New creates a Map instance
func New(replicas int, fn Hash) *Map {
	m := &Map{
		replicas: replicas,
		hash:     fn,
		hashMap:  make(map[int]string),
	}
	if m.hash == nil {
		m.hash = crc32.ChecksumIEEE
	}
	return m
}

// Add adds some keys to the hash.
func (m *Map) Add(keys ...string) {
	for _, key := range keys {
		for i := 0; i < m.replicas; i++ {
			// Create virtual node name: "0NodeA", "1NodeA"...
			hash := int(m.hash([]byte(strconv.Itoa(i) + key)))
			m.keys = append(m.keys, hash)
			m.hashMap[hash] = key
		}
	}
	sort.Ints(m.keys)
}

// Get gets the closest item in the hash to the provided key.
func (m *Map) Get(key string) string {
	if len(m.keys) == 0 {
		return ""
	}

	hash := int(m.hash([]byte(key)))
	// Binary search for the first node hash >= key hash
	idx := sort.Search(len(m.keys), func(i int) bool {
		return m.keys[i] >= hash
	})

	// If we reached the end, wrap around to the start (ring structure)
	if idx == len(m.keys) {
		idx = 0
	}

	return m.hashMap[m.keys[idx]]
}

Comparison: Modular vs. Consistent Hashing
#

Feature	Modular Hashing (`key % n`)	Consistent Hashing
Logic	Simple math operation.	Ring topology + Binary Search.
Node Addition/Removal	Invalidates almost all cache keys.	Invalidates only 1/N keys (neighbors).
Data Skew	Depends on the hash function quality.	Mitigated by Virtual Nodes.
Complexity	O(1)	O(log N) due to binary search.

The introduction of replicas (Virtual Nodes) is crucial. It prevents data skew where one node accidentally covers a huge portion of the ring.

Step 4: Distributed Communication (HTTP)
#

Nodes need to talk to each other. If Node A receives a request for a key that belongs to Node B, Node A must forward that request.

4.1 Abstractions
#

Create peers.go to define interfaces. This makes our code testable and allows swapping HTTP for gRPC later.

package godistcache

// PeerPicker is the interface that must be implemented to locate
// the peer that owns a specific key.
type PeerPicker interface {
	PickPeer(key string) (peer PeerGetter, ok bool)
}

// PeerGetter is the interface that must be implemented by a peer.
type PeerGetter interface {
	Get(group string, key string) ([]byte, error)
}

4.2 HTTP Pool
#

Create http.go. This acts as both a Server (handling incoming requests) and a Client (calling other nodes).

package godistcache

import (
	"fmt"
	"godistcache/consistenthash"
	"io"
	"log"
	"net/http"
	"net/url"
	"strings"
	"sync"
)

const (
	defaultBasePath = "/_geecache/"
	defaultReplicas = 50
)

// HTTPPool implements PeerPicker for a pool of HTTP peers.
type HTTPPool struct {
	self        string
	basePath    string
	mu          sync.Mutex
	peers       *consistenthash.Map
	httpGetters map[string]*httpGetter
}

func NewHTTPPool(self string) *HTTPPool {
	return &HTTPPool{
		self:     self,
		basePath: defaultBasePath,
	}
}

// Log info with server name
func (p *HTTPPool) Log(format string, v ...interface{}) {
	log.Printf("[Server %s] %s", p.self, fmt.Sprintf(format, v...))
}

// ServeHTTP handles all http requests
func (p *HTTPPool) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	if !strings.HasPrefix(r.URL.Path, p.basePath) {
		panic("HTTPPool serving unexpected path: " + r.URL.Path)
	}
	p.Log("%s %s", r.Method, r.URL.Path)

	// /<basepath>/<groupname>/<key> required
	parts := strings.SplitN(r.URL.Path[len(p.basePath):], "/", 2)
	if len(parts) != 2 {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}

	groupName := parts[0]
	key := parts[1]

	group := GetGroup(groupName)
	if group == nil {
		http.Error(w, "no such group: "+groupName, http.StatusNotFound)
		return
	}

	view, err := group.Get(key)
	if err != nil {
		http.Error(w, err.Error(), http.StatusInternalServerError)
		return
	}

	w.Header().Set("Content-Type", "application/octet-stream")
	w.Write(view.ByteSlice())
}

// Set updates the pool's list of peers.
func (p *HTTPPool) Set(peers ...string) {
	p.mu.Lock()
	defer p.mu.Unlock()
	p.peers = consistenthash.New(defaultReplicas, nil)
	p.peers.Add(peers...)
	p.httpGetters = make(map[string]*httpGetter, len(peers))
	for _, peer := range peers {
		p.httpGetters[peer] = &httpGetter{baseURL: peer + p.basePath}
	}
}

// PickPeer picks a peer according to the key
func (p *HTTPPool) PickPeer(key string) (PeerGetter, bool) {
	p.mu.Lock()
	defer p.mu.Unlock()
	if p.peers.Get(key) != "" && p.peers.Get(key) != p.self {
		p.Log("Pick peer %s", p.peers.Get(key))
		return p.httpGetters[p.peers.Get(key)], true
	}
	return nil, false
}

// httpGetter implements the PeerGetter interface
type httpGetter struct {
	baseURL string
}

func (h *httpGetter) Get(group string, key string) ([]byte, error) {
	u := fmt.Sprintf(
		"%v%v/%v",
		h.baseURL,
		url.QueryEscape(group),
		url.QueryEscape(key),
	)
	res, err := http.Get(u)
	if err != nil {
		return nil, err
	}
	defer res.Body.Close()

	if res.StatusCode != http.StatusOK {
		return nil, fmt.Errorf("server returned: %v", res.Status)
	}

	bytes, err := io.ReadAll(res.Body)
	if err != nil {
		return nil, err
	}
	return bytes, nil
}

Step 5: Handling the “Thundering Herd” (Singleflight)
#

In a distributed system, if a hot key expires, thousands of requests might hit the cache simultaneously. If the cache misses, they all hit the DB. This can crash the DB.

Singleflight ensures that for a given key, only one request is in flight to the source of truth (DB or remote node) at a time. All other concurrent requests wait for that one result.

Create singleflight/singleflight.go:

package singleflight

import "sync"

// call is an in-flight or completed Do call
type call struct {
	wg  sync.WaitGroup
	val interface{}
	err error
}

// Group manages calls to singleflight
type Group struct {
	mu sync.Mutex       // protects m
	m  map[string]*call // lazily initialized
}

func (g *Group) Do(key string, fn func() (interface{}, error)) (interface{}, error) {
	g.mu.Lock()
	if g.m == nil {
		g.m = make(map[string]*call)
	}
	if c, ok := g.m[key]; ok {
		g.mu.Unlock()
		c.wg.Wait() // Wait for the existing request to finish
		return c.val, c.err
	}
	c := new(call)
	c.wg.Add(1)
	g.m[key] = c
	g.mu.Unlock()

	c.val, c.err = fn()
	c.wg.Done()

	g.mu.Lock()
	delete(g.m, key)
	g.mu.Unlock()

	return c.val, c.err
}

Step 6: The Main Group Logic
#

Now we assemble the pieces. We need a Group struct that coordinates the cache, the peers, and the data source loader.

Create geecache.go (or append to existing root file):

package godistcache

import (
	"fmt"
	"godistcache/singleflight"
	"log"
	"sync"
)

// A Getter loads data for a key.
type Getter interface {
	Get(key string) ([]byte, error)
}

type GetterFunc func(key string) ([]byte, error)

func (f GetterFunc) Get(key string) ([]byte, error) {
	return f(key)
}

type Group struct {
	name      string
	getter    Getter
	mainCache cache
	peers     PeerPicker
	loader    *singleflight.Group
}

var (
	mu     sync.RWMutex
	groups = make(map[string]*Group)
)

func NewGroup(name string, cacheBytes int64, getter Getter) *Group {
	if getter == nil {
		panic("nil Getter")
	}
	mu.Lock()
	defer mu.Unlock()
	g := &Group{
		name:      name,
		getter:    getter,
		mainCache: cache{cacheBytes: cacheBytes},
		loader:    &singleflight.Group{},
	}
	groups[name] = g
	return g
}

func GetGroup(name string) *Group {
	mu.RLock()
	g := groups[name]
	mu.RUnlock()
	return g
}

func (g *Group) RegisterPeers(peers PeerPicker) {
	if g.peers != nil {
		panic("RegisterPeers called more than once")
	}
	g.peers = peers
}

func (g *Group) Get(key string) (ByteView, error) {
	if key == "" {
		return Byte

Introduction #

Prerequisites & Environment Setup #

Development Environment #

Project Initialization #

Step 1: The Foundation - LRU Eviction Policy #

1.1 Implementing LRU #

1.2 The ByteView Abstraction #

Step 2: Concurrency Control #

2.1 The Main Cache Structure #

Step 3: Distribution & Consistent Hashing #

3.1 Consistent Hashing Implementation #

Comparison: Modular vs. Consistent Hashing #

Step 4: Distributed Communication (HTTP) #

4.1 Abstractions #

4.2 HTTP Pool #

Step 5: Handling the “Thundering Herd” (Singleflight) #

Step 6: The Main Group Logic #

Related Articles