Architeture Patterns: Caching (Part 2)

Cache warming and cache stampede

5 min readFeb 26, 2022

In part 1 of this series, we looked at the different types of caches and the various ways they can be used to scale up applications. Now let’s look at some nuances of using caching.

Scaling caches

Like any other part of the system design, caches come under load as the scale of the application increases. External caches are servers like any other and can buckle under the read/write traffic being sent their way. Even in-memory caches can suffer degraded performance due to read locking if too many application threads try to access them, although this is much harder to get to and easier to mitigate. let’s look at some problems and solutions scaling caches.

Scaling to more traffic

A cache is essentially a data store, and the problem of scaling for traffic is a well-known one in the database domain. Rising traffic can cause scalability problems by increasing the CPU usage or by choking the network bandwidth available to a server.

The most straightforward way to scale for an increase in traffic is to have multiple servers which can serve the traffic. In databases, we typically configure a master-slave (aka leader-follower) topology where all writes go to a single server which replicates them across all the other servers. This way, all servers have all the data and the application can connect to any of them to read them. This reduces the load on…

Architeture Patterns: Caching (Part 2)

Cache warming and cache stampede

Scaling caches

Scaling to more traffic

Written by Kislay Verma