Is Caching Still Necessary?

A look at why putting a cache in front of a database usually has limited effectiveness, and some guidelines on when caching is actually a good thing.

May 24th, 2024 6:28am by Behrad Babaee

Featued image for: Is Caching Still Necessary?

Image from kumachenkova on Shutterstock.

As I discussed in a previous article on caching, introducing a caching layer in front of a database, whether external or internal, has limited effectiveness in improving the performance of an application suffering from slow data access. The key point to remember is that end-user functionalities typically require multiple database accesses.

For caching to improve the end-user experience, all these database queries must be served from the cache. Therefore, unless the cache hit ratio is exceptionally high, caching is unlikely to be beneficial.

You might wonder why most database technologies include an internal caching layer, or why putting a caching technology in front of a database is a common practice. The short answer is that caching effectively improves throughput, but not latency.

Throughput vs. Latency

In theory, throughput and latency are independent. This means it is possible to have a system with massive throughput but incredibly slow response time. For example, I could fill 1 petabyte of disks with the requested information and ship them to the client overnight. In this scenario, the system’s throughput would be an impressive 11.6 gigabytes per second (1 petabyte/(24 hours * 3600 seconds)). However, the latency would be a dismal one day.

In practice, however, insufficient throughput can significantly affect latency. Revisiting the previous example, let’s consider that the client now requests 10 petabytes of information, but we only have 1 petabyte of disks available for shipment. Consequently, the delivery process would span multiple days: It would take one day to ship the first petabyte and another day for the client to return the disks. This cycle would repeat until the 19th day, when all the data is finally shipped. Therefore, the lack of throughput would effectively increase the response time by a factor of 19.

Increasing Throughput With a Caching Layer

Building on the previous example, introducing a caching layer is similar to setting up a local hub engineered to store 90% of the data a customer is likely to request. This hub is equipped to dispatch up to 9 petabytes of data per shipment, enabling deliveries to be completed within just one hour.

To deliver 10 petabytes of data to the customer, 9 petabytes can be shipped from the local hub within an hour, and the remaining 1 petabyte will be delivered from the main storage on the following day. By implementing a local hub, we have increased the throughput tenfold and enhanced the response time by a factor of 19.

It’s crucial to understand that this dramatic improvement in response time is solely attributed to the increased throughput; the speed of delivery itself does not influence this enhancement.

To Cache or Not To Cache

So far, we have learned that adding a cache improves latency if and only if the sluggishness is due to insufficient throughput. In the example above, if the customer is unhappy with the one-day response time, increasing the shipment disk capacity or the percentage of data stored on the local hub to anything less than 100% will not enhance the response time.

To determine whether a cache can be beneficial, it’s essential to consider the database’s algorithms and data structures, the hardware it operates on and the application’s data access patterns. Consequently, there is no one-size-fits-all answer. Instead of settling with the vague and unsatisfying response of “it depends,” I aim to provide a detailed analysis in the following sections, outlining both the advantages and disadvantages of caching.

The Case Against Caching

I begin by presenting the case against caching because, given the capabilities of today’s commodity hardware, caching should generally be unnecessary.

This raises the question: Why do nearly all well-known database technologies still include an internal cache?

The answer is quite fascinating, requiring an exploration of numerous details. For now, consider this proof by contradiction: Aerospike, a database that operates without a cache, manages to equal or even surpass the performance of technologies that store part or all of their data in memory. This clearly demonstrates that caching is not indispensable for achieving optimal performance.

I agree that caching has historically been synonymous with high performance, and for many, the notion of a database operating without a caching layer seems inconceivable. However, the capabilities of modern off-the-shelf hardware, coupled with shifts in software development practices in recent years, have dramatically transformed the landscape.

For example, modern off-the-shelf solid-state drives (SSDs) can now achieve throughputs of 12 to 14 gigabytes per second — approximately 60 times faster than the spinning disks that were common a decade ago. This significant advancement is particularly noteworthy given that the clock speeds of our CPUs and the frequencies of our memory have largely remained unchanged during this period.

On the other hand, modern software applications run on the cloud, and depend on inter-component communication across networks that typically offer up to 12.5 gigabytes per second of bandwidth (100 gigabits per second). However, this figure is merely theoretical. In practice, inefficiencies in our network stack, including packet and frame size overheads, back-offs, and other factors, prevent us from using even a third of this capacity.

These changes are significant for two key reasons. First, in modern applications, the network, rather than the disk, has become the slowest component of the stack. Second, the performance gap between memory and disk has substantially narrowed; while disks were previously two to three orders of magnitude slower than memory, they are now only about an order of magnitude slower.

For these reasons, deploying a cache in front of a database, whether internal or external, is often inefficient:

External cache: The cache must be accessed over a network, which typically provides significantly lower throughput compared to direct memory access. This arrangement can lead to underutilization of the RAM’s performance capabilities.
Internal cache: Modern computers typically incorporate multiple disks that collectively deliver throughput far exceeding what the network can handle. Therefore, the additional throughput gained from an internal cache does not necessarily translate into enhanced performance.

As highlighted at the beginning of this section, if a database can fully use all the disk throughput available to it, there is no need to cache the data in memory.

The Case in Favor of Caching

At this point, you might be thinking that I have a bias against caches! Nothing could be further from the truth. Let me provide some guidelines on when caching is actually effective:

Storing results of computations or transformations: Data retrieval sometimes involves computational operations or transformations that demand additional CPU cycles. Caching the results of these computations or transformations can effectively increase the application’s computational bandwidth, enhancing overall performance.
High throughput on a small dataset: Consider a scenario where you need to manage 400 gigabytes of data but require a throughput equivalent to 10 disks. In such cases, using an in-memory database might be a more effective solution. However, it’s crucial to remember that in-memory databases are volatile. If the data is critical, a storage-backed in-memory database is needed to safeguard against data loss.
Improving performance for a sequence of requests: While sequential access should ideally be avoided, it is sometimes inevitable. In such cases, having a cache, even with a very low cache hit rate, can still enhance the user experience.
Improving data locality: Caching data closer to users may significantly reduce networking costs if the source is located far away. For example, the static components of a website can be cached closer to the customer to reduce costs and minimize the unreliability associated with transferring data across continents over the public internet.
Eliminating network latency: Setting up a local cache on the application server can completely remove network latency, enhancing performance.
Using surplus memory: Many applications do not require significant amounts of memory, yet servers are often equipped with ample memory. Employing this surplus memory for caching can be advantageous. I want to emphasize that caches generally do not cause harm; it is the cost-effectiveness that often tips the scale against them. As long as you do not solely rely on adding more RAM as your performance-tuning strategy, leveraging excess memory for caching is a decent approach.
Using cache as an in-memory database: For applications that consistently rely on a specific portion of data, such as data from the last week or month, consider using a cache as an in-memory database to keep this frequently accessed data readily available.
Caching data locally in the process: A slight detour here — local caching isn’t directly relevant to the focus of this article or the previous one. I just aim to clarify for readers the distinction between caching technologies and local caching.

Wrapping Up

For general use cases, consider using a modern database like Aerospike, which efficiently uses disk throughput. This would eliminate the need to spend excessive money and resources on technologies that require substantial memory for caching. Beyond providing basic database functions, it can also be configured as an in-memory database, storage-backed in-memory database, in-memory cache or on-disk cache. This adaptability ensures that if your use case may benefit from caching, Aerospike can also seamlessly accommodate that requirement.

Aerospike version 7.1 introduces precision least recently used (LRU) cache eviction within the database core, expanding its ability to drive enterprise-grade in-memory caching use cases. Learn more at Aerospike.com.

Behrad Babaee, the principal architect at Aerospike, brings over a decade of experience in modern data storage technologies, including databases, caches and streaming platforms. A recognized thought leader, his work spans several widely adopted, scalable applications and major database technologies...