Improve API Performance with Caching

by Jeff Costa

When it comes to API caching, we’ve found that many developers want to get a deeper understanding of how it works and the benefits it provides. We’ve also observed that adoption of API caching has been limited by a lack of tools to define and manage caching rules for each API (historically, it’s been impractical to establish caching rules across multiple APIs, because the ability to cache responses varies widely).  

This blog post will review several different implementation scenarios to help you understand the benefits of API response caching. In case you’re already wondering what we found as we explored these scenarios, let’s give you a key result:

Based on the test methodology described below, we observed that the test API was delivered 21% faster when cached.

In addition, we’ll highlight in this blog post how our new Akamai API Gateway empowers API developers with the tools they need to tailor independent caching policies for every API. In short, API Gateway dramatically simplifies the process of taking advantage of API caching so you can deliver more data in the same amount of time.

Key caching scenarios to consider:  

  • Locally with no application-level caching
  • Locally with application-level caching
  • IaaS (Digital Ocean) with no application-level caching
  • IaaS with application-level caching
  • Akamai with no application-level caching
  • Akamai with application-level caching
  • Akamai caching and no application-level caching
  • Akamai caching with application-level caching

We will be using the Siege load-testing and benchmarking tool to generate requests against a sample API. If you’re unfamiliar with Siege, it’s an open-source tool used to determine the load an application can withstand when receiving traffic spikes. It has more nefarious uses as well, such as adversaries attempting to DDoS a server with volumetric traffic. Setting aside the more malicious use cases, it’s a great tool to stress-test single or multiple API URLs.

A Siege run looks like the screen capture below: request after request to a given URL:

The API we will use for our testing is a simple NodeJS API with a MongoDB backend I wrote that returns adoptable cat data. After all, this is the Internet and the Internet loves cats, right? It’s a basic Express-based API using mongoose, bodyParser, etc.:

We will be testing a simple GET request that returns a 6.59 KB JSON response of 227 lines in length consisting of adoptable cat data:

Below are the headers returned by running the GET command. Note that we are keeping the connection to the server alive and that GZIP is enabled to reduce payload size:

For our first test, let’s run Siege with 5 concurrent users for 5 minutes from localhost against an Express server running on localhost—no network involved. We want to keep any network-related delays out of the equation to set a baseline. The Siege command line invocation tells Siege to run for 5 minutes while simulating 5 concurrent users making GET requests:

$ siege -c 5 –time=5m –content-type “application/json” GET http://localhost:3000/cats

The results of the first Siege run include data on number of hits (transactions), how long the test ran, how much data was transferred, transaction rate, and the number of successful and failed transactions:

This will serve as our (somewhat artificial) baseline.

Now let’s add application-level caching into the mix. We are going to use the apicache NPM package to add simple, in-memory response caching to our app. After adding the package (“npm install apicache”), we add a few lines of configuration to enable it:

Now we turn it on for the single “get_all_cats” route. Note that the original route is commented out here, and a copy of that route is now using apicache with a directive to cache the response for 5 minutes:

We will swap between these two routes as we enable and disable caching for the various test runs later in this article. The NodeJS server will now be restarted to clear memory of anything that could confound our tests. If you make a request against the server, it reveals that response caching is indeed enabled by the presence of a new header: cache-control. This header’s max-age value is set at 300 seconds (the 5 minutes we specified in the app):

You may also turn on debugging for the apicache module to verify that it is working:

Note the first request took 31ms to return data to the client, but subsequent requests did not take nearly that long. The first request is a full request, but all requests after that were served out of memory cache, and returned to the client in less than 0.5ms. This is the core benefit of caching.

Now let’s re-run the exact same Siege command as we did the first time to see if we get any difference with the application-based caching:  

$ siege -c 5 –time=5m –content-type “application/json” GET http://localhost:3000/cats

Wow! Application-based caching has made a HUGE difference. In the same five minutes of elapsed time, we served up 60% more hits with caching turned on. We transferred more data doing it, but were able to process 449 transactions per second, up from the original 179 transactions per second—another 60% jump! This is a pretty sizable bang for our buck by just adding caching to the API.

Now let’s add the network into the equation. For the purpose of this article, I created a Digital Ocean Droplet server with 1GB of RAM, 1 vCPU, and a 25GB SSD running in their NYC data center. This is the minimum hardware level Digital Ocean offers for a virtual machine, and I chose it deliberately; this will allow us to see what caching brings to an API in a hardware-constrained environment. I deployed the NodeJS cats API to the Droplet server with Nginx in front to proxy requests from port 80 to NodeJS’s port 3000. There is also a local instance of MongoDB running that acts as the backend for the API. No other software is running on the server.

Let’s make a request to the server to review the response headers it sends back. You can see by the presence of the Nginx banner that we are no longer running on localhost:

Now, let’s re-run the Siege command against the Digital Ocean server (by IP address) with the identical Siege command-line invocation:

$ siege -c 5 –time=5m –content-type “application/json” GET

Below is the result of the Siege run against the Digital Ocean VPS with no API response caching:

Note the changes from running this test locally to running it against an IaaS provider:

  • The number of hits and transactions drops by 35%
  • We start seeing failed transactions for the first time

Thirty-five percent is a pretty big haircut, but somewhat expected when running code on a multi-tenant cloud provider. Now let’s see what enabling apicache can do for us at Digital Ocean. We enable response caching in the API as described previously. To ensure caching is turned on, let’s make a single request to the API to validate that response caching is turned on at Digital Ocean:

Again we see the presence of the cache-control header with a value of 5 minutes (i.e., 300 seconds), so let’s re-run the same Siege command against the Digital Ocean server with application caching enabled:

$ siege -c 5 –time=5m –content-type “application/json” GET

WOW! This time, enabling caching lets us complete 4,932 MORE transactions and gives us the ability to serve those transactions at a 21% faster (63/sec vs. 80/sec) rate. We delivered more data in the same amount of time, as the server did not have to regenerate the response every time. This clearly demonstrates the value of enabling caching for API responses wherever possible.

Why should you cache API responses as I just showed you? Here are seven reasons:

  1. Most API operations are read operations.
  2. When data doesn’t change, why keep generating it?
  3. Cloud providers charge by IOPS and instance size. How much compute power could you turn off by caching instead?
  4. You can reduce network traffic.
  5. You can reduce database query/computation costs.
  6. You can serve a cached, stale response in the event of an origin outage.
  7. You can reduce reliance on rate limiting (i.e., don’t turn away valid API consumers!)

Your next question may be: “WHAT should I cache?” Here are some general guidelines:

  • Any resource accessible via HTTP GET
  • Static data
  • Responses that are immutable
  • Responses that change infrequently or at predictable intervals
  • Responses used by many clients (frequently-requested data)

Since cacheability will vary widely across your portfolio of APIs, it’s critical to implement your caching strategy on an individual API basis. One size does NOT fit all!  And this is where Akamai API Gateway really shines: One of the extraordinary things about API Gateway is that it empowers each API developer to define caching rules on an independent basis.  With API Gateway, there’s no dependency on a set of domain-level caching rules, so each API developer has full autonomy to optimize delivery of their content.

This is the first of two posts about API caching and API Gateway; I encourage you to read the second post, which builds on the information above and discusses how to improve your application-caching results with API Gateway. Also, check the API Gateway home page for lots more helpful information.

Jeff Costa is a senior product manager at Akamai Technologies.

Categories: APIs

Suggested Article