Use “Cache Prefreshing” in API Gateway for Extremely Time-Pressured API Calls

November 6, 2019 · by Tim Vereecke ·

Many dynamic API calls, such as payment transactions, are somewhat time-pressured and must complete within several seconds. But other API calls are even more critical and are what we call “ultradynamic.” These calls require extremely rapid refresh times (usually less than 100 milliseconds) to ensure that fast-changing data is updated quickly for an API client. For example, consider sports scores:

sports scores
Example: An API returning sports scores where sub-second freshness is crucial

How important is it to fans of either team that these scores be as close to real-time as possible? For websites whose business is based on frequent fan traffic, this is a business-critical question.

Or consider an online retail product website where a customer sees this screen:

in-stock items
Example: An API returning dynamic pricing and out-of-stock information

If a customer tries to buy a product that says “in stock” based on the API call, but the product is actually out of stock, how frustrated would they be? Would their buying behavior change? Would they jump immediately to a competitor site? For an online retailer, these are business-critical concerns.

In addition to sports and retail inventory, ultradynamic API calls can also include:

  • Breaking news 
  • Mass transportation schedules, such as train timing or flight delays
  • Severe weather events
  • Social media commentary

So how can you best accommodate ultradynamic API calls?

Introducing “Cache Prefreshing” in Akamai API Gateway

For the scenarios above, and many more, you can now accelerate and offload your fast-changing API content using “Cache Prefreshing”—a new and powerful feature in Akamai API Gateway. Cache Prefreshing is a mechanism to refresh cached API responses prior to their full expiration time. (The term “prefresh” isn’t a real word, of course, but a mashup of “prior” and “refresh.”)

Let's take a look at how the caching process works today without Cache Prefreshing. Often, due to the fear of serving stale or out-of-date content, a fast-changing API call is marked as not cacheable (i.e., no-store/bypass cache) for a few reasons:

  • A super-short cache time of just 1-2 seconds can result in showing out-of-date content. 
  • Super-short cache times still result in a percentage of users having to wait while fresh content is requested from the origin. This is even more evident during less-busy times of the day or if the content becomes less popular.

The “freshness” of the not cacheable content in this scenario, however, comes at the expense of scalability, security, and performance. Here’s why: Every single API call must go to the origin to be regenerated and served. That increases the latency and exposure to security threats, while reducing the number and geographical distribution of servers available to distribute the content.

But with the introduction of Cache Prefreshing, there is now an alternative way for you to serve fresh (i.e., <1 second) content while maintaining solid performance, scalability, and reliability. Cache Prefreshing is, in short, the best of both worlds.

How Cache Prefreshing works

Now that we’ve reviewed the problem that prefreshing solves, let’s take a look at exactly how it works:

  1. An incoming request triggers prefreshing when the response in the Akamai cache is considered old enough (expressed as a percentage of the full, preset cache time).
  2. An API request triggering the prefresh is not delayed, as the existing response is served out of cache.
  3. When the cached response is asynchronously updated, the full cache time is reset once again.

Let's look at this visually. Specifically, we’ll dive into two broad scenarios:

  • High-traffic scenario (which includes three different sub-scenarios)
  • Low-traffic scenario

High-traffic scenario

sub-scenario 1
Sub-scenario 1: No-store, every single request goes to the origin

The time-series shown above simulates an ultradynamic API resource being marked as no-store. In this sub-scenario, every single API call goes back to the origin, and it therefore has a performance penalty (tall red lines), along with a risk for scalability issues. The benefit of this approach is that the content is always fresh.

sub-scenario 2
Sub-scenario 2: Caching for 2s 

Sub-scenario 2, shown above, illustrates what happens when you cache the same API resource using a time-to-live (TTL) of two seconds. Although a high percentage of the requests in this chart is much faster (green short lines), every two seconds one user has bad luck: They will need to wait for the origin to regenerate the response. There’s also another risk here: Even though the number of requests going back to the origin is dramatically reduced, the API response may be two seconds old (and thus outdated).

sub-scenario 3
Sub-scenario 3:  TTL=2s with 25% prefresh

Sub-scenario 3, shown above, illustrates what happens when we cache for two seconds and also enable API Gateway’s Cache Prefreshing (set to 25% of two seconds, or 500ms). The 4th request gets content served out of cache (short green line) and triggers the prefresh asynchronously and resets the TTL. In this sub-scenario, we maintain super-fast performance for all of our API calls, achieve sub-second freshness, and offload a high amount of requests to the Edge. In the broad, high-traffic scenario, this is how you can achieve optimal performance, scalability, and reliability.

Low-traffic scenario

The cache refresh settings in sub-scenario 3 above were perfect for high-traffic API calls (e.g., live soccer games, popular pages on a news website, hot products currently on sale, etc.). Now let’s look at a low-traffic scenario to understand how prefreshing would work. 

During lower traffic periods (e.g., in the middle of the night, after the game, older content, etc.), the performance would drop again if we were still using the settings above. For example, suppose we only see one API request every 3-5 seconds. In our previous scenario of two-second caching, we would not see any benefit, and the performance might look like this:

low-traffic scenario #1

Also, a prefresh setting of 25% would not have helped here, because it does not serve outdated content nor does it prefresh content automatically. However, we can tune our settings and tune both the full TTL and the prefresh percentage to reach our performance goals. If we assume that during low-traffic periods the freshness of the content is less important, we can increase the TTL to five seconds instead of two seconds, and decrease the prefresh rate from 25% to 10%.

With these settings, we would still maintain super-fast response times during low-traffic periods:

low-traffic scenario #2

As soon as traffic increases, the real TTL is back to the expected freshness of 500ms:

low-traffic scenario #3

In short, you can create settings for cache refreshing that give you top performance in both high-traffic and low-traffic scenarios. Now, let’s measure that performance.

Measuring the performance impact of Cache Prefreshing

This section covers some real-world example timings of the performance impact that Cache Prefreshing has on ultradynamic API calls. Table 1 below shows the no-store version of the API call (as described in sub-scenario 1, above). Here, each API call is going back to origin (a TCP_MISS meaning that the object was not in Akamai cache) and showing a response time of ~90ms. Table 2 shows the performance gains when the same API call is served out of cache (a TCP_MEM_HIT where the object was in Akamai edge server RAM) resulting in response times of ~25ms:

table 1: no-store
Table 1: An ultradynamic API call marked as no-store
table 2: cache prefresh
Table 2: An ultradynamic API call with cache prefresh triggered after 20% of the TTL (5s)

Another example shows the impact as measured by real user monitoring (RUM) on an API call after switching from no-store to caching with prefresh enabled: median response times improved from ~50ms to ~20ms:

dynamic banners chart

The purpose of the API call in this example was to show relevant and very dynamic advertisements. Ideally, different content should be shown on each page view to reduce banner blindness. The performance of this ultradynamic API call is critical; an ad must be displayed before the user scrolls down. In this case, Cache Prefreshing was configured to trigger after only 1% of the set four-minute TTL. When a single user browses a site, they typically get fresh content every 2.4 seconds. And unless they stay longer than four minutes (i.e., full TTL) on one page, the next page will serve the advertisement call directly from cache.

Configuring Cache Prefreshing in API Gateway

Akamai API Gateway provides you with easy and granular control of the new Cache Prefreshing settings. You can configure it via Akamai Control Center or you can update the settings in a fully automated way via Akamai’s Endpoint Definition API.

Cache Prefreshing can easily be toggled on or off in API Gateway. When turned on, you must instruct API Gateway to trigger a prefresh of the content by setting a percentage of the full cache time, as indicated by the red arrow in the screenshot below. Then save the configuration (orange button in the lower right corner of the screen) and deploy it to the network.

screenshot 1

Note that you can define default settings at the API endpoint level, and optionally override these at the individual resource level for prefreshing, as seen in the following two API Gateway screenshots:

screenshot 2
Default Cache Prefreshing settings at the API endpoint level
Granular control of Cache Prefreshing at the resource level


Cache Prefreshing is a powerful new feature in API Gateway: API calls that you might have previously considered to be completely uncacheable may now be cacheable, which improves your performance, scalability, and infrastructure offload. 

To try Cache Prefreshing for yourself, simply enable Cache Prefreshing in API Gateway. If you’re not yet using API Gateway, just register for the Akamai Developer Trial and take API Gateway for a free, 90-day test-drive.

Tim Vereecke is a web performance architect at Akamai Technologies.