The Akamai Intelligent Platform™ consists of many, many servers spread around the world close to users. The platform notices when content is popular, and keeps a local copy to give to the next user to request it. Since the server communicating with the user is very close to them, they tend to have a better user experience. This is the main benefit of the process called content caching.

Caching is done using a pull model. Akamai does not know about objects for which we haven’t seen requests, nor do we push content out to our servers in order to pre-warm our caches. One slight exception to this is the Prefetching feature where the servers examine the web page that is being downloaded or the video stream being requested and proactively make requests ahead of the user to ensure that there is an object in cache before the user gets around to requesting it. But this is still part of the pull model as we need to get a request for an object in order to learn about related objects that we should download and the action is taken independently by each of our server deployments.

After downloading an object the Akamai servers will save a copy of it. Servers within the same deployment are able to check each other’s caches using the Inter-Cache Protocol (ICP). This means that the amount of storage space for caching content is the sum of the storage space of all the servers in that deployment.

When our DNS servers reply to lookup requests, we generally return two IP addresses for any given deployment. Both servers will cache the objects that they serve, so the total cache space is half the total storage space of that deployment. This redundancy allows us to take machines out of service for installs, or maintenance, or due to an outage, and still be able to serve their objects without having to go forward to the origin. On a network with over 135,000 servers deployed in more than 2200 locations around the world, it is important to ensure that these sorts of events are transparent to our customers.

As the cache on a server fills up, our software looks for the least recently used objects in its cache and evicts them to make room for new objects. Objects that are very popular will remain in cache for a long time, while less popular objects will be automatically pruned as requests for new objects arrive. Content is never evicted unless there are requests for new objects and there is no existing room in the cache for them, or unless the customer explicitly issues a purge request to have an object removed from the network.

If the caches of the edge deployments is not sufficient to cache the majority of a customer’s content, we can add one or more layers of deployments between the edge and the origin to provide a Tiered Distribution hierarchy. Over time, the edge cache will contain the most popular objects, and first layer of tiered distribution will contain the most popular of what’s left, and so forth until we end up retrieving the very least popular objects from the origin.