Understanding AWS CloudFront Content Caching

·

3 min read

Understanding AWS CloudFront Content Caching

Introduction

Amazon CloudFront is a CDN service from Amazon Web Services that helps make websites faster by storing copies of website content in different places around the world. This way, when you visit a website, you get the information from a location nearby, making it load faster.

But what are these nearby locations?

These nearby locations are called Edge Locations. These edge locations are spread globally to cover most of the area on the globe. As per the AWS, there are over 410 PoPs, including more than 400 Edge Locations, and 13 regional mid-tier caches in over 90 cities across 48 countries.

But how do edge locations function?

The edge locations consist of various nodes. You can consider nodes like some entity/virtual servers to hold the data in the form of a cache. Each node caches the data for a certain amount of time based on certain parameters.

Whenever content is requested, the request is landed on the nearest edge location closest to the location from where the request originated. Post landing Cloudfront checks whether there is any cache present on the edge location for the requested content. If the cache is present the content is served from the cache.

But if the content is not present the request is forwarded to the actual servers where the content is present. Once the response is sent from the origin servers the response is cached on the edge location and then the response is sent back to the users. Now when the user next time requests for the same content the cached content from the last iteration is served to the user. This cache mechanism also helps in reducing the latency for the user.

How long content is cached on the edge locations?

AWS CloudFront caches content on different parameters mentioned below.

  1. Cache policy

    AWS CloudFront has a concept of cache policy. A cache policy is a document that contains the amount of time the content needs to be cached on the edge location.

    Whenever we need to cache the content, we need to provide the default, minimum, and maximum TTL(Time to live). These TTL values define for how much duration the content will be cached on the edge locations.

  2. No of requests

    This is a parameter that is not discussed more often but plays an important role. Let's assume we have configured a cache policy for our Cloudfront distribution which caches the content for 1 hour. However, the traffic on the CloudFront distribution is very low, with hardly 10 requests. This is where this parameter comes into the picture.

    If the number of requests is very low, AWS Cloudfront does not cache the content even though a cache policy is set.

How to control cache on edge locations?

Usually, the cache expires on the edge location based on the configurations done in the CloudFront distribution cache policy. But there are some times when we need to customize the cache expiration time according to the requirement.

We can achieve this functionality using the max-age and s-maxage header. These two headers are responsible for cache expiration.

1. max-age

This header works on the browser-level cache. Whenever you need to control the cache on the browser level you can configure this header under the Cache-Control header and return it while returning the response.

Cache-Control:"max-age=30"

2. s-maxage

You can use this header when you need to control the cache expiration time on the edge location/CDN level. This header works with all CDN providers like Cloudfront, Akamai, etc.

Cache-Control:"s-maxage=30"

Note: When you return only max-age the edge location as well as the browser both use the value as the expiration time, while the s-maxage only works with the edge location. It will not work on the browser level.

Conclusion

Amazon CloudFront's efficient caching mechanism optimizes content delivery by strategically storing data at edge locations worldwide, significantly reducing latency and enhancing user experience across the globe.