AWS CloudFront: Delivering Content at the Speed of Light

By Sakshi Zalavadia / Aug 29,2023

In today's digital era, delivering content with utmost speed and performance is crucial for staying ahead, and that's where AWS CloudFront shines. CloudFront is designed to efficiently distribute web content, such as images, videos, and dynamic data, to users across the globe, ensuring low latency and an exceptional user experience. By leveraging a network of strategically placed edge locations, CloudFront reduces the distance between the content and its consumers, resulting in faster load times and reduced server load. In this blog, we will explore the fundamental concepts behind AWS CloudFront, its caching mechanisms, and caching strategies to optimize content delivery. Additionally, we'll delve into the concept of CloudFront Origin Shield and how it can further enhance the performance and resilience of your content delivery architecture. So, let's embark on a journey to unravel the mysteries of AWS CloudFront and uncover the best practices for delivering content at blazing speed.

How CloudFront Works with Edge Locations?

Amazon CloudFront is a web service designed to accelerate the distribution of both static and dynamic web content, including .html, .css, .js, and image files, to end-users. This content is delivered through a global network of data centres known as edge locations.

When a user requests content served by CloudFront, the DNS routes the request to CloudFront edge location (POP – point of presence) that offers the lowest latency. CloudFront checks its cache for the requested object, if it is present in the cache, CloudFront returns it to users. If the object is not present in the cache, CloudFront compares the request with the specifications in your CloudFront distribution and forwards the request to the origin server for the requested object. The origin server sends the object back to the edge location. CloudFront starts forwarding the object to the user as soon as the first byte from the origin arrives. Additionally, CloudFront adds the object to the cache for future requests.

CloudFront with Regional Edge Cache

CloudFront also has a regional edge cache, which brings more content closer to the viewers, even when the content is not requested enough to be cached at edge locations, to improve performance for that content, ensuring optimal performance and swift delivery. When a viewer makes a request, DNS routes it to the nearest POP and if it is cached in POP, the content is delivered and if it is not cached, request goes to the nearest regional edge cache. In the regional edge cache, CloudFront checks the cache for the requested content, if it is cached here then CloudFront forwards it to the POP that requested it, and the content is added to the cache in POP and served to the user at the same time. If the content is not cached in either POP or regional edge cache, CloudFront compares the request with your CloudFront specification and forwards the request to the origin server. The returned response from the origin server is cached at both the regional edge cache and POP for the next time a viewer requests it. This makes sure that all the POP in a region shares a local cache, eliminating multiple requests to the origin server. Also, CloudFront keeps a persistent connection with the origin server, so objects are fetched as soon as possible.

Strategic ways to optimize caching in CloudFront

1. Improving Cache hit ratio

The number of requests directly served from the CloudFront cache compared to all the requests made is the cache hit ratio. You can improve performance by increasing the cache hit ratio. You can specify the longest practical time for which the content can be cached, before fetching the latest version from the origin.

2. CloudFront Origin Shield

CloudFront’s global network has edge locations and a regional edge cache, which serves as a middle layer to provide cache hits and consolidate origin requests in nearby geographical regions. Origin Shield adds another layer of protection and performance to your content delivery network (CDN). It sits between your regional edge cache and origin, ensuring that all requests from the regional cache pass through it. This reduces the number of requests that your origin server must handle, which can improve performance and reliability.

Origin Shield can be used for various use cases, such as:

1. When the users are spread across different geographical locations. 2. On-premises origin which have bandwidth or capacity constraints. 3. Origins that can dynamically package content for live streaming.

Now let us understand the end-to-end request path using the following example:

The request of user 1 is first routed to closest edge location, and it checks whether the requested content is cached at the edge location or not.
If the content is available at edge location, then it successfully returns the requested content. This scenario is considered as “Cache hit”.
If the content is not available at the edge location, the CloudFront routes the request to the Regional Edge cache. The scenario is considered as “Cache miss”.
If the content is present at regional edge cache, then it is returned to the requested edge location (POP) and served to the user at the same time.
If the content is not cached at regional edge cache too, then the request is sent to the Origin Shield if configured.
Origin Shield is the cache closest to the Origin which consolidates all the similar request to one request from all the regional cache and send it the Origin source. Here if the content is already cached at origin shield, then it is returned to the regional edge cache and POP thus serving the response to the users.
If the content is not cached at origin shield, then it requests the content form the Origin of the CloudFront distribution and returns the response.
Now, if the User 2 which is nearest to edge location 2 requests the same content as the user 1, then his request is sent to Edge Location 2 .
Edge Location 2 does not have the requested content cached, hence its request is sent to the Regional Edge Cache, cache miss scenario has occurred here.
As Regional Edge Cache has the content (because it is same as user 1 requested), it directly serves to the end user and cache at edge location 2.

3. Caching based on query string parameters

Caching based on query string parameters is a way to improve the performance of your CloudFront distribution by caching different versions of an object based on the query string parameters that are passed in the request URL.

For example, if you have an object that returns different HTML content depending on the value of the ?color=red query string parameter, you can configure CloudFront to cache two versions of the object, one for the ?color=red parameter value and one for the ?color=blue parameter value. This way, CloudFront can serve the correct version of the object to the user without having to forward the request to the origin server. To configure CloudFront to cache based on query string parameters, you need to specify the query string parameters that you want to use as the cache key in your CloudFront distribution's cache policy. However, it is important to note that caching based on query string parameters can also increase the size of your cache, so you need to carefully consider your needs before you enable this feature.

4. Caching based on cookies

"Caching based on cookies" is a significant aspect that can greatly influence content delivery performance. Cookies are small pieces of data stored on a user's browser, commonly used to maintain session information, or personalize user experiences. CloudFront distribution caches different versions of an object based on the cookies that are sent by the user's browser.

For example, if you have an object that returns different HTML content depending on whether the user is logged in, you can configure CloudFront to cache two versions of the object, one for logged-in users and one for anonymous users. This way, CloudFront can serve the correct version of the object to the user without having to forward the request to the origin server. By strategically implementing caching based on cookies, you can strike a balance between delivering dynamic content and maximizing the benefits of CloudFront's robust caching capabilities.

5. Caching based on request headers

Request headers are additional pieces of information sent by a client's browser when making a request to a web server. By default, CloudFront treats each request header variation as a unique object, which can lead to suboptimal caching behaviour and reduced cache efficiency. However, with CloudFront's flexibility, you can customize caching rules to consider specific request headers and cache different versions of the same content based on these headers. One example of this is when serving content in multiple languages. Let's say you have a website with localized content and a request header named "Accept-Language" that indicates the user's preferred language. By configuring CloudFront to cache content based on the "Accept-Language" header, the first request for a specific language variant will be fetched from the origin server, but subsequent requests with the same "Accept-Language" header will be served directly from the cache, resulting in faster response times, and reducing the origin server's load. Caching based on request headers can be a powerful tool in your CloudFront caching strategy to optimize content delivery and enhance user experiences.

Conclusion

In conclusion, AWS CloudFront is a powerful and efficient Content Delivery Network (CDN) service that revolutionizes content delivery across the globe, that deliver content at blazing speeds and unparalleled performance. By harnessing its caching capabilities strategically, businesses can create a seamless and delightful user experience, solidifying their position in the digital landscape and gaining a competitive edge in today's fast-paced online world. With CloudFront's global reach and powerful caching strategies, the journey to optimize content delivery has never been smoother.