Using a far future expires header
By using a far future expires header you can efficiently control how assets are cached on the client, which results in improved performance. Here's how to set it up correctly with Apache, and some pointers on how to refresh users cache when you modify files.
Most of you probably already know YSlow, the plugin created by Yahoo's exceptional performance team. These guys have done some awesome work on performance, and the rules behind YSlow (summed up in Steve Souders' book) are a gold mine for people who wants to improve performance on their web applications. If you don't already have it, install it now!
In their research, the exceptional performance team found that two of the most effective steps you can take to improve performance is 1) to reduce the number of HTTP requests, and 2) using a far future expires header, which is what we're talking about today.
Benefits of a far future expires header
These assets only change when we modify our sites design and/or behaviour, and so it should not be necessary for our users to download these files more than once per deployment. Unfortunately, for most sites you actually download all these files on every request. By using an expires header far into the future, the browser will not download these files again before the given date. This means you can save a few hundred kb of assets on every request!
Problems with far future expires headers
The problem with the far future expires header is exactly the same as the benfit: the bowser will not download files again if the header says they haven't changed. Even if you roll out new versions that contain important bug fixes and improvements.
The way to solve this problem is to have filenames change when their contents change. Unfortunately this yields more work, and you should seek to find a way of automating this. I'll get back to this towards the end of this post.
Apache and expires headers
First, enable the mod_expires module. For systems using apt (Debian, Ubuntu and others) you'll do this like follows:
# Check if module is already enabled $ ls /etc/apache2/mods-enabled | grep expires # If it wasn't $ sudo ln -s /etc/apache2/mods-available/expires.load /etc/apache2/mods-enabled/ # Restart Apache $ sudo /etc/init.d/apache2 restart
<FilesMatch "\.(ico|pdf|flv|jpe?g|png|gif|js|css|swf)$"> ExpiresActive On ExpiresDefault "access plus 1 year" </FilesMatch>
If you'd like more fine grained control, you can use the ExpiresByType directive.
FilesMatch directive matches files based on a regular expression and can be tweaked to your liking. For instance, if you'd like these settings only to apply to your design assets (and not content images) you could do this instead:
<FilesMatch "/design/.*\.(ico|pdf|flv|jpe?g|png|gif|js|css|swf)$"> ExpiresActive On ExpiresDefault "access plus 1 year" </FilesMatch>
FilesMatch directive can go in your
If you cannot use mod_expires for some reason, you can achieve a similar effect using mod_headers. This module allows you to set arbitrary headers. Using this we can manually set an Expires header as such:
<FilesMatch "\.(ico|pdf|flv|jpe?g|png|gif|js|css|swf)$"> Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT" </FilesMatch>
This needs occasional updating. Update: As Samuli points out in the comments, you should not set an expires header more than one year into the future (according to the HTTP 1.1 RFC). This means you'll need to tend to this solution atleast once a year, should you choose to do things this way.
As I mentioned, you need a mechanism for updating users cache once you start using the far future expires header. When you roll out changes to your static assets, their filenames (or atleast URLs) need to change in order to be sure everyone gets the latest and greatest.
The simplest way to update file names is to attach a query parameter to your asset URLs. This is automated by a few server side frameworks, like Rails. The timestamp of last modification date is appended to the URL to a file, resulting in something like this:
/stylesheets/mystyles.css?1234567890. When something is added to a URL to avoid client side caching we sometimes refer to it as a "cache buster".
There is a problem with the above cache buster, though. The default configuration of the popular caching proxy Squid used to make Squid not consider known URLs with changed query parameters as new URLs. Effectively, users behind a Squid proxy would not have their caches refreshed by the above cache busters.
"Hard" cache busters
I like to refer to our first shot at fixing the update problem as "soft" cache busters. I consider them soft since they require no extra configuration or work on your part, other than generating the URLs.
The hard cache buster approach requires more work, but is more robust in that it will always cause clients to regard the URLs as unique and new. Instead of appending the
mtime timestamp on the URL, we include it right inside the URL, ie:
In order for this to work we need to either change our filenames on each deploy, or be a little clever with
mod_rewrite. The following
RewriteRule will cause the above URL, and all others like it (ie
<path>-cb<mtime>.<suffix>), to seemlessly redirect to the original files (ie
RewriteRule (.*)-cb\d+\.(.*)$ $1.$2 [L]
So there you go, a "use as is" setup for improving performance by ways of the
<VirtualHost *> # Usual config <FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$"> ExpiresActive On ExpiresDefault "access plus 1 year" </FilesMatch> RewriteEngine On RewriteRule (.*)-cb\d+\.(.*)$ $1.$2 [L] </VirtualHost>
Thanks to Samuli for pointing out that the expires header should not exceed one year into the future.