List items
Items from the current list are shown below.
Blog
All items from July 2016
18 Jul 2016 : Using a CDN for regrexitmap.eu #
This weekend I've been playing around with Amazon's CloudFront CDN. I've been setting up a new site, www.regrexitmap.eu, and although I'm not expecting it to be heavily used, the site is bandwidth-heavy and entirely static on the server-side, so a good candidate for deployment via CDN. For those unfamiliar with the term, CDN stands for Content Delivery Network, able to push the content of a website out to multiple servers across the world. This moves the content closer to the end-users, in theory reducing latency and making the site feel more responsive.
There are other benefits of using a CDN. Because the site is served from multiple locations it also makes it less susceptible to denial of service attacks. Since I work in security, there's been a lot of discussion in my research group about DoS attacks and I recently saw a fascinating talk by Virgil Gligor on the subject (the paper's not yet out, but Ross Anderson has written up a convenient summary).
The availability that DoS attempts to undermine offers a wholly different dynamic from the confidentiality, integrity and authenticity that I'm more familiar with. These four together make up the CIAA 'triad' (traditionally just CIA, but authenticity is often added as another important facet of information security). Tackling DoS feels much more practical than the often cryptographic approaches used in the other three areas. An attacker can scale up their denial of service by sending from multiple sources (for example using a botnet), while a CDN redresses the balance by serving from multiple sources, so there's an elegant symmetry to it.
In addition to all of that, CloudFront looks to be pretty cheap, at least compared to spinning up an EC2 instance to serve the site. That makes it both educational and practical. What's not to like?
Amazon makes it exceptionally easy to serve a static site from an S3 bucket. Simply create a new bucket, upload the files using the Web interface and select the option to serve it as a site.
The only catch is that you also have to apply a suitable policy to the bucket to make it public. Why Amazon doesn't provide a simpler way of doing this is beyond me, but there are plenty of how-tos on the Web to plug the gap.
Driving a website from S3 offers serviceable, but not great, performance. A lot of sites do this, and already in May 2013 netcraft identified 24.7 thousand hostnames running an entire site served directly from S3 (with many more serving part of the site from S3). It's surely much higher now.
Once a site's been set up on S3, hosting it via CloudFront is preposterously straightforward. Create a new distribution, set the origin to the S3 bucket and use the new cloudfront.net address.
The default CloudFront domains aren't exactly user-friendly. This is fine if they're only used to serve static content in the background (such as the images for a site, just as the retail Amazon site does), but an end-user-facing URL needs a bit more finesse. Happily it's straightforward to set up a CNAME to alias the cloudfront subdomain. Doing this ensures Amazon can continue to manage the DNS entry it points to, including which location to serve the content from. So I spent £2.39 on the regrexitmap.eu domain and am now fully finessed.
Finally I have three different domain names all pointing to the same content.
The process, which is in theory very straightforward, was in practice somewhat glitchy. The bucket policy I've already mentioned above. The part that caused me most frustration was in getting the domain name to work. Initially the S3 bucket served redirects to the content (why? Not sure). This was picked up by CloudFront, which happily continued to serve the redirects even after I'd changed the content. The result was that visiting the CloudFront URL (or the rerexitmap.eu domain) redirected to S3, changing the URL in the process, even though the correct content was served. It took several frustrating hours before I realised I had to invalidate the material through the CloudFront Web interface before all of the edge servers would be updated. Things now seem to update immediately without the need for human intervention; it's not entirely clear what changed, but it certainly hindered progress before I realised.
The whole episode took about a day's work and next time it should be considerably shorter. The cost of running via CloudFront and S3 is a good deal less than the cost of running even the most meagre EC2 instance. Whether it gives better performance is questionable.
Comparing basic S3 access with the equivalent CloudFronted access gives a 25% speed-up when accessed from the UK. However, to put this in context, serving the same material from my basic fasthosts web server results in a further 10% speed-up on top of the CloudFront increase.
If I'm honest, I was expecting CloudFront to be faster. On the other hand this is checking only from the UK where my fasthosts server is based. The results across the world are somewhat more complex, as you can see for yourself from the table below.
We can render this data as a graph to try to make it more comprehensible. It helps a bit, but not much. In the graph, a steeper line is better, so CloudFront does well at the start and mid-table, but also has the site with the longest ping time overall. The lines jostle for the top spot, from which it's reasonable to conclude they're all giving pretty similar performance in the aggregate.
In conclusion, apart from the unexpected redirects, setting up CloudFront was really straightforward and the result is a pretty decent and cheap website serving platform. While I'm not in a position to compare with other CDN services, I'd certainly use CloudFront again even without the added incentive of wanting to know more about it.
I'm now looking in to adding an SSL cert to the site. Again Amazon have made it really straightforward to do, but the trickiest part is figuring out the cost implications. The site doesn't accept any user data and SSL would only benefit the integrity of the site (which, for this site, is of arguable benefit), so I'd only be doing it for the experience. If I do, I'll post up my experiences here.
Comment
There are other benefits of using a CDN. Because the site is served from multiple locations it also makes it less susceptible to denial of service attacks. Since I work in security, there's been a lot of discussion in my research group about DoS attacks and I recently saw a fascinating talk by Virgil Gligor on the subject (the paper's not yet out, but Ross Anderson has written up a convenient summary).
The availability that DoS attempts to undermine offers a wholly different dynamic from the confidentiality, integrity and authenticity that I'm more familiar with. These four together make up the CIAA 'triad' (traditionally just CIA, but authenticity is often added as another important facet of information security). Tackling DoS feels much more practical than the often cryptographic approaches used in the other three areas. An attacker can scale up their denial of service by sending from multiple sources (for example using a botnet), while a CDN redresses the balance by serving from multiple sources, so there's an elegant symmetry to it.
In addition to all of that, CloudFront looks to be pretty cheap, at least compared to spinning up an EC2 instance to serve the site. That makes it both educational and practical. What's not to like?
Amazon makes it exceptionally easy to serve a static site from an S3 bucket. Simply create a new bucket, upload the files using the Web interface and select the option to serve it as a site.
The only catch is that you also have to apply a suitable policy to the bucket to make it public. Why Amazon doesn't provide a simpler way of doing this is beyond me, but there are plenty of how-tos on the Web to plug the gap.
Driving a website from S3 offers serviceable, but not great, performance. A lot of sites do this, and already in May 2013 netcraft identified 24.7 thousand hostnames running an entire site served directly from S3 (with many more serving part of the site from S3). It's surely much higher now.
Once a site's been set up on S3, hosting it via CloudFront is preposterously straightforward. Create a new distribution, set the origin to the S3 bucket and use the new cloudfront.net address.
The default CloudFront domains aren't exactly user-friendly. This is fine if they're only used to serve static content in the background (such as the images for a site, just as the retail Amazon site does), but an end-user-facing URL needs a bit more finesse. Happily it's straightforward to set up a CNAME to alias the cloudfront subdomain. Doing this ensures Amazon can continue to manage the DNS entry it points to, including which location to serve the content from. So I spent £2.39 on the regrexitmap.eu domain and am now fully finessed.
Finally I have three different domain names all pointing to the same content.
- For the S3 bucket: regrexit.s3-website-eu-west-1.amazonaws.com
- For the CloudFront mirror: duymrehzkmtay.cloudfront.net
- For the end-user CNAMEd domain: www.regrexitmap.eu
The process, which is in theory very straightforward, was in practice somewhat glitchy. The bucket policy I've already mentioned above. The part that caused me most frustration was in getting the domain name to work. Initially the S3 bucket served redirects to the content (why? Not sure). This was picked up by CloudFront, which happily continued to serve the redirects even after I'd changed the content. The result was that visiting the CloudFront URL (or the rerexitmap.eu domain) redirected to S3, changing the URL in the process, even though the correct content was served. It took several frustrating hours before I realised I had to invalidate the material through the CloudFront Web interface before all of the edge servers would be updated. Things now seem to update immediately without the need for human intervention; it's not entirely clear what changed, but it certainly hindered progress before I realised.
The whole episode took about a day's work and next time it should be considerably shorter. The cost of running via CloudFront and S3 is a good deal less than the cost of running even the most meagre EC2 instance. Whether it gives better performance is questionable.
Comparing basic S3 access with the equivalent CloudFronted access gives a 25% speed-up when accessed from the UK. However, to put this in context, serving the same material from my basic fasthosts web server results in a further 10% speed-up on top of the CloudFront increase.
Loading times accessing the site on S3 (2.54s total).
Loading times accessing the site via CloudFront (1.92s total).
Loading times accessing the site on fasthosts.co.uk (1.75s total).
If I'm honest, I was expecting CloudFront to be faster. On the other hand this is checking only from the UK where my fasthosts server is based. The results across the world are somewhat more complex, as you can see for yourself from the table below.
Ping times for the three access methods from across the world (all times in ms, from dotcom).
Location | S3 | CloudFront | Fasthosts |
Amsterdam, Netherlands | 16 | 9 | 12 |
London, UK | 13 | 19 | 9 |
Paris, France | 14 | 10 | 26 |
Frankfurt, Germany | 26 | 7 | 23 |
Copenhagen, Denmark | 32 | 18 | 33 |
Warsaw, Poland | 48 | 26 | 38 |
Tel-Aviv, Israel | 79 | 58 | 72 |
VA, USA | 88 | 93 | 86 |
NY, USA | 76 | 105 | 87 |
Amazon-US, East | 80 | 99 | 100 |
Montreal, Canada | 92 | 100 | 92 |
MN, USA | 106 | 114 | 106 |
FL, USA | 107 | 118 | 109 |
TX, USA | 114 | 117 | 138 |
CO, USA | 129 | 118 | 138 |
Mumbai, India | 142 | 124 | 130 |
WA, USA | 135 | 144 | 136 |
CA, USA | 141 | 149 | 137 |
South Africa | 157 | 165 | 155 |
CA, USA (IPv6) | 278 | 149 | 230 |
Tokyo, Japan | 243 | 229 | 231 |
Buenos Aires, Argentina | 263 | 260 | 224 |
Beijing, China | 260 | 253 | 249 |
Hong Kong, China | 283 | 293 | 287 |
Sydney, AU | 298 | 351 | 334 |
Brisbane, AU | 313 | 343 | 331 |
Shanghai, China | 332 | 419 | 369 |
We can render this data as a graph to try to make it more comprehensible. It helps a bit, but not much. In the graph, a steeper line is better, so CloudFront does well at the start and mid-table, but also has the site with the longest ping time overall. The lines jostle for the top spot, from which it's reasonable to conclude they're all giving pretty similar performance in the aggregate.
In conclusion, apart from the unexpected redirects, setting up CloudFront was really straightforward and the result is a pretty decent and cheap website serving platform. While I'm not in a position to compare with other CDN services, I'd certainly use CloudFront again even without the added incentive of wanting to know more about it.
I'm now looking in to adding an SSL cert to the site. Again Amazon have made it really straightforward to do, but the trickiest part is figuring out the cost implications. The site doesn't accept any user data and SSL would only benefit the integrity of the site (which, for this site, is of arguable benefit), so I'd only be doing it for the experience. If I do, I'll post up my experiences here.