flypig.co.uk

List items

Items from the current list are shown below.

Blog

All items from July 2016

18 Jul 2016 : Using a CDN for regrexitmap.eu #
This weekend I've been playing around with Amazon's CloudFront CDN. I've been setting up a new site, www.regrexitmap.eu, and although I'm not expecting it to be heavily used, the site is bandwidth-heavy and entirely static on the server-side, so a good candidate for deployment via CDN. For those unfamiliar with the term, CDN stands for Content Delivery Network, able to push the content of a website out to multiple servers across the world. This moves the content closer to the end-users, in theory reducing latency and making the site feel more responsive.

There are other benefits of using a CDN. Because the site is served from multiple locations it also makes it less susceptible to denial of service attacks. Since I work in security, there's been a lot of discussion in my research group about DoS attacks and I recently saw a fascinating talk by Virgil Gligor on the subject (the paper's not yet out, but Ross Anderson has written up a convenient summary).

The availability that DoS attempts to undermine offers a wholly different dynamic from the confidentiality, integrity and authenticity that I'm more familiar with. These four together make up the CIAA 'triad' (traditionally just CIA, but authenticity is often added as another important facet of information security). Tackling DoS feels much more practical than the often cryptographic approaches used in the other three areas. An attacker can scale up their denial of service by sending from multiple sources (for example using a botnet), while a CDN redresses the balance by serving from multiple sources, so there's an elegant symmetry to it.

In addition to all of that, CloudFront looks to be pretty cheap, at least compared to spinning up an EC2 instance to serve the site. That makes it both educational and practical. What's not to like?

Amazon makes it exceptionally easy to serve a static site from an S3 bucket. Simply create a new bucket, upload the files using the Web interface and select the option to serve it as a site.

S3 bucket

The only catch is that you also have to apply a suitable policy to the bucket to make it public. Why Amazon doesn't provide a simpler way of doing this is beyond me, but there are plenty of how-tos on the Web to plug the gap.

S3 bucket policy

Driving a website from S3 offers serviceable, but not great, performance. A lot of sites do this, and already in May 2013 netcraft identified 24.7 thousand hostnames running an entire site served directly from S3 (with many more serving part of the site from S3). It's surely much higher now.

Once a site's been set up on S3, hosting it via CloudFront is preposterously straightforward. Create a new distribution, set the origin to the S3 bucket and use the new cloudfront.net address.

S3 distribution origin settings

The default CloudFront domains aren't exactly user-friendly. This is fine if they're only used to serve static content in the background (such as the images for a site, just as the retail Amazon site does), but an end-user-facing URL needs a bit more finesse. Happily it's straightforward to set up a CNAME to alias the cloudfront subdomain. Doing this ensures Amazon can continue to manage the DNS entry it points to, including which location to serve the content from. So I spent £2.39 on the regrexitmap.eu domain and am now fully finessed.

Finally I have three different domain names all pointing to the same content.
The process, which is in theory very straightforward, was in practice somewhat glitchy. The bucket policy I've already mentioned above. The part that caused me most frustration was in getting the domain name to work. Initially the S3 bucket served redirects to the content (why? Not sure). This was picked up by CloudFront, which happily continued to serve the redirects even after I'd changed the content. The result was that visiting the CloudFront URL (or the rerexitmap.eu domain) redirected to S3, changing the URL in the process, even though the correct content was served. It took several frustrating hours before I realised I had to invalidate the material through the CloudFront Web interface before all of the edge servers would be updated. Things now seem to update immediately without the need for human intervention; it's not entirely clear what changed, but it certainly hindered progress before I realised.

The whole episode took about a day's work and next time it should be considerably shorter. The cost of running via CloudFront and S3 is a good deal less than the cost of running even the most meagre EC2 instance. Whether it gives better performance is questionable.

Comparing basic S3 access with the equivalent CloudFronted access gives a 25% speed-up when accessed from the UK. However, to put this in context, serving the same material from my basic fasthosts web server results in a further 10% speed-up on top of the CloudFront increase.

Loading times for S3
Loading times accessing the site on S3 (2.54s total).

Loading times for CloudFront
Loading times accessing the site via CloudFront (1.92s total).

Loading times for fasthosts
Loading times accessing the site on fasthosts.co.uk (1.75s total).

If I'm honest, I was expecting CloudFront to be faster. On the other hand this is checking only from the UK where my fasthosts server is based. The results across the world are somewhat more complex, as you can see for yourself from the table below.

Ping times for the three access methods from across the world (all times in ms, from dotcom).
Location S3 CloudFront Fasthosts
Amsterdam, Netherlands 16 9 12
London, UK 13 19 9
Paris, France 14 10 26
Frankfurt, Germany 26 7 23
Copenhagen, Denmark 32 18 33
Warsaw, Poland 48 26 38
Tel-Aviv, Israel 79 58 72
VA, USA 88 93 86
NY, USA 76 105 87
Amazon-US, East 80 99 100
Montreal, Canada 92 100 92
MN, USA 106 114 106
FL, USA 107 118 109
TX, USA 114 117 138
CO, USA 129 118 138
Mumbai, India 142 124 130
WA, USA 135 144 136
CA, USA 141 149 137
South Africa 157 165 155
CA, USA (IPv6) 278 149 230
Tokyo, Japan 243 229 231
Buenos Aires, Argentina 263 260 224
Beijing, China 260 253 249
Hong Kong, China 283 293 287
Sydney, AU 298 351 334
Brisbane, AU 313 343 331
Shanghai, China 332 419 369

We can render this data as a graph to try to make it more comprehensible. It helps a bit, but not much. In the graph, a steeper line is better, so CloudFront does well at the start and mid-table, but also has the site with the longest ping time overall. The lines jostle for the top spot, from which it's reasonable to conclude they're all giving pretty similar performance in the aggregate.

Pings times cummulative over location

In conclusion, apart from the unexpected redirects, setting up CloudFront was really straightforward and the result is a pretty decent and cheap website serving platform. While I'm not in a position to compare with other CDN services, I'd certainly use CloudFront again even without the added incentive of wanting to know more about it.

I'm now looking in to adding an SSL cert to the site. Again Amazon have made it really straightforward to do, but the trickiest part is figuring out the cost implications. The site doesn't accept any user data and SSL would only benefit the integrity of the site (which, for this site, is of arguable benefit), so I'd only be doing it for the experience. If I do, I'll post up my experiences here.
Comment