Not Found

Sorry, but I couldn't find the page that you requested. Maybe it's been lost? Or deleted? Or stolen?!

Click the 'back' button of your browser to return to where you came from, or alternatively, you can always return Home.


13 Mar 2021 : Male Violence against Women #
Today I watched this video of Chris Hemmings. It's inspiring and I can't find a word in it to disagree with.

After watching it I was really surprised to see the reaction of many men disagreeing, stating that it wasn't their fault (they're not predators) and that the focus should be put on “bad individuals” rather than “men as a group”.
Chris Hemmings on BBC News

The fact is, as a man I've benefited from living in a patriarchal society. Men have (predominantly, if not exclusively) made the laws, shaped the culture, chosen the path that society has taken. Individual men have reaped the greatest of the rewards from this by become rich and powerful. Society — the society I live in — has been like this for centuries if not longer.

The idea that I, as a man, haven't benefited from sexism, that I've not benefited from an entrenched prejudice against women and towards men, is frankly absurd. The fact is that I benefit from it everyday, by being paid a bit more, from being taken more seriously, by being listened to more intently, from not having to do things I would otherwise have to do. By having more freedom to make my own choices. I couldn't pick out a specific instance where I've benefited from this system. I don't know of any time I got a job instead of a more qualified female candidate. I never fought for a pay rise that would otherwise have been given to a woman. I never knowingly took the men in my computer classes more seriously than the women. In fact, I have always gone out of my way to support women in technology (I hope in a practical, rather than patronising way).

So I couldn't give you a specific instance of where I've benefited from society being patriarchal. But that doesn't mean it isn't the case. It just seems so obvious to me that I benefit in unquantifiable ways on a daily basis. How could it not be so? How could it be that male preference is woven so tightly into the history of society, but that I haven't personally benefited as a man? Of course I have.

It's true that it may not feel like this for many men. Many men feel they are treated unfairly by the world, that they've earned what they have and deserve more than they're given. The fact is, I've never met anyone who didn't believe this. Everyone feels they're working hard for little reward. If you're a man, I hate to break it to you, but in the majority of cases this just isn't true: if it wasn't for living in a male-dominated society you would simply be getting less than you do now. It's true, of course, that there are other prejudices and that many people suffer from them. And you don't have to be a minority to be the victim of prejudice. But if you're a victim of some other form of bigotry, that doesn't mean you don't also benefit from being male. The fact is that cumulatively, sexism and prejudice against women is probably the most significant injustice that exists in society. As a man, it's almost impossible to have avoided benefiting from sexism.

So now, when women are asking to be treated with respect, and where the overwhelming cases are of men treating women badly. now is the time when some men want to disown the patriarchy and deny responsibility for the society that we helped create, and which we have benefited from.

Personally I don't buy it. I think we, as men, are very much to blame here. Not necessarily to blame for specific instances of violence or bigotry perpetrated against women, but certainly to blame for a society and culture that allows it to happen. As men, we've all taken the benefits, so now would be a good time to accept the responsibility.

I want to make clear that I accept my share of the responsibility. I'm not exactly sure what accepting it entails, but probably it means in the future having to accept more compromises in terms of my personal income, personal comfort, and how much my voice is heard, in order for women to gain a rightfully more dominant position in society.

The fact is I've met a lot of very competent women and the world would not be worse off if their voices had the same weight as their male counterparts. The men fighting against this will, I hope, be on the wrong side of history.

27 Feb 2021 : Carbon Cancel Culture #
In an ideal world with a circular economy, it might be possible to achieve something close to a carbon-neutral lifestyle. Right now this is not so easy. Even with our best efforts, and while living in lockdown, Joanna and I still managed to produce nearly 9 tonnes of CO2 last year.

So, whilst reducing production is always the best goal, it's still necessary to think about what to do with the remaining output. A quick search on the Internet will reveal a massive choice of carbon offsetting schemes, and when I looked into it last year I was basically overwhelmed. There's plenty of advice (is it good advice?) to suggest which schemes to go for. There's plenty of advice (is it good advice?) telling you that it's a pointless exercise. I don't know whether it's worth it or not, but at worst it's an opportunity to be scammed, while at best it might actually be doing some good. That pushes the risk-reward balance over into the positive for me.

Last year I ended up using Karbonaut to offset my output. The words on the website made it look legitimate with claims to be contributing to "Gold Standard" projects. But in practice I wasn't going on much. Well, Karbonaut is now "closed", which isn't a good sign. Not that I'm suggesting it isn't legitimate, but at least it meant this year I had to start my search all over again.

So, it was with some relief and happiness that I discovered that the UN runs a centralised carbon offsetting platform as part of the United Nations Framework Convention on Climate Change. You don't have to be a government or company to use it, any individual can just rock up at and use it to contribute to a carbon offsetting project. The money goes directly to the project you choose and there's plenty of in-depth documentation about every project to help decide which to go for.

At the end of the process you even get a convincing looking Voluntary Cancellation Certificate. If you're thinking of offsetting your carbon footprint, I strongly recommend it.
Cancellation Certificate from

30 Jan 2021 : Escaping Hades #
Yesterday after much blood, toil and darkness, I finally made it out of Hades. I've made it to the surface multiple times, but before this had never managed to get past Hades himself. He is, after all, very mean. It turns out that, for me at least, the key to escaping his onslaught was all in the loadout of boons and benefits.
I'm not so much of a melee fighter, I like to get distance between me and my foes. So I'd assumed that the Exagryph was going to be the weapon that expedited my escape. But the game is smart in the way it encourages the player to make different choices on different runs and consequently I wasn't sticking to it for every attempt. So I was trying the Heart-Seeking Bow. It's also a ranged weapon, but initially I'd dismissed it because of the slow aiming. In fact I wasn't wrong about that: the main weapon wasn't so useful in the event. The special on the other hand is fast, almost always hits something and also deals a devastating amount of damage up close. That makes it quite formidable. Plus the choice of weapon doesn't affect the options for Cast, and this was an important factor in my escape.

In terms of boons, the key set that really made it possible was the following.
  1. Multiple Crystal Beam Casts from Demeter. During the final boss encounter you need to spend a lot of time hiding behind things, keeping moving and generally doing things that don't allow you to aim, or to get close to Hades. Crystal Beams make a great weapon here because you can drop them and leave them to do their work. Fire and forget. However, they're slow to work and slow to re-aim, so one isn't enough. With three Bloodstones, a 3 second regeneration cycle and a 5 second lifetime, I was able to plant four simultaneous Crystals at different points on the map, making them almost impossible for foes to evade.
  2. Super Speed from Hermes. Hades has a wide reach, and being able to get out of his way, by being not just fast but really fast is both unexpectedly helpful and exhileratingly fun. With boons from Hermes giving me increased movement speed, going yet faster after a dash, it was easy to evade almost all of the attacks from Hades. Combined with the Crystal Beams that allowed me to keep moving without having to stop to attack, this was a great combination.
  3. Life Recovery from Athena and Demeter. Usually during a battle I'm relying on Stubborn Defiance to get me out of a tight spot. But during this escape I picked up Stubborn Roots, a Duo boon from Athena and Demeter. This kicks in when you have no remaining Stubborn Defiance opportunities, leading to a slight but perceptible life recovery over time. This also combines well with super speed, used as a delaying tactic to avoid taking damage while allowing my life bar to rise up.
There were other advantages too, such as a beefed-up special that was useful for picking off the shades periodically spawned by Hades to help him take me down. But the combination of multiple crystals, extra speed and gradually increasing life was what really got me through.

Below is the full list of boons I accumulated on my travel through Hades. For the record, it took 71 chambers, 10453 slain foes, 53 attempts and 68 hours to make the escape.

Main Weapon
  1. Heart-Seeking Bow
Attack boons
  1. Epic Divine Strike Lv.2 - Your Attack is stronger and can deflect - Attack damage + 90%
  2. Sniper Shot - Your Attack deals +200% damage to distant foes
  3. Perfect Shot - Your Power Shot is easier to execute and deals +150% damage
Special boons
  1. Divine Flourish Lv.3 - Your special is stronger and can deflect - Special damage + 101%
  2. Chimaera Jerky - Passive: your special deals +30% damage - Duration 5 encounters
  3. Halting Flourish - Your Special deals +48% damage
Cast boons
  1. Rare Crystal Beam Lv.2 - Your cast drops a crystal that fires a beam at foes for 5 Secs. - Cast Damage 11 every 0.2 Sec.
  2. Rare Snow Burst Lv.3 - Whenever you cast, damage foes around you and inflict chill - area damage 77
  3. Legendary Fully Loaded - gain extra Bloodstones for your cast - Max Bloodstones +2
  4. Braid of Atlas - Passive; your cast deals +50% damage - Duration 5 encounters
  5. Epic Slothful Shot - Your Cast deals +75% damage
Speed boons
  1. Epic Greater Haste - You move faster - move speed 40%
  2. Epic Hyper Sprint - After you dash, briefly become sturdy and move +100% faster - sprint duration 0.7 Sec.
  3. Epic Blade Dash Lv.3 - Your dash creates a blade rift where you stand - 24 damage per hit
Life boons
  1. Duo Stubborn Roots - If you have no Stubborn Defiance, your life slowly recovers - Life Regeneration During Battle 1 every 0.8 Sec.
  2. Lucky Tooth - Automatically restore up to 100 Life when your Life Total is depleted - Once per escape attempt
  3. Rare Nourished Soul - Any Life Restoration effects are more potent - Bonus restoration +32%
  4. Quick Recovery - After taking damage, quickly Dash to recover some Life you just lost - Life recovered 30% of damage taken
Uncategorised boons
  1. Rare Pressure Point Lv.2 - Any damage you deal has a chance to be critical - Critical chance +4%
  2. Support Fire - After you cast, or hit with an Attack or Special, fire a seeking arrow - Arrow damange 10
  3. Epic Blinding Flash - Your abilities that can Deflect also make foes Exposed - Bonus Backstab Damage +75%
  4. Epic Hunter's Mark - After you deal Critical damage to a foe, a foe near you is Marked - Marked Critical Chance +54%
  5. Dark Thirst - Whenever you find Darkness, gain +20%
23 Jan 2021 : New Year's Resolutions 2021 - Reckoning and Renewal #
The year 2020 was a strange one, not just for me but for everyone. For me it was also a year full of achievement, and while I'd hate to brag, I really would like to get some of these things down on "paper", so that I can look back at them in the future and appreciate my dedication (to TV and games mostly!).

This was the year of the pandemic of course, and that shaped my life just as it did everyone else's, sometimes in very unexpected ways. Most notably, I spent much more time in Finland than I was expecting, geographically distant from Joanna. My flat is small, but it does have a balcony, a good view, good facilities and a decent Internet connection, all of which I was very thankful for.

Some people think success is about having as much impact on the world as possible. One of the things being in Finland in the midst of a pandemic taught me is that sometimes the opposite is true. Most of the things I've grown to value over the last year have been about having as little impact as possible.
You could say that I've been trying to earn my metaphorical goose feather babiches: learning, as Joel Fleischman did in the closing episodes of Northern Exposure, to go lightly through life. Coincidentally, one of my achievements this year was to finally watch the entirety of Northern Exposure from start to finish, all six series and 110 episodes. I must have watched my first episode back when it aired on Channel 4 in 1990, so that's a 30-year goal completed there. If that isn't exactly something to be proud of, it does at least make me feel fulfilled.

In a similar vein I also completed Shadow of the Tomb Raider. That's been a 17 year journey, although admittedly one with a periodically shifting finish line as new games are released. Finally, I also completed the final instalment of the Syberia adventure game series. I arrived a little late to the series in 2014, but that still mounts up to a pretty long journey. Sadly there was less snow than I was hoping for in the third instalment compared to the second, but it was still a really good feeling reaching the conclusion of the story.
Another thing I put a lot of effort into in 2020 was the development of Contrac, my Linux implementation of Google and Apple's Exposure Notification API. When I released the first version in July I naively thought it would be something I could get done and then put to one side. It didn't turn out that way. The specs have been updated and extended at least three times, and it turns out there's a never-ending potential for improvement. So it's ended up requiring considerably more commitment than I expected. What started as a simple desire to understand the protocol and hold Google and Apple to account turned into something more practical, but also more effortful.

Contrac will, no doubt, continue to eat up a lot of my spare time in 2021.

Finally, 2020 was also a rubbish year for me in a very literal way. Back in 2019 I started collecting data about the amount of waste I produce (recycling, non-recycling returnable and compostable waste). I continued collecting this data throughout 2020, with the aim not just of understanding my waste output, but also of reducing it. I think I had some decent success with this, going from a daily average waste output of 304.18g per day in 2019 to a much improved 153.9g per day in 2020. That's nearly half, and well below the target of less than 300g per day I set myself last January. Here's what my waste output looked like in 2020.
Daily waste data histocurve for 2020

So what of my other new year's resolutions from last year? The repeated lockdowns, social isolation and travel restrictions all counter-intuitively helped me progress with them.

My first target was to rewrite this website from legacy ASP to PHP, so that it could be moved over to a LAMP server. I was hoping to get this done-and-dusted by the end of January and by 19th January this was looking plausible. Ultimately it took a bit longer, with the final switch over happening on 17th March. Looking back, I'm pretty proud of this result. The conversion ended up taking the original 54 files and 4871 lines of bespoke ASP code and re-writing it into 49 files and 5028 lines of PHP code. The resulting website is to all intents and purposes identical to the previous site for the end user, but now running on Linux with a TLS certificate. It was a lot of work, but I'm happy with the (almost completely unchanged) result.

Second, as already mentioned, I aimed to reduce my daily waste output to an average below 300g a day. This seemed like an achievable but still worthwhile goal. Reducing my waste to that level required a lot of work and a complete change in my buying habits. I now eat a lot more leftovers, and have shifted from glass to cardboard and plastic packaging. I forewent peanut butter for almost a year until just recently when I managed to source it in a non-glass container. I really hope I can keep my waste down at a similar level in 2021 as well.

Third I planned to work on Scintillon, my Sailfish OS app for controlling Philips Hue smart lights. This was, unfortunately, one of the victims of the pandemic, or more precisely, of my work on Contrac. While Contrac got 15 released during the year, Scintillon didn't get any.

Fourth I planned to spend 30 minutes each day learning Finnish. This is, sadly, where I failed most miserably. This was partly due to isolation, which left my Finnish classes cancelled and my opportunity to interact with native Finns greatly reduced. But even though there was nothing stopping me progressing with my FinnishPod101 subscription, I even failed with that. Ironically this January I started using the Duolingo app and have been working my way through the Finnish lessons quite steadily. So maybe I'll achieve the goal... just a year later.

So, that's two successes and two failures. Hopefully that's just about enough momentum to propel me into 2021 ready to attack some new resolutions. So, here are my goals for 2021.
  1. Put this website code into a public git repository. Having converted the code to PHP, it's high time I released it as open source. I might even include the old ASP code too, along with all of its hideous flaws. I'm no ASP coder. And no PHP coder either.
  2. Each week by the end of the weekend, spend at least an hour doing something calming that doesn't require a computer. For example it could be solving a maths problem, doing something artistic, reading a book, writing something or just going for a walk. The key thing is that I must keep a record of what I did each week, otherwise I'm going to lose the impetus to continue.
  3.  Complete the bisection analysis that Frajo and I started working on a year ago. We collected all the data and wrote the algorithm; now we just need to apply the algorithm to the data. It's one of those tasks that never seems to reach the top of my priority list, despite the fact it might generate some interesting results.

Plus, I aim to maintain the wins I made in 2020 by continuing with Dulingo, keeping my average waste output down to less than 250g per day, and keeping my carbon footprint to less than 5 tonnes of CO2 during the year.

That's a fair amount to manage, and it may be harder without the pandemic, but it's good to have goals, right?
1 Jan 2021 : Something positive from 2020: a reduced carbon footprint #
Back in April last year I reviewed my carbon footprint and found it to be much higher than I'd hoped. Because my wife Joanna works in Cambridge UK, while I work in Tampere Finland, our carbon output caused by flights was off the scale. Along with the fact that we're essentially running two households, our combined CO2 output was 14.47 tonnes in 2019, or about 7.24 tonnes each. Compared to the UK average of 6.5 tonnes, or world average of 5 tonnes, that really doesn't look good. Especially when you think that we were trying our best to keep it low (for example, I don't run a car and subscribe to a fully renewable electricity plan).

We were determined to improve on this in 2020 and gave ourselves some targets to hit. Then of course 2020 turned out to be an atypical year, to put it mildly. We both spent the majority of the year working from home. For six months we were in separate countries unable to travel to see each other. And while this was bad in many ways, it did at least have an impressive effect on our carbon footprint.

With our ability to travel seriously curtailed, the numbers look very different for 2020. Here's the complete breakdown, including the respective values for 2019 and the goals that we set ourselves back in April.

Source Details for 2020 CO2 output 2019 (t) Goal for 2020 (t) CO2 output 2020 (t)
Electricity 1 427 kWh 0.50 0.25 0.40
Natural gas 6 869 kWh 1.18 1.18 1.26
Flights 4 return HEL-LON 5.76 3.46 2.26
Car 2 000 km 1.45 0.97 0.39
Bus 40 km 0.00 0.00 0.01
National rail 400 km 0.08 0.16 0.01
International rail 1 368 km 0.02 0.04 0.01
Taxi 37 km 0.01 0.02 0.01
Food and drink   1.69 1.69 1.11
Pharmaceuticals   0.26 0.26 0.32
Clothing   0.03 0.03 0.06
Paper-based products   0.34 0.34 0.15
Computer usage   1.30 1.30 1.48
Electrical   0.12 0.12 0.29
Non-fuel car   0.00 0.00 0.10
Manufactured goods   0.50 0.10 0.03
Hotels, restaurants   0.51 0.51 0.16
Telecoms   0.15 0.15 0.05
Finance   0.24 0.24 0.24
Insurance   0.19 0.19 0.11
Education   0.05 0.05 0.00
Recreation   0.09 0.09 0.06
Total   14.47 11.14 8.50

In some areas we didn't hit our targets, but when it comes to travel we obliterated them. The final result is a combined carbon footprint of 8.5 tonnes of CO2, or 4.25 tonnes each. That's really quite good, taking us well below the UK (6.5 tonnes) and EU (6.4 tonnes) averages, and even taking us below the worldwide average of 5 tonnes.

If 2020 had been a normal year we clearly would have struggled to keep our footprint so low. But it's all the same to the environment and so I'm glad for the improvement.

Turning to the future, the real question will be whether we can sustain this same low level in 2021. Given the uncertainty of what lies ahead and the peculiar circumstances we experienced last year, it doesn't seem sensible to try to set a lower target, but rather to simply aim to match what we did in 2020 and see how we get on with that.

If you're interested to calculate your own carbon footprint, I can recommend the Carbon Footprint Calculator I used to compile the values here. It really made the process surprisingly painless.
1 Nov 2020 : Occasional Tomb Raider update #
I've been keeping up-to-date with the Tomb Raider series, completing all of the games and raiding all of the tombs, for the last 17 years (yes, you read that right). Not all of the games in the series have made it on to my blog, but as both the games and the gaps between them have grown longer, it's given me more time to reflect on the journey.

The first game was released in 1996 but my first blog post on the topic wasn't until 2013, which was already a decade after my first Tomb Raiding experiences at university. By then I'd already completed fourteen games. More recently in 2017 I completed Rise of the Tomb Raider. This afternoon I reached the (ever-rising) summit once again by completing Shadow of the Tomb Raider, the latest in the series. As far as I'm aware there's no public plan or date for the next release, so I guess I'll get a bit of a break before the next one.
I like to categorise the series into six chunks:
  1. Original Game (from the very first Tomb Raider all the way through to Chronicles)
  2. Angel of Darkness
  3. First reboot (Legend, Anniversary, Underworld)
  4. Interlude (Guardian of Light, Temple of Osiris)
  5. Interlude (Relic Run, Go)
  6. Second reboot (Tomb Raider, Rise, Shadow)
There's also the Game Boy games, but I never played them as I didn't own the hardware, so I pretend to myself that these don't exist.

My feelings on each of these are mixed. Obviously I loved the original series. Even though I played them in the wrong order and well past their release dates, I loved the exactness of the game play (walk to edge, one jump back, run and jump). It was also mesmerising for me to see how they managed to achieve so much within the constraints of the technology. These were the "Integer" games with rectilinear grid-based maps. Yet the designers managed to harness them to represent not just tombs but also jungles, deserts, stately homes, cathedrals, alien spaceships and crumbling Scottish castles. They represent a masterclass in squeezing atmosphere from a very constrained set of tools. No game before managed to capture the magic of walking from a tight corridor into a huge cavernous room quite like these games.

Angel of Darkness stands on its own. It almost sank the franchise because of the terrible reviews. It certainly had its problems but I loved it in spite of them. It was the first of the games to really try to include a serious story, and the Louvre is a handsome building to break into. I would have loved to find out where Core Design wold have taken it if Eidos hadn't given the licence to Crystal Dynamics.

The first reboot was enjoyable but somewhat unmemorable for me. It was a worthy reboot, capturing much of the original's atmosphere, but the stories fell halfway between the original and the Angel of Darkness approach: trying to be serious but ultimately a bit too slick. The game play did, however, manage to bridge the divide between the very exacting original games and the more free-flowing second reboot. If the original games were the Integers, these games were a shift to the Rationals.

And now the second reboot. To complete my analogy with number systems, these games are the Reals. They lack any sort of exactitude, and it never feels like you're entirely in control. On the other hand, this makes them feel far less digital and much closer to real life. The design, artwork and effects are beautiful. The stories are supposed to be tales of growth and coming of age, but I'm not really sure they quite succeed. Others have claimed that the attempt at serious narrative never quite gels with the absurd situations and gameplay. And I think I agree with that criticism. Still, they captured my interest enough for me to pour many hours of my life into completing them, and it's possible to enjoy the game while still having to separate the story from the action.
Shadow of the Tomb Raider, the game I've just completed, has lost almost all of the levity of the original games. But when you look at it from a technical or graphical perspective, it's an astonishing piece of work. The lighting, architecture, flora and forna are phenomenal. The small touches of interaction (Lara's hands pressing against the walls, foliage moving aside as you walk through them, birds scattering, rainbows appearing in the spray of a waterfall) make things feel very vivid and alive. On the other hand interactions with other characters are less convincing. If I'm honest, solitude is part of the appeal of the games, and if the village hubs had been removed leaving only the main story and challenge tombs, I'd have been quite satisfied.

So, here's my full list of completed Tomb Raider games. Shadow of the Tomb Raider is the last in the trilogy, so I wonder where it will go from here.
  1. Tomb Raider.
  2. Unfinished Business and Shadow of the Cat.
  3. Tomb Raider II: Starring Lara Croft.
  4. Tomb Raider III: Adventures of Lara Croft.
  5. The Golden Mask.
  6. Tomb Raider: The Last Revelation.
  7. Tomb Raider: The Lost Artefact.
  8. Tomb Raider Chronicles.
  9. Tomb Raider: The Angel of Darkness.
  10. Tomb Raider Legend.
  11. Tomb Raider Anniversary.
  12. Tomb Raider Underworld.
  13. Lara Croft and the Guardian of Light.
  14. Tomb Raider (reboot).
  15. Lara Croft and the Temple of Osiris.
  16. Rise of the Tomb Raider.
  17. Shadow of the Tomb Raider.
29 Oct 2020 : Dishwasher or washing up bowl. Which is really better for the environment? #
Last week I considered whether I should be buying stuff in plastic packaging in preference to glass. So since I've started this game, I thought it would be good to move on a step and look at another part of my life.

The rented flat where I live comes with a dishwasher, but I've never actually used it. The main reason is that I don't have enough crockery to fill it, but maybe I should? I've been told in discussion, and also by advertisements, that using a dishwasher is more ecological than washing up by hand. This always seemed a bit implausible to me, but maybe it's true?

Let's find out.
Dishwasher and sink

First of all, how much energy is needed to do a batch of washing up? This depends on what you do, but my washing up regime is pretty consistent: I fill the sink with water that's as hot as I'm comfortable splooshing around in. I never use more than one sink's-worth since, as I already mentioned, I don't have much crockery anyway.

To work out how much energy it takes for me to wash up we need two things: the amount of water, and the temperature increase of the water.

For the amount I filled the sink using my kettle. It took a total of six kettle-cycles. Each cycle I weighed the kettle before and after and recorded the weight difference. Adding up all of these differences gave me the total weight of water that went into the sink: $10.234\ {\rm kg}$.

The temperature I find comfortable in the sink is $38^{\circ} {\rm C}$, which is a raise of $18^{\circ} {\rm C}$ (or 18 Kelvin) above room temperature.

A quick skim of the Web reveals that the specific heat capacity of water at this temperature is $4179.6\ {\rm J}\ {\rm kg}^{-1}\ {\rm K}^{-1}$.

So, to calculate the energy $E_S$ required (where $S$ stands for sink), we need to multiply everything together like so.
E_S &= 4179.6\ {\rm J}\ {\rm kg}^{-1}\ {\rm K}^{-1} \times 10.234\ {\rm kg} \times 18\ {\rm K} \\
&= 769932\ {\rm J} \\
&= 770\ {\rm kJ}.
My sink, useful for washing up

That's the first half of our comparison. Now we need the energy $E_D$ used by the dishwasher (I'll leave you to figure out what the $D$ stands for). The diswasher is an AEG F77420W0P (energy efficiency class A++) and luckily the dishwasher manual has a handy table that lists the energy requirements of the different modes. The table only has the values in kilowatt hours, but this is just a different unit for measurement energy. In fact $1\ {\rm kWh} = 3.6 \times 10^6\ {\rm J}$, so we can calculate the kJ by multiplying the kWh values by 3600.
Mode Energy (kWh) Energy (kJ) Water (l)
ECO 0.7 2520 9.9
Auto 0.5 — 1.2 1800 — 4320 6.0 — 11.0
PRO 1.3 — 1.4 4680 — 5040 11.0 — 13.0

I don't know what these different modes — ECO, Auto and PRO — are for, but let's assume we'd be using the ECO setting. This means that for my dishwasher, in ECO mode, we have $E_D = 2520\ {\rm kJ}$.

And now we have what we need to do a comparison.
My dishwasher, very shiny

A washing up session takes $770\ {\rm kJ}$ whereas a dishwasher load takes $2520\ {\rm KJ}$; one dishwasher load is the equivalent of 3.27 sinks of washing up. My dishwasher is of the slim variety, but it still holds up to 14 plates, plus a bunch of other stuff. So if I wash at least 5 plates with each sink of water, then the sink will end up being more ecological in the long run. That's not unreasonable and suggests to me that in fact, the sink and dishwasher are fairly similar in terms of their energy use.

However, another factor is the water usage. The manual states that 9.9 litres of water are needed for an ECO load. That's the same amount as a single dish washing session in the sink, so the comparison here is in favour of the dishwasher.

To summarise, it does indeed seem that if you're doing a full load, you'd be better off (environmentally speaking) using the dishwasher. If you're doing less than a full load, the sink could well be better.

None of this includes the energy needed to build the dishwasher. According to this article in The Guardian, for an appiance that lasts 10 years this could add an extra 20% environmental cost, but I've not seen the calcuations and I couldn't find the actual figures for my dishwasher, so I'm not including that here.

These numbers are also all rather specific to my situation of course. A bigger dishwasher might be more efficient. For me, it's a little academic, since it would impossible for me to fill the dishwasher, so the future for me is clear: more washing up.

26 Oct 2020 : Glass or plastic. Which is really better for the environment? #
For the last 14 months I’ve been collecting data about how much rubbish I produce, broken down into various categories (paper, card, glass, metal, returnables, compost, plastic and general). I’ve had two aims: first to gather data about how much rubbish I generate and second to try to reduce my overall output for environmental reasons.

One of the encouraging things about this process is that it seems to have worked. If I look at my waste output between mid-August and mid-October 2020 and compare it to the same period last year, my output has reduced from an average of 366 g per day to 126 g per day, a two thirds decrease. Here’s the breakdown of how the two years compare across the categories.
Waste output by category between August and October, comparing 2019 and 2020

I’ve been using a variety of different techniques to achieve this. For example my tolerance for eating food past it’s best-before date has increased considerably. There’s a sticker above my letter box asking not to receive any junk mail. I also buy food with lighter packaging: cardboard packets of beans instead of tins, cartons of wine instead of bottles. Wherever possible I buy plastic pots and bottles instead of glass.

Glass is really heavy, so cutting it out has been a really easy way to reduce the weight of my waste and as you can see from the graph, this is where I made my biggest decrease. But for many this choice will seem controversial, and many times when I’ve picked a plastic bottle from the shelf at the grocer instead of glass, I’ve wondered whether I was driven more by hitting my weight targets than any real environmental benefits.

So I thought I’d better look into the relative environmental impacts of glass as compared to plastic. Plastic has had a bad rap recently for having a terrible impact on the marine environment. But this is rather emotive, and is only one facet of the environmental impact of a product. Actually figuring out the full life cycle environmental impact of something is fiendishly difficult. You have to consider the production costs, transportation costs, recycling costs and much more besides. Happily Roberta Stefanini, Giulia Borghesi, Anna Ronzano and Giuseppe Vignali from the University of Parma have done all of this hard work already. Their paper “Plastic or glass: a new environmental assessment with a marine litter indicator for the comparison of pasteurized milk bottle”, recently published in the International Journal of Life Cycle Assessment, compares the environmental impact of glass and plastic polyethylene terephthalate (PET) across a range of environmental factors for the full life cycle of the packaging. This includes comparing non-recycled PET with recycled PET (R-PET) bottles, as well as non-returnable glass and returnable glass bottles.

The indicators used for comparison are “global warming (kg CO2 eq), stratospheric ozone depletion (kg CFC11 eq), terrestrial acidification (kg SO2 eq), fossil resource scarcity (kg oil eq), water consumption (m3) and human carcinogenic toxicity (kg 1.4-DCB)”. In addition they also introduce a new marine litter indicator (MLI).

What they find is surprisingly clear-cut. Across all of the indicators apart from MLI the same pattern emerges: R-PET is the least environmentally damaging, followed by PET. Returnable glass bottles follow, with non-returnable glass bottles the worst by a large margin. We can see this in the six graphs below. There’s a lot of detail in them, but I wanted to include them in full because it’s fascinating to see both how complex the results are and also how the different processes contribute to the final environmental cost. But in spite of the detail the overall conclusion from each graph is clear: non returnable glass is worse than the others (in all of the graphs higher is worse).
Global warming of different packaging solutions stages Stratospheric ozone depletion for each stage
Terrestrial acidification for each stage Fossil resource scarcity for each stage of different packaging solutions
Water consumption for each stage of different packaging solutions Human carcinogenic toxicity for each stage of different packaging solutions

It’s a surprising definitive set of results. So why is it like this? The authors of the paper put this more clearly and succinctly than I could manage.
"glass bottles have the highest impact on environment, because of their production and transports. In fact, to create a glass bottle a lot of energy is used to reach high temperature. Moreover, plastics can be transported in octabins before the bottle formation in the food companies, while glass bottles are already transported in their final form, that takes up a lot of places and less bottles can be carried at each journey. Finally, glass bottle’s weight is very high, and trucks consume more, emitting more pollutants. For these reasons, glass bottle appears as the most impactful material according to global warming, stratospheric ozone depletion, terrestrial acidification, fossil resource scarcity and water consumption."

It’s worth noting that in the case of returnable glass bottles the authors assume that a bottle is reused eight times before having to be recycled. This is the number of reuses after which a bottle is likely to become broken or too scuffed to be used again. They determine that a bottle would have to be reused thirty times before its global warming potential reaches similar levels to those of a PET bottle, at which point the other criteria would still be worse environmentally.

The remaining criterion, not shown in these graphs, is that of the MLI. Here things change. MLI is proposed in the paper as an approach to comparing the relative impact on the marine ecosystem of the different packaging types. MLI is defined as follows:
{\rm MLI} = \frac{F_1^{f_1} \times F_4^{f_4}}{F_2^{f_2} \times F_3^{f_3}}
where $F_1$ is the number of disbursed containers, $F_2$ is the incentive for returning a bottle (e.g. the cash given for returning it), $F_3$ is the weight of the packaging material and $F_4$ is the material degradation over time (400 years in the case of glass, 100 years for PET). The values $f_1, \ldots, f_4$ are weights used to capture the relative importance of each of the four inputs.

The results for various weightings are given in this table (taken from the paper but amended slightly for clarity). As with the graphs, a higher number is environmentally worse.
MLI weights $f_1, \ldots, f_4$ PET R-PET Non-returnable glass Returnable glass
3, 2, 1, 2 0.56 0.56 19.47 0.78
2, 2, 1, 1 5.56 5.56 21.16 0.85
1, 1, ½, 1 0.75 0.75 4.60 0.92
2, 2, ½, 1 1.24 1.24 21.16 0.85
2, 3, 1, 2 0.93 0.93 105.80 0.85

This table shows that independent of the weights applied, non-returnable glass has the highest environmental impact. However, the comparison between R-PET and returnable glass is more nuanced. The authors conclude the following:
“According to the MLI proposed, the best solution would be using returnable glass bottles, thanks to the low number of bottles needed and therefore dispersed, their weight and return incentives. However, it is important to remember that the environmental dispersion of bottle is strictly related to human’s behaviour: consequently, it is important to raise the consumers’ awareness on this topic.”

The paper is thorough and we’ve covered a lot of detail here, but the conclusion for me is much simpler: from an environmental perspective returnable PET plastic is clearly better than glass across multiple criteria. The only place where this doesn’t apply is for MLI, for which it’s much harder to make definitive judgements.

It seems therefore, that I should carry on choosing plastic packaging over glass whenever possible. That will benefit both my weight targets and the environment.
11 Oct 2020 : The danger of a non-transparent AI Register #
The cities of Helsinki and Amsterdam recently announced the launch of their local government AI Register (Helsinki) and Algorithm Register (Amsterdam). This is certainly forward-looking and with positive aims, but actually looking through the registers, I was surprised and a little perturbed by how vague the entries are.

If the purpose of the registers is to promote accountability then it concerns me that the current implementation only provides the veneer of transparency. If government is claiming to provide transparency when it's not, however well-intentioned, this can lead to more harm than good.

Here's the feedback I sent to the city administrations and to the company running the registers. I'm not really expecting any results, but writing out my concerns was extremely therapeutic, albeit also quite time consuming. I recommend it as a satisfying activity if you have the time to spare.
With the recent establishment of your AI/algorithm registers, it’s great that you're taking the transparency of automated processes seriously. I hesitate therefore to criticise the schemes which are clearly well-intentioned and a step in the right direction, but it concerns me — based on the data currently available in the registers — that in their current form they may do more harm than good.

My three main concerns are the following.

1. Confusion between AI and algorithms. These two things are not the same, and conflating the two degrades public understanding of the issues involved. Algorithms cover a very broad set of concepts that includes every piece of software in use today. AI (or more specifically Machine Learning) is a much narrower concept. Machine learning involves applying an algorithm to a dataset, in order to produce a separate algorithm that can then be used as the basis for decision-making (or some other task). The resulting algorithms are much more opaque, their biases much harder to understand, and the datasets much more important for providing that understanding. Right now the register seems to include a mixture of both machine learning and traditional algorithms, but without any clarity over which is which. For each of the entries it should be made clear whether machine learning is involved, and if so what type.

2. Providing the algorithms. The entries in the database provide only a very high-level overview of the algorithms being used. Frankly, these are of no real use without more detail and the code for the algorithms needs to be made available. I’m very aware that commercial sensitivity is often used as an argument for why this can't be done, but as someone who works for a company developing open source software, I’m also aware that keeping algorithms and datasets private isn’t the only way to run a commercial or public service. If the register is to have real benefit, Helsinki and Amsterdam cities need to apply pressure to companies to make their algorithms available, or else give preference to those companies that will. Otherwise the register will end up being no more than a list of names of companies supplying software to local government.

3. Providing the datasets. If the algorithms are machine learning algorithms, then the full datasets used for training need to be made available (or a recent snapshot in the case of dynamic learning). Consideration must be given to privacy, and this is a real challenge, but the good news is that there’s a wealth of existing good practice in this area, especially coming from universities with their growing culture of open data for validating research, supported and encouraged by EU funding requirements.

To reiterate, I applaud the idea behind the registers, but I’d also encourage you to go further in order to allow them to be the real tools of accountability that the public needs, and that I think you're aiming for.

I was pleased to complete the survey on your site, where I also entered these comments. When you reply please let me know if you prefer me not to make our correspondence public (I’ll assume that it’s okay for me to do so unless you state otherwise).

Thanks for your efforts with the registers, which I wish you every success with, and for taking the time to read this message.

25 Apr 2020 : The cold hard truth about my carbon footprint #
Understanding our impact on the environment has always been hard, and I've been lucky enough to live through several iterations of what being green means. At one point environmental impact was measured by the number of aerosols you used. Then it was based on how acidic you made the rain. Then it was the type of detergent you used. There were no doubt many in between that I've forgotten.

The latest metric is that of our carbon footprint: how much CO2 a person produces each year. It certainly has advantages over some of the others, for example by being measurable on a continuous scale, and by capturing a broader range of activities. But at the same time it doesn't capture every type of ecological damage. Someone with zero carbon footprint can still be destroying the ozone layer and poisoning the world's rivers with industrial waste.

Still, even if it's one of many metrics that captures our harm to the environment, it's still worth tracking it in the hope of reducing our overall impact.

With that in mind I recently calculated my carbon footprint using the aptly named "Carbon Footprint Calculator" provided by a company aptly named "Carbon Footprint Ltd.".

I actively try to reduce my carbon emissions, for example by using electricity from a renewable provider, and by walking, cycling or using public transport rather than driving. However I also have a rented flat in Finland (where I live and work), alongside a house in the UK (where my wife lives and works). Travelling between Cambridge and Tampere by boat and train is a three-day odyssey, compared to 11 hours by plane, so I fly much more than I should. Joanna and I don't really enjoy the carbon-footprint benefits of having two or more people living in a single home. Of course, the environmental consequences don't really care why the CO2 is being produced, only that it is, so we need to take an honest look at the output we're producing.

Here's a breakdown of our impact as determined by the calculator.
Source Details CO2 output 2019 (t) Goal for 2020 (t)
Electricity 1 794 kWh 0.50 0.25
Natural gas 6 433 kWh 1.18 1.18
Flights 10 return HEL-LON 5.76 3.46
Car 11 910 km 1.45 0.97
National rail 1 930 km 0.08 0.16
International rail 5 630 km 0.02 0.04
Taxi 64 km 0.01 0.02
Food and drink   1.69 1.69
Pharmaceuticals   0.26 0.26
Clothing   0.03 0.03
Paper-based products   0.34 0.34
Computer usage   1.30 1.30
Electrical   0.12 0.12
Manufactured goods   0.50 0.10
Hotels, restaurants   0.51 0.51
Telecoms   0.15 0.15
Finance   0.24 0.24
Insurance   0.19 0.19
Education   0.05 0.05
Recreation   0.09 0.09
Total   14.47 11.14

Given the effort we put in to reducing our footprint, this feels like a depressingly high total. The average for two people in our circumstances is 15.16 tonnes, but the worldwide average is 10.0 tonnes, and the target needed to combat climate change is 4.0 tonnes per year. So we are way off where we really need to be.

How could we get it down to an ecologically-safe level? Well the cold hard truth is that right now, we couldn't. Even if we took no more flights, converted our gas boiler to a renewable energy source and stopped commuting by car, that would still leave our joint carbon footprint at 6.39 tonnes for the year. Too much.

The danger is that we become nihilistic about it, so we need to set realistic goals and then just try to continue to bring it down over time. Joanna and I have been through and worked out what we think we can realistically achieve this year. The COVID-19 pandemic turns out to have some positives here, since we're not commuting or flying at all right now. We think we can realistically bring our combined carbon footprint down to 11.2 tonnes for 2020, and that's what we'll be aiming to do.

The reality is that reducing our CO2 to a sensible level is hard, and it's going to get harder. I'm hoping having something to aim for will help.
13 Apr 2020 : How to build a privacy-respecting website #
Even before mobile phones got in on the act, the Web had already ushered in the age of mass corporate surveillance. Since then we've seen a bunch of legislation passed, such as the EU ePrivacy Directive and more recently the GDPR, aiming to give Web users some of their privacy back.

That's great, but you might imagine a responsible Web developer would be aiming to provide privacy for their users independent of the legal obligations. In this world of embedded javascipt, social widgets, mixed content and integrated third-party services, that can be easier said than done. So here's a few techniques a conscientious web developer can apply to increase the privacy of their users.

All of these techniques are things I've applied here on my site, with the result that I can be confident web users aren't being tracked when they browse it. If you want to see another example of a site that takes user privacy seriously, take a look at how Privacy International do it (and why).

1. "If you have a GDPR cookie banner, you're part of the problem, not part of the solution"

It's tempting to think that just because you have a click-through GDPR banner with the option of "functional cookies only" that you're good. But users have grown to hate the banners and click through instinctively without turning off the tracking. These banners often reduce users' trust in a site and the web as a whole. What's more, on a well designed site they're completely unnecessary (see 2). That's why you won't find a banner on this site.

2. Only set a cookie as a result of explicit user interaction

On this site I do use to cookies. One is set when you log in, the other if you successfully complete a CAPTCHA. If you don't do either of those things you don't get any cookies.

The site has some user-specific configuration options, such as changing the site style. I could have used a cookie to store those settings too (there's nothing wrong with that, it's what cookies were designed for), but I chose to add the options into the URL instead. However, if I had chosen to store the options in a cookie, I'd be sure only to set the cookie in the event the user actually switches away from the default.

In addition to these two cookies, I also use Disqus for comments, and this also sets cookies, as well as tracking the user. That's bad, but a necessary part of using the service. See section 5 below for how I've gone about addressing this.

3. Only serve material from a server you control

This is good for performance as well as privacy. This includes images, scripts, fonts, or anything else that's automatically downloaded as part of the page.

For example, many sites use Google Fonts, because it's such an excellent resource. But why does Google offer such a massive directory of free fonts? Well, I don't know if they do, but they could certainly use the server hits to better track users, and at the very least it allows them to collect usage data.

The good news is that all of the fonts have licences that allow you to copy them to your server and serve them from there. That's not encouraged by Google, but it's simple to do.

The same applies to scripts, such as jQuery and others. You can embed their copy, but if you want to offer improved privacy, serve it yourself.

Hosting all the content yourself will increase your bandwidth, but it'll also increase your users' privacy. On top of that it'll also provide a better and more consistent experience in terms of performance. Relying on a single server may sound counter-intuitive, but if your server isn't serving the content, all of the stuff around it is irrelevant already, so it's a single point of failure either way. And for your users, waiting for the very last font, image, or advert to download because it's on a random external server you don't control, even if it's done asynchronously, is no fun at all.

Your browser's developer tools are a great way to find out where all of the resources for your site are coming from. In Firefox or Chrome hit F12, select the Network tab, make sure the Disable cache option is selected, then press Ctrl-R to reload the page. You'll see something like this.
Using the developer tools to find external content

Check the Domain column and make sure it's all coming from your server. If not, make a copy of the resource on your server and update your site's code to serve it from there instead.

Spot the difference in the images below (click to enlarge) between a privacy-preserving site like DuckDuckGo and a site like the New York Times that doesn't care about its readers' privacy.
DuckDuckGo content source New York Times content source

4. Don't use third party analytics services

The most commonly used, but also the most intrusive, is probably Google Analytics. So many sites use Google Analytics and it's particularly nefarious because it opens up the door for Google to effectively track web users across almost every page they visit, whether they're logged into a Google service or not.

You may still want analytics for your site of course (I don't use it on my site, but I can understand the value it brings). Even just using analytics from a smaller company provides your users with increased privacy by avoiding all their data going to a single sink. Alternatively, use a self-hosted analytics platform like matomo or OWA. This keeps all of your users' data under your control while still providing plenty of useful information and pretty graphs.

5. Don't embed third-party social widgets, buttons or badges

Services can be very eager to offer little snippets of code to embed into your website, which offer things like sharing buttons or event feeds. The features are often valued by users, but the code and images used are often trojan horses to allow tracking from your site. Often you can get exactly the same functionality without the tracking, and if you can't then 2 should apply: make sure they're not able to track unless the user explicitly makes use of them.

For non-dynamic sharing buttons often the only thing needed is to move any script and images on to your server (see 3). But this isn't always the case.

For example, on this site I use Disqus for comments. Disqus is a notorious tracker, but as a commenting system it offers some nice social features, so I'd rather not remove it. My solution has been to hide the Disqus comments behind an "Uncover Disqus comments" button. Until the user clicks on the button, there's no Disqus code running on the site and no way for Disqus to track them. This fulfils my requirement 2, but it's also not an unusual interaction for the user (for example Ars Technica and Engadget are both commercial sites that do the same).

When you embed Disqus on your site the company provides some code for you to use. On my site it used to look like this:
<div id="disqus_thread"></div>
var disqus_shortname = "flypig";
var disqus_identifier = "page=list&amp;list=blog&amp;list_id=692";
var disqus_url = "";

(function() { // DON'T EDIT BELOW THIS LINE
	var dsq = document.createElement("script"); dsq.type = "text/javascript"; dsq.async = true;
	dsq.src = "https://" + disqus_shortname + "";
	(document.getElementsByTagName("head")[0] || document.getElementsByTagName("body")[0]).appendChild(dsq);

On page load this would automatically pull in the script, exposing the user to tracking. I've now changed it to the following.
<div id="disqus_thread"></div>
<a id="show_comments" href="#disqus_thread" onClick="return show_comments()">Uncover Disqus comments</a>
<script type="text/javascript">
    var disqus_shortname = "flypig";
    var disqus_identifier = "page=list&amp;list=blog&amp;list_id=692";
    var disqus_url = "";
    function show_comments() {
        document.getElementById("show_comments").style.display = "none";
        var dsq = document.createElement("script"); dsq.type = "text/javascript"; dsq.async = true;
        dsq.src = "https://" + disqus_shortname + "";
        (document.getElementsByTagName("head")[0] || document.getElementsByTagName("body")[0]).appendChild(dsq);
        return false;

The script is still loaded to show the comments, but now this will only happen after the user has clicked the Uncover Disqus comments button.

For a long time I had the same problem embedding a script for social sharing provided by AddToAny. Instead I now just provide a link directly out to This works just as well by reading the referer header rather than using client-side javascript and prevents any tracking until the user explicitly clicks on the link.

There are many useful scripts, service and social capabilities that many web users expect sites to support. For a web developer they can be so convenient and so hard to avoid that it's often much easier to give in, add a GDPR banner to a site, and move on.

6. Don't embed third-party adverts

Right now the web seems to run on advertising, so this is clearly going to be the hardest part for many sites. I don't serve any advertising at all on my site, which makes things much easier. But it also means no monetisation, which probably isn't an option for many other sites.

It's still possible to offer targetted advertising without tracking: you just have to target based on the content of the page, rather than the profile of the user. That's how it's worked in the real world for centuries, so it's not such a crazy idea.

Actually finding an ad platform that will support this is entirely another matter though. The simple truth is that right now, if you want to include third party adverts on your site, you're almost certainly going to be invading your users' privacy.

There are apparent exceptions, such as Codefund which claims not to track users. I've not used them myself and they're restricted to sites aimed at the open source community, so won't be a viable option for most sites.

Compared to many others, my site is rather simple. Certainly that makes handling my readers' privacy easier than for a more complex site. Nevertheless I hope it's clear from the approaches described here that there often are alternatives to just going with the flow and imposing trackers on your users. With a bit of thought and effort, there are other ways.
11 Apr 2020 : Google/Apple's “privacy-safe contact tracing“, a summary #
As I discussed yesterday, Google and Apple recently announced a joint privacy-preserving contact tracing API aimed at helping people find out whether they'd been in contact with someone who subsequently tested positive for COVID-19.

We've already relinquished so many rights in the fight against COVID-19, it's important that privacy isn't another one, not least because the benefit of contact tracing increases with the number of people who use it, and if it violates privacy it'll rightly put people off.

So I'm generally positive about the specification. It seems to be a fair attempt to provide privacy and functionality. Not only that, it's providing a benchmark for privacy that it would be easy for governments to fall short of if the spec weren't already available. Essentially, any government who now provides less privacy than this, is either incompetent, or has alterior motives.

But what does the spec actually say? Apple and Google have provided a decent high-level summary in the form of a slide deck, from which the image below is taken. They've also published a (non-final) technical specification. However, for me the summary is too high-level (it explains what the system does, but not how it works) and the technical specs are too low-level (there's too much detail to get a quick understanding). So this is my attempt at a middle-ground.
A high-level overview of the approach

There are three parts to the system. There's the OS part, which is what the specification covers; there's an app provided by your regional health authority; and there's a server run by your regional health authority (or more likely, a company the health authority subcontracted to). They all act together to provide the contact tracing service.
  1. Each day the user's device generates a random secret $k$, which stays on the user's device for the time being.
  2. The device then broadcasts BLE beacons containing $h = H(k, c)$ where $H$ is a one-way hash function and $c$ is a counter. Since $k$ can't be derived from $h$, and since no pair of beacons $h_1, h_2$ can be associated with one another, the beacons can't in theory be used for tracking. This assumes that the BLE subsystem provides a level of tracking-protection, for example through MAC randomisation. Such protections don't always work, but at least in theory the contact-tracing feature doesn't make it any worse.
  3. The device also listens for any beacons sent out by other users and stores any it captures locally in a list $b_1, b_2, \ldots$.
  4. If a user tests positive for COVID-19 they are asked to notify the regional health authority through the app. This involves the app uploading their secret $k$ for the day to a central database run by the regional health authority (or their subcontractor). From what I can tell, neither Apple nor Google need to be involved in the running of this part of the system, or to have direct access to the database. Note that only $k$ is uploaded. Neither the individual beacons $h_1, h_2, \ldots$ sent, nor the beacons $b_1, b_2, \ldots$ received, need to be uploaded. This keeps data quantities down.
  5. Each day the user's phone also downloads a list $k_1, k_2, \ldots, k_m$ of secrets associated with people who tested positive. This is the list collated each day in the central database. These keys were randomly generated on the user's phone and so are pseudonymous.
  6. The user's phone then goes through the list and checks whether one of the $k_i$ is associated with someone they interacted with. It does this by re-calculating the beacons that were derived from this secret: $H(k_i, 1), H(k_i, 2), \ldots, H(k_i, m)$, and compares each against every beacon it collected the same day.
  7. If there's a match $H(k_i, j) = b_l$, then the user is alerted that they likely interacted with someone who has subsequently tested positive. Because the phone also now knows the counter $j$ used to generate the match, it can also provided a time for when the interaction occurred.

This is a significant simplification of the protocol, but hopefully gives an idea of how it works. This is also my interpretation based on reading the specs, so liable to error. By all means criticise my summary, but please don't use this summary to criticise the original specification. If you want to do that, you should read the full specs.

Because of the way the specification is split between the OS and the app, the BLE beacons can be transmitted and received without the user having to install any app. It's only when the user tests positive and wants to notify their regional health authority, or when a user wants to be notified that they may have interacted with someone who tested positive, that they need to install the app. This is a nice feature as it means there's still a benefit even if users don't immediately install the app.

One of the big areas for privacy concern will be the behaviour of the apps provided by the regional health authorities. These have the ability to undermine the anonymity of the system, for example by uploading personal details alongside $k$, or by tracking the IP addresses as the upload takes place. I think these are valid concerns, especially given that governments are notorious data-hoarders, and that the system itself is unlikely to be built or run by a health authority. It would be a tragic missed opportunity if apps do undermine the privacy of the system in this way, but unfortunately it may also be difficult to know unless the sourcecode of the apps themselves is made available.
10 Apr 2020 : Initial observations on the joint Google/Apple “privacy-safe contact tracing” specification #

Apple and Google today announced a joint protocol to support contact tracing using BLE. You can read their respective posts about it on the Apple Newsroom and Google blog.

The posts offer some context, but the real meat can be found in a series of specification documents. The specs provide enough information about how the system will work to allow a decent understanding, albeit with some caveats.

With so much potential for misuse, and given that mistrust could lead to some people choosing not to use the system, it's great that Google and Apple are apparently taking privacy and interoperability so seriously. But I'm a natural sceptic, so whenever a company claims to be taking privacy seriously, I like to apply a few tests.
  1. Are the specs and implementation details (ideally sourcecode) freely and openly available?
  2. Is interoperability with other software and devices supported.
  3. Based on the information available, is there a more privacy-preserving approach that the company could have gone with, but chose not to?
The answers to these appear to be "yes" (but not the sourcecode), "mostly" and "no". It's quite unusual, even for companies like Apple that make bold claims about privacy, to satisfy any one of these, let alone more than one, so this is genuinely very encouraging. Based on the specs released so-far, it seems that this has been a good-faith attempt to achieve both protection and privacy.

The catch is that the API defined by the specs provides only half of a full implementation. Apple and Google are providing an API for generating and capturing BLE beacons. They don't say what should happen to those beacons once they've been captured. Presumably this is because they expect this part of the system to be implemented by a third-party, most likely a regional public health authority (or, even more likely, a company that a health authority has subcontracted to).

Again, this makes sense, since different regions may want to implement their own client and server software to do this. In fact, by delegating this part of the system, Google and Apple strengthen their claim that they're acting in good faith. They're essentially encouraging public health authorities and their subcontractors to live up to the same privacy standards.

Apart from the privacy issues, my other main interest is in having the same system work on operating systems other than iOS and Android. My specific interest is for Sailfish OS, but there are other smartphone operating systems that people use, and locking users of alternative operating systems out of something like this would be a terrible result both for the operating system and for all users.

Delegation of the server and app portions to health authorities unfortunately makes it highly unlikely that alternative operating systems will be able to hook into the system. For this to happen, the health authority servers would also need to provide a public API. Google and Apple leave this part completely open, and the likelihood that health authorities will provide an API is unfortunately very slim.

I'd urge any organisation planning to develop the client software and servers for a fully working system to prove me wrong. Otherwise alternative operating system users like me could be left unable to access the benefits of the system. This reduces its utility for those users to nill, but it also reduces the effectiveness of the system for all users, independent of which operating system they use, because it increases the false negative rate.

There's one other aspect of the specification that intrigues me. In the overview slide deck it states that "Alice’s phone periodically downloads the broadcast beacon keys of everyone who has tested positive for COVID-19 in her region." (my emphasis). This implies some form of region-locking that's not covered by the spec. Presumably this is because the servers will be run by regional health authorities and so the user will install an app that applies to their particular region. There are many reasons why this is a good idea, not least because otherwise the amount of data a user would have to download to their device each day would be prohibitive. But there is a downside too. It essentially means that users travelling across regions won't be protected. If they interact with someone from a different region who tests positive, this interaction won't be flagged up by the system.

The spec is still very new and no doubt more details will emerge over the coming days and weeks. I'll be interested to see how it pans out, and also interested to see whether this can be implemented on devices like my Sailfish OS phone.
Reference to region-locking, taken from the overview slide deck
17 Mar 2020 : Everything about this site has changed #
Today is an important step forwards for this site. The whole site has just been moved from a Windows IIS server and generated using ASP reading data from an MS Access database, to a Linux server running Apache and served using PHP and a MySQL database. It's gone from WIAA to LAMP.

It's been written in ASP since 29th January 2006, when I converted it from the original static HTML. So this makes it the second major changes since it started life on a Sun server at the University of Birmingham back in November 1998. From static, to ASP and now to PHP.

Hopefully the site will look and work the same, but in the background, everything has changed. Every piece of ASP code had to be re-written and there were also quite a few changes needed to get the CGI code to work. However, for the latter, fewer than you might expect given they were written for Windows in C. My decision to go with ASP back in 2006 may not have been the best one, but I made a better decision going for all open source libraries for my CGI code.

As well converting the code, I also took the chance to improve it in places, with better configuration and slightly better abstraction. There's a short post covering my experiences of transitioning the code from ASP to PHP if you're interested. You can also read my original plan to convert the site when it became one of my New Year's Resolutions (one down, three to go).

The one external change you might notice is actually quite important. It's long overdue, but the site now finally has a TLS certificate. Combined with the removal of all tracking, I'm now happy with the security.

There may be some glitches to iron out with the new code, so if you notice strangeness, please let me know.
TLS and no tracking
22 Feb 2020 : These aren't the cookies you're looking for #
By far this is the best invitation to speak at a conference I've received. I wonder how much the attendees at the World Congress of Food would enjoy my talk about web browser state!
Please come and speak about cookies

I'm sure the conference itself will be very good and this is perhaps an understandable misunderstanding, but it's still quite funny (I decided not to accept).
17 Feb 2020 : Shower Gel or Soap: which is better financially and environmentally? #
Today I want to tackle one of the really big questions of our time: which is better, soap or shower gel?

For a long time I thought shower gel was basically just watered-down soap and therefore couldn't possibly be better value. I can add water to soap myself, thank you very much. But shower gel and soap are actually made in quite different ways. They're both produced through a process called saponification (yes, honestly; probably coined by the same person who came up with Unobtanium), whereby fat or oil reacts with an alkali. However, while the alkali used in the production of soap is sodium hydroxide, for liquid soaps potassium hydroxide is used instead.

Still, what you end up with in both cases is an emulsifier that makes it easier to remove oil related stuff from your skin. There are two questions which really interest me. First, which is the cheaper in use, and second which is the more environmentally friendly.

To answer the first, I performed a couple of experiments. I bought some basic soap and shower gel products from Lidl's Cien range (that's Cien, not Chien). I think it's fair to say they're both value products, which makes them great for comparison.
Shower gel (left) and soap (right)

Here are their vital stats (as of June 2019).
  1. Lidl Cien Shower Gel: 300 ml (309g actual contents weight) costing €0.89.
  2. Lidl Cien Soap: 2 bars, 150g each (140g actual contents weight) costing €0.87.
So, that's a pretty similar cost-to-weight ratio. The question then is which of the two will last longer in practice. That 300g bottle of shower gel lasted me 19.5 days, whereas a single bar of soap lasted 26 days. So that gives a pretty convincing answer from the results.
  Cost per kg Usage per day Cost per day
Shower gel €2.89 15.85g €0.046
Soap €3.11 5.38g €0.017

These results are pretty clear-cut. I got through nearly three times as much shower gel per day compared to soap, making soap considerably less than half the cost of shower gel. So if your main concern is cost, soap is clearly the better option. Shower gel pretty much is watered-down soap after all.

But what about the environmental costs? There are many things to consider which make this a complex question and very challenging to answer. The transportation costs of soap will be less, because the daily weight used is less. However, in terms of the chemicals and energy needed for production, it's really hard to say.

The ingredients on the side of each packet aren't really very helpful, because the relative quantities are missing. Establishing the exact amounts turns out to be hard. However, I was able to get some relatively generic formulas from Ernest W. Flick's Cosmetic And Toiletry Formulations Volume 2. The formula for shower soap is given as follows.
Ingredient Quantity
Water and Preservative 29.3%
MONATERI 951A 20.8%
Sipon LSB 17.9%
MONAMID 1089 5.0%
Ethylene Glycol Monostearate 2.0%

And here's the formula for shower gel.
Ingredient Quantity
Water q.s. to 100%
Standapol T 30%
Standapol E-1 10%
Lamepon S 9%
Standamid LDO 2.5%
Standamox LAO-30 3%
Sodium Chloride 2%
Kathon CG 0.05%

The "q.s." here is an abbreviation of "quantum satis", meaning "the amount which is enough".

Frankly, the only ingredient that means anything to me is "Water". But at least that's something. Based on this, we can roughly conclude that soap is approximately 29% water, 71% other, whereas shower gel is approximately 43% water, 57% other. Combining this with the results from our experiment, we get the following:
  Daily usage water Daily usage other
Shower gel 6.89g 8.96g
Soap 1.58g 3.80g

So, whether you're concerned about the water requirements, the chemical usage, or the transportation costs, of either product, it looks pretty clear that soap is the better option in all cases. It's hard to get any accurate idea of how they compare environmentally, but we can conclude that the reduced amounts of soap used in practice are unlikely to be outweighed by differences in the production process.

Of course, this is based on my own usage, and on a particular product line. Maybe it's different for you or for different products. Nevertheless, this has convinced me and I know which one I'll be sticking to in future.

19 Jan 2020 : The journey from ASP to PHP #
Today I made a big step forwards in improving this website. For 14 years the site has run on an MS Access and ASP backend. Yes, that's ASP, not ASP.NET, which wasn't an option when I wrote the code. There were multiple reasons for me choosing ASP, but one of them was that — given the backing of Microsoft — it looked to have better long-term prospects than the open-source underdog PHP. Now I'm in the situation where I want to move the site over to a Linux server (primarily so I can get it TLS-enabled) and so it needs to be re-written in somethig that will run properly on Linux.

In order to minimise my effort, that means re-writing it in PHP. My prediction that ultimately ASP would prevail over PHP didn't quite pan out as I expected. But that's no bad thing. I'm not a fan of PHP particularly, but I'm even less a fan of ASP.

The conversion isn't just a matter of re-writing the ASP in PHP. I also need to convert the database from MS Access to MySQML. For this I've written a simple python script that will do a complete transfer automatically. It's great because I can run it at any time to do the transfer, which is important given the site will continue to get updates (like this one) right until the switch finally happens.

Today's achievement was to finally get the site running using PHP. It's weird to see exactly the same pages popping out of completely different code running on completely different stacks. There remain a bunch of backend changes I still need to make (probably I'm no more than 20% of the way through), but this at least proves that the conversion is not only feasible, but seamlessly so.
The ASP site left, and the PHP site right

To my relief, the re-writing of the code from ASP to PHP has been surprisingly straightfoward as well. Some of the key similarities:
  1. The structuring is very similar; almost identical. Code is interwoven into HTML, executed on request in a linear way, the resulting text output is the page the requester sees.
  2. Database access is using SQL, so no big changes there.
  3. Javascript and PHP are both curly-bracket-heavy, dynamically-typed, imperative languages.
  4. ASP and PHP both include files in a similar way, which should allow the file structures to remain identical.

In fact, the structure of the two codebases is so similar that it's been practically a line-by-line conversion.
The ASP code left, and the PHP code right

There are nevertheless some important differences, some of which you can see in the screenshot above.
  1. The most obvious visual difference is that all PHP variables must be prefixd with a $ symbol, whereas javascript variable can just use pretty much any alphanumeric identifier.
  2. PHP concatenates strings using the . symbol, whereas Javascript uses the + symbol. This might seem like a minor change, but string concatenation is bread-and-butter site generation functionality, so it comes up a lot.
  3. Many Javascript types, including strings, are classes which come with their own bunch of methods. In contrast PHP seems to prefer everything to be passed as function parameters. For example: string.substring(start, end) vs. substr($string, $start, $length).
  4. PHP regex literatls are given as strings, whereas in Javascript they have their own special syntax.
  5. Javascript has this nice Date class, whereas PHP seems happier relying on integer timestamps.
  6. Variable scoping seem to be different. This caused me the biggest pain, since ASP is more liberal, meaning with PHP more effort is needed passing variables around.

In practice, none of these changes are really that bad and I was able to convert my codebase with almost no thought. It just required going through and methodically fixing each of lines in sequence. Most of it could have been automated fairly easily even.

However, as I go through converting the code I'm continually noticing both small and big ways to improve the design. Tighter SQL statements, clearer structuring, streamlining variable usage, better function structure, improved data encapsulation and so on. But in the first instance I'm sticking to this line-by-line conversion. Once it's up and running, I can think about refactoring and refining.

It feels like I'm making good progress on my plan to change the site. I was hoping to get it done by the end of January, and right now that's not looking out-of-the-question.
11 Jan 2020 : If my washing machine were a car, how fast would it travel? #
In Finland I live in a small flat, so spend more time in close proximity to my washing machine than I'd really like. But as the drum spun up to create its highest pitched whine this morning while I was cleaning my teeth, the speed of it impressed me.
My washing machine

So I wondered: if it were a vehicle, how fast would it be travelling? It shouldn't be too hard to calculate with the information available. What are the pieces needed? The radius of the drum and the angular velocity should be enough.

For the angular velocity we just need to check out the technical specs from the manual. That was easy as I already carry a copy around with me on my phone to help me figure out which programme to use.
The programme listing from the manual

Today I was running a 30°C Cotton wash, which spins at 1200 rpm.
1200 {\rm\ rpm} = \frac{1200 \times 2 \pi}{60 {\rm\ s}} = 40 \pi {\rm\ radian} {\rm\ s}^{-1}
There's nothing in the manual about the drum size, so I reverted to a tape measure.
Inside the drum with a tape measure

So that's a diameter of 47cm, or a radius of 0.235m. That's the inside of the drum of course, but that is the bit the clothes have to deal with.

This gives us a linear velocity of
40 \pi \times 0.235 = 20.5 {\rm\ m}{\rm s}^{-1} = 106.3 {\rm\ kph} = 66.1 {\rm\ mph}.

So if my washing machine were an electric car, it'd be zipping along at nearly the speed limit. That's surprisingly nippy!

6 Jan 2020 : New Year's Resolutions #
Fourteen years ago I wrote the code that powers this site. Until that time I'd used only static pages using a pre-generated templating system that were then uploaded to the site via FTP. As the site expanded and web technologies progressed, it was clear I needed to have something more structured, with content stored in a database and pages generated using server-side code. At the time I was trenchantly anti-Microsoft, but I also realised it's unfair to criticise something you don't understand. Microsoft's development technologies were in the ascendant and also seemed to be a better long-term bet than the open source alternatives. So I consciously chose ASP over PHP. Fourteen years later, even though that original ASP code has done pretty well, with the benefit of hindsight I can say with some confidence that I backed the wrong horse.
Evolution of the site, 1998, 2006 and 2020

So you could say it's been a somewhat extended evaluation period, but I've finally decided it's time to re-write the site in PHP. This will allow me to migrate to a Linux server, giving me more control and flexibility, but more importantly it'll also allow me to deploy a TLS certificate for the site (it's perfectly possible to deploy a cert on an IIS server of course, but this is a thing with my hosting provider).

To help motivate me, while I'm not usually a fan of New Year's Resolutions, this year I'm making an exception. In 2020 I've decided that this will be my first resolution: to re-write this site's code in PHP. All of the content will remain and if all goes well no-one except me will notice any difference. Maybe astute visitors will notice a padlock.

Since I'm already setting myself up for failure, I figure I may as well stick a few more items on the list too. So, in 2020, here's are some other things I plan to achieve.
  1. Reduce my daily waste output to below 300g. In the last four months of 2019 it was 329g a day, 81% of which was recycled. In 2020 I want to reduce this to less than 300g waste per day. I'd prefer to decrease waste than increase recycling. I've no idea how I'll do it yet, and the year has started disastrously already, so we'll see. I'd also like to be carbon neutral, but I've not even calculated my current carbon footprint, so maybe that'd be getting ahead of myself.
  2. Scintillon is the Philips Hue app I developed and maintain for Sailfish OS phones. After having used it last year to control the lights in my flat, it's now ready for a bit of refinement. I often find myself having to switch between different pages in the app in order to control my lights the way I want, so I think it'd be a good improvement to support some extra configurability to allow users to design their own interface the way they want. I just need to carve out the time to design and implement it.
  3. I'm learning Finnish but it's difficult and I'm slow. So I need to focus. My sister generously gave me a subscription to FinnishPod101 for Christmas and now I just need to commit to using it. My aim is to spend at least 30 minutes a day learning Finnish, topping up my Finnish classes using the site.
I have so many more projects and plans lined up, like my ideas to create a gesture-based programming language or to extend the concept of a Celtic knot to n-dimensional space. But if it's taken me 14 years to write this website, I may have to leave some of those till 2021.
3 Dec 2019 : Graphs of Waste, Part 4: Pitfalls and Scope for Improvement #
In the previous three articles (part 1, part 2 and part 3) we developed the idea of a histogram into a histocurve, a graph that displays data that might otherwise be presented as a histogram, but which better captures the continuity between data items by presenting them as a curve, rather than a series of columns.

Here are a couple of graphs that show the same data plotted as a histogram and then as a histocurve. You may recall that our starting point was a realisation that simply plotting the data and joining the points gave a misleading representation of the data. The important point about these two graphs — both the histogram and the histocurve — is that the area under the graph is always a good representation of the actual quantities the data represents. In this case, it's how much recycling and rubbish I generate each day.
Stacked histogram showing my waste output
The same data shown as a stacked histocurves

Having got to this point, we can see that there are also some pitfalls with using these histocurves that don't apply to histograms. I reckon it's important to be aware of them, so worth spending a bit of time considering them.

The most obvious to me is the fact that the histocurve doesn't respect the maximum or minimum bounds of the graph. In the case of my waste data, there's a very clear minimum floor because it's impossible for me to generate negative waste.

In spite of this, because the height is higher at some points than it would otherwise be as a means of maintaining continuity, it has to be lower at other points to compensate. As a result in several areas the height dips below the zero point. We can see this in the stacked curve as areas where the curve gets 'cut off' by the curve below it.

As yet, I've not been able to think of a sensible way to address this. Fixing it would require compensating for overflow in some areas by distributing the excess across other columns. This reduces accuracy and increases complexity. It's also not clear that an approach like this could always work. If you have any ideas, feel free to share them in the comments.

For some types of data this is more important than others. For example, in the case of this waste data, the notion of negative waste is pretty perplexing, however for many types of data there is no strict maximum or minimum to speak of. Suppose for example it were measurements of water flowing in and out of a reservoir. In this case the issue would be less relevant.

Another danger is that the graph gives a false impression of accuracy. The sharp boundaries between columns in a histogram make clear where a data value starts and ends. By looking at the graph you know over which period a reading applies. With a histocurve it looks like you should be able to read a value off the graph for any given day. The reading would be a 'prediction' based on the trends, but of course we've chosen the curve of the graph in order to balance the area under the curve, rather than using any consideration of how the curve relates to the phenomenon being measured.

This leads us on to another issue: that it's hard to derive the actual readings. In the case of a histogram we can read off the height and width of a column and reverse engineer the original reading by multiplying the two together. We aren't able to do this with the histocurve, so the underlying data is more opaque.

The final problem, which I'd love to have a solution for, is that changing the frequency of readings changes the resulting curve. The current data shows readings taken roughly once per week at the weekends. Suppose I were to start taking readings mid-week as well. If the values taken midweek were exactly half the values I was measuring before (because they were taken twice as frequently) then the histogram would look identical. The histocurve on the other hand would change.

These limitations aren't terminal, they just require consideration when choosing what type of graph to use, and making clear how the viewer should interpret it. The most important characteristic of the histocurve is that it captures the results by considering the area under the curve, and none of the values along the curve itself are truly representative of the actual readings taken beyond this. As long as this is clear then there's probably a use for this type of graph out there somewhere.

That wraps up this discussion about graphs, histgrams and histocurves. If you made it this far, as Chris Mason would say, congratulations: you ooze stamina!
26 Nov 2019 : Graphs of Waste, Part 3: A Continuously Differentiable Histogram Approach #
In part one we looked at how graphs can be a great tool for expressing the generalities in specific datasets, but how even seemingly minor changes in the choice of graphing technique can result in a graph that tells an inaccurate story.

In part two we found out we could draw a continuous line graph that captured several useful properties that are usually associated with histograms, notably that the area under the line graph is the same as it would be for a histogram between the measurement points along the $x$-axis.

But what if we want to go a step further and draw a smooth line, rather than one made up of straight edges? Rather than just a continuous line, can we present the same data with a continuously differentiable line? Can we do this and still respect this 'area under the graph' property?

It turns out, the answer is "yes"! And we can do it in a similar way. First we send the curve through each of the same points at the boundary of each column, then we adjust the height of the midpoint to account for any changes caused by the curvature of the graph.

There are many, many, ways to draw nice curves, but one that frequently comes up in computing is the Bézier curve. It has several nice properties, in that it's nicely controllable, and depending on the order of the curve, we can control to any depth of derivative we choose. We'll use second-degree Bézier curves, meaning that we'll be able to have a continuous line and a continuous first derivative. This should keep things nice and smooth.

Bézier curves are defined parametrically, meaning that rather than having a function that takes an $x$ input and produces a $y$ output, as is the common Cartesian case, instead it takes a parameter input $t$ that falls between  0 and 1, and outputs both the $x$ and $y$ values. In order to avoid getting confused with the variables we used in part two, we're going to use $u$ and $v$ instead of $x$ and $y$ respectively.

Here's the formula for a second-order Bézier curve.

\begin{pmatrix} u \\ v \end{pmatrix} = (1 - t)^3 \begin{pmatrix} u_0 \\ v_0 \end{pmatrix} + 3(1 - t)^2 t \begin{pmatrix} u_1 \\ v_1 \end{pmatrix} + 3 (1 - t) t^2 \begin{pmatrix} u_2 \\ v_2 \end{pmatrix} + t^3 \begin{pmatrix} u_3 \\ v_3 \end{pmatrix} .

Where $\begin{pmatrix} u_0 \\ v_0 \end{pmatrix}$, $\begin{pmatrix} u_3 \\ v_3 \end{pmatrix}$ are the start and end points of the curve respectively, and $\begin{pmatrix} u _1\\ v_1 \end{pmatrix}$, $\begin{pmatrix} u_2 \\ v_2 \end{pmatrix}$ are control points that we position in order to get our desired curve.

The fact a Bézier curve is parametric is a problem for us, because it makes it considerably more difficult to integrate under the graph. If we want to know the area under the curve, we're going to have to integrate it, so we need a way to turn the parameterised curve into a Cartesian form.

Luckily we can cheat.

If we set $\begin{pmatrix} u_1 \\ v_1 \end{pmatrix}$ and $\begin{pmatrix} u_2 \\ v_2 \end{pmatrix}$ to be $\frac{1}{3}$ and $\frac{2}{3}$ of the way along the curve respectively, then things get considerably easier. In other words, set

u_1 & = u_0 + \frac{1}{3} (u_3 - u_0) \\
    & = \frac{2}{3} u_0 + \frac{1}{3} u_3 \\
u_2 & = u_0 + \frac{2}{3} (u_3 - u_0) \\
    & = \frac{1}{3} u_0 + \frac{2}{3} u_3 .

Substituting this into our Bézier curve equation from earlier we get

u & = (1 - t)^3 u_0 + 3 (1 - t)^2 t \times \left( \frac{2}{3} u_0 + \frac{1}{3} u_3 \right) + 3 (1 - t) t^2 \times \left( \frac{1}{3} u_0 + \frac{2}{3} u_3 \right) + t^3 u_3 \\
  & = u_0 + t (u_3 - u_0) .

When we choose our $u_1$ and $u_2$ like this, we can perform the substitution

\psi(t) = u_0 + t(u_3 - u_0)
in order to switch between $t$ and $u$. This will make the integral much easier to solve. We note that $\psi$ is a bijection and so invertible as long as $u_3 \not= u_0$. We can therefore define the inverse:

t = \psi^{-1} (u) = \frac{u - u_0}{u_3 - u_0} \\
It will also be helpful to do a bit of groundwork. We find the values at the boundary as
\psi^{-1} (u_0) & = 0, \\
\psi^{-1} (u_3) & = 1, \\
and we also define the following for convenience.
V(u) = v(\psi^{-1} (u)) .

We'll use these in the calculation of the integral under the Bézier curve, which goes as follows.

\int_{u_0}^{u_3} V(u) \mathrm{d}u

Using the substitution rule we get

\int_{\psi^{-1}(u_0)}^{\psi^{-1}(u_3)} & V(\psi(t)) \psi'(t)\mathrm{d}t = \int_{t = 0}^{t = 1} v(\psi^{-1}(\psi(t))) (u_3 - u_0) \mathrm{d}t \\
 & = (u_3 - u_0) \int_{0}^{1} v(t) \mathrm{d}t . \\
 & = (u_3 - u_0) \int_{0}^{1} (1 - t)^3 v_0 + 3 (1 - t)^2 t v_1 + 3 (1 - t) t^2 v_2 + t^3 v_3 \mathrm{d}t \\
 & = (u_3 - u_0) \int_{0}^{1} (1 - 3t + 3t^2 - t^3) v_0 + 3 (t - 2t^2 + t^3) v_1 + 3 (t^2 - t^3) v_2 + t^3 v_3 \mathrm{d}t \\
 & = \frac{1}{4} (u_3 - u_0) (v_0 + v_1 + v_2 + v_3) .

We'll bank this calculation and come back to it. Let's now consider how we can wrap the Bézier curve over the points in our graph to make a nice curve. For each column we're going to end up with something like this.
Switching the straight lines for B�zier curves at the top of a column Detail of a single B�zier curve

Now as before, we don't have control over $u_0$, $v_0$ because it affects the adjoining curve. We also don't have control over $u_1$ and $u_2$ because as just described, we have these set to allow us to perform the integration. We also must have $u_3$ set as $u_3 = u_0 + w / 2$ so that it's half way along the column.

Our initial assumption wil be that $v_3 = h$, but this is the value we're going to manipulate (i.e. raising or lowering the central point) in order to get the area we need. We shouldn't need to adjust it by much.

That just leaves $v_1$ and $v_2$. We need to choose these to give us a sensible and smooth curve, which introduces some additonal constraints. We'll set the gradient at the point $u_0$ to be the gradient $g_1$ of the line that connects the heights of the centrepoints of the two adjacent columns:

g_1 = \frac{y - y_L}{x - x_L}
where $x, y$ are the same points we discussed in part two, and $x_L, y_L$ are the same points for the column to the left. We'll also use $x_R, y_R$ to refer to the points for the column on the right, giving us:

g_2 = \frac{y_R - y}{x_R - x} .

Using our value for $g_1$ we then have

v_1 = v_0 + g_1 (u_1 - u_0) .

For the gradient $g$ at the centre of the column, we set this to be the gradient of the line between $y_1$ and $y_2$:

g = \frac{y_2 - y_1}{x_2 - x_1} .

We then have that

v_2 = v_3 + g (u_2 - u_3) .

From these we can calculate the area under the curve using the result from our integration calculation earlier, by simply substiuting the values in. After simplifying the result, we get the following.

A_1' = \frac{1}{8}(x_2 - x_1) \left( 2y' + \frac{13}{6} y_1 - \frac{1}{6} y_2 + \frac{1}{6} g_1 (x_2 - x_1) \right)
where $y'$ is the height of the central point which we'll adjust in order to get the area we need. This looks nasty, but it'll get simpler. We can perform the same calculation for the right hand side to get

A_2' = \frac{1}{8}(x_2 - x_1) \left( 2y' + \frac{13}{6} y_2 - \frac{1}{6} y_1 - \frac{1}{6} g_2 (x_2 - x_1) \right) .

Adding the two to give the total area $A' = A_1' + A_2'$ allows us to do a bunch of simplification, giving us

A' = \frac{w}{2} \left( \frac{1}{2} y_1 + \frac{1}{2} y_2 + y' \right) + \frac{w^2}{48} (g_1 - g_2) .

If we now compare this to the $A$ we calculated for the straight line graph in part two, subtracting one from the other gives us that

y' = y + \frac{w}{24} (g_2 - g_1) .

This tells us how much we have to adjust $y$ by to compensate for the area change caused by the curvature of the Bézier curves.

What does this give us in practice? Here's the new smoothed graph based on the same data as before.
The histogram data drawn using B�zier curves

Let's overlay the three approaches — histogram, straight line and curved graphs — to see how they all compare. The important thing to note is that the area under each of the columns — bounded above by the flat line, the straight line and the curve respectively — are all the same.
Histogram, straight lines and B�zier curves all overlaid on the same graph

Because of the neat way Bézier curves retain their area properties, we can even stack them nicely, similarly to how we stacked our histogram in part one, to get the following representation of the full set of data.
Stacked histocurves showing all the data

Putting all of this together, we now have a pretty straightforward way to present area-under-the-graph histograms of continuous data in a way that captures that continuity. I call this graph a "histocurve". A histocurve can give a clearer picture of the overall general trends of the data. For example, each of the strata in the histocurve remains unbroken, compared to the strata in a classic histogram which is liable to get broken at the boundary between every pair of columns.

That's all great, but it's certainly not perfect. In the fourth and final part of this series which I hope to get out on the 3rd December, I'll briefly discuss the pitfalls of histocurves, some of their negative properties, and things I'd love to fix but don't know how.

19 Nov 2019 : Sailfish Backup and Restore from Xperia XA2 to Xperia 10 #
It’ll come as no surprise to hear I’ve tried my share of phones running Sailfish OS, starting with the Jolla 1 and ending up with an Xperia XA2 via a Jolla C and Xperia X.

Yesterday I moved to the latest of the official Sailfish compatible phones, the Xperia 10. Having been using it now for a couple of days, I have to say that I’m exceptionally happy with it. It captures the understated aesthetic of the Xperia X, which I much preferred over the more brutal XA2 design that followed. But the screen is large and bright, and the long tall screen works really nicely with Sailfish OS which has always made good use of vertical space. Having an extra row of apps in the switcher makes the whole screen feel 33% bigger (even though in fact it's only 12% larger). Many apps, like Messages, Whisperfish, Depecher and Piepmatz, are built around a vertical scrolling SilicaFlickable. They all work beautifully on the tall thin screen. It turns out I'm much more excited at the move from the XA2 to the 10 than I expected.

There are some minor regressions. The camera bump is noticeably larger than on the X, and I'm still getting used to the button placement on the side (not only have the buttons moved, but they're also noticeablly harder to distinguish using touch alone). On the other hand the fingerprint reader is better placed and more responsive.
The screen is 12% larger, but it feels 33% larger

But one area where Sailfish OS deserves some love is in the upgrade process. The strong privacy-focus that makes the phone so attractive to people like me, also means that all of the data on the phone doesn’t get automatically synced to some megacorp’s datacentre. Moving from one phone to another requires a bit of manual effort, and I thought it might help some people to hear the process I went through (and even if not, it’ll certainly help me next time I go through this process). Ultimately it was pretty painless, and there’s nothing on my old phone that I don’t now have on my new phone, but it did require a few steps.

Step 1: Perform a system backup
My first step was to perform a system backup. This will backup photos, videos, accounts (minus passwords), notes and a few other bits and pieces. I’d love for it to have greater coverage, but it’s a clean, safe and stable way to capture the basic. I performed the backup to SD card, but if you have cloud accounts you could use them instead.

Step 2: Configure the new device (low-level)
There are a few default configuration steps I always like to go through. Not everyone will want to do all of this, but some might.

A. Set up a device lock and developer mode, including allowing a remote connection.

B. Enable developer updates… and update the phone.

C. Configure the hostname.
echo NAME > /etc/hostname
hostname NAME
hostnamectl set_hostname NAME
D. Create a public-private SSH key.
Log in to your phone using the developer password.
ssh-keygen -t rsa -b 4096 -C ""

View the public key
cat ~/.ssh/

E. Configure SSH to use a private/public keypair.

Having set up developer mode you can log in to the device via SSH using a password. It makes things much easier if you can also log in using a public/private key pair as well. To set this up, access the new phone using the SDK. This will neatly configure SSH on the phone.

Then log in to the phone and add the public key of the private computer you want to access your phone with to the ~/.ssh/authorized_keys file. Also add the private key of the phone you’re backup up from. If this phone doesn’t already have a public/private key pair, follow D above to create one on your old phone too.

Step 3: Install zypper
This step isn't really necessary, but I find it convenient.
pkcon install zypper

Step 4: Restore the system backup
Move the SD card from the old phone to the new phone and use the system backup restore option to restored the contents of the backup to the new device.

Step 5: Copy the non-backed up stuff
As mentioned above there are a lots of things the system backup won’t capture. Many of these, like app configurations, can be neatly transferred from the old phone to the new phone anyway. To do this, log in to the old phone using SSH.

Then you can copy all the music and documents from the old phone to the new phone over the network like this.
scp -r ~/Music nemo@
scp -r ~/Documents nemo@

And copy your app configurations. You should tweak this to suit the apps you have installed.
scp -r ~/.config nemo@
scp -r ~/.local/share/harbour-received nemo@
scp -r ~/.local/share/harbour-tidings nemo@
scp -r ~/.local/share/harbour-depecher nemo@
scp -r ~/.local/share/harbour-sailotp nemo@
scp -r ~/.local/share/harbour-whisperfish nemo@
This step is actually rather neat. I was able to transfer the configurations for all of my native apps all from the contents of the ~/.config and .local/share directories, saving me a boat-load of time and hassle.

Step 6: Deploy software requiring manual installation
I use Storeman, Whisperfish and the Matrix client, all of which require manual installation (the latter two aren't in the Jolla Store or OpenRepos). Here's an example of how you can install Storeman (but make sure you update the links to use the latest available version).
curl -L –output harbour-storeman.rpm
rpm -U harbour-storeman.rpm
rm harbour-storeman.rpm

Step 7: Install apps from Jolla Store and OpenRepos
I put the phones side-by-side, open the app draw on the old phone and just go through each app one by one installing them. Maybe there’s a better quicker way, but this worked for me.
Checking all the right apps are installed

Step 8: Update the accounts
For each of the accounts in the Settings app, the passwords will have been stripped from the backup for security reasons. I went through each systematically and added the passwords in again. I had some problems with a couple of accounts, so I just recreated them from scratch, copying over the details from the UI of the old phone.
Refresh the acconts

Step 9: Swap the SIM cards
I use two SIM cards, which unfortunately leaves no space for the SD card.

Step 10: Manual configurations
At this point, I went through and did some manual configuration of things like the top menu, ambiances, VPNs, USB mode, Internet sharing name, Bluetooth name, Keyboards, etc.

Step 11: Install Android apps manually
Some android apps require manual installation. For me these were FDroid, APKPure and MicroG. These are essentially pre-requisiste of all of the other Android apps I use. As an example, here's how I went about installing FDroid (APKPure is similar).
cd ~/Downloads
curl -L --output FDroid.apk
apkd-install FDroid.apk
rm Fdroid.apk

APKPure can be installed in a similar way. MicroG is a bit more involved, but here's a summary of the steps:
A. Configure Android App Support to Disable Android App Support system package verification.
B. Add the MicroG repository to FDroid.
C. Check the fingerprint.
D. Search for and install microG Service Core and Fakestore.
E. Open the Android settings for MicroG, go back, re-enter and negotiate to the Apps & Notifications > App Permissions > Enable system package replacement screen.
F. Activate the system package replacement setting for MicroG and Fakestore.
G. Open MicroG and request permissions (grant them all).
H. Stop and then restart the Android App Support from the Android page in the Settings app in order finalise the MicroG configuration.
I. Open MicroG and check that everything is set up correctly.

Step 12: Install Android apps from the store
At this point, I install any other remaining Android apps that were on my old phone.

Step 13: Relax
Have a cup of tea, enjoy your new phone!

As a bit of a bootnote, I’m personally of the belief that several of these steps could be added to the automatic backup, such as the Music, Documents and app configurations. With a new device there will always be some need for fresh manual configuration. I’d love to see a better way for the apps to transfer across, but given that many of the apps are essentially sideloaded from OpenRepos, that may not be a realistic goal. At any rate, making the backup and restore process as smooth as possible is certainly something I think most users would find valuable.
19 Nov 2019 : Graphs of Waste, Part 2: A Continuous Histogram Approach #
In part one we looked at how graphs can be a great tool for expressing the generalities in specific datasets, but how even seemingly minor changes in the choice of graphing technique can result in a graph that tells an inaccurate story.

We finished by looking at how a histogram would be a good choice for representing the particular type of data I've been collecting, to express the quantity of various types of waste (measured by weight) as the area under the graph. Here's the example data plotted as a histogram.
All data plotted as a stacked histogram

While this is good at presenting the general picture, I really want to also express how my waste generation is part of a continuous process. In the very first graph I generated to try to understand my waste output, I drew the datapoints and joined them with lines. This wasn't totally crazy as it highlighted the trends over time. However, it gave completely the wrong impression because the area under the graph bore no relation to the amount of waste I produced.

How can we achieve both? Show a continuous change of the data by joining datapoints with lines, while also ensuring the area under the graph represents the actual amount of waste produced?

The histogram above achieves the goal of having the area under the graph represent the all-important quantities captured by the data clearly visible in the graph. But it doesn't express the continuous nature of the data.

Contrariwise, if we were to take the point at the top of each histogram column and join them up, we'd have a continuous line across the graph, but the area underneath would no longer represent useful data.
If we want to capture a `middle ground' between the two, it's helpful to apply some additional constraints.
  1. The line representing the weights should be continuous.
  2. The area under the line should be the same as the area under the histogram column for each column individually.
  3. For each reading, the line can be affected by the readings either side (this is inevitable if the constraint 1 is going to be enforced), but should be independent of anything further away.

To do this, we'll adjust the position of the datapoints for each of the readings and introduce a new point in between every pair of existing datapoints as follows.
  1. Start with the datapoints positioned to be horizontally centred in each column and taken as the height of the histogram column that encloses it.
  2. For every pair of datapoints A and B, place an additional point at the boundary of the columns for A and B, and with y value set as the average between the two columns A and B.

Following these rules we end up with something like this.
Plotting between the midpoint of each histogram column

This gives us our continuous line, but as you can see from the diagram, for each column the area under the line doesn't necessarily represent the quantity captured by the data. We can see this more easily by focussing in on one of the columns. The hatched area in the picture below shows area that used to be included, but which would be removed if we drew our line like this, making the area under the line for this particular region less than it should be.
Considering a single column of the histogram

Across the entire width of these graphs the additions might cancel out the subtractions, but that's not guaranteed, and it also fails our second requirement that the area under the line should be the same as the area under the histogram column for each column individually.

To address this we can adjust the position of the point in the centre of each column by altering its height to capture the correct amount of area. In the case shown above, we'd need to move the point higher because we've cut off some of the area and need to get it back. In other cases we may need to reduce the height of the point to remove area that we over-captured.
The elements making up the column The area under the lines for a column
To calculate the exact height of the central point, we can use the following formula.

$$ y = 2h - \frac{1}{2} (y_1 + y_2) .
The area $A = A_1 + A_2 + A_3 + A_4$ under the curve can then be calculated as follows.

\begin{align*} A & = \left( \frac{w}{2} \times y_1 \right) + \left( \frac{w}{2} \times y_2 \right) + \left( \frac{1}{2} \times \frac{w}{2} \times (y - y_1) \right) + \left( \frac{1}{2} \times \frac{w}{2} \times (y - y_3) \right) \\ & = \frac{w}{2} \left( \frac{1}{2} y_1 + \frac{1}{2} y_2 + y \right) . \\ \end{align*}
Substituting $y$ into this we get the following.
\begin{align*} A & = \frac{w}{2} \left( \frac{1}{2} y_1 + \frac{1}{2} y_2 + 2h - \frac{1}{2} y_1 - \frac{1}{2} y_2 \right) \\ & = wh. \end{align*}

Which is the area of the column as required.

Following this approach we end up with a graph like this.
Line after adjusting the midpoints to account for the area under the graph

Which taken on its own gives a clear idea of the trend over time, while still capturing the overall quantity of waste produced in each period as the area under the graph.
The line without the histogram, but still retaining the area-under-the-graph property

In the next part we'll look at how we can refine this further by rendering a smooth curve, rather than straight lines, but in a way that retains the same properties we've been requiring here.

All of the graphs here were produced using the superb MatPlotLib and the equations rendered using MathJax (the first time I'm using it, and it looks like it's done a decent job).
12 Nov 2019 : Graphs of Waste, Part 1: Choose Your Graph Wisely #
I have to admit I'm a bit of a data visualisation pedant. If I see data presented in a graph, I want the type of graph chosen to match the expressive aim of the visualisation. A graph should always aim to expose some underlying aspect of the data that would be hard to discern just by looking at the data in a table. Getting this right means first and foremost choosing the correct modality, but beyond that the details are important too: colours, line thicknesses, axis formats, labels, marker styles. All of these things need careful consideration.

You may think this is all self-evident, and that anyone taking the trouble to plot data in a graph will obviously have taken these things into account, but sadly it's rarely the case. I see data visualisation abominations on a daily basis. What's more it's often the people you'd expect to be best at it who turn out to fall into the worst traps. Over fifteen years of reviewing academic papers in computer science, I've seen numerous examples of terrible data visualisation. These papers are written by people who have both access to and competence in the best visualisation tooling, and who presumably have a background in analytical thinking, and yet graphs presented in papers often fail the most basic requirements. It's not unusual to see graphs that are too small to read, with unlabelled axes, missing units, use of colour in greyscale publications, or with continuous lines drawn between unrelated discrete data points.

And that's without even mentioning pseudo-3D projections or spider graphs.

One day I'll take the time to write up some of these data visualisation horror stories, but right now I want to focus on one of my own infractions. I'll warn you up front that it's not a pretty story, but I'm hoping it will have a happy ending. I'm going to talk about how I created a most terrible graph, and how I've attempted to redeem myself by developing what I believe is a much clearer representation of the data.

Over the last couple of months I've been collecting data on how much waste and recycling I generate. Broadly speaking this is for environmental and motivational reasons: I believe that if I make myself more aware of how much rubbish I'm producing, it'll motivate me to find ways to reduce it, and also help me understand where my main areas for improvement are. If I'm honest I don't expect it'll work (many years ago I was given a device for measuring real-time electricity usage with a similar aim and I can't say that succeeded), but for now it's important to understand my motivations. It goes to the heart of what makes a good graphing choice.

So, each week I weigh my rubbish using kitchen scales, categorised into different types matching the seven different recycling bins provided for use in my apartment complex.
The bins at my apartment complex

Here's the data I've collected until now presented in a table.
Measurements of waste and recycling output (g)
Date Paper Card Glass Metal Returnables Compost Plastic General
18/08/19 221 208 534 28 114 584 0 426
25/08/19 523 304 702 24 85 365 123 282
01/09/19 517 180 0 0 115 400 0 320
06/09/19 676 127 360 14 36 87 0 117
19/09/19 1076 429 904 16 0 1661 0 417
28/09/19 1047 162 1133 105 74 341 34 237
05/10/19 781 708 218 73 76 1391 54 206
13/10/19 567 186 299 158 40 289 63 273

We can't tell a great deal from this table. We can certainly read off the measurements very easily and accurately, but beyond that the table fails to give any sort of overall picture or idea of trends.

The obvious thing to do is therefore to draw a graph and hope to tease out something that way. So, here's the graph I came up with, and which I've had posted and updated on my website for a couple of months.
Data plotted directly on a graph

What does this graph show? Well, to be precise, it's a stacked plot of the weight measurements against the dates the measurements were taken. It gives a pretty clear picture of how much waste I produced over a period of time. We can see that my waste output increased and peaked before falling again, and that this was mostly driven by changes in the weight of compost I produced.

Or does it? In fact, as the data accumulated on the graph, it became increasingly clear that this is a misleading visualisation. Even though it's an accurate plot of the measurements taken, it gives completely the wrong idea about how much waste I've been generating.

To understand this better, let's consider just one of the stacked plots. The red area down at the base is showing the measurements I took for general waste. Here's another graph that shows the same data isolated from the other types of waste and plotted on a more appropriate scale.
The line plotted for general waste

If you're really paying attention you'll notice that the start date on this second graph is different to that of the first. That's because the very first datapoint represents my waste output for the seven days prior to the reading, and we'll need those extra seven days for comparison with some of the other plots we'll be looking at shortly.

There are several things wrong with this plot, but the most serious issue, the one I want to focus on, is that it gives a completely misleading impression of how much waste I've been generating. That's because the most natural way to interpret this graph would be to read off the value for any given day and assume that's how much waste was generated that day. This would leave the area under the graph being the total amount of waste output. In fact the lines simply connect different data points. The actual datapoints themselves don't represent the amount of waste generated in a day, but in fact the amount generated in a week. And because I don't always take my measurements at the same time each week, they don't even represent a week's worth of rubbish. To find out the daily waste generated, I'd need to divide a specific reading by the number of days since the last reading.

Take for example the measurements taken on the 6th September. I usually weight my rubbish on a Saturday, but because I went on holiday on the 7th I had to do the weighing a day early. Then I was away from home for seven days, came back and didn't then weight my rubbish again until the 19th, nearly two weeks later.

Although I spent a chunk of this time away, it still meant that the reading was high, making it look as if I'd generated a lot of waste over the two-week period. In fact, considering this was double the time of the usual readings, it was actually a relatively low reading. This should be reflected in the graph, but it's not. It looks like I generated more rubbish than expected; in fact I generated less.

We can see this more clearly if we plot the data as a column (bar) graph and as a histogram. Here's the column graph first.
General waste plotted as a bar chart

These are the same datapoints as in the previous graph, but drawn as columns with widths proportional to the duration that the readings represent. The column that spreads across from the 6th to the 19th September is the reading we've just been discussing. This is a tall, wide, column because it represents a long period (nearly two weeks) and a heaver than usual weight reading (because it's more than a weeks' worth of rubbish). If we now convert this into a histogram, it'll give us a clearer picture of how much waste was being generated per day.
General waste plotted as a histogram

This histogram takes each of the columns and divides it by the number of days the column represents. A histogram has the nice property that the area — rather than the height — of a column represents the value being plotted. In this histogram, the area under all of the columns represents the quantity of waste that I've generated across the entire period: the more blue, the more waste.

Not only is this a much clearer representation, it also completely changes the picture. The original graph made it look like my waste output peaked in the middle. There is a slight rise in the middle, but it's actually just a local maximum. In fact the overall trend was that my daily general waste output was decreasing until the middle of the period, and then rose slightly over time. That's a much more accurate reflection of what actually happened.

It would be possible to render the data as a stacked histogram, and to be honest I'd be happy with that. The overall picture, which ties in with my motivation for wanting the graph in the first place, indicates how much waste I'm generating based on the area under the graph.
All data plotted as a stacked histogram

But in fact I tend to be generating small bits of rubbish throughout the week, and I'd like to see the trend between readings, so it would be reasonable to draw a line between weeks rather than have them as histogram blocks or columns.

So this leads us down the path of how we might draw a graph that captures these trends, but still also retains the nice property that the area under the graph represents the amount of waste produced.

That's what I'll be exploring in part two.

All of the graphs here were generated using the superb MatPlotLib.
17 Aug 2019 : Querying the cost of sharing code between iOS and Android #
Eyal Guthmann, a Dropbox software engineer, has written an interesting piece about the difficulties of sharing C++ code across mobile platforms. I'm not questioning the truth of the difficulties Dropbox experienced, but as someone who's part of the mobile C++ dev community*, and in past lives has helped maintain C/C++ libraries shared across multiple platforms (Android, iOS, Windows, Linux, Sailfish), I don't buy all of the arguments he presents.

Let's take the points he raises one-by-one.

1) The overhead of custom frameworks and libraries - replacing language defaults

I admit this can be painful and intricate, but the main platforms already have support for cross-language library binding. When it comes to threading I'd argue the sane approach is to keep the threading in one place, on the platform-specific side, given each platform has its own slightly different approach. You can still share plenty of useful code without trying to homogenise a threading implementation across platforms.

Eyal also brings up threading in the context of debugging across language boundaries. I'd apply the same rule here: keep the threading out of the C/C++ code. That doesn't mean you can't share plenty of the code that executes inside each thread, of course.

2) The overhead of custom frameworks and libraries - replacing language defaults

Eyal cites two examples (json11 and nn) of custom libraries for replacing language defaults that Dropbox has to maintain. Combined they amount to 1812 lines of code, including comments. I find it difficult to believe Dropbox struggles with the overhead of maintaining these libraries.

3) The C++ mobile community is almost non-existent

Eyal needs to look harder. Either that or he's putting more weight on that "almost" than I think the word can reasonably sustain. Maybe he should have spoken to the devs at Qt?

4) Difference between platforms

Perhaps I'm misunderstanding what Dropbox were trying to achieve, but I'd argue the key to using cross platform C/C++ libraries is through good architecting: choosing which parts to work on cross-platform and which to leave as platform-specific. In some cases such as UI, control-flow/event handling and hardware access, it just makes more sense to use the good, bespoke, vendor-supplied tools and approaches.

5) The overhead of a custom dev environment

At least this arguments has some force for me. My personal experience is that tooling is quite painful even when you stick to the most standard of environments and approaches on a single platform. Adding in multiple toolchains and environments into a single project is going to introduce some interesting and new ways to experience pain.

6) The overhead of training, hiring, and retaining developers

I work for a company that employs many C++ mobile devs and getting quality talent certainly isn't easy. Then again I've never worked anywhere that found recruiting easy. If Dropbox find it easier to recruit mobile devs with Swift or Kotlin experience, then I'm not going to argue. Reading between the lines though, it sounds like Dropbox lost a big chunk of their C++ team and failed to keep the knowledge within the company. Sometimes even the best planning can't avoid something like that happening, but it doesn't follow that the technology in question is to blame.

So, to summarise, what I'm saying is that unless you're writing your complete application using some fully cross-platform toolkit (e.g. Qt, Xamarin, etc.) in which case you accept the compromises that come with that, then you can still use C/C++ for reducing maintenance with good partitioning. Use C/C++ for core library functionality but anything less generic, including control flow and UI, should stay as platform-specific code where vendors already provide good tooling but with largely incompatible approaches anyway.

I have to say, I feel greatly privileged that I'm now being paid to develop for a single platform that's perfectly tailored for C/C++ development across the entire stack. But I acknowledge that cross-platform development is a reality for a company like Dropbox and that it's hard. It's a shame that Dropbox feel they have to give up on code-sharing for their solution.

* I'm a C/C++ developer working in the mobile space, so that makes me "part of the community", right?
24 Mar 2019 : GetiPlay 0.7-1 released #
Here's the changelog for the just-released version 0.7-1 of GetiPlay. More details below, from OpenRepos or github.

Sun Mar 24 2019 David Llewellyn-Jones <> 0.7-1
  1. Correct iterator errors when deleting media files and items from queue.
  2. Correctly trim logfile and prevent UI performance degradation over time.
  3. Correct an incorrect RPM configuration.
  4. Remove cyclic dependences in QML.
  5. Fix various other QML errors.
  6. Add scroll animation when clicking on tab to jump to the top of the page.
  7. Allow control using the lockscreen media (MPRIS) controls.
  8. Improve the button layout on the queue item info screen.
22 Mar 2019 : What does the latest petition tell us about changing attitudes to Brexit? #
With the latest petition trying to revoke article 50 and block Brexit, I've crunched the numbers again to find out how the mood is changing in different parts of the UK. Check out my regrexitmap to see which parts of the UK are movig more towards remain, and which are moving more towards Brexit. And if you're surprised by the result, you should also check out the map I generated in May 2016 using data from a similar petition which attracted over 4 million signatures three years ago. The astonishing thing is that compared to back then, very little... very very little has changed.
Comparing regrexit after three years
3 Sep 2018 : Yet More Proof that the Human Race is Screwed #

I don’t usually get angry, but something about this really hustles my hircus. I just clicked through an advertarticle on the Register about “Serverless Computing London”, a conference that claims to help developers “decide on the best path to a more efficient, scalable and secure computing future.”.

The speaker roster looked interesting, because I’d never heard of any of them (that’s just me; I’m not following Serverless trends closely), so I clicked through to find out about the headline keynote, Chad Arimura, from Oracle. Chad’s image seemed to load slower than the rest of the page, which made me suspicious. So I loaded up the image separately and this is what I found.

This image is too large

Chad’s mugshot is being downloaded as a behemoth 2756 x 2756 pixel image and then scaled down on my screen to a 114 x 114 pixel image client-side. Check out those timing bars. It’s taking 1.5 seconds to download the bastard. Because it’s nearly 1 meg of data.

I did some scientific testing, and established that if the image had been scaled down at the site, it could have been served to me as 3.9kB of data. That’s 0.004 of the bandwidth. Huge swaths of time, resources and human ingenuity have gone in to developing efficient image compression algorithms so that we can enjoy a rich multimedia Web, minimising the energy required while we fret about global warming due to our historical excesses. A visually-identical 114x114 pixel BMP image (circa 1995) would have taken 52kB of bandwidth.

This all wouldn’t be so bad if maybe Chad didn’t look quite so smug*, and if we couldn’t discern from the title of the image that someone went to the trouble of cropping it. Why didn’t they just scale it down at the same time?

But the saddest part, of course, is that this is to advertise a conference about Serverless Computing. What’s the point of Serverless Computing? To allow better allocation of resources so that server time is spent serving content, rather than waiting for requests.

I totally appreciate the irony of me spending an hour posting an image-heavy blog post about how a conference on a perfectly valid technical subject is wasting bandwidth. But I would simply say that this only strengthens my argument: I'm human too. We're all screwed.

* To be fair to Chad, it’s almost certainly not his fault (other speakers get the same treatment), and the admirable minimalism of his personal website suggests he’s actually totally bought in to the idea of efficient web design.
23 Aug 2018 : Sending emails from AWS Lambda inside a VPC without NAT #

Many websites are made up of some core stateless functionality tied to a database where all the state lives. The functionality may make changes to the database state, but all of the tricky issues related to concurrency and consistency (for example, in case two users simultaneous cause the database to be updated) are left to the database to deal with. That allows the stateless part to be partitioned off and run only when the user is actually requesting a page, making it easily scalable.

In this scenario, having a full-time monolithic server (or bank of servers) handling the website requests is overkill. Creating a new server instance for each request is potentially much more cost efficient and scalable. Each request to the site triggers a new function to be called that runs the code needed to generate a webpage (e.g. filling out a template with details for the user and view), updating the database if necessary. Once that’s done, the server is deleted and the only thing left is the database. An important benefit is that, if there are no requests coming in, there’s no server time to pay for. This is the idea behind ‘serverless’ architectures. Actually, there are lots of servers involved (receiving and actioning HTTP requests, running the database, managing the cluster) but they’re hidden and costs are handled by transaction rather than by uptime.

AWS Lambda is one of the services Amazon provides to allow this kind of serverless set up. Creating ‘Lambda functions’ (named after the Lambda calculus, but really they’re just functions) that run on various triggers, like a web request, has been made as easy as pie. Connecting these functions to an RDS database has also been made really easy. But there’s a fly in the ointment.

To get the Lambda function communicating with the RDS instance, it’s common practice to set them both up inside the same Virtual Private Cloud. This isn’t strictly necessary: it’s possible to have the database exposed on a public IP and have the Lambda function communicate with it that way. However, the obvious downside to doing it like this is that the database is exposed to the world, making it a hacking and denial-of-service target. If both the Lambda function and database are in a VPC, then assuming everything is suitably configured, the database will be effectively protected from external attack.

Setting up a VPC

The beauty of this arrangement is that the Lamdba functions will still respond to the GET and POST requests for accessing the site, because these are triggered by API Gateway events rather than direct connections to the functions. It’s a nice arrangement.

However, with the Lambda function inside the VPC, just like the database, it has no public IP address. This means that by default it can’t make any outgoing connections to public IP addresses. This doesn’t necessarily matter: a website access will trigger an event, the Lambda function fires up, communicates with the database, hands over a response which is sent back to the user. The API gateway deals with the interface between the request/response and Lambda function interface.

The problem comes if the Lambda function needs to access an external resource for some other reasons. For example, it might want to send an email out to the user, which requires it to communicate with an SMTP server. Websites don’t often need to send out emails, but on the occasions they do it tends to be to ensure there’s a second communication channel, so it can’t be handled client-side. For example, when a user registers on a site it’s usual for the site to send an email with a link the user must click to complete the registration. If the user forgets their password, it’s common practice for a site to send a password reset link by email. Increasingly sites like Slack are even using emails as an alternative to using passwords.

A Lambda function inside a VPC can’t access an external SMTP server, so it can’t send out emails. One solution is to have the RDS and the Lambda function on the public Internet, but this introduces the attack surface problem mentioned above. The other solution, the one that’s commonly recommended, is to set up a NAT Gateway to allow the Lambda function to make outgoing connections to the SMTP server.

Technically this is fine: the Lambda function and RDS remain protected behind the NAT because they’re not externally addressable, but the Lambda function can still make the outgoing connection it needs to send out emails. But there’s a dark side to this. Amazon is quite happy to set up a NAT to allow all this to happen, but it’ll charge for it by the hour as if it’s a continuously allocated instance. The benefits of running a serverless site go straight out the window, because now you’ve essentially got a continuously running, continuously charged, EC2 server running just to support the NAT. D’oh.

Happily there is a solution. It’s a cludge, but it does the trick. And the trick is to use S3 as a file-based gateway between a Lambda function that’s inside a VPC, and a Lambda function that’s outside a VPC. If the Lambda function inside the VPC wants to send an email, it creates a file inside a dedicated S3 bucket. At the same time we run a Lambda function outside the VPC, triggered by a file creation event attached to the bucket. The external Lambda function reads in the newly created file to collect the parameters needed for the email (recipient, subject and body), and then interacts with an SMTP server to send it out. Because this second Lambda function is outside the VPC it has no problem contacting the external SMTP server directly.

So what’s so magical about S3 that means it can be accessed by both Lambda functions, when nothing else can? The answer is that we can create a VPC endpoint for S3, meaning that it can be accessed from inside the VPC, without affecting the ability to access it from outside the VPC. Amazon have made special provisions to support this. You’d have thought they could do something similar with SES, their Simple Email Service, as well and fix the whole issue like that. But it’s not currently possible to set SES up as a VPC endpoint, so in the meantime we’re stuck using S3 as a poor-man’s messaging interface.

The code needed to get all this up-and-running is minimal, and even the configuration of the various things required to fit it all together isn’t particularly onerous. So let’s give it a go.

Creating an AWS Lambda S3 email bridge

As we’ve discussed, the vagaries of AWS mean it’s hard to send out emails from a Lambda function that’s trapped inside a VPC alongside its RDS instance. Let’s look at how it’s possible to use S3 as a bridge between two Lambda functions, allowing one function inside the VPC to communicate with a function outside the VPC, so that we can send some emails.

At the heart of it all is an S3 bucket, so we need to set that up first. We’ll create a dedicated bucket for the purpose called ‘yfp-email-bridge’. You can call it whatever you want, but you’ll need to switch out ‘yfp-email-bridge’ in the instructions below for whatever name you choose.

Create the bucket using the Amazon S3 dashboard and create a folder inside it called email. You don’t need to do anything clever with permissions, and in fact we want everything to remain private, otherwise we introduce the potential for an evil snooper to read the emails that we’re sending.

Here’s my S3 bucket set up with the email folder viewed through the AWS console.

Create an S3 bucket with a folder called 'email' inside

Now let’s create our email sending Lambda function. We’re using Python 3.6 for this, but you can rewrite it for another language if that makes you happy.

So, open the AWS Lambda console and create a new function. You can call it whatever you like, but I’ve chosen send_email_uploaded_to_s3_bridge (which in retrospect is a bit of a mouthful, but there’s no way to rename a function after you’ve created it so I’m sticking with that). Set the runtime to Python 3.6. You can either use an existing role, or create a new one with S3 read and write permissions.

Now add an S3 trigger for when an object is created, associated with the bucket you created, for files with a prefix of email/ and a suffix of .json. That’s because we’re only interested in JSON format files that end up in the ‘email’ folder. You can see how I’ve set this up using the AWS console below.

Set up the Lambda function to trigger at the right time.

When the trigger fires, a JSON string is sent to the Lambda function with contents much like the following. Look closely and you’ll see this contains not only details of the bucket where the file was uploaded, but also the filename of the file uploaded.

    "Records": [
            "eventVersion": "2.0",
            "eventSource": "aws:s3",
            "awsRegion": "eu-west-1",
            "eventTime": "2018-08-20T00:06:19.227Z", 
            "eventName": "ObjectCreated:Put", 
            "userIdentity": {
                "principalId": "A224SDAA064V4C"
            "requestParameters": {
                "sourceIPAddress": "XX.XX.XX.XX"
            "responseElements": {
                "x-amz-request-id": "D76E8765EFAB3C1", 
                "x-amz-id-2": "KISiidNG9NdKJE9D9Ak9kJD846hfii0="
            "s3": {
                "s3SchemaVersion": "1.0", 
                "configurationId": "67fe8911-76ae-4e67-7e41-11f5ea793bc9", 
                "bucket": {
                    "name": "yfp-email-bridge", 
                    "ownerIdentity": {
                        "principalId": "9JWEJ038UEHE99"
                    "arn": "arn:aws:s3:::yfp-email-bridge"
                "object": {
                    "key": "email/email.json", 
                    "size": 83, 
                    "eTag": "58934f00e01a75bc305872", 
                    "sequencer": "0054388a73681"

Now we need to add some code to be executed on this trigger. The code is handed the JSON shown above, so it will need to extract the data from it, load in the appropriate file from S3 that the JSON references, extract the contents of the file, send out an email based on the contents, and then finally delete the original JSON file. It sounds complex but is actually pretty trivial in Python. The code I use for this is the following. You can paste this directly in as your function code too, just remember to update the sender variable to the email address you want to send from.


import os, smtplib, boto3, json
from email.mime.text import MIMEText

s3_client = boto3.client('s3')

def send_email(data):
	sender = ''
	recipient = data['to']
	msg = MIMEText(data['body'])
	msg['Subject'] = data['subject']
	msg['From'] = sender
	msg['To'] = recipient

	result = json.dumps({'error': False, 'result': ''})
		with smtplib.SMTP(host=os.environ['SMTP_SERVER'], port=os.environ['SMTP_PORT']) as smtp:
			smtp.login(os.environ['SMTP_USERNAME'], os.environ['SMTP_PASSWORD'])
			smtp.sendmail(sender, [recipient, sender], msg.as_string())
	except smtplib.SMTPException as e:
		result = json.dumps({'error': True, 'result': str(e)})
	return result

def lambda_handler(event, context):
	for record in event['Records']:
		bucket = record['s3']['bucket']['name']
		key = record['s3']['object']['key']
		size = record['s3']['object']['size']
		# Ignore files over a certain size
		if size < (12 * 1024):
			obj = s3_client.get_object(Bucket=bucket, Key=key)
			data = json.loads(obj['Body'].read().decode('utf-8'))

		# Delete the file
		print("Deleting file {bucket}:{key}".format(bucket=bucket, key=key))
		s3_client.delete_object(Bucket=bucket, Key=key)

This assumes that the following environment variables have been defined:


The purpose of these should be self-explanatory, and you’ll need to set their values to something appropriate to match the SMTP server you plan to use. As long as you know what values to use, filling them on the page when creating your Lambda function should be straightforward, as you can see in the screenshot below.

The lambda function needs some environment variables configured.

Now save the Lambda function configuration. We’ve completed half the work, and so this is a great time to test whether things are working

29 Jun 2018 : Going QML-Live #

In my spare time I've been developing a QT app called GetiPlay. It's a simple app that allows you to download audio and video from BBC iPlayer, for use on Sailfish OS phones. The traditional approach on Linux devices would be to use get_iplayer in a console, but for all of the progress that's been made on mobile devices in the last decade, console use still sucks. Given I spend so much time listening to or watching BBC content, slapping a simple UI over the command line get_iplayer was an obvious thing to do.

The app has been developing nicely, using the QT Creator for C++ and the UI written in QML. Historically I've not been a fan of QML, but as I grow more familiar with it, it's been growing on me. For all of the things that I find weird about it, it really does give great performance and helps build a consistent UI, as well as promoting loose coupling between the UI and underlying functional logic.

A big downside to QML is that there's no preview, so the development process follows a consistent cycle: adjust code, build code, deploy code, test, repeat. The build and deploy steps are loooong. This impacts things in three serious ways: it makes development slow, it makes me sleepy, and it incentivises against making minor tweaks or experimentation.

Is It Worth the Time?

Nevertheless, there's always a trade-off between configuring and learning new technologies, and just getting things done using those you're already using. The ever-relevant XKCD has more than one pertinent comics covering this topic.


The UI for GetiPlay is straightforward, so I was quite content to use this lengthy, but (crucially) working approach until yesterday. What prompted me to change was a feature request that needed some more subtle UI work, with animated transitions between elements that I knew would take a couple of hundred cycles round that development loop to get right. Doing the maths using Randall Munroe's automation matrix, I needed to find a more efficient approach.

So this morning I started out using QML Live. This is a pretty simple tool with an unnecessarily bulky UI that nevertheless does a great job of making the QML design approach more efficient. You build and run the app as usual, then any QML changes are directly copied over to the device (or emulator) and appear in the app immediately. Previously a build cycle took between 40 and 100 seconds. Now it's too quick to notice: less than a second.

QT Creator IDE and QML-Live

Using a quick back of the envelope calculation, I'll perform a UI tweak that would previously have required a rebuilt around 20 times a day, but probably only every-other day, so let's say 10 times a day for the next six months. So (10 * 365 * 0.5) / (60 * 24) = 1.27 days I can save. I spent about half a day configuring everything properly, so that leaves a saving of 0.77 days, or 18 hours. Not bad!

QML-Live certainly isn't perfect, but it's simple, neat and has made me far more likely to try out interesting and experimental UI designs. Time configuring it is time well spent, even if that extra 18 hours is just about the same amount of time I wasted dithering over the last two days!

12 Jun 2018 : GetiPlay now actually plays, too #
For some time now I've been meaning to add a proper media player to GetiPlay. Why, you may well ask, bother to do this when Sailfish already has a perfectly good media player built in? Well, there are two reasons. First, for TV and radio programmes, one of the most important controls you can have is 'jump back a few seconds'. I need this when I'm watching something and get interrupted, or miss an important bit of the narrative, or whatever. It's such a useful button, it's worth writing a completely new media player for. Second, it's just far more seamless to have it all in one application.

So I finally got to adding it in. Here's the video player screen.

The QT framework really does make it easy to add media like this. It still took a good few days to code up of course, but it'd be a lot quicker for someone who knew what they were doing.

I'm also quite proud of the audio player, with the same, super-useful '10 seconds back' button. It also stays playing no matter where you move to in the app. Here it is, showing the controls at the bottom of the screen.

If you'd like to get these new features in your copy of GetiPlay, just download the latest version from OpenRepos, grab yourself the source from GitHub, or check out the GetiPlay page.
6 Jun 2018 : Huge GetiPlay release 0.3-1 #
I'm really pleased to release version 0.3-1 of GetiPlay, the unofficial interface for accessing BBC iPlayer stuff on Sailfish OS. This latest version is a huge update compared to previous releases, with a completely new tab-based UI and a lovely download queue so you can download multiple programmes without interruption.

Immediate info about every one of the thousands and thousands of TV and radio programmes is also now just a tap away.

Install yourself a copy from OpenRepos, grab the MIT-licensed source from GitHub or visit the GetiPlay page on this site.
30 May 2018 : My last teaching at Cambridge #
In 2016 I did my first teaching at Cambridge, and now I've just finished what is likely to be my last ever supervision at Cambridge. The course was Part IB security (the second course out of three the students study), and as with all of the Cambridge courses, the structure is lectures and small-group supervisions (tutorials with two or three students). This term I was teaching students from St John's and Peterhouse colleges. My experience this term was made particularly good by a set of diligent and engaged students. In large classes, if there are too many questions it can become overwhelming, but with small groups there's much more scope to cover questions more deeply. Security covers the breadth of topics, from those that are quite straightforward to those that are much more conceptual, and all of the students this year were on the ball both asking very sensible questions, and answering questions for each other. That makes for a much more enjoyable teaching experience (and if you're reading this: good job; I hope you enjoyed the supervisions too).

The Computer Lab, Cambridge

So, I didn't think I'd say this, but I'll miss this teaching. I've had the privilege to experience teaching across multiple HE institutions in the UK (Oxford, Birmingham, Liverpool John Moores, Cambridge). Living up to the high teaching standards of my colleagues and what the students' rightfully demand has been hard across all of these, but it's been great motivation and inspiration at the same time.

And, having grown up in a household of teachers, and after twenty years in the business, I think I've now seen enough of a spectrum to understand both the importance of teaching, but also its limitations. The attitude and aptitude of students plays such a crucial role in their learning. When you only get to interact with students in one small slice of their overall curriculum, there's a limit to how much you can affect this. That's not to downplay the importance of encouraging students in the right way, but rather to emphasise that teaching is a group activity. Students need good teachers across the board, and also need to bring an appetite.

It's great to teach good, enthusiastic students, and to see them grasp ideas as they're going along. But my ultimate conclusion is a rather selfish one: the best way to learn a practical subject is to do it; the best way to learn a theoretical subject is to teach it.
8 May 2018 : Finally addressing gitweb's gitosis #

My life seems to move in cycles. Back in February 2014 I set up git on my home server to host bare repositories for my personal dev projects. Up until then I'd been using Subversion on the same machine, and since most of my projects are personal this worked fine. Inevitably git became a sensible shift to make, so I set up gitolite for administration and with the Web front-end served up using gitweb.

Unfortunately, back then I couldn't get access control for the Web frond-end to synchronise with gitolite. It's been a thorn ever since, and left me avoiding my own server in favour of others. There were two parts to the reason for this. First the inability to host truly private projects wsa an issue. I often start projects, such as research papers where I host the LaTeX source on git, in private but then want to make them public later, for example when the paper has been published. Second, I was just unhappy that I couldn't set things up the way I wanted. It was important for me that the access control of the Web front end should be managed through the same config approach as used by gitolate for the git repositories themselves. Anything else just seemed backwards.

Well, I've suddenly found myself with a bit of time to look at it, and it turned out to be far easier than I'd realised. With a few global configuration changes and some edits to the repository config, it's now working as it should.

So, this isn't intended as a tutorial, but in case anyone else is suffering from the same mismatched configuration approach between gitweb and gitolite, here's a summary of how I found to set things up in a coherent way.

First, the gitweb configuration. On the server git is set up with its own user (called 'git') and with the repositories stored in the project root folder /srv/git/repositories. The gitweb configuration file is /etc/gitweb.conf. In this file, I had to add the following lines:

$projects_list = $projectroot . "/../projects.list";
$strict_export = 1;

The first tells gitweb that the Web interface should only list the project shown in the /srv/git/projects.list file. The second tells gitweb not to allow access to any sub-project that's not listed, even if someone knows (or can guess) the direct URL for accessing it.

However, that projects.list file has to be populated somehow. For this, I had to edit the gitolite config file at /srv/git/.gitolite.rc. This was already set up mostly correctly (probably with info I put in it four years ago), apart from the following line, which I had to add:

$GL_GITCONFIG_KEYS = "gitweb.owner|gitweb.description|gitweb.category";

This tells gitolite that any of these three keys can be validly added to the overall gitolote configuration files, for them to be propagated on to the repositories. The three values are used to display owner, description and category in the Web interface served by gitweb. However, even more importantly, any project that appears in the gitolite file with one of these variables, will also be added to the projects.list file automatically.

That's great, because it means I can now add entries to my gitolite.conf that look like this:

repo    myproject
        RW+     =   flypig
        R       =   @all
        R       =   gitweb
        config gitweb.owner = flypig
        config gitweb.description = "A project which is publicly accessible"
        config gitweb.category = "Public stuff"

When these changes are pushed to the gitolite-conf repository, hey-presto! gitolite will automatically add the project to the projects.list file, and the project will be accessible through the Web interface. Remove the last four lines, and the project will go dark, hidden from external access.

It's a small change, but I'm really pleased that it's finally working properly after such a long time and I can get back to developing stuff using tools set up just the way I like them.

2 Apr 2018 : Apple believes privacy is a fundamental human right #
The latest update from Apple brought with it a rather grand statement about privacy, stating that "Apple believes privacy is a fundamental human right". So do I, as it happens, so I'm glad Apple are making it known. However, we've heard similar claims from companies like Microsoft in the past (remember Scroogled?), so I'm always sceptical when large multi-national companies that run successful advertising platforms make grand claims about their customers privacy. Maybe it's even made me a bit cynical.

I much prefer to judge companies by their privacy policies than by their slick advertising statements, and to their credit Apple seem to be delivering on their privacy claims by putting their privacy poliies right in front of their users. Unfortunately they've done it in a way that's totally unusable. The fact that all of the privacy statements are in one place is great. The fact that they're in a tiny box that doesn't allow you to export -- or even select and copy out -- all of the text, is a usability clusterfuck. Please Apple, by all means put the policy front and centre of your user interface, but provide us with a nicely formatted text file or Web page to view it all on as well.

  The Apple privacy window

If you're concerned about your privacy like me, you'll want to read through this material in full. But worry not. I've gone to the trouble of selecting each individual piece of text and pasting into a markdown file that, I think, makes things much more readable. View the whole thing on Github, and if you notice any errors or changes, please submit a pull request and I'll try to keep it up-to-date.

In spite of my cynicism, I actually believe Apple, Microsoft, Google and especially Facebook take user privacy incredibly seriously. They know that the whole model is built on trust and that users will be offended if they abuse this trust. Everyone says that 'the user is the product' on platforms like Facebook, as if to suggest they don't really care about you, but all of these companies also know that their value is based on the satisfaction of their users. They have to provide a good service or users will go elsewhere. The value they get from your data is based on their ability to control your data, which means privacy is important to them.

Unfortunately, the motivation these tech companies have for protecting your data is also something that undermines your and my privacy as users of their services. Privacy is widely misunderstood as being about whether data is made public or not, whereas -- at least by one definition -- it's really about having control over who has access to information about you. By this argument a person who chooses to make all of their data public is enjoying privacy, as long as they've done it without coercion, and can change their stance later.

The tech companies have placed themselves as the means by which we maintain this control, but this means we have to trust them fully, and it also means we have to understand them fully. Privacy policies are one of the most important tools for getting this understanding. As users, we should assume that their privacy policies are the only constraint on what they'll really be willing to do with our data. Anything they write elsewhere is subordinate to the policy, and given the mixture of jurisdictions and wildly varying capabilities of oversight bodies around the world, I'd even put more weight on these polices than I would on local laws. In short, the policies are what matters, and they should be interpreted permissively.
12 Mar 2018 : Spring time at Howe Farm Zoo #
The house Joanna and I are currently renting is right on the edge of Cambridge. The city centre is due  south east, but to the north and to the west it’s just fields and the odd motorway as far as the eyes can see (which it turns out, according to Google Maps, is the Cambridge American Cemetry 2 miles away).

The view according to Google.

The view according to Google Maps

The view according to my window.

The view according to my window

Because it’s so close to the edge of the city, it’s really quite rural and as a result we share our house and garden with large numbers of other animals. It’s not unusual for rabbits, squirrels, deer and pheasants to wander around the grounds (all 100 square meters of it). What’s more, the boundary between the outside and inside of our house is distressingly porous, with insects and arachnids apparently enjoying free movement between the two.

Last night my programming was interrupted by a vicious buzzing sound. It turned out to be a queen wasp, awoken from its slumber over the winter and now angrily headbutting my light shade in a bid to head towards the sun. I’m not keen on wasp stings to be honest, so extracting it was quite a delicate exercise that involved gingerly opening and closing the door, dashing in and out of the room, turning the light on and off and chasing the wasp with a Tupperware box. I got it eventually and dragged it out into the cold; I’m sure it’ll return.

I take this to be a clear sign that spring has arrived. The turning of the seasons are the four points of the year I love most, so I’m excited by this. Other signs that we’re reaching spring include the spiders that have started stalking me during my mornig showers, and the arrival of beautiful clumps of daffodils on the lawn in our garden. So, roll on spring I say. Let’s get the dull winter behind us and start to sprout.

Daffodils in the garden

6 Mar 2018 : Beauty and the User Agent String #
Thanks to OSNews for linking to this great article about the messed up history of the Browser User Agent String. There's a moral in this story somewhere, but only if you can overcome the immediate feeling of despair about human progress this article induces.
25 Feb 2018 : Being successful as a thief #
Ars has a great video interviewing Paul Neurath about the troubled development of Thief. I loved sneaking around in the Thief games, from the original right through Deadly Shadows and up to the latest remake. But apart from a wonderful excuse to replay the games in my head, the real message of the video is about the challenges and time pressures of development, something I'm acutely aware of right now with Pico.

"You have to make mistakes. You try things, you go down a lot of dead ends. In this case a lot of those dead ends didn't pan out. But we were learning... That was the key thing. We finally had the mental model after doggedly pursuing this for a year. Now we know what we need to do to get this done and we figured it out and got it done."

When I was young game developers were my heroes. It's good to know that such an inspirational series of games suffered failures and challenges, but still came out as the amazing games they were. We're all working towards the moments they experienced, when "it worked and it felt great."

12 Feb 2018 : Countdown #
I'm not convinced it was good use of my time, but I spent the weekend writing some code to solve the Countdown numbers game. In case you're not familiar with Countdown, here's a clip.

There are lots of ways to do this, but my solution hinges on being able to enumerate all binary trees with a given number of nodes. Doing this efficiently (both in terms of time and memory) turned out to be tricky, and there's a hinge for this too, based on how the trees are represented. The key is to note that each layer can't have more than n nodes, where n is the number of nodes the tree can have overall.

Each tree is stored as a list, with each item in the list representing the nodes at a given depth in the tree (a layer). Each item is a bit sequence representing which nodes in the layer have children.

These bit sequences would get long quickly if they represented every possible node in a layer (since there are 2, 4, 8, 16, 32, ... possible nodes at each layer). Instead, the index of the bit represents the index into the actual nodes in the layer, rather than the possible nodes. This greatly limits the length of the bit sequence, because there can no more than n actual nodes in each layer, and there can be no more than n layers in total. The memory requirement is therefore n2.

Here's an example:
T = [[1], [1, 1], [1, 1, 0, 0], [0 ,1, 0, 0]]
which represents a tree like this:

A binary tree

It's really easy to cycle through all of these, because you can just enumerate each layer individually, which involves cyclying through all sequences of binary strings.

It's not a new problem, but it was a fun exercise to figure out.

The code is up on GitHub in Python if you want to play around with it yourself.
30 Sep 2017 : Connecting to an iPhone using BLE and gatttool #

On the Pico project we've recently been moving from Bluetooth Classic to BLE. We have multiple motivations for this, and not just the low energy promise. In addition, BLE provides RSSI values, which means we can control proximity detection better, and frankly, Bluetooth has been causing us a lot of reliability problems that we've had to work around. We're hoping BLE will work better. Finally, we're developing an iPhone client. iOS, it seems doesn't properly support Bluetooth Classic, so BLE is our best option for cross-platform compatibility.

One of the challenges developing anything that uses a protocol built on top of some transport is that typically both ends of the protocol have to be developed simultaneously. This slow things down, especially when we're trying to distribute tasks across several developers. So we were hoping to use gatttool, part of bluez on Linux, as an intermediate step, to allow us to check the initial iOS BLE code worked before moving on to the Pico protocol proper.

So, here's a quick summary of how we used gatttool to write characteristics to the iPhone.

One point to note is that things weren't smooth for us. In retrospect we had the iPhone correctly running as a BLE peripheral, but we had real trouble connecting. I'll explain how we fixed this too.

Writing to a BLE peripheral using bluez is a four step process:

  1. Scan for the device using hcitool.
  2. Having got the MAC from the scan, connect to it using gatttool.
  3. Find the handle of the characteristic you want to write to.
  4. Perform the write.

The first step, scanning for a device, can be done using the following command.

sudo hcitool -i hci0 lescan

This commend is using hcitool to perform a BLE scan (lescan) using the local device (-i hci0). If you have more than one Bluetooth adaptor, you may want to specify the use of something other than hci0.

When we first tried this, we kept on getting input/output errors, even when run as root. I don't know why this was, but eventually we found a solution:

sudo hciconfig hci0 down
sudo hciconfig hci0 up

Not very elegant, but it seemed to work. After this, the scan started throwing up results.

flypig@delphinus:~sudo hcitool -i hci0 lescan
LE Scan ...
58:C4:C5:1F:C7:70 (unknown)
58:C4:C5:1F:C7:70 Pico's iPhone
CB:A5:42:40:F8:68 (unknown)
58:C4:C5:1F:C7:70 (unknown)
58:C4:C5:1F:C7:70 Pico's iPhone

Note the repeated entries. The device I was interested in was "Pico's iPhone", where we were running our test app. On other occasions when I've performed the scan, the iPhone MAC address came up, but without the name (marked as "unknown"). Again, I don't know why this is, but just trying the MACs eventually got me connected to the correct device.

Having got the MAC, now it's time to connect (step 2).

sudo gatttool -t random -b 58:C4:C5:1F:C7:70 -I

What's this all about? Here we're using gatttool to connect to the remote device using its Bluetooth address (-b 58:C4:C5:1F:C7:70). Obviously if you're doing this at home you should use the correct MAC which is likely to be different from this. Our iPhone is using a random address type, so we have to specify this too (-t random). Finally, we set it to interactive mode with -I. This will open gatttool's own command console so we can do other stuff.

If everything goes well, the console prompt will change to include the MAC address.


So far we've only set things up and not actually connected. So we should connect.

[58:C4:C5:1F:C7:70][LE]> connect
Attempting to connect to 58:C4:C5:1F:C7:70
Connection successful

Great! Now there's a time problem. The iPhone will throw us off this connection after only a few seconds. If it does, enter 'connect' again to re-establish the connection. There's another catch though, so be careful: the iPhone will also periodically change it's MAC address. If it does, you'll need to exit the gatttool console (Ctrl-D), rescan and then reconnect to the device as above.

Having connected we want to know what characteristics are available, which we do be entering 'characteristics' at the console.

[58:C4:C5:1F:C7:70][LE]> characteristics
handle: 0x0002, char properties: 0x02, char value handle: 0x0003, uuid: 00002a00-0000-1000-8000-00805f9b34fb
handle: 0x0004, char properties: 0x02, char value handle: 0x0005, uuid: 00002a01-0000-1000-8000-00805f9b34fb
handle: 0x0007, char properties: 0x20, char value handle: 0x0008, uuid: 00002a05-0000-1000-8000-00805f9b34fb
handle: 0x000b, char properties: 0x98, char value handle: 0x000c, uuid: 8667556c-9a37-4c91-84ed-54ee27d90049
handle: 0x0010, char properties: 0x98, char value handle: 0x0011, uuid: af0badb1-5b99-43cd-917a-a77bc549e3cc
handle: 0x0034, char properties: 0x12, char value handle: 0x0035, uuid: 00002a19-0000-1000-8000-00805f9b34fb
handle: 0x0038, char properties: 0x12, char value handle: 0x0039, uuid: 00002a2b-0000-1000-8000-00805f9b34fb
handle: 0x003b, char properties: 0x02, char value handle: 0x003c, uuid: 00002a0f-0000-1000-8000-00805f9b34fb
handle: 0x003e, char properties: 0x02, char value handle: 0x003f, uuid: 00002a29-0000-1000-8000-00805f9b34fb
handle: 0x0040, char properties: 0x02, char value handle: 0x0041, uuid: 00002a24-0000-1000-8000-00805f9b34fb
handle: 0x0043, char properties: 0x88, char value handle: 0x0044, uuid: 69d1d8f3-45e1-49a8-9821-9bbdfdaad9d9
handle: 0x0046, char properties: 0x10, char value handle: 0x0047, uuid: 9fbf120d-6301-42d9-8c58-25e699a21dbd
handle: 0x0049, char properties: 0x10, char value handle: 0x004a, uuid: 22eac6e9-24d6-4bb5-be44-b36ace7c7bfb
handle: 0x004d, char properties: 0x98, char value handle: 0x004e, uuid: 9b3c81d8-57b1-4a8a-b8df-0e56f7ca51c2
handle: 0x0051, char properties: 0x98, char value handle: 0x0052, uuid: 2f7cabce-808d-411f-9a0c-bb92ba96c102
handle: 0x0055, char properties: 0x8a, char value handle: 0x0056, uuid: c6b2f38c-23ab-46d8-a6ab-a3a870bbd5d7
handle: 0x0059, char properties: 0x88, char value handle: 0x005a, uuid: eb6727c4-f184-497a-a656-76b0cdac633b

In this case, there are many characteristics, but the one we're interested in is the last one, with UUID 'eb6727c4-f184-497a-a656-76b0cdac633b'. We know this is the one we're interested in, because this was the UUID we used in our iPhone app. We set this up to be a writable characteristic, so we can also write to it.

[58:C4:C5:1F:C7:70][LE]> char-write-req 0x005a 5069636f205069636f205069636f205069636f205069636f205069636f205069636f20
Characteristic value was written successfully

Success! On the iPhone side, we set it up to output the characteristic to the log if it was written to. So we see the following.

2017-09-29 20:09:21.206744+0100 Pico[241:25455] QRCodeReader:start()
2017-09-29 20:09:21.802875+0100 Pico[241:25455] BLEPeripheral: State Changed
2017-09-29 20:09:21.803002+0100 Pico[241:25455] BLEPeripheral: Powered On
2017-09-29 20:09:22.801024+0100 Pico[241:25455] BLEPeripheral:start()
2017-09-29 20:10:01.122027+0100 Pico[241:25455] BLE received: Pico Pico Pico Pico Pico Pico Pico

Where did all those 'Pico's come from? That's the value we wrote in, but in hexadecimal ASCII:

50 69 63 6f 20 50 69 63 6f 20 50 69 63 6f 20 50 69 63 6f 20 50 69 63 6f 20 50 69 63 6f 20 50 69 63 6f 20
P  i  c  o     P  i  c  o     P  i  c  o     P  i  c  o     P  i  c  o     P  i  c  o     P  i  c  o    

So, to recap, the following is the command sequence we used.

sudo hciconfig hci0 down
sudo hciconfig hci0 up
sudo hcitool -i hci0 lescan
sudo gatttool -t random -b 58:C4:C5:1F:C7:70 -I
[LE]> connect
[LE]> characteristics
[LE]> char-write-req 0x005a 5069636f205069636f205069636f205069636f205069636f205069636f205069636f20

When it's working, my experience is that gatttool works well. But BLE is a peculiar paradigm, very different from general networking and offers lots of opportunity for confusion.

30 May 2017 : Catastrophic success #
I’ve been using computers in a serious way for the last 32 years and have been taking backup seriously for about half of that. Starting with backup to CD-WR in 2002, then to removable disk-caddy a few years later, and USB hard drive in 2007. For most of that time I’ve been aware of the importance of off-site backups, but it wasn’t until October last year that I actually started doing it. Now my machines all perform weekly incremental backups to my home server, which all then in turn gets client-side encrypted and transferred to Amazon S3.
CD backup in 2002 Hard drive backup in 2007
CD backup in 2002 Hard drive backup in 2007

Despite all of this effort I’ve never had to resort to restoring any of these backups. It’s surprising to think that over all this time, none of my hard drives have ever failed catastrophically.

That was until last Thursday, when I arrived home to discover Constantia, my home server, had suffered a serious failure due to a sequence of power cuts during the day. A bit of prodding made clear that it was the hard drive that had failed. I rely heavily on Constantia to manage my diary, cloud storage, git repos, DNS lookup and so on, so this was a pretty traumatic realisation. On Friday I ordered a replacement hard drive, which arrived Sunday morning.

Luckily Constantia has her operating system on a separate solid state drive, so with a bit of fiddly with fstab I was able to get her to boot again, allowing me to install and format the new drive. I then started the process of restoring the backup from S3.

Backup in progress
Thirteen hours and 55 minutes later, the restore is complete. Astonishingly, Constantia is now as she was before the backup. Best practice is to test not just your backup process regularly, but your restore process as well. But it’s a time consuming and potentially dangerous process in itself, so I’m not proud to admit that this was the first time I’d attempted restore. I’m therefore happy and astonished to say that it worked flawlessly. It’s as if I turned Constantia off and then three days later turned her back on again.

Credit goes to the duplicity and déjà-dup authors. Your hard work made my life so much easier. What could have been hugely traumatic turned out to be just some lost time. On the other hand, it also puts into perspective other events that have been happening this weekend. BA also suffered a power surge which took out its systems on Saturday morning. It took them two days to get their 500 machines spread across two data centres back up and running, while it took me three days to get my one server restored.
28 May 2017 : Catastrophic failure #
A series of power cuts last Thursday left Constantia, my home server in a sorry state. On start-up, she would make a sort-of repeating four-note melody, then crash out to a recovery terminal.

Constantia is poorly

I've subsequently discovered that the strange noises were from the hard drive failing, presumably killed by the repeated power outages. A replacement hard drive arrived this morning (impressively on a Sunday, having been ordered from Amazon Friday evening), which I'm in the process of restoring a backup onto.

Old drive on the left, new drive on the right

Right now I'm apprehensive to say the least. This is the first real test of my backup process, which stores encrypted snapshots on Amazon S3 using Déjà Dup. If it works, I'll be happy and impressed, but I'm preparing myself for trouble.

When I made the very first non-incremental backup of Constantia to S3 it took four days. I'm hoping restoring will be faster.
6 May 2017 : Detectorists #
Last week while away in Paris at EuroUSEC I received a distraught phone call from Joanna. She'd been mowing the lawn (reason enough for distress in itself) and in the process lost her engagement ring. She was pretty upset to be honest, which made me upset being so far away and not able to help. The blame, it transpired, could be traced back to the stinging nettles in our garden. Joanna had been stung while clearing them and moved the ring onto her right hand as a result. That left it more loose than usual, and it probably then fell off while bailing grass cuttings.

We determined to search and find the ring when I got back, and as a backup plan we'd source a metal detector and try that if it came to it. Having seen every episode of Detectorists and loved them, we knew this would work. Secretly, neither of us were quite so certain.

Our unaided search proved fruitless. We scoured the garden over the whole weekend, but ultimately decided our rudimentary human senses weren't going to cut it. We ordered a £30 metal detector from Amazon. In case you're not familiar with the metal-detector landscape, that really is at the bottom end of the market. We weren't really prepared to pay more for something we anticipated using only once, and that might anyway turn out to be pointless. As you can see, we really didn't fancy our chances.

We used the metal detector for a bit, but again, didn't seem to be getting anywhere. It would happily detect my silver wedding ring, and buzzed aggressively when I swooshed it too close to my shoes (metal toe caps; they confuse airport security no end as well), but finding anything other than my feet was proving to be a lot harder.
We discovered that the detector doesn't just detect metal in the general, but can differentiate between different types of metal depending on how it's configured. Joanna's ring is white gold, not silver, so we had to find another piece of white gold in the house to test it on.

Soon after that we started to uncover treasure. First a scrunched up piece of aluminium foil buried a few centimetres under our lawn. Then a rusty corner of a piece of old iron sheeting about 5mm think, buried some 10cm below the ground. As you can imagine we were feeling a lot more confident after having found some real treasure.
And then, just a few minutes later, the detector buzzed again and scrabbling through the grass cuttings revealed Joanna's lost engagement ring, lost no more.

We were pretty chuffed with ourselves. And we were pretty chuffed with the metal detector. If the Detectorists taught us anything, it's that finding treasure is hard. Granted our treasure-hunting creds are somewhat undermined by us having lost the treasure in the first place, but we found the treasure nonetheless. And it was gold we found, so justification enough for us to perform a small version of the gold dance.
Joanna found a ring I found a piece of rusty metal
Joanna found a white-gold ring... ...while I found a rusty old sheet of iron
15 Apr 2017 : Terrible computing choices #
I've just done a terrible thing. For literally months I've been planning my next laptop upgrade, weighing the alternatives and comparing specs. This wil end up being my daily workhorse, and these aren't cheap machines so it's worth getting it right. I narrowed it down to two different devices: the Dell XPS 13 and the Razer Blade Stealth.
Razer Blade Stealth Dell XPS 13
Razer Blade Stealth Dell XPS 13

Physically the RBS is a beautifully crafted device, small and light but with a solidity and finish that left me drooling when I handled it in the Razer store in San Franciso. In comparison the XPS is dull and uninspiring. It's competently made for sure, but suffers from the sort of classic PC over-design that makes the Apple-crowd smug. For the record if I owned an RBS I'd find it hard to hide my smugness.

The XPS is indisputably the better machine. It has a larger screen in a smaller chassis and a much better battery life all for a slightly lower price. In spite of this, the excitement of the RBS won out over the cold hard specs of the XPS. The Dell is simply not an exciting machine in the same way as the RBS with its magically colourful keyboard.

Why then, after all this, have I just gone and ordered the Dell? After making my decision to buy the RBS I dug deeper into how to run Linux on it. The Web reports glitches with a flickering screen, dubious Wi-fi drivers, crashing caps-lock keys and broken HDMI output. On the other hand, Dell supports Ubuntu as a first-class OS, which reassures me that the experience will be glitch-free.

After months of deliberation I chose specs over beauty, which I fear may mean I've finally strayed into adulthood. It feels like a terrible decision, while at the same time almost certainly being the right decision. Clearly I'm still not convinced I made the right choice, but at least I finally did.


Razer Blade Stealth

Dell XPS 13


3.5GHz Intel Core i7-7500U

3.5GHz Intel Core i7-7500U


16GB, 1866MHz LPDDR3

16GB, 1866MHz LPDDR3





Intel HD620

Intel HD620


3840 x 2160

3200 x 1800

Screen size (in)



Battery (WHr)



Height (mm)



Width (mm)



Depth (mm)



Weight (kg)









Backlit keyboard

Whoa yes



USB-C, 2 x USB-3, HDMI, 3.5mm

USB-C, 2 x USB-3, SD card, 3.5mm, AC


Real nice Dull :(

Linux compat

Unsupported, glitches

Officially supported

Price (£)



20 Mar 2017 : Rise of the Tomb Raider #
Rise of the Tomb Raider was released for PC over a year ago now, so it's about time I got back on track with my quest to complete all the Tomb Raider games. After scouring caverns, military bases, villages and, well, tombs, for artefacts and challenges, I've finally got there again.
It was a good game as always, not as tight as the originals but enjoyable and kept me searching for treasure. Perhaps the biggest surprise was to find myself chasing chickens through tombs as the ultimate game finale.

Here it is, added to my ongoing list of completed Croft games, previously updated a few years back now.
  • Tomb Raider.
  • Unfinished Business and Shadow of the Cat.
  • Tomb Raider II: Starring Lara Croft.
  • Tomb Raider III: Adventures of Lara Croft.
  • The Golden Mask.
  • Tomb Raider: The Last Revelation.
  • Tomb Raider: The Lost Artefact.
  • Tomb Raider Chronicles.
  • Tomb Raider: The Angel of Darkness.
  • Tomb Raider Legend.
  • Tomb Raider Anniversary.
  • Tomb Raider Underworld.
  • Lara Croft and the Guardian of Light.
  • Tomb Raider (reboot).
  • Lara Croft and the Temple of Osiris.
  • Rise of the Tomb Raider.
And, because chickens don't make for the most visually-stunning sceenshots, here's a spectacular vista from the section in Syria, including obligatory lens flare and carefully undisturbed artefact.

Classic Tomb Raider beauty
10 Mar 2017 : Minor Pico victories #
Late last night (or more correctly this morning) my SailfishOS phone completed its first ever successful authentication with my laptop using Pico over Bluetooth. A minor, but very fulfilling, victory. One step close to making Pico a completely seamless part of my everyday life.

Authentication-wrangling results
4 Mar 2017 : A tale of woe: failing to heed the certificate-pinning warnings #
As I mentioned previously, last month I discovered rather abruptly that Firefox revoked the StartCom root certificate used to sign the TLS certificate on my site. Ouch. To ease the pain, I planned to move over to using Let's Encrypt, a free service that will automatically generate a new certificate for my site every few months. Both StartCom and Let's Encrypt use a similar technique: they verify only that I have control over the apache2 user on my server by demonstrating that I can control the contents of the site. But the pain hurt particularly badly because I'd been using certificate-pinning, which essentially prevents me using any other certificates apart from a small selection that I keep as backups. Let's Encrypt doesn't give you control over the certificates it signs. The result: anyone who visited my site in the last month (of which there are no-doubt countless millions) would be locked out of it. It's the certificate-pinning nightmare everyone warns you about. So I ratcheted the pinning down from a month to 60 seconds and waited for browsers across the world to forget my previously-pinned certificate.
Today, the 30 days finally expired. In theory, my previously pinned certificates are no longer in force and it's safe for me to switch over to Let's Encrypt. And so this is what I've done.
Check for yourself by visiting and hitting the little green padlock that appears in the address bar. Depending on the browser it should state that it's a secure connection, verified by Let' Encrypt.
Does the stark black-and-white page render beautifully? Then great! Does it say the certificate has expired, is invalid, or has been revoked? Well, then I guess I screwed up, so please let me know.
I didn't really learn my lesson though. In my desparate need to get a good score on, I've turned certificate-pinnng back on (thanks Henrik Lilleengen for leading me astray). Nothing could possibly go wrong this time, right?
22 Feb 2017 : Fedoras horribly hobbled OpenSSL implementation #
For reasons best known to their lawyers, Red Hat have chosen to hobble their implementation of OpenSSL. According to a releated bug, possible patent issues have led them to remove a large number of the elliptic curve parametrisations, as you can see by comparing the curves supported on Fedora 25:
[flypig@blaise ~]$ openssl ecparam -list_curves
  secp256k1 : SECG curve over a 256 bit prime field
  secp384r1 : NIST/SECG curve over a 384 bit prime field
  secp521r1 : NIST/SECG curve over a 521 bit prime field
  prime256v1: X9.62/SECG curve over a 256 bit prime field
with those supported on Ubuntu 16.04:
flypig@Owen:~$ openssl ecparam -list_curves
  secp112r1 : SECG/WTLS curve over a 112 bit prime field
  secp112r2 : SECG curve over a 112 bit prime field
  secp128r1 : SECG curve over a 128 bit prime field
  secp128r2 : SECG curve over a 128 bit prime field
  secp160k1 : SECG curve over a 160 bit prime field
  secp160r1 : SECG curve over a 160 bit prime field
  secp160r2 : SECG/WTLS curve over a 160 bit prime field
  secp192k1 : SECG curve over a 192 bit prime field
  secp224k1 : SECG curve over a 224 bit prime field
  secp224r1 : NIST/SECG curve over a 224 bit prime field
  secp256k1 : SECG curve over a 256 bit prime field
  secp384r1 : NIST/SECG curve over a 384 bit prime field
  secp521r1 : NIST/SECG curve over a 521 bit prime field
  prime192v1: NIST/X9.62/SECG curve over a 192 bit prime field
  prime192v2: X9.62 curve over a 192 bit prime field
  prime192v3: X9.62 curve over a 192 bit prime field
  prime239v1: X9.62 curve over a 239 bit prime field
  prime239v2: X9.62 curve over a 239 bit prime field
  prime239v3: X9.62 curve over a 239 bit prime field
  prime256v1: X9.62/SECG curve over a 256 bit prime field
  sect113r1 : SECG curve over a 113 bit binary field
  sect113r2 : SECG curve over a 113 bit binary field
  sect131r1 : SECG/WTLS curve over a 131 bit binary field
  sect131r2 : SECG curve over a 131 bit binary field
  sect163k1 : NIST/SECG/WTLS curve over a 163 bit binary field
  sect163r1 : SECG curve over a 163 bit binary field
  sect163r2 : NIST/SECG curve over a 163 bit binary field
  sect193r1 : SECG curve over a 193 bit binary field
  sect193r2 : SECG curve over a 193 bit binary field
  sect233k1 : NIST/SECG/WTLS curve over a 233 bit binary field
  sect233r1 : NIST/SECG/WTLS curve over a 233 bit binary field
  sect239k1 : SECG curve over a 239 bit binary field
  sect283k1 : NIST/SECG curve over a 283 bit binary field
  sect283r1 : NIST/SECG curve over a 283 bit binary field
  sect409k1 : NIST/SECG curve over a 409 bit binary field
  sect409r1 : NIST/SECG curve over a 409 bit binary field
  sect571k1 : NIST/SECG curve over a 571 bit binary field
  sect571r1 : NIST/SECG curve over a 571 bit binary field
  c2pnb163v1: X9.62 curve over a 163 bit binary field
  c2pnb163v2: X9.62 curve over a 163 bit binary field
  c2pnb163v3: X9.62 curve over a 163 bit binary field
  c2pnb176v1: X9.62 curve over a 176 bit binary field
  c2tnb191v1: X9.62 curve over a 191 bit binary field
  c2tnb191v2: X9.62 curve over a 191 bit binary field
  c2tnb191v3: X9.62 curve over a 191 bit binary field
  c2pnb208w1: X9.62 curve over a 208 bit binary field
  c2tnb239v1: X9.62 curve over a 239 bit binary field
  c2tnb239v2: X9.62 curve over a 239 bit binary field
  c2tnb239v3: X9.62 curve over a 239 bit binary field
  c2pnb272w1: X9.62 curve over a 272 bit binary field
  c2pnb304w1: X9.62 curve over a 304 bit binary field
  c2tnb359v1: X9.62 curve over a 359 bit binary field
  c2pnb368w1: X9.62 curve over a 368 bit binary field
  c2tnb431r1: X9.62 curve over a 431 bit binary field
  wap-wsg-idm-ecid-wtls1: WTLS curve over a 113 bit binary field
  wap-wsg-idm-ecid-wtls3: NIST/SECG/WTLS curve over a 163 bit binary field
  wap-wsg-idm-ecid-wtls4: SECG curve over a 113 bit binary field
  wap-wsg-idm-ecid-wtls5: X9.62 curve over a 163 bit binary field
  wap-wsg-idm-ecid-wtls6: SECG/WTLS curve over a 112 bit prime field
  wap-wsg-idm-ecid-wtls7: SECG/WTLS curve over a 160 bit prime field
  wap-wsg-idm-ecid-wtls8: WTLS curve over a 112 bit prime field
  wap-wsg-idm-ecid-wtls9: WTLS curve over a 160 bit prime field
  wap-wsg-idm-ecid-wtls10: NIST/SECG/WTLS curve over a 233 bit binary field
  wap-wsg-idm-ecid-wtls11: NIST/SECG/WTLS curve over a 233 bit binary field
  wap-wsg-idm-ecid-wtls12: WTLS curvs over a 224 bit prime field
    IPSec/IKE/Oakley curve #3 over a 155 bit binary field.
    Not suitable for ECDSA.
    Questionable extension field!
    IPSec/IKE/Oakley curve #4 over a 185 bit binary field.
    Not suitable for ECDSA.
    Questionable extension field!
  brainpoolP160r1: RFC 5639 curve over a 160 bit prime field
  brainpoolP160t1: RFC 5639 curve over a 160 bit prime field
  brainpoolP192r1: RFC 5639 curve over a 192 bit prime field
  brainpoolP192t1: RFC 5639 curve over a 192 bit prime field
  brainpoolP224r1: RFC 5639 curve over a 224 bit prime field
  brainpoolP224t1: RFC 5639 curve over a 224 bit prime field
  brainpoolP256r1: RFC 5639 curve over a 256 bit prime field
  brainpoolP256t1: RFC 5639 curve over a 256 bit prime field
  brainpoolP320r1: RFC 5639 curve over a 320 bit prime field
  brainpoolP320t1: RFC 5639 curve over a 320 bit prime field
  brainpoolP384r1: RFC 5639 curve over a 384 bit prime field
  brainpoolP384t1: RFC 5639 curve over a 384 bit prime field
  brainpoolP512r1: RFC 5639 curve over a 512 bit prime field
  brainpoolP512t1: RFC 5639 curve over a 512 bit prime field
I only discovered this when trying to build a libpico rpm. The missing curves cause particular problems for Pico, because we use prime192v1 for our implementation of the Sigma-I protocol. Getting around this is awkward, since we don’t have a crypto-negotiation step (maybe there’s a lesson there, although protocol negotiation is also a source of vulnerabilities).
There’s already a bug report covering the missing covers, but given that the situation has persisted since at least 2007 and remains unresolved, it seems unlikely Red Hat’s lawyers will relent any time soon. They’ve added the 256-bit prime field version since this was licensed by the NSA, but the others remain AWOL.
Wikipedia shows the various patents expiring around 2020. Until then, one way to address the problem is to build yourself your own OpenSSL RPM without all of the disabled code. Daniel Pocock produced a nice tutorial back in 2013, but this was for Fedora 19 and OpenSSL 1.0.1e. Things have now moved on and his patch no longer works correctly, so I’ve updated his steps to cover Fedora 25.
Check out my blog post about it if you want to code along.
22 Feb 2017 : Building an unhobbled OpenSSL 1.0.2j RPM for Fedora 25 #
For most people it makes sense to use the latest (at time of writing) 1.0.2k version of OpenSSL on Fedora 25 (in which case, see my other blog post). However, if for some reason you need a slightly earlier build (version 1.0.2j to be precise), then you can switch out the middle part of the process I wrote about for 1.0.2k with the following set of commands.
# Install the fedora RPM with all the standard Red Hat patches
cd ~/rpmbuild/SRPMS
rpm -i openssl-1.0.2j-1.fc25.src.rpm
# Install the stock OpenSSL source which doesn’t have the ECC code removed
# Patch the spec file to avoid all of the nasty ECC-destroying patches
cd ../SPECS
patch -p0 <
# And build
rpmbuild -bb openssl.spec
And to install the resulting RPMs:
cd ~/rpmbuild/RPMS/$(uname -i)
rpm -Uvh --force openssl-1.0.2j*rpm openssl-devel-1.0.2j*rpm openssl-libs-1.0.2j*rpm
I’m not sure why you might want to use 1.0.2j over 1.0.2k, but since I already had the patch lying around, it seemed sensible to make it available.
22 Feb 2017 : Building an unhobbled OpenSSL 1.0.2k RPM for Fedora 25 #
Fedora’s OpenSSL build is actually a cut-down version with many of the elliptic curve features removed due to patent concerns. These are available in stock OpenSSL and in other distros such as Ubuntu, so it’s a pain they’re not available in Fedora. Daniel Pocock provided a nice tutorial on how to build an RPM that restores the functionality, but it’s a bit old now (Fedora 19, 2013) and generated errors when I tried to follow it more recently. Here’s an updated process that’ll work for OpenSSL 1.0.2k on Fedora 26.
Prepare the system
Remove the existing openssl-devel package and install the dependencies needed to build a new one. These all have to be done as root (e.g. by adding sudo to the front of them).
rpm -e openssl-devel
dnf install rpm-build krb5-devel zlib-devel gcc gmp-devel \ 
  libcurl-devel openldap-devel NetworkManager-devel \
  NetworkManager-glib-devel sqlite-devel lksctp-tools-devel \
  perl-generators rpmdevtools
Set up an rpmbuild environment
If you don’t already have one. Something like this should do the trick.
Obtain the packages and build
The following will download the sources and apply a patch to reinstate the ECC functionality. This is broadly the same as Daniel's, but with more recent package links and an updated patch to work with them.
# Install the fedora RPM with all the standard Red Hat patches
cd ~/rpmbuild/SRPMS
rpm -i openssl-1.0.2k-1.fc25.src.rpm
# Install the stock OpenSSL source which doesn&rsquo;t have the ECC code removed
# Patch the spec file to avoid all of the nasty ECC-destroying patches
cd ../SPECS
patch -p0 <
# And build
rpmbuild -bb openssl.spec
Install the OpenSSL packages
cd ~/rpmbuild/RPMS/$(uname -i)
rpm -Uvh --force openssl-1.0.2k*rpm openssl-devel-1.0.2k*rpm openssl-libs-1.0.2k*rpm
Once this has completed, your ECC functionality should be restored. You can check by entering
openssl ecparam -list_curves
to list the curves your currently installed package supports. That should be it. In case you want to use the slightly older 1.0.2j version of OpenSSL, you can follow my separate post on the topic.
24 Dec 2016 : You are old, Acer Laptop #
"You are old, Acer Laptop" this blog-writer wrote,
"And your battery has become rather shite;
And yet you incessantly compile all this code –
Do you think, at your age, it is right?"

"In my youth," Acer Laptop replied to the man,
"I feared it might injure my core;
But now that I'm perfectly sure I have none
Why, I do it much more then before."

"You are old," said the man, "As I mentioned before,
And have grown most uncommonly hot;
Yet you render in Blender in HD or more –
Pray, don't you think that's rather a lot?"

"In my youth," said the Acer, as he wiggled his lid,
"I kept all my ports very supple
By the use of this app—installed for a quid—
Allow me to sell you a couple?"

"You are old," said the man, "And your threading's too weak
For anything tougher than BASIC;
Yet you ran Java 5 with its memory leak –
Pray, how do you manage to face it?"

"In my youth," said the laptop, "I took a huge risk,
And argued emacs over vim;
And the muscular strength which it gave my hard disk,
Has lasted through thick and through thin."

"You are old," said the man, "one can only surmise
That your circuits are falling apart;
Yet you balanced a bintree of astonishing size—
What made you so awfully smart?"

"I have answered three questions, now leave me alone,"
Said the Acer; "It's true I'm not brand new!
Do you think I'm like Siri on a new-fangled phone?
Be off, or I'll have to unfriend you!"

My current laptop is getting a bit long-in-the-tooth. It's an Acer Aspire S7 which Joanna and I bought cheap a couple of years ago as an ex-display machine. It's a thin, light ultrabook that's worked really well with Linux and still feels powerful enough to use as my main development machine. Amongst it's excellent qualities, the only two negatives have been a rather loud fan, and a less-than-perfect keyboard.

Still, it's getting a bit worn-out now and I've used it so much some of the keys have worn through to the backlight. I've also noticed some very appealing ultrabook releases recently, including the new Acer Swift 7 and the Asus Zenbook 3. Both of these hit important milestones, with the Swift being less than 1cm think, and the Zenbook coming in at under 1kg in weight. Impressive stuff.

My rather bruised keyboard

With these two releases having piqued my interest, and with my current machine due for renewal, it seemed like a good time to reassess the ultrabook landscape and figure out whether I can justify getting a new machine.

Most manufacturers now offer some impressive ultrabook designs. HP has its Elitebook and Spectre  ranges, Apple's MacBook Pro now falls firmly within the category, Dell has the XPS devices and Razer is a newcomer to the ultrabook party with its new Blade Stealth laptop. They all seem to have received decent reviews and there's clearly been some design-love spent on them all.

However, they are also all expensive machines (around the £1000 mark). I'm going to use this as my main work machine for the next couple of years, during which time it'll get daily use, so I have no qualms about spending a lot on a good laptop. On the other hand, if I make a bad decision it'll be an expensive mistake. Given this, it's only sensible I should spend some time considering the various options and try to make a decision not just based on instinct, but on the hard specs for each machine.

There are plenty of reviews online which there's no need for me to duplicate; however I have some particular requirements and preferences, so this analysis is based firmly on these.

My requirements are for a thin, light laptop that's got a really good screen (the larger and higher the resolution the better). When I say thin, I mean ideally 1cm or thinner. By light, I mean as close to 1kg as possible. By good screen, it should be at least a 13in screen with better-than-FHD resolution (given FHD is what my current laptop supports). Any new machine must be better than my current laptop by a significant margin. My current laptop is still perfectly usable, and I'm happy with the size, weight, processing speed and resolution; but it doesn't make sense to get a new machine if it's not going to be a noticeable upgrade.

I've been single-booting Linux for many years now, and plan to do the same with whichever laptop I get next. That means the Windows/macOS distinction is irrelevant for me: they'll get wiped off as the first thing I do with the machine either way.

Before starting this task I was certain I'd end up getting the Acer Swift 7. Based on the copy I'd read, it's the thinnest 13.3in laptop you can buy and looks quite attractive to me (apart from the horrible 'rose gold' colour; ugh). If this didn't work out, I thought the numbers would point in the direction of an Apple device, given almost everyone I know in the Computer Lab uses an Apple laptop (there must be something in that, right?). After carefully working through the specs, I've been really surprised by the results.

The MacBook Pro appears to be decent in most areas, but in fact is worse than the best of its competitors in almost all respects. Since I don't want to run macOS, the only thing in its favour is the attractive design. The MacBook Air is really showing its age now, and is even beaten by the MacBook Pro on everything except price.

The Swift 7 is thin, but turns out to be a really poor choice. That just goes to show how unreliable my gut instinct is and I'm glad I didn't buy it without looking at the alternatives. It's running an M-class processor with no touchscreen or keyboard backlight. The port selection is average and in practice its only strengths are the thin chassis and fanless design. Both are nice features, but the result of the package is hardly an upgrade over my existing machine.

The Razer Blade Stealth was originally down as my alternative choice. It has a gloriously high-resolution (3840 * 2160) screen, and personally I love the multi-coloured keyboard lighting. Some might say it's a just gimmick, and I could never justify a purchase because of it (especially bearing in mind it almost certainly won't work properly on Linux), but I still think it's glorious. Unfortunately the Stealth turns out to have a small screen size and suffers problems running Linux. Both are show-stoppers for me.

The Zenbook also looks really appealing, with its incredibly lightness. Unfortunately, like the Stealth it suffers from a smaller screen size and Linux problems. Too bad.

I kept the Spectre in for comparison, but I could never have gone for it given it's horrific aesthetics. I admit, I'm shallow. Nevertheless, it turns out it doesn't offer enough of an upgrade over my existing system anyway (same resolution, worse dimensions and weight).

The unequivocal standout winner is the Dell XPS. In some ways I'm sad about this, as in my mind I associate Dell with being the height of box-shifting PC dullness. Dell's aggressive product placement really puts me off. The machine itself doesn't have a particularly spectacular design. Yet there's no denying the numbers, and the screen really does appear to be way-ahead of the competition, with its unusually thin bezel, high-resolution and decent size. I was tempted by the 15 in version, given its discreet graphics, but the size and weight just nudge outside the area I feel is acceptable for me.

That leaves only the XPS 13 standing. To top everything else off, Dell is the only company to officially support Linux (Ubuntu) on its machines, which it deserves credit for. I'm not sure whether I'll end up getting a new laptop at all, but if I do I'd want it to be this.

Scroll past the pictures to see my full 'analysis' of the different laptops.
Acer Aspire S7 Acer Swift 7 Hp Spectre 13t
Acer Aspire S7 Acer Swift 7 HP Spectre 13t
Razer Blade Stealth Apple MacBook Pro Apple MacBook Air
Razer Blade Stealth MacBook Pro MacBook Air
Asus Zenbook 3 Dell XPS 13 Dell XPS 15
Asus Zenbook 3 UX390UA Dell XPS 13 Dell XPS 15

Colour coding:
The same as my current Acer Aspire S7

Acer Aspire S7

Acer Swift 7

HP Spectre 13t (or 13-v151nr)

Razer Blade Stealth

MacBook Pro


1.9GHz Intel Core i5-3517U

1.2GHz Intel Core i5-7Y54

2.5GHz Intel Core i7-6500U

2.7GHz Intel Core i7-7500U

3.3GHz Intel Core i7

RAM (max)

4GB, 1600MHz DDR3



16GB, 1866MHz LPDDR3

16GB, 2133MHz LPDDR3

NVM (max)







Intel HD4000

Intel HD615

Intel HD620

Intel HD620

Intel HD550





2560x1440 or 3840x2160


Screen size (in)






Battery (hours)






Height (mm)






Width (mm)






Depth (mm)






Weight (kg)


















Backlit keyboard







USB2*2, 3.5mm, HDMI, SD, AC

USB3*2, 3.5mm

USB3*3, 3.5mm

USB3*2, 3.5mm, HDMI, AC

USB3*2, 3.5mm



Linux compat


Reportedly works OK


Works with glitches (e.g. WiFi)

Currently flaky (will improve)

Price (£)






Price spec


8GB, 256GB

8GB, 256GB

16GB, 256GB

8GB, 256GB


Has been perfect, apart from the poor keyboard

Underpowered, and not big enough upgrade to be worthwhile

Ugly, ugly, ugly

Really tempting, good value, but small screen size is a problem

Quite big and heavy. Decent, but the Dell XPS 13 is better in every respect


Acer Aspire S7

MacBook Air

Asus Zenbook 3 UX390UA

Dell XPS 13

Dell XPS 15


1.9GHz Intel Core i5-3517U

2.2GHz Intel Core i7

Intel Core i7-7500U

3.1GHz Intel Core i5-7200U

3.5GHz Intel Core i7-6700HQ

RAM (max)

4GB, 1600MHz DDR3

8GB, 1600MHz LPDDR3

16GB, 2133MHz LPDDR3

8GB, 1866MHz LPDDR3

32GB, 2133MHz DDR4

NVM (max)







Intel HD4000

Intel HD6000

Intel HD620

Intel HD620








Screen size (in)






Battery (hours)






Height (mm)






Width (mm)






Depth (mm)






Weight (kg)


















Backlit keyboard







USB2*2, 3.5mm, HDMI, SD, AC

USB3*2, 3.5mm, TB, SD, AC

USB3, 3.5mm

USB3*3, 3.5mm, SD, AC

USB3*3, 3.5mm, SD, HDMI, AC



Linux compat



Works but volume, FP, HDMI issues

Officially supported

Reported to work well

Price (£)






Price spec


8GB, 256GB

16GB, 521GB

8GB, 256GB

16GB, 512GB


Has been perfect, apart from the poor keyboard

The low resolution being worse than my current laptop, as well as being thick, rules this out

Thin and really light, makes it really appealing, but the small screen size is a problem

Relatively thick and heavy, but the screen is really great

Just a bit too big and heavy to be viable

9 Dec 2016 : Cracking PwdHash #
On Wednesday Graham Rymer and I presented our work on cracking PwdHash at the Passwords 2016 conference. It's the first time I've done a joint presentation, which made for a new experience. It was also a very enjoyable one, especially having the chance to work with such a knowledgeable co-author.

The work we did allowed us to search for the original master passwords that people use with PwdHash. Passwords which are used to generate the more complex site-specific passwords given to websites, and which may then have been exposed by recent password leaks in hashed form. We were surprised, both by the number of master passwords we were able to find, and the speed with which hashcat was able to eat its way through the leaked hashes.

Running on an Amazon EC2 instance, we were able to work through the SHA1-hashed leak by generating 40 million hashes per second. In total we were able to recover 75 master passwords from the leak, as well as further master passwords from the and leaks.

Feel free to download the paper and presentation slides, or watch the video captured during the conference (unfortunately there's only audio with no video for the first segment).

Here are a few of the master passwords Graham was able to recover from the password leaks.
Domain Leaked hash Password
Stratfor e9c0873319ec03157f3fbc81566ddaa5 frogdog
Rootkit 2261bac1dfe3edeac939552c0ca88f35 zugang
Rootkit 43679e624737a28e9093e33934c7440d ub2357
Rootkit dd70307400e1c910c714c66cda138434 erpland
LinkedIn 508c2195f51a6e70ce33c2919531909736426c6a 5tgb6yhn
LinkedIn ed92efc65521fe5074d65897da554d0a629f9dc7 Superman1938
LinkedIn 5a9e7cc189fa6cf1dac2489c5b81c28a3eca8b72 Fru1tc4k3
LinkedIn ba1c6d86860c1b0fa552cdb9602fdc9440d912d4 meideprac01
LinkedIn fd08064094c29979ce0e1c751b090adaab1f7c34 jose0849
LinkedIn 5264d95e1dd41fcc1b60841dd3d9a37689e217f7 linkedin

I'll leave it as an exercise for the reader to decide whether these are sensible master passwords or not.
16 Oct 2016 : Fixing snap apps with relocatable DATADIRS #

It's luminous, not florescent

Recently I've been exploring how to create snaps of some of my applications. Snap is the new 'universal packaging format' that Canonical is hoping will become the default way to deliver apps on Linux. The idea is to package up an app with all of its dependencies (everything needed apart from ubuntu-core), then have the app deployed in a read-only container. The snap creator gets to set what are essentially a set of permissions for their app, with the default preventing it from doing any damage (either to itself or others). However, it's quite possible to give enough permissions to allow a snap to do bad stuff, so we still have to trust the developers of the snap (or spend your life reading through source-code to check for yourself). If you want to know more about how snaps work, the material out there is surprisingly limited right now. Most of the good stuff - and happily it turns out to be excellent - can be found at

Predictably the first thing I tried out was creating a snap for functy, which means you can now install it just be typing 'snap install functy' on Yakkety Yak. If your application already uses one of the conventional build systems like cmake or autotools, creating a snap is pretty straightforward. If it's a command-line app, just specifying a few details in a yaml file may well be enough. Here's an example for a fictional utility called useless, which you can get hold of from GitLab if you're interested (the code isn't fictional, but the utility is!).

The snapcraft file for this looks like this.
name: useless
version: 0.0.1
summary: A poem transformation program
  Has a very limited purpose. It's mostly an arbitrary example of code.
confinement: strict

    command: useless
    plugs: []

    plugin: autotools
      - pkg-config
      - libpcre2-dev
      - libpcre2-8-0
    after: []

This just specifies the build system (plugin), some general description, the repository for the code (source), a list of build and runtime dependencies (build-packages and stage-packages respectively) and the command to actually run the utility (command).

This really is all you need. To test it just copy the lot into a file called snapcraft.yaml, then enter this command while in the same directory.
snapcraft cleanbuild

And a snap is born.

This will create a file called useless_0.0.1_amd64.snap which you can install just fine. When you try to execute it things will go wrong though: you'll get some output like this.
flypig@Owen:~/Documents/useless/snap$ snap install --force-dangerous useless_0.0.1_amd64.snap

useless 0.0.1 installed
flypig@Owen:~/Documents/useless/snap$ useless
Opening poem file: /share/useless/dong.txt

Couldn't open file: /share/useless/dong.txt

The dong.txt file contains the Edward Lear poem "The Dong With a Luminous Nose". It's a great poem, and the utility needs it to execute properly. This file can be found in the assets folder, installed to the $(datadir)/@PACKAGE@ folder as specified in assets/
uselessdir = $(datadir)/@PACKAGE@
useless_DATA = dong.txt COPYING
EXTRA_DIST = $(useless_DATA)

In practice the file will end up being installed somewhere like /usr/local/share/useless/dong.txt depending on your distribution. One of the nice things about using autotools is that neither the developer not the user needs to know exactly where in advance. Instead the developer can set a compile-time define that autotools will fill and embed in the app at compile time. Take a look inside src/
bin_PROGRAMS = ../useless
___useless_SOURCES = useless.c

___useless_LDADD = -lm @USELESS_LIBS@

___useless_CPPFLAGS = -DUSELESSDIR=\"$(datadir)/@PACKAGE@\" -Wall @USELESS_CFLAGS@

Here we can see the important part which sets the USELESSDIR macro define. Prefixing this in front of a filename string literal will ensure our data gets loaded from the correct place, like this (from useless.c)
char * filename = USELESSDIR "/dong.txt";

If we were to package this up as a deb or rpm package this would work fine. The application and its data get stored in the same place and the useless app can find the data files it needs at runtime

Snappy does things differently. The files are managed in different ways at build-time and run-time, and the $(datadir) variable can't point to two different places depending on the context. As a result the wrong path gets baked into the executable and when you run the snap it complains just like we saw above. The snapcraft developers have a bug registered against the snapcraft package explaining this. Creating a generalised solution may not be straightforward, since many packages - just like functy - have been created on the assumption the build and run-time paths will be the same.

One solution is to allow the data directory location to be optionally specified at runtime as a command-line parameter. This is the approach I settled on for functy. If you want to snap an application that also has this problem, it may be worth considering something similar.

The first change needed is to add a suitable command line argument (if you're packaging someone else's application, check first in case there already is one; it could save you a lot of time!). The useless app didn't previously support any command line arguments, so I augmented it with some argp magic. Here's the diff for doing this. There's a fair bit of scaffolding required, but once in, adding or changing the command line arguments in the future becomes far easier.

The one part of this that isn't quite boilerplate is the following generate_data_path function.
char * generate_data_path (char const * leaf, arguments const * args) {
    char * result = NULL;
    int length;

    if (leaf) {
        length = snprintf(NULL, 0, "%s/%s", args->datadir, leaf);
        result = malloc(length + 2);
        snprintf(result, length + 2, "%s/%s", args->datadir, leaf);

    return result;

This takes the leafname of the data file to load and patches together the full pathname using the path provided at the command line. It's simple stuff, the only catch is to remember to free the memory this function allocates after it's been called.

For functy I'm using GTK, so I use a combination of GOptions for command line parsing and GString for the string manipulation. The latter in particular makes for much cleaner and safe code, and helps simplify the memory management of this generate_data_path function.

Now we can execute the app and load in the dong.txt file from any location we choose.
useless --datadir=~/Documents/Development/Projects/useless/assets

There's one final step, which is to update the snapcraft file so that this gets added automatically when the snap-installed app is run. The only change now is set the executed command as follows.
    command: useless --datadir="${SNAP}/share/useless"

Here's the full, updated, snapcraft file.
name: useless
version: 0.0.1
summary: A poem transformation program
  Has a very limited purpose. It's mostly an arbitrary example of code.
confinement: strict

    command: useless --datadir="${SNAP}/share/useless"
    plugs: []

    plugin: autotools
      - pkg-config
      - libpcre2-dev
      - libpcre2-8-0
    after: []

And that's it! Install the snap package, execute it (just by typing 'useless') and the utility will run and find it's needed dong.txt file.

There's definitely a sieve involved
2 Oct 2016 : Server time-travel, upgrading four years in nine days #
Years of accumulation has left me with a haphazard collection of networked computers at home, from a (still fully working) 2002 Iyonix through to the ultrabook used as my daily development machine. There are six serious machines connected to the network, if you don't count the penumbra of occasional and IoT devices (smart TV, playstation, Raspberry Pis, retired laptops, A9).

All of this is ably managed by Constantia, my home server. As it explains on her webpage, Constantia's a small 9 Watt fanless server I bought in 2011, designed to be powered using solar panels in places like Africa with more sun than infrastructure (and not in a Transcendence kind of way).

Constantia's physical presence

Although Constantia's been doing a phenomenal job, until recently she was stuck running Ubuntu 12.04 LTS Precise Pangolin. Since there's no fancy graphics card and 12.04 was the last version of Ubuntu to support Unity 2D, I've been reticent to upgrade. A previous upgrade from 8.04 to 10.04 many years ago - when Constantia inhabited a different body - caused me a lot of display trouble, so she has form in this regard.

Precise is due to fall out of support next year, and I'd already started having to bolt on a myriad PPAs to keep all the services pumped up to the latest versions. So during my summer break I decided to allocate some time to performing the upgrade, giving me the scope to fix any problems that might arise in the process.

This journey, which started on 19th September, finished today, a full two weeks later.

As expected, the biggest issue was Unity, although the surprise was that it ran at all. Unity has a software graphics-rendering fallback using LLVMPipe, which was actually bearable to use, at least for the small amount of configuration needed to get a replacement desktop environment up and running. After some research comparing XFCE, LXDE and Gnome classic (the official fallback option) I decided to go for XFCE: lightweight but also mature and likely to be supported for the foreseeable future. Having been running it for a couple of weeks, I'm impressed by how polished it is, although it's not quite up there with Unity in terms of tight integration.

The XFCE desktop running on Constantia with beautiful NASA background

There were also problems with some of the cloud services I have installed. MediaWiki has evaporated, but I was hardly using it anyway. The PPA-cruft needed to support ownCloud, which I use a lot, has been building up all over the place. Happily these have now been stripped back to the standard repos, which makes me feel much more comfortable. Gitolite, bind, SVN and the rest all transferred with only minor incident.

The biggest and most exciting change is that I've switched my server backups from USB external storage to Amazon AWS S3 (client-side encrypted, of course). Following a couple of excellent tutorials on configuring deja-dup to use S3 by Juan Domenech, and on S3 IAM permissions by Max Goodman, got things up and running.

But even with these great tutorials, it was a bit of a nail-bighting experience. My first attempt to back things up took five days continuous uploading to reach less than 50% before I decided to reconfigure. I've now got it down to a full backup in four days. By the end of it, I feared I might have to re-mortgage to pay the Amazon fees.

So, how much does it cost to upload and store 46 GiB? As it turns out, not so much: $1.06. I'm willing to pay that each month for effective off-site backup.

The upgrade of Constantia also triggered some other life-refactoring, including the moving of my software from SourceForge to GitLab, but that's a story for another time.

After all this, the good news is that Constantia is now fully operational an up-to-date with Ubuntu 16.04 LTS Xenial Xerus. This should get her all the way through to 2021. Kudos to the folks at Aleutia for creating a machine up to the task, and to Ubuntu for the unexpectedly smooth upgrade process.

The bad news is that nextCloud is now waxing as ownCloud wanes. It doesn't seem to yet be the right time to switch, but that time is approaching rapidly. At that point, I'll need another holiday.
18 Jul 2016 : Using a CDN for #
This weekend I've been playing around with Amazon's CloudFront CDN. I've been setting up a new site,, and although I'm not expecting it to be heavily used, the site is bandwidth-heavy and entirely static on the server-side, so a good candidate for deployment via CDN. For those unfamiliar with the term, CDN stands for Content Delivery Network, able to push the content of a website out to multiple servers across the world. This moves the content closer to the end-users, in theory reducing latency and making the site feel more responsive.

There are other benefits of using a CDN. Because the site is served from multiple locations it also makes it less susceptible to denial of service attacks. Since I work in security, there's been a lot of discussion in my research group about DoS attacks and I recently saw a fascinating talk by Virgil Gligor on the subject (the paper's not yet out, but Ross Anderson has written up a convenient summary).

The availability that DoS attempts to undermine offers a wholly different dynamic from the confidentiality, integrity and authenticity that I'm more familiar with. These four together make up the CIAA 'triad' (traditionally just CIA, but authenticity is often added as another important facet of information security). Tackling DoS feels much more practical than the often cryptographic approaches used in the other three areas. An attacker can scale up their denial of service by sending from multiple sources (for example using a botnet), while a CDN redresses the balance by serving from multiple sources, so there's an elegant symmetry to it.

In addition to all of that, CloudFront looks to be pretty cheap, at least compared to spinning up an EC2 instance to serve the site. That makes it both educational and practical. What's not to like?

Amazon makes it exceptionally easy to serve a static site from an S3 bucket. Simply create a new bucket, upload the files using the Web interface and select the option to serve it as a site.

S3 bucket

The only catch is that you also have to apply a suitable policy to the bucket to make it public. Why Amazon doesn't provide a simpler way of doing this is beyond me, but there are plenty of how-tos on the Web to plug the gap.

S3 bucket policy

Driving a website from S3 offers serviceable, but not great, performance. A lot of sites do this, and already in May 2013 netcraft identified 24.7 thousand hostnames running an entire site served directly from S3 (with many more serving part of the site from S3). It's surely much higher now.

Once a site's been set up on S3, hosting it via CloudFront is preposterously straightforward. Create a new distribution, set the origin to the S3 bucket and use the new address.

S3 distribution origin settings

The default CloudFront domains aren't exactly user-friendly. This is fine if they're only used to serve static content in the background (such as the images for a site, just as the retail Amazon site does), but an end-user-facing URL needs a bit more finesse. Happily it's straightforward to set up a CNAME to alias the cloudfront subdomain. Doing this ensures Amazon can continue to manage the DNS entry it points to, including which location to serve the content from. So I spent £2.39 on the domain and am now fully finessed.

Finally I have three different domain names all pointing to the same content.
The process, which is in theory very straightforward, was in practice somewhat glitchy. The bucket policy I've already mentioned above. The part that caused me most frustration was in getting the domain name to work. Initially the S3 bucket served redirects to the content (why? Not sure). This was picked up by CloudFront, which happily continued to serve the redirects even after I'd changed the content. The result was that visiting the CloudFront URL (or the domain) redirected to S3, changing the URL in the process, even though the correct content was served. It took several frustrating hours before I realised I had to invalidate the material through the CloudFront Web interface before all of the edge servers would be updated. Things now seem to update immediately without the need for human intervention; it's not entirely clear what changed, but it certainly hindered progress before I realised.

The whole episode took about a day's work and next time it should be considerably shorter. The cost of running via CloudFront and S3 is a good deal less than the cost of running even the most meagre EC2 instance. Whether it gives better performance is questionable.

Comparing basic S3 access with the equivalent CloudFronted access gives a 25% speed-up when accessed from the UK. However, to put this in context, serving the same material from my basic fasthosts web server results in a further 10% speed-up on top of the CloudFront increase.

Loading times for S3
Loading times accessing the site on S3 (2.54s total).

Loading times for CloudFront
Loading times accessing the site via CloudFront (1.92s total).

Loading times for fasthosts
Loading times accessing the site on (1.75s total).

If I'm honest, I was expecting CloudFront to be faster. On the other hand this is checking only from the UK where my fasthosts server is based. The results across the world are somewhat more complex, as you can see for yourself from the table below.

Ping times for the three access methods from across the world (all times in ms, from dotcom).
Location S3 CloudFront Fasthosts
Amsterdam, Netherlands 16 9 12
London, UK 13 19 9
Paris, France 14 10 26
Frankfurt, Germany 26 7 23
Copenhagen, Denmark 32 18 33
Warsaw, Poland 48 26 38
Tel-Aviv, Israel 79 58 72
VA, USA 88 93 86
NY, USA 76 105 87
Amazon-US, East 80 99 100
Montreal, Canada 92 100 92
MN, USA 106 114 106
FL, USA 107 118 109
TX, USA 114 117 138
CO, USA 129 118 138
Mumbai, India 142 124 130
WA, USA 135 144 136
CA, USA 141 149 137
South Africa 157 165 155
CA, USA (IPv6) 278 149 230
Tokyo, Japan 243 229 231
Buenos Aires, Argentina 263 260 224
Beijing, China 260 253 249
Hong Kong, China 283 293 287
Sydney, AU 298 351 334
Brisbane, AU 313 343 331
Shanghai, China 332 419 369

We can render this data as a graph to try to make it more comprehensible. It helps a bit, but not much. In the graph, a steeper line is better, so CloudFront does well at the start and mid-table, but also has the site with the longest ping time overall. The lines jostle for the top spot, from which it's reasonable to conclude they're all giving pretty similar performance in the aggregate.

Pings times cummulative over location

In conclusion, apart from the unexpected redirects, setting up CloudFront was really straightforward and the result is a pretty decent and cheap website serving platform. While I'm not in a position to compare with other CDN services, I'd certainly use CloudFront again even without the added incentive of wanting to know more about it.

I'm now looking in to adding an SSL cert to the site. Again Amazon have made it really straightforward to do, but the trickiest part is figuring out the cost implications. The site doesn't accept any user data and SSL would only benefit the integrity of the site (which, for this site, is of arguable benefit), so I'd only be doing it for the experience. If I do, I'll post up my experiences here.
24 Jun 2016 : A bit more. #
Not comfort at all, but looking at the results across the country, Cambridge was one of the few places in England outside London that voted to remain (overwhelmingly, 74% to 26%). I was also happily surprised given the north-south balance that Liverpool (58% to 41%) and the Wirral (52% to 48%) also voted to remain. That could be because both areas have benefited greatly from European investment, but that must be true of many other parts of England too. Maybe they're just saner people? Less surprising is that Castle Point voted overwhelmingly to leave (73% to 27%).
For me the argument about popular sovereignty was far more important than the argument about the economy and my guess would be that this persuaded many who voted to leave (although my darker more cynical side fears it may have been immigration). It's sad for me that this argument about sovereignty was exactly my reason for wanting to remain. So many important international decisions where the UK has now lost its voice and vote.
24 Jun 2016 : EU Referendum #
As a British European I feel like part of my identity, and part of my voice in the world, was taken away from me today. I just hope as a country, we can turn this decision to leave the EU into something positive.
27 Feb 2016 : Losing My Religion #
For the last 18 years this site has stuck rigidly to a dynamic-width template. That's because I've always believed fixed-width templates to be the result of either lazy design or a misunderstanding of HTML's strengths. Unfortunately fashion seems to be against me, so in a bid to regain credibility, I'm now testing out a fixed-width template.

Look closely at the original design from 1998 and you'll see the structure of the site has hardly changed, while the graphics - which drew heavy inspiration from the surface of the LEGO moon - have changed drastically. At the time I was pretty pleased with the design, which just goes to show how much tastes, as well as web technologies, have changed in the space of two decades.

By moving to a fixed-width template I've actually managed to annoy myself. The entire principle of HTML is supposed to be that the user has control over the visual characteristics of a site. 'Separate design and content' my jedi-master used to tell me, just before mind-tricking me into doing the dishes. The rot set in when people started using tables to layout site content. The Web fought back with CSS, which was a pretty valiant attempt, even if we're now left with the legacy of a non XMl-based format (why W3C? Why?!).

But progress marches sideways and Javascript is the new Tables. Don't get me wrong, I think client-side programmability is a genuine case of progress, but it inevitably prevents proper distinction between content and design. It doesn't help that Javascript lives in the HTML rather than the CSS, which is where it should be if it's only purpose is to affect the visual design. Except good interactive sites often mix visuals and content in a complex way, forcing dependencies across the two that are hard to partition.

Happily computing has already found a solution to this in the form of MVC. In my opinion MVC will be the inevitable next stage of web enlightenment, as the W3C strives to pull it back to its roots separating content from design. Lots of sites implement their own MVC approach, but it should be baked into the standards. The consequence will be a new level of abstraction that increases the learning-curve gradient, locks out newcomers and spawns a new generation of toolkits attempting to simplify things (by pushing the content and design together again).

Ironically, the motivation for me to move to a fixed-width came from a comment by Kochise responding to a story about how websites are becoming hideous bandwidth-hogs. Kochise linked to a motherfucking website. So much sense I thought! Then he gave a second link. This was still a motherfucking website, but claimed to be better. Was it better? Not in my opinion it wasn't. And anyway, both websites use Google Analytics, which immediately negates anything worthwhile they might have had to say. The truly remarkable insight of Maciej Cegłowski in the original article did at least provoke me into reducing the size of my site by over 50%. Go me!

It highlighted something else also. The 'better' motherfucking website, in spite of all the mental anguish it caused me, did somehow look more modern. There are no doubt many reasons, but the most prominent is the fixed column width, which just fits in better with how we expect websites to look. It's just fashion, and this is the fashion right now, but it does make a difference to how seriously people take a site.

I actually think there's something else going on as well. When people justify fixed-width sites, they say it makes the text easier to read, but on a dynamic-width site surely I can just reduce the width of the window to get the same effect? This says something about the way we interact with computers: the current paradigm is for full-screen windows with in-application tabs. As a result, changing the width of the window is actually a bit of a pain in the ass, since it involves intricate manipulation of the window border (something which the window manager makes far more painful than it should be) while simultaneously messing up the widths of all the other open tabs.

It's a rich tapestry of fail, but we are where we are. My view hasn't changed: fixed width sites are at best sacrificing user-control for fashion and at worst nothing more than bad design. But I now find myself at peace with this.

If you think the same, but unlike me your're not willing to give up just yet, there's a button on the front page to switch back to the dynamic width design.
1 Feb 2016 : Pebble SDK Review #
Although Pebble smartwatches have been around for some time, I only recently became one of the converted after buying a second-hand Pebble Classic last October. Over Christmas I was lucky enough to be upgraded to a Pebble Time Round. This version was only released recently, and the new form factor requires a new approach to app development. Not wildly different from the existing Classic and Time variants, but enough to necessitate recompilation and some UI redesign of existing apps.

As a consequence many of the apps I'd got used to on my Classic no longer appear in the app store for the Round. This, I thought, offered a perfect opportunity for me to get to grips with the SDK by upgrading some of those that are open source.

Although I'm a total newb when it comes to Pebble and smartwatch development generally, I have plenty more experience with other toolchains, SDKs and development environments, from Visual Studio and QT Creator through to GCC and Arduino IDE, as well as the libraries and platforms that go with them. I was interested to know how the Pebble dev experience would compare to these.

It turns out there are essentially two ways of developing Pebble apps. You can use a local devchain, built around Waf, QEMU and a custom C compiler. This offers a fully command line approach without an IDE, leaving you to choose your own environment to work in. Alternatively there's the much slicker CloudPebble Web IDE. This works entirely online, including the source editor, compiler and pebble emulator.

CloudPebble IDE
I worked through some of the tutorials on CloudPebble and was very impressed by it. The emulator works astonishingly well and I didn't feel restricted by being forced to use a browser-based editor. What I found particularly impressive was the ability to clone projects from GitHub straight into CloudPebble. This makes it ideal for testing out the example projects (all of which are up on GitHub). Without having to clutter up your local machine. Having checked the behaviour on the CloudPebble emulator, if it suits your needs you can then easily find the code to make it work and replicate it in your own projects.

Although there's much to recommend it, I'm always a bit suspicious of Web-based approaches. Experience suggests they can be less flexible than their command line equivalents, imposing a barrier on more complex projects. In the case of CloudPebble there's some truth to this. If you want to customise your build scripts (e.g. to pre-generate some files) or combine your watch app with an Android app, you'll end up having to move your build locally. In practice these may be the fringe cases, but it's worth being aware.

So it can be important to understand the local toolchain too. There's no particular IDE to use, but Pebble have created a Python wrapper around the various tools so they can all be accessed through the parameters of the pebble command.
Pebble Tool command:

    build               Builds the current project.
    new-project         Creates a new pebble project with the given name in a
                        new directory.
    install             Installs the given app on the watch.
    logs                Displays running logs from the watch.
    screenshot          Takes a screenshot from the watch.
    insert-pin          Inserts a pin into the timeline.
    delete-pin          Deletes a pin from the timeline.
    emu-accel           Emulates accelerometer events.
    emu-app-config      Shows the app configuration page, if one exists.
    emu-battery         Sets the emulated battery level and charging state.
    emu-bt-connection   Sets the emulated Bluetooth connectivity state.
    emu-compass         Sets the emulated compass heading and calibration
    emu-control         Control emulator interactively
    emu-tap             Emulates a tap.
    emu-time-format     Sets the emulated time format (12h or 24h).
    ping                Pings the watch.
    login               Logs you in to your Pebble account. Required to use
                        the timeline and CloudPebble connections.
    logout              Logs you out of your Pebble account.
    repl                Launches a python prompt with a 'pebble' object
                        already connected.
    transcribe          Starts a voice server listening for voice
                        transcription requests from the app
    data-logging        Get info on or download data logging data
    sdk                 Manages available SDKs
    analyze-size        Analyze the size of your pebble app.
    convert-project     Structurally converts an SDK 2 project to an SDK 3
                        project. Code changes may still be required.
    kill                Kills running emulators, if any.
    wipe                Wipes data for running emulators. By default, only
                        clears data for the current SDK version.

Although it does many things, the most important are build, install and logs. The first compiles a .pbw file (a Pebble app, essentially a zip archive containing binary and resource files); the second uploads and runs the application; and the last offers runtime debugging. These will work on both the QEMU emulator, which can mimic any of the current three watch variants (Original, Time, Time Round; or aplite, basalt and chalk for those on first name terms), or a physical watch connected via a phone on the network.

CloudPebble IDE
It's all very well thought out and works well in practice. You quickly get used to the build-install-log cycle during day-to-day coding.

So, that's the dev tools in a nutshell, but what about the structure, coding and libraries of an actual app? The core of each app is written in C, so my first impression was that everything felt a bit OldSkool. It didn't take long for the picture to become more nuanced. Pebble have very carefully constructed a simple (from the developer's perspective) but effective event-based library. For communication between the watch and phone (and via that route to the wider Internet) the C hands over to fragments of Javascript that run on the phone. This felt bizarre and overcomplicated at first, but actually serves to bridge the otherwise rough boundary between embedded (watch) and abstract (phone) development. It also avoids having to deal with threading in the C portion of the code. All communication is performed using JSON, which gets converted to iterable key-value dictionaries when handled on the C side.

This seems to work well: the UI written in C remains fluid and lightweight with Javascript handling the more infrequent networking requirements.

The C is quite restrictive. For example, I quickly discovered there's no square root function, arguable one of the more useful maths functions on a round display (some trig is provided by cos and sin lookup functions). The libraries are split into various categories such as graphics, UI, hardware functions and so on. They're built as objects with their own hierarchy and virtual functions implemented as callbacks. It all works very well and with notable attention to detail. For example, in spite of it being C, the developers have included enough hooks for subclasses to be derived from the existing classes.

The downside to all of this is that you have to be comfortably multilingual: C for the main code and interface, Javascript for communication with a phone, Java and Objective-C to build companion Apps and Python for the build scripts. Whew.

Different people will want different things in a development environment: is it well structured? Does it support a developer's particular preference of language? Is it simple at the start but flexible enough to deal with more complex projects? Does it support different development and coding styles? How much boilerplate overhead is there before you can get going? How familiar does it all feel?

It just so happens that I really like C, but dislike Javascript, although I'm certain there are many more developers who feel the exact opposite. The Pebble approach is a nice compromise. I was happy dealing with the C and the Javascript part was logical (e.g. no need to deal with browser incompatibilities). If you're a died-in-the-wool Web developer, there's even a pre-build JS shim for creating watch faces.

So it also seems to work well together and I've come away impressed. Many developers will find the CloudPebble interface slicker and easier to use. But after wading through the underlying complexities – and opacity – of IDEs like Visual Studio and Eclipse, the thoughtful clarity of the Pebble SDK makes for a refreshing change. I wouldn't recommend it for a complete newcomer to C or JS, but if you have any experience with these languages, you'll find yourself up and running with the Pebble toolchain in no time.
24 Jan 2016 : Deconstructing Gone Home #
They say that mastery is not a question of specialization, but sureness of purpose and dedication to craft. Gone Home demonstrates that the application of all three can generate wonderful results. While most games revel in their use of varied and multifarious mechanics – singleplayer, multiplayer, cover mechanisms, rewards, dizzying weapon counts and location changes – Gone Home sticks to a single plan with minimal mechanics and delivers it flawlessly.

Everything is driven by the narrative, which takes a layered approach. There’s no choice as such and this isn’t a choose-your-own adventure. In spite of that, a huge amount of trust is bestowed on the player which allows them to miss large portions of the story if they so choose. This trust is rooted in the mechanics of the game rather than the story, and ultimately makes the game far more rewarding.

To understand this better we need to deconstruct the game with a more analytic approach. A good place to start with this is the gameplay mechanics, but it will inevitably require us to consider the story as well. So, here be spoilers. If you’ve not yet played the game, I urge you to do so before reading any further.

There are spoilers beyond this door

Even though the game is full 3D first-person perspective, the mechanics are pretty sparse. The broad picture is that you have scope to move around the world, pick up and inspect objects, discover ‘keys’ to unlock new areas, and listen to audio diaries. This is a common mechanic used in games and even the use of audiologs has become somewhat of a gaming trope. The widely acclaimed Bioshock franchise uses them as an important (but not the only) narrative device. They’re used similarly in Dead Space, Harley Quinn’s recordings in Batman, the audiographs in Dishonored, and the audio diaries in the rebooted Tomb Raider. Variants include Deus Ex’s email conversations and Skyrim’s many books that provide context for the world. There are surely many others, but while some of these rely heavily on audiologs to maintain their story, few of them use it as a central gameplay mechanic. Bioshock, for example, emphasises fight sequences far more and includes interactions with other characters such as Atlas or Elizabeth for story development. Gone Home provides perhaps the most pure example of the use of audiologs as a central mechanic.

So mechanically this is a pure exploration game. This makes it an ideal game for further analysis, since the depth of mechanics remain tractable. As we’ll see, the mechanics in play actually feel sparser than they are. By delving just a bit into the game we find there’s more going on than we might have imagined on first inspection.

Starting with the interactions, we can categorise theses into eight core ‘active’ mechanics and a further five ‘passive’ types.

Active interaction types
  1. Movement and crouching/zooming
  2. Picking up objects
    1. Full rotation
    2. Lateral rotation
    3. Reading (possibly with multiple pages)
    4. Adding an object to your backpack
    5. Triggering a journal entry
    6. Playing an audio cassette
  3. Return object to the same place
  4. Throw object
  5. Turn on/off an object (e.g. light, fan, record player, TV)
  6. Open/close door (including some with locks or one-way entry)
  7. Open/close cupboards/drawers
  8. Lock combinations
Passive interaction types
  1. Hover text
  2. Reading object
  3. Finding clues in elusive hard-to-see places
  4. Viewing places ahead-of-time (e.g. conservatory)
  5. Magic eye pictures
The distinction between active and passive is not just qualitative. All of the active interactions require specifically coded mechanisms to allow them to operate. This contrasts with the passive interactions, which capitalise on design elements made available through the existing toolset (e.g. the placement or design of objects).

While the key mechanism for driving the narrative forward is exploration through inspecting objects, it’s perhaps more enlightening to first understand the mechanisms used to restrict progress. All games must balance player agency against narrative cohesion. If the player skips too far forward they may miss information that’s essential for understanding the story. If the player is forced carefully along a particular route the sense of agency is lost, and can also lead to frustration if progress is hindered unnecessarily. Sitting in between is a middle ground that trusts the player to engage with the game and relies on them to manage and reconstruct information that may be presented out-of-order, incomplete and in multiple ways.

There are then seven main ‘bulkheads’ (K1-K7) that define eight areas that force the narrative to follow a given sequence. On top of this there are two optional ‘sidequest bulkheads’ (K8, K9). The map itself can be split into twelve areas, and the additional breakpoints help direct the flow of the player, although where no keys are indicated this occurs through psychological coercion rather than compulsion.

Gone Home progression map
These areas shown in the diagram are as follows.
  • P1. Porch
  • P2. Ground floor west
  • P3. Upstairs
  • P4. Stairs between upstairs and library
  • P5. Three secret panels
  • P6. Locker
  • P7. Basement
  • P8. Stairs between basement, guestroom and ground floor east
  • P9. Ground floor east
  • P10. Room under the stairs
  • P11. Attic
  • P12. Filing cabinet (optional)
  • P13. Safe (optional)
The keys needed to unlock progress are the following.
  • K1. Christmas duck key
  • K2. Sewing room map
  • K3. Secret panel map
  • K4. Locker combination
  • K5. Basement key
  • K6. Map to room under stairs in conservatory
  • K7. Attic key
  • K8. Safe combination in library (optional)
  • K9. Note in guestroom (optional)

Some of the keys in Gone Home
Given there are twenty five audio diaries, and a huge number of other written items and objects which add to the story, it’s clear that The Fullbright Company (the Gone Home developers) assume a reasonable amount of flexibility in the ordering of the information within these eight areas. It’s very easy to miss a selection of them on a single run-through of the game.

The diaries themselves only capture the main narrative arc – Sam’s coming-of-age – which interacts surprisingly loosely with the other arcs that can be found. These can be most easily understood by categorising them in terms of characters:
  1. Sam’s coming-of-age (sister)
  2. Terrance’s literary career (Dad)
  3. Jan’s affair (Mum)
  4. Oscar’s life (great uncle)
  5. Kaitlin’s travel (protagonist)
Other incidental characters are used to develop these stories, such as Carol, Jan’s college housemate whose letters are used to frame Jan’s possible affair with her new work colleague. However, these five characters and five story arcs provide the main layers that enrich the game.
An interesting feature of these stories is that they each conform to different literary genres, and this helps to obscure the nature of the story, allowing the ending to remain a surprise up until the last diary entry. Terrance’s career has elements of tragedy which are reinforced by the counterbalancing romance of Jan’s affair. Oscar’s story, which is inseparable from that of the house itself, introduces elements of horror. Kaitlin’s story is the least developed, but is perhaps seen best as a detective story driven by the player. Even though you act through Kaitlin as the protagonist, it’s clearly Sam who’s star of the show. Even though it’s clear from early on that the main narrative, seen through Sam’s eyes, is a coming-of-age story, the ending that defines the overall mood (love story, tragedy?) is left open until the very end.

Perhaps another interesting feature is the interplay between the genres and the mechanics. The feel of the game, with bleak weather, temperamental lighting, darkened rooms and careful exploration, is one of survival horror. Initially it seems like the game might fall into this category, with Oscar’s dubious past and the depressed air. This remains a recurrent theme throughout. But ultimately this is used more to provide a backdrop to Sam’s story, transporting Kaitlin through her (your) present-day experiences to those of Sam as described through her audio diaries and writings.

Ultimately then, it’s possible to deconstruct Gone Home into its thirteen main interaction types, eight areas and five narrative arcs. This provides the layering for a rich story and involving game, even though, compared to many of its contemporaries in the gaming arena, it’s mechanically rather limited. By delving into it I was hoping it might provide some insight into how the reverse can take place: the construction of a game based on a fixed set of mechanics and restricted world. It goes without saying that the impact of the story comes from its content and believability, along with pitching the trust balance in the right spot. Neither of these can be captured in an easily reproducible form.

Nonetheless it would be really neat if it were possible to derive a formal approach from this for simplifying the process of creating new material that follows a similar layered narrative approach. Unlike many games Gone Home is complex enough to be enjoyable but simple enough to understand as a whole. It was certainly one of my favourite games of 2014, and if there's a formula for getting such things right, it's a game that's worth replicating.

You Can Do *Better*
Addendum: I wrote this back in July 2014 while lecturing on the Computer Games Development course at Liverpool John Moores Univrsity and recently rediscovered it languishing on my hard drive. At the time I thought it might be of interest to my students and planned to develop it into a proper theory. Since I never got around to doing so, and now probably never will, I felt I may as well publish it in its present form.
23 Jan 2016 : How Not to Write #
Each week I read a column in the Guardian Weekly called "This column will change your life" by Oliver Burkeman. It's full of insightful but unsubstantiated claims about how efficiency, mental state, tidiness or whatnot can be improved if only you can follow some simple advice. Always a good read.

Oliver Burkeman's Column

This week it explained how getting over writer's block is simply a case of being disciplined: the trick to writing is to write often and in small doses. Not only should you create a schedule to start, but you should also create a schedule to stop. Once your time runs out, stop writing immediately ("even if you've got momentum and could write more"). It's the same advice that was given to me about revision when I was sixteen and is probably as valid now as it was then.

The advice apparently comes from a book by Robert Boice. I was a bit dismissive of the claim in the article that used copies sell for $190, but I've just checked on Amazon and FastShip is selling it for $1163 (Used - Acceptable). That's $4 per page, so it must be saturated with wisdom.

My interest was piqued by the fact that the book's aimed at academics struggling to write. I wouldn't say I struggle to write, but I would say I struggle to write well. Following Boice's advice, writing often and in small doses should probably help with that, but here are a few other things I genuinely think will probably help if - like me - you want to improve your writing ability.

  1. Read a lot. Personally I find it much easier to get started if I already have a style in mind. Mimicking a style makes the process less personal, and that distance can make it easier (at least for me, but this might only work if you suffer from repressed-Britishness). For the record and to avoid any claims of deception from those who know me, I do hardly any reading.
  2. Plan and structure. Breaking things into smaller pieces makes them more manageable and once you have the headings it's just a case of filling in the blanks. Planning what you intend to say will result in better arguments and more coherent ideas.
  3. Leave out loads of ideas. Clear ideas are rarely comprehensive and if you try to say everything you'll end up with a web of thoughts rather than a nice linear progression.
  4. Let it settle overnight. Sometimes the neatest structures and strongest ideas are the result of fermentation rather than sugar content. I don't really know what that means, but hopefully you get the idea.
  5. Don't let it settle for another night. It's better to write something than to allow it to become overwhelming.
  6. And most important of them all... oh, time's up.

How Not to Live Your Life

21 Jan 2016 : Are smartwatches better than watches? #
The Pebble Time Round is a beautiful device in many ways. Aesthetically it's one of the few smartwatches that manages to hide its programmable heart inside the slim dimensions of a classic analogue shell. This sets it apart from its existing Pebble brethren, all of which have what can only charitably be described as an eighties charm. Given one of my most treasured possessions during my teenage years was a Casio AE-9W Alarm Chrono digital watch, and for the last three months I've been proudly wearing a Pebble Classic, I feel I speak with some authority on the matter.

Pebble Classic (above) and Pebble Time Round (below)

The Pebble Time Round can't entirely shed its geek chic ancestry. The round digital face suffers from a sub-optimally wide bezel. The colour e-ink display - although with many advantages - simply isn't as vivid and crisp as most other smartwatches on the market.

In spite of this, Pebble have managed to create a near perfect smartwatch for my purposes. I still get a kick out of receiving messages on my watch. My phone, which used to sit on my desk in constant need of attention now stays in my pocket muted and with vibration turned off. Whenever some communication arrives I can check it no matter what I'm doing in the space of three seconds. For important messages this isn't a great advantages; where the real benefit lies is in avoiding the disruption caused by all those unimportant messages that can be left until later.

Obviously the apps are great too. In practice I've found myself sticking to just a few really useful apps, but those few that do stick make me feel like I'm living in the future I was promised as a child. Most of all, the real excitement comes from being able to program the thing. There's nothing more thrilling than knowing there's a computer on my wrist that's just waiting to do anything asked of it, imagination and I/O permitting. I would say that though, wouldn't I?!

Of course, that's not just true for Pebble; you could say the same for just about any current generation smartwatch: Google Wear, iWatch, Tizen, whatever. Still, it's great that Pebble are forging a different path to these others, focussing on decent battery life, nonemissive displays and a minimalist interface design.

For the last decade I've been dismissive of watches in general and never felt the need to wear one. I arrived late to the smartwatch party, but aving taken the time to properly try some out, I'm now convinced they're a viable form factor. Even if only to fulfil the childhood fantasies of middle-aged geeks like me, they'll surely be here to stay (after all, there's a lot of us around!).

3 Jan 2016 : Slimming Down in 2016 #
Today is the last day of my Christmas break and the last thing I need is distractions, but when I saw this article on The Website Obesity Crisis by Maciej Cegłowski I couldn't stop myself reading through to the end. Maciej ("MAH-tchay") is a funny guy, and the article - which is really the text and slides from a presentation he gave in October - is really worth a read.

The central point Maciej makes is that websites have become script-ridden quagmires of bloat. Even viewing a single tweet will result in nearly a megabyte of download. He identifies a few reasons for this. First that ever increasing bandwidth and decreasing latency means web developers don't notice how horrifically obese their creations have become. While the problem is well-known with no end of articles discussing the issue and presenting approaches for fixing it, they invariably miss the point. They focus on complex, clever optimisations, rather than straightforward byte-count. Those that do consider byte-count can make things worse by shifting the goalposts upwards, inflating what can be considered 'normal'. Finally, the unsustainability of the Web economy has led to the scaffold of scripts used by advertisers and trackers to accelerate in complexity.

There are some sublime examples in the presentation, like the 400-word article complaining about bloat that itself manages to somehow accumulate 1.2 megabytes of fatty residue on its way through the interflume arteries. If you've not read it, go do so now and heed its message.

Like I said, the last thing I need is distractions right now, which is why the article immediately prompted me to check my own website's bandwidth stats. Having nodded along enthusiastically with everything written in Maciej's presentation, I could hardly just leave it there. I needed to apply the Russian Literature Test:

"text-based websites should not exceed in size the major works of Russian literature"

What I found was pretty embarrassing. The root page is one of the simplest on my site. Here's what it looks like:

The root page of the site

Yet it weighed in at 800KB. That's the same size as a "the full text of The Master and Margarita" by Bulgakov. Where's all that bandwidth going? The backend of my site is like Frankenstein's monster: cobbled together from random bits of exhumed human corpse. Nonetheless it should make it relatively terse in its output and it certainly shouldn't need all that. Checking with Mozilla's developer tools, here's what I found.

The original network analysis

There are some worrying things here. For some reason the server spent ages sitting on some of the CSS requests. More worrying yet is that the biggest single file is the widget script for AddThis. I've been using AddThis to add a 'share' button to my site. No-one ever uses it. The script for the button adds nearly a third of a megabyte to the size, and also gives AddThis the ability to track anyone visiting the site without their knowledge.

Not good; so I dug around on the Web and found an alternative called AddToAny. It doesn't use any scripts, just slurps the referrer URL if you happen to click on the link. This means it also doesn't track users unless they click on the link. Far preferable.

After making this simple change, the network stats now look a lot healthier.

The network analysis with AddThis scripts removed

Total bandwidth moved from 800KB to 341KB, cutting it by over a half (see the totals in the bottom right corners). It also reduced load time from 2s down to 1.5s.

But I wasn't done yet. I harbour a pathological distrust of Apple, Google, Facebook and Microsoft, and ditched my Google account over a year ago. I've always been sad about this because Google in particular makes some excellent products that I'd otherwise love to use. Google Fonts is a case in point, with its rich collection of high quality typefaces and a really easy API for using them on the web. Well look there in the downloads and you'll see my site pulls down 150KB of font data from Google. That's the Cabin font used on the site if you're interested.

Sadly then, in my zeal to minimise Google's ability to track me, I totally ignored the plight of those visiting my site. Every time the font is downloaded Google gets some juicy analytics for it to hoard and mine.

The solution I've chosen is to copy the fonts over to my own server (the fonts themselves are open source, so that's okay). Google's servers are considerably faster at responding than my shared-hosting server, but the change doesn't seem to impact the overall download time, and even reduces the overall size by 0.17KB (relative URLs are shorter!). Okay, that's not really a benefit, but the lack of tracking surely is.

The network analysis with Google Fonts removed

The final result has increased page load and reduced bandwidth usage to less than Fyodor Dostoyevsky's The Gambler, which I think is fitting given Dostoyevsky was forced to keep it short, writing to a deadline to pay off his gambling debts. Russian Literature Test passed!

I feel chuffed that my diversionary tactics yielded positive results. All is not quite peachy in the orchard though. Many will argue that including a massive animated background on my page is hypocritical, and they're probably right. Although the shader's all of 2KB of source, it'll be executed 100 million times per second on a 4K monitor. Some of the pages also use Disqus for the comments. I've never really liked having to use Disqus, but I feel forced to include some kind of pretence at being social. Here's why it's a problem.

The network analysis when there are Disqus comments on the page

Not only does Disqus pull in hundreds of KB of extra data, it also provides another perfect Trojan horse for tracking. I've not yet found a decent solution to this, and I fear the Web is just too busy eating itself to allow for any kind of sensible fix.

27 Dec 2015 : Finally, Syberian snow #
Not in real life, but finally I'm getting the snow I feel I deserve in Syberia II. Good work Microïds!

Snowing in Syberia II

16 Dec 2015 : Let's not Encrypt just yet #
The TLS certificate for Constantia, my home server, ran out this evening. I've been using StartSSL for my certificate for several years now, and given their free automated service I've been very pleased. The downside is you can only generate one certificate at a time, so if you screw it up, there's not much that can be done (apart from ponying up). That always made me nervous as I've been known to screw things up in the past.

With the new Let's Encrypt service I was tempted to try that, but the certificates need renewing every 90 days, so I stuck to what I know. It seems I'm getting better at it though: the new certificate appears to have worked without a hitch.

14 Dec 2015 : Siberian Odyssey #
After many years of very careful observation, I've discovered I'm worryingly susceptible to advertising. If I see someone drinking a cool beer on TV my thirst will fire up. Technology adverts make me fiddle with my phone. Pizza ads will make my hungry. (Apparently I'm still immune to sports adverts though).

One of the consequences is that at certain times of year I like my games to match the season. Costume Quest at Halloween, A Bird Story in the Spring, Broken Sword in the Summer. It helps me get into the right frame of mind.

Syberia game, but not in Siberia (or even Syberia)

Last Christmas I decided Syberia would be the way to get into the Christmas spirit. Lots of wintry images, ice and snow. I played through the whole game solving the puzzles and waiting for the ice and snow to kick in. Eventually, I thought, the game would have to take me to Siberia. It's the name of the f**cking game!

So, eventually after 13 hours of play I got on a train heading for Syberia, only for the game to abruptly end.

It turns out Benoît Sokal - the game's director - misjudged how long the story was and Syberia (or even Siberia) doesn't happen until game 2.

I've now waited the entire year and it's time to go for a second attempt: my game this Christmas is going to be Syberia II. I enjoyed the first game, so I don't regret having played it, but this one had better take me to Siberia or I'll be contacting trading standards!

5 Sep 2015 : Flying livestock at Gatwick #
On their journey towards Crete via Gatwick my mum and step dad noticed this rather elegant flying pig. Or maybe it's meant to be a flying horse?! I'd like to think the implied pig reference wasn't entirely unintentional!

Pegasus airlines demonstrates their appreciation for porcine aviation

25 Jul 2015 : GameJam videos #
Game Jam was exactly a month ago and while it was pretty intense at the time, it was also a load of fun.

Alongside all their incredible help with the event, OpenLab also commissioned this great video summarising the event.

If you're still up for more footage after watching that, check out the showreel of the five phenomenal games the teams created.

And you can even download, install and play the games themselves.

18 May 2015 : Compiling OpenVDB using MinGW on Windows #

OpenVDB seems to work best on Linuxy systems. Nick Avramoussis has posted some useful and clear instructions on how to build it using VC++10/11. Unfortunately C++ libraries aren't portable between compilers, and I needed it integrated into an existing project built using MinGW.

This post chronicles my experiences with getting it to work. If you're planning to travel the same path, you should know from the start that it's quite an odyssey. OpenVDB has several dependences which also need to be built with MinGW as well. But it is possible. Here's how.

The Dependencies

OpenVDB relies on several libraries you'll need to build before you can even start on the feature presentation. The best place to start is therefore downloading each of these dependencies and collecting them together.

I've listed the version numbers I'm using. It's likely newer versions will work too.

  1. Boost 1.58
  2. ilmbase 1.0.3 source code (part of OpenEXR)
  3. OpenVDB 3.0.0. Not a dependency, but you're certainly going to need it
  4. TBB 4.3 Update 5 Source

You also need zlib, but MinGW comes with a version you can use for free. Finally, grab yourself this skeleton archive which contains some files needed to complete the build.

The Structure

Each of these will end up generating a library you'll link in to OpenVDB. In theory it doesn't matter where you stick them as long as you can point g++ to the appropriate headers and libraries. Still, to make this process (and description) easier, it'll be a big help if your folders are structured the same way I did it. By all means mix it around and enjoy the results!

I've unpacked each archive into its own folder all at the same level with the names boost, ilmbase, openvdb, tbb and test. The last contains a couple of test files, which you can grab from the skeleton archive. You can download a nice ASCII-art version of the folder structure I ended up with (limited to a depth of 2) to avoid any uncertainty.

In the next few sections I'll explain how to build each of the prerequisites. This will all be done at the command line, so you should open a command window and negotiate to the folder you unpacked all of the archives into.

Building Boost

Boost comes with a neat process for building with all sorts of toolchains, including MinGW. Assuming the folder structure described above, here's what I had to do.

cd boost
bootstrap.bat mingw
.\b2 toolset=gcc
cd ..

If you've download the skeleton archive, you'll find the build-boost.bat script will do this for you. This will build a whole load of boost libraries inside the boost\stage\lib folder. As we'll see later, the ones you'll need are libboost_system-mgw48-mt-1_58 and libboost_iostreams-mgw48-mt-1_58.

Building ilmbase

Actually, we don't need all of ilmbase; we only need the Half.cpp file. Here's what I did to build it into the library needed.

cd ilmbase\Half
g++ -UOPENEXR_DLL -DHALF_EXPORTS=\"1\" -c -I"." -I"..\" Half.cpp
cd ..\..
ar rcs libhalf.a ilmbase\Half\*.o

This will leave you with a library libhalf.a in the root folder, which is just where you need it.

Building TBB

TBB comes with a makefile you can use straight away, which is handy. This means you can build it with this.

cd tbb
mingw32-make compiler=gcc arch=ia32 runtime=mingw tbb
cd ..

Now copy the files you need into the root.

copy tbb\build\windows_ia32_gcc_mingw_release\tbb.dll .
copy tbb\build\windows_ia32_gcc_mingw_release\tbb.def .

Building OpenVDB

Phew. If everything's gone to plan so far, you're now ready to build OpenVDB. However, there are a few changes you need to make to the code first.

Following the steps from Nick's VC++ instructions, I made these changes:

  1. Add #define NOMINMAX in Coord.h and add #define ZLIB_WINAPI in
  2. Change the include path in Types.h from <OpenEXR/half.h> to <half.h>
  3. Add #include "mkstemp.h" to the top of openvdb\io\ This is to add in the mkstemp function supplied in the skeleton archive, which for some reason isn't included as part of MinGW.

The following should now do the trick.

cd openvdb
g++ -DOPENVDB_OPENEXR_STATICLIB=\"1\" -UOPENEXR_DLL -DHALF_EXPORTS=\"1\" -c -w -mwindows -mms-bitfields -I"..\..\libzip\lib" -I".." -I"..\boost" -I"..\ilmbase\Half" -I"..\tbb\include" *.cc io\*.cc math\*.cc util\*.cc metadata\*.cc ..\mkstemp.cpp
cd ..
ar rcs libopenvdb.a openvdb\*.o

And bingo! You should have a fresh new libopenvdb.a library file in the root folder of your project.

Testing the Library

Okay, what now?

You want to use your new creation? No problemo! The skeleton archive has a couple of test programs taken from the OpenVDB cookbook.

These tests also provide a great opportunity to demonstrate how the libraries can be integrated into the MinGW build process. Here are the commands I used to build them.

g++ -DOPENVDB_OPENEXR_STATICLIB=\"1\" -UOPENEXR_DLL -DHALF_EXPORTS=\"1\" -w -c -I"." -I"boost" -I"ilmbase\Half" -I"tbb\include" test\test1.cpp
g++ -DOPENVDB_OPENEXR_STATICLIB=\"1\" -UOPENEXR_DLL -DHALF_EXPORTS=\"1\" -w -c -I"." -I"boost" -I"ilmbase\Half" -I"tbb\include" test\test2.cpp
g++ -g -O2 -static test1.o tbb.dll zlib1.dll -Wl,-luuid -L"." -o test1.exe -lhalf -lopenvdb -L"boost\stage\lib" -lboost_system-mgw48-mt-1_58 -lboost_iostreams-mgw48-mt-1_58
g++ -g -O2 -static test2.o tbb.dll zlib1.dll -Wl,-luuid -L"." -o test2.exe -lhalf -lopenvdb -L"boost\stage\lib" -lboost_system-mgw48-mt-1_58 -lboost_iostreams-mgw48-mt-1_58

The key points are the pre-processor defines for compilation:

  2. Define: HALF_EXPORTS
  3. Undefine: OPENEXR_DLL

the include folders needed also for compilation:

  1. boost
  2. ilmbase\Half
  3. tbb\include

and the library folders needed during linking:

  1. tbb.dll
  2. zlib1.dll (can be found inside the MinGW folder C:\MinGW\bin
  3. libhalf.a
  4. libopenvdb.a
  5. libboost_system-mgw48-mt-1_58.a
  6. libboost_iostreams-mgw48-mt-1_58.a

Finally you should be left with two executables test1.exe and test2.exe to try out your new creation.

27 Apr 2015 : New home help #
A homeless friend of mine thinks he may finally be getting a place to stay and it could be an opportunity for him to turn things around. It would be great news, but the prospect of him landing in an empty flat with almost no furnishings is depressing at best.

He doesn't have access to the Internet, so asked if I'd try to track down stuff people might be throwing out, but which would make good furnishings for someone with no money moving into a new place.

Anyone know of sites to search for local people offering to have things taken off their hands for little or no cash?

Anyone in the Liverpool area have spare stuff you would otherwise be thinking of throwing away?

I want to help, but I'm not really sure where to start, so any suggestions would be good. Please drop me an email, or comment below if you have any.

7 Apr 2015 : Sailfish Really Is Linux #
One of the great things about smartphone operating systems is that, despite being really quite mature, they're nonetheless still fairly well differentiated. This means there are good reasons to choose one over another. For example iOS has a very mature app ecosystem, but with restrictions that prevent some types of software being made available (crucially restrictions on software that downloads other code). In contrast, Android and Google Play have much more liberal policies. This results in a broader ecosystem, but where the overall average quality is often said to be lower.

Android also has the claim of being Linux, which in theory means it has access to the existing - incredibly mature - Linux software ecosystem. In practice for most people this is moot, since their focus is on the very different type of software available from the Play Store. For developers though, this can be important. For me the distinction is important partly because I'm already familiar with Linux, and partly as a matter of principal. In my world computing is very much about control. I love the idea of having a computer in my pocket not because it gives me access to software, or as a means of communication, but because it's a blank slate just waiting to perform the precise tasks I ask of it. That sounds authoritarian, but better to apply it to a computer than a person. I'm pretty strict about it too. Ever since being exposed to the wonder of OPL on a Psion 3a (way back in 1998), direct programmability has always been one of the main critiera when choosing a phone.

This weekend was the Easter Bank Holiday, meaning a lengthy train ride across the country to visit my family. I wanted to download some radio programmes and possibly some videos to watch en-route, but didn't get time before we set off. I'd managed to install the Android version of BBC iPlayer on my Jolla, but for some reason this doesn't cover BBC Radio, which has been split off into a separate application. Hence I embarked on a second journey while sitting on the train: installing get_iplayer entirely using my phone. This meant no use of a laptop with the Sailfish IDE, and building things completely from source as required.

The experience was enlightening: during the course of the weekend I was able to install everything from source straight on my phone. This included the rtmp streaming library and ffmpeg audio/video converter all obtained direct from their git repositories, all just using my phone.

Banished downloaded using get_iplayer

Why would anyone want to do this when you can download the BBC radio app from the store? You wouldn't, but I still think it's very cool that you can.

Here's how it happened.

get_iplayer is kind-of underground software. It shouldn't really exist, and the BBC barely tolerates it.

It's written in Perl and is currently available from Getting it is just a matter of running the following command in the shell:

git clone git://

Perl is already installed on Sailfish OS by default (or at least was on my phone and is in the repositories otherwise). There were some other Perl libraries that needed installing, but which were also in the repositories. I was able to add them like this:

pkcon install perl-libwww-perl
pkcon install perl-URI

Because it's Perl, there's no need to build anything, and at this point get_iplayer will happily query the BBC listing index and search for programmes. However, trying to download a programme generates an error about rtmpdump being missing.

The rtmpdump library isn't in the Sailfish repositories, but can be built from source really easily. You can get it from, and I was able to clone the source from the git repository:

git clone git://

Building from source requires the open-ssl development libraries, which are in the repositories:

pkcon install openssl-devel

After this it can be built (although note developer mode is needed to complete the install):

cd rtmpdump
make install
cd ..

As part of this build the librtmp library will be created, which needs to be added to the library path.

echo /usr/local/lib > /etc/

This should be enough to allow programmes to be downloaded in flv format. However, Sailfish won't be comfortable playing these unless you happen to have installed something to play them with. get_iplayer will convert them automatically as long as you have ffmpeg installed, so getting this up and running was the next step. Once again, the ffmpeg source can be cloned directly from its git repository:

git clone git://

ffmpeg installation

The ffmpeg developers have done an astonishing job of managing ffmpeg's dependencies. It allows many extras to be baked into it, but even without any of the other dependencies it'll use the autoconfig tools to allow a minimal build to be created:

pkcon install autotools
cd ffmpeg
make install
cd ..

ffmpeg is no small application, and compiling it on my phone took over an hour and a half. I know this because we watched an entire episode of Inspector Montalbano in the meantime, which get_iplayer helpfully tells me is 6000 seconds long!

Inspector Montalbano info from get_iplayer

Nonetheless, once completed the puzzle is complete, and get_iplayer will download and convert audio and video to formats that can be listened to or viewed on the Sailfish media player.

For me there's something beautiful about the ability to build, install and run these applications directly on the phone. get_iplayer is command-line, so lacks the polished GUIs of the official applications, but it's still very efficient and usable. I get that this makes me sound like Maddox, but that only makes me more right.

Three, my mobile carrier, insists I'm using tethering and cuts my connection whenever I try to download files using get_iplayer. It's annoying to say the least, but highlights the narrow gap between GNU/Linux on a laptop and GNU/Linux on a Sailfish OS phone.

7 Feb 2015 : Impressed by GitHub #
We recently started working on the Horizon 2020-funded Wi-5 project, and one of the questions that immediately came up was "where to host our code repositories?" The nature of the project is that not all of the code can be made public, so private repositories are essential. After looking at GitHub's pricing policy, I'd almost come to the conclusion we might have to rule it out, until stumbling on their Education Team. A quick submission later and they got back to say they'd upgraded the Wi-5 GitHub organisation to the Silver plan for free. I'm genuinely impressed. Thank you GitHub!
31 Dec 2014 : Automarking Progress #
I've always hated marking. Of all the tasks that gravitate around the higher education process, like lecturing, tutoring, creating coursework specifications and writing exams, marking has always felt amongst the least rewarding. I understand its importance, both as a means of providing feedback (formative) and applying judgement (summative). But good feedback takes a great deal of time, and assigning a single number that could significantly impact a student's life chances also takes a great deal of responsibility. Multiply that by the size of a class, and it can become impossible to give it the time - and energy - it deserves.

Automation has always offered the prospect of a partial solution. My secondary-school maths teacher - who was a brilliant man and worth listening to - always said that maths was for the lazy. It uncovers routes to generalisations that reduce the amount of thinking and work needed to solve a problem. Programming is the practical embodiment of this. So if there's one area which needs the support of automation in higher education, it must be marking.

Back in 1995 when I was doing my degree in Oxford, they were already using automated marking for Maple coursework. When I started at Liverpool John Moores in 2004 I was pretty astonished that they weren't doing something similar for marking programming coursework. Roll on ten years and I'm still at LJMU, and programming coursework is still being marked by hand. We have 300 students on our first year programming module, so this is no small undertaking.

To the University's credit, they've agreed to provide funds as a Curriculum Enhancement Project to research into whether this can be automated, and I'm privileged to be working alongside my colleagues Bob Askwith, Paul Fergus and Michael Mackay to try to find out. As I've implied, there are already good tools out there to help with this, but every course has its own approach and requirements. Feedback is a particularly important area for us, so we can't just give a mark based on whether a program executes correctly and gives the right outputs.

For this reason while Google has spent the tail-end of 2014 evangelising about their self-driving cars, I've been busy setting my sites for automation slightly lower. If a computer can drive me to work, surely it's only right it should then do my work for me when I get there?

There are many existing approaches and tools, along with lots of literature to back it up. For example Ceilidh/CourseMarker (Higgins, Gray, Symeonidis, & Tsintsifas, 2005; Lewis & Davies, 2004), Try (Reek, 1989), HoGG (Morris, 2003), Sphere Engine (Cheang, Kurnia, Lim, & Oon, 2003), BOSS (Joy & Luck, 1999), GAME (Blumenstein, Green, Nguyen, & Muthukkumarasamy, 2004), CodeLab, ASSYST (Jackson & Usher, 1997) and others.

Unfortunately many of these existing tools don't seem to be available either publicly or commercially. For those that are, they're not all appropriate for what we need. CourseMarker looked promising, but its site is down and I've not been able to discover any other way to access it. CodeLab is a neat site, which our students would likely benefit from, but at present it wouldn't give us the flexibility we need to fit it in with our existing course structure. The BOSS online submission system looks very viable but deploying it and getting everyone using it would be quite an undertaking; it's something I definitely plan to look into further though. Finally Sphere Engine provides a really neat and simple way to test out programs. In essence it's a simple web service that you upload a source file to, which it then compiles and executes with a given set of inputs. It returns the generated output which can then be checked. It can do this for an astonishing array of language variants (around 65 at the last count: from Ada to Whitespace) and is also the engine that powers the fantastic Sphere Online Judge. Sphere Engine were very helpful when we contacted them, and the simplicity and flexibility of their service was a real draw. Consequently the approach we're developing uses Sphere Engine as the backend processor for our marking checks.

Compilation and input/output checks aren't our only concerns though. The feedback sheet we've been using for the last few years on the module covers code efficiency, good use of variable names, indentation and spacing, and appropriate commenting, as you can see in the example here.

Marking by human hand

With the aim of matching these as closely as possible, we're therefore applying a few other metrics:

Comment statistics: Our automated approach doesn't measure comments, but rather the spacing between them. For Java code the following regular expression will find all of the comments as multi-line blocks: '/\*.*?\*/|//.*?$(?!\s*//)' (beautiful huh?!). The mean and standard deviation of the gap between all comments is used as a measure of quality. Obviously this doesn't capture the actual quality of the comments, but in my anecdotal experience, students who are commenting liberally and consistently are on the right tracks.

Variable naming: Experience shows that students often use single letter or sequentially numbered variable names when they're starting out, as it feels far easier then inventing sensible names. In fact, given the first few programs they write are short and self-explanatory, this isn't unreasonable. But at this stage our job is really to teach them good habits (they'll have plenty of opportunity to break them later). So I've added a check to measure the length of variable names, and whether they have numerical postfixes by pulling variable declarations from the AST of the source code.

Indentation: As any programmer knows, indentation is stylistic, unless you're using Python or Whitespace. Whether you tack your curly braces on the end of a line or give them a line of their own is a matter of choice, right? Wrong. Indentation is a question of consistency and discipline. Anything less than perfection is inexcusable! This is especially the case when just a few keypresses will provoke Eclipse into reformatting everything to perfection anyway. Okay, so I soften my stance a little with students new to programming, but in practice it's easiest for students to follow a few simple rules (Open a bracket: indent the line afterwards an extra tab. Close a bracket: indent its line one tab fewer. Everything else: indent it the same. Always use tabs, never spaces). These rules are easy to follow, and easy to test for, although in the tests I've implemented they're allowed to use spaces rather than tabs if they really insist.

Efficient coding: This one has me a bit stumped. Maybe something like McCabe's cyclomatic complexity would work for this, but I'm not sure. Instead, I've lumped this one in as part of the correct execution marks, which isn't right, but probably isn't not too far off how it's marked in practice.

Extra functionality: This is a non-starter as far as automarking's concerned, at least in the immediate future. Maybe someone will one day come up with a clever AI technique for judging this, but in the meantime, this mark will just be thrown away.

Our automarking script performs all of these checks and spits out a marking sheet based on the feedback sheet we were previously filling out by hand. Here's an example:

Marking but not as we know it

As you can see, it's not only filling out the marks, but also adding a wodge of feedback based on the student's code at the end. This is a working implementation for the first task the students have to complete on their course. It's by far the easiest task (both in terms of assessment and marking), but the fact it's working demonstrates some kind of viability. I'm confident that most of the metrics will transfer reasonably elegantly to the later assessments too.

There's a lot of real potential here. Based on the set of scripts I marked this year, the automarking process is getting within one mark of my original assessment 80% of the time (with discrepancy mean=1.15, SD=1.5). Rather than taking an evening to mark, it now takes 39.38 seconds.

The ultimately goal is not just to simplify the marking process for us lazy academics, but also to provide better formative feedback to the students. If they're able to submit their code and get near-instant feedback before they submit their final coursework, then I'm confident their final marks will improve as well. Some may say this is a bit like cheating, but I've thought hard about this. Yes, it makes it easier for them to improve their marks. But their improved marks won't be chimeras, rather they'll be because the students will have grasped the concepts we've been trying to teach them. Personally I have no time for anyone who thinks it's a good idea to dumb down our courses, but if we can increase students' marks through better teaching techniques that ultimately improves their capabilities, then I'm all for it.

As we roll into 2015 I'm hoping this exercise will reduce my marking load. If that sounds good for you too, feel free to contribute or join me for the ride: the automarking code is up on GitHub, and this is all new for me, so I have a lot to learn.

14 Dec 2014 : Adafruit Backlights as Nightlights #
Yesterday I spent a fun and enlightening day at DoESLiverpool for their monthly Maker Day. It was my first time, and I'm really glad I went (if you live near Liverpool and fancy spending the day building stuff, I recommend it). I got loads of help from the other makers there, and at the end of the day I'd built a software-controllable blinking light and gained a new-found confidence for soldering (not bad for someone who's spent the last twenty years finding excuses to avoid using a soldering iron). Thanks JR, Jackie, Doris, Dan and everyone else I met on the day! Here's the little adafruit Trinket-controlled light (click to embiggen):

Adafruit Trinket with a backlight module attached, alongside a tealight

The light itself is an adafruit Backlight Module, an LED encased in acrylic that gives a nice consistent light across the surface. In the photos it looks pretty bright, and Molex1701 asked whether it'd be any good for a nightlight. Thanks for the question!

The only thing is I know nothing about lights and lumens and wouldn't trust my own judgement when wandering around in the semi-dark. So to answer the question I thought it'd be easiest to take a few photos. The only room in the flat where we get total darkness during the day is the bathroom, so I stuck the adafruit in the bath along with some helpful gubbins for reference (ruler, rubber duck, copy of Private Eye) and took some photos. As well as the backlight module, there are also some photos with the full light and a standard tealight (like in the photo above) for comparison. I reckon tealights must be a pretty universal standard for photon output levels.

These firsts three below (also clickable) show the same shot in different lighting conditions from afar. Respectively they're the main bathroom light, the backlight module, and a tealight.

Bathtub with standard florescent bulb from above
Bathtub with backlight light inside
Bathtub with tealight light inside

Here are two close-up shots with backlight and tealight respectively.

Bathtub close-up with backlight light inside
Bathtub close-up with tealight light inside

As you can see from the results, the backlight isn't as bright as a tealight. Whether it'd be bright enough to use as a nightlight is harder to judge, but my inclination is to say it probably isn't. Maybe if you ran a couple of them side-to-side they'd work better. It's also worth noting the backlight module is somewhat directional. There is light seepage from the back of the stick, but most of the light comes out from one side and things are brighter when in line with it.

It may also be worth saying something about power output. Yesterday JR, Doris and I measured the current going through it. The backlight was set up with 3.3V and drew 10 mA of current. The battery I'm using is a 150mAh Lithium Ion polymer battery, so I'm guessing the backlight should run for around 15 hours (??) on a single charge. Add in the power needed for the trinket and a pinch of reality salt and it's probably much less. Last night it ran from 8pm through to some time between 4am and 10am (it cut out while I was asleep), so that's between 8-14 hours.

If you do end up building a nightlight from some of these Molex1701, please do share!

11 Aug 2014 : Thieving scum! #
It's been nearly seven years since my previous venture into the criminal mind of the master thief, but as part of my holiday therapy I'm becoming Garrett again. There have been many great stealth games to fill the gap since the last Thief release, including the quite brilliant Dishonored from 2012. This was the closest yet to reproducing the setting and atmosphere of the Thief series, and many would say it surpassed it in many ways. Dunwall captured the same steampunk aesthetic, divided society and solitary exploration as The City. The no-kill stealth mechanics and multipath approach to gameplay were bloodline descendants of the original Thief. As a game it was an astonishing achievement. But it lacked one crucial element: a voice. The prospect of taking on the role of Garrett the master thief is just too exciting. To become a truly accomplished larcenist, you have to submit to his amoral self-justification. His sardonic narrative is a crucial counterbalance to the despair and suffering of the environment.

There have been criticisms levied at the game for its gameplay linearity, repetitive ambient dialogue and failure to achieve the same level of psychological tension. These are all no-doubt valid criticisms, but while I've so far only played through the first chapter, none of these are yet detracting from my enjoyment of the game. The shadows still make you feel invisible and there's still a sense of invincibility as you nick a diamond right from under the jeweller's nose. I can tell already: this is going to be really great therapy!

Thief in the rain Thief streets More Thief streets

7 Jul 2014 : Real or Render? Render or Real? #
The astonishing ability of computers to turn entirely imaginary objects into realistic representations is obvious just by watching pretty much any recent blockbuster movie. I know I go on about it a lot, but it bears repeating that with 3D printing you can take it a step further: turning entirely imaginary objects into their physical counterparts. This isn't the first time I've compared renders to reality (or is it the other way around? I forget), but the question of how close they can get remains a bit of a fascination.

So what do you think? One of these images is a render, created using Blender Cycles. The other is a photograph of a 3D print generated from the same model and cast in bronze. Which is which though?

If you're not sure, click on the image for a larger version, and leave a comment if you think you've figured it out!

Cubic Celtic knot rendered using Blender Cycles and 3D printed in raw bronze

11 Jun 2014 : How much information's created when I stare out the window? #
This afternoon I received an advertising email from the Viglen Marketing team. It boldly repeats the oft-quoted statement of Eric Schmidt from Google's Atmosphere convention in 2010:

"Between the birth of the world and 2003, there were 5 exabytes of information created. We now create 5 exabytes every 2 days."

Every time I read this quote my faith in human intelligence dies a little more. It's an old quote now, but it still riles me: it's such a patently absurd statement. I can understand Dr. Schmidt making it for the sake of theatre, but please don't repeat it as if it's fact.

There have been far more detailed and convincing critiques of the claim than I'm able to offer, but I wouldn't even extend the benefit of the doubt that these lavish on Google's Executive Chairman. The fact is, the same amount of information is being created now as has ever been the case. If you want to some how massage the quote into plausibility you have to narrow its meaning beyond recognition. Perhaps it means data recorded, rather than information created? Perhaps it only means by humans? Perhaps it means only in sharable form? When the information is useful? On Earth? When someone is watching?

How much information is there in a cave painting? I'd wager more than Google explicitly stores in all of its data warehouses. How much information gets sucked into a black hole every second? I can't even be bothered to think about it. It's just the basic difference between discrete and continuous stuff.

Frankly, it probably means "data that has been recorded permanently by humans in discrete form". So why not say so?

This morning I was relatively happy; now I'm just annoyed.

Stupid quotes that shouldn't be repeated

22 May 2014 : Technology vs The Law #
Broken CD image by omernos The problem with technology is that it has created a new and unique power struggle; a struggle that the law has found itself on the wrong side of. The legal bullying of Ladar Levinson that ultimately resulted in him having to shut down his company Lavabit, is a nasty symptom of the way the law reacts when it feels threatened.

I won't go into the details here, but recommend you take a look at Levinson's description of what happened in his Guardian article.

How can the legal system have got so fucked up that this can happen? How is it - to use Levinson's words - that he can find himself "standing in a secret courtroom, alone, and without any of the meaningful protections that were always supposed to be the people's defense against an abuse of the state's power"?

To understand this, we need to figure out where the law gets its power from. The nature of the law has always been inextricably linked with power. It's the people in power who define the laws and this gives them credibility through process (although it doesn't give any guarantee that the laws are just). How do you get to be in power? If you're lucky, you might live in a country where there's a process for this too. In the US they exercise what they call democracy (it's not exactly what I'd call democracy, but it's still a lot better than what we have here in the UK). Still, the legitimacy of the process is really seeded elsewhere: it's a redistribution of powers granted conditionally by those who are physically most powerful. Some might say the legitimacy comes from something like the constitution, but in practice the legitimacy of a constitution comes from the war that was won beforehand. Without the demonstration of superior power, the constitution would have rather been just a manifesto put together by a bunch of terrorists.

All laws are founded on power and all power is founded on force. Except that technology has a tendency to destabilise this equation. Take guns (I'm not a big fan of guns in practice, and I'm going to conveniently classify them as technology for the purposes of this argument). Guns have the potential to be an amazing leveller. Prior to their introduction, the force behind the power was premised on physical strength and numbers. Suddenly with guns physical strength becomes an irrelevance. And this isn't just about the advantage of being the first to have one. If everyone owned a gun then actual physical strength would no longer be a consideration since everyone would have the means to end another person's life at the click of a button. I'm not advocating this as a wise move of course (just think what would happen if there was a "Terminate user" option next to the "Report abuse" link on YouTube), but it does illustrate the point.

The law is ultimately reliant on physical force for its legitimacy. Not only does it rely on political power (which is underwritten by force), but it also uses force as its last-resort sanction. There are many intermediate sanctions (removal of money and property, restrictions of rights, threat of surveillance, storing details on a database...), but if these fail, or if someone refuses to submit to them, the ultimate sanctions are incarceration or death, both of which are physical threats. And it's not just legal outcomes, but also the legal process that relies on the threat of physical force. During an investigation, if someone refuses to comply with a search warrant, the police are within their rights to break down the door. Take away the physical threat and you leave the law impotent.

New technologies, and especially encryption and distributed networking technologies, pose a real threat to this. While you can break down a door with a sledgehammer, you can't decrypt an encrypted message by smashing open a computer. If the encryption is done right, you can't decrypt the message at all: you're fighting against the laws of nature and mathematical axioms*. Up until now, the solution sought by the law has been to go after the encryptor rather than the encryption (take for example RIPA in the UK). But technology is nibbling away at this too. Distributed technologies support actions that have no single enactor; information and processes that don't belong to anyone. You can't pursue a physical protector if none exists.

The events surrounding Lavabit and the actions of the intelligence and police services uncovered by Edwared Snowdon demonstrate a response by the law to try to address a threat which is conceptual rather than physical. The growing realisation that physical solutions can't work has led to laws and processes that were designed to protect being contorted into tools that many people no longer recognise as just.

Unless the law can find new ways to deal with the conceptual threats to its processes that new technologies have introduced, the temptation to become increasingly draconian will remain. There need to be new solutions that don't amount to "if we can't attack the problem with physical force, we'll attack an innocent bystander instead."

On the other hand, individuals will continue to invest in more robust cryptography and make more widespread use of distributed technologies (by which I absolutely do not mean the Cloud!) as a way of preserving the privacy and (ethical) rights that recent events suggest the law has started taking away.

* May be subject to change.

15 May 2014 : Treading More Lightly #
Footprints image by mailsparky Some time ago I started the process of disentangling myself from Google's clutches. This morning I finally finished the process by deleting the last vestiges of my account.

When Google first appeared it demonstrated a refreshingly open and efficient approach to the Internet, so I was making prolific use of their services until a couple of years ago. Since switching away from Google's search it's felt like their other services have become increasingly irrelevant to me.

In spite of this I discovered this morning the tentacles were still embedded pretty deep. I had documents scattered all over Google Drive, a languishing Google+ profile mostly used for access to hangouts, a Google Talk account (as a front for getting people to use Jabber), Google Analytics, Android accounts, an old Blogger blog; the list goes on.

And this was just the exposed information. I dread to think about the mountain of data being amassed in the background. The reality check really hit last year when Google's services went offline for four minutes in August. Subsequent reports suggested that Internet traffic dropped by 40% as a result. That's a dangerous over-reliance we have there. I was also impressed when one of my students, involved in the CodePool project (if you're reading this: you know who you are!) attempted to remove her Web footprint; I was surprised at how successful she was.

This isn't an attempt to remove my Web presence though and sadly I don't expect the data accumulation to stop. I'm sure Google will continue to know more about my movements than anyone else, whether company or individual. The biggest problem for me is that, even though everyone knows that Google knows, we don't really know the extent of knowledge Google can derive from our data. That's a real concern.

Google still offers outstanding services. I've found no replacement for the public-facing calendar sharing of Google Calendar. I'll inevitably continue to use Google Scholar, Google Maps and Google Images but without the login. Yet most of Google's services are replicated by smaller and less intrusive companies. I'm under no illusion about the motives of these smaller rivals: they still want my data and ad-revenue. But by virtue of their size they're less of a threat to my privacy.

23 Feb 2014 : Adventures with The Other Half #
It's fair to say this is a misleading title. As you'll discover if you take the trouble to read through (and now you've started, you'd be missing out if you didn't), this has nothing to do with either feats of derring-do or my wife Joanna.

No, this is to do with my Jolla phone. Back in the day, before smartphones were ubiquitous, many phone manufacturers tried to lure in the punters by offering interchangeable fascias or backplates. Not very subtle, or high-tech, but presumably effective.

Well, Jolla have decided to return to this, while taking the opportunity to update it for the 21st Century. Each Jolla smartphone appears to be built in two halves, split parallel to the screen and with the back half ("The Other Half") replaceable to provide not just different styles, but also additional functionality. The extra functionality is provided by cleverly using NFC-detection of different covers, along with the ability for covers to draw power from and communicate with the main phone via a selection of pins on the back.

At the moment there are only four official Other Halves that I'm aware of: Snow White (the one that comes as standard), Keira Black, Aloe and Poppy Red (the preorder-only cover). They use the NFC capability to change the styling of the phone theme as the cover is changed, but in the future there's a hope that new covers might provide things light wireless charging, solar charging, pull-out keyboard, etc.

For me, the interesting thing about the phone has always been the Sailfish OS that powers it. As anyone who's ever set eyes on me will attest, I've never been particularly fashion conscious, so the prospect of switching my phone cover to match my outfit has never offered much appeal. However, since the good sailors at Jolla have released a development kit for The Other Half, and since it seemed like an ideal challenge to test out the true potential of future manufacturing - by which I mean 3D printing - this was not an opportunity I could not miss.

Rather brilliantly, the development kit includes a 3D model which loads directly into Blender.


From there it's possible to export it in a suitable format for upload directly to the Shapeways site. The model is quite intricate, since it has various hooks and tabs to ensure it'll fit cleanly on to the back of the phone. Sadly this means that most of the usual materials offered by Shapeways are unavailable without making more edits to the model (sadly, it will take a bit more work before it can be printed in sterling silver or ceramic!). My attempt to print in polished Strong & Flexible failed, and eventually I had to go with Frosted Ultra Detail. Not a problem from a design perspective, but a bit more expensive.

The result was immaculate. All of the detail retained, a perfect fit on the phone and a curious transparent effect that allows the battery, sim and SD card to be seen through the plastic.


Although a perfect print, it wasn't a good look. Being able to see the innards of the phone is interesting in an industrial kind of way, but the contouring on the inside results in a fussy appearance.

The good news is that all of the undulations causing this really are on the inside. The outer face is slightly curved but otherwise smooth. The printing process results in a very slight wood-grain effect, which I wasn't anticipating, but in hindsight makes sense. The solution to all of this was therefore to sand the outside down and then add some colour.


The colour I chose was a pastel blue, or to give its full title according to the aerosol it came in, Tranquil Blue. Irrespective of the paint company's choice of name, the result was very pleasing, as you can see from the photos below. The 3D-printed surface isn't quite as nicely textured as the original Other Half cover that came with the phone, but I believe most people would be hard-pressed to identify it as a 3D-printed cover. It looks as good as you might expect from mass-produced commercial plasticware.

With the design coming straight from the developer kit, I can't claim to have made any real input to the process. And that's an amazing thing. Anyone can now generate their own 3D printed Other Half direct from Shapeways with just a few clicks (and some liberal unburdening of cash, of course!). A brand-new or updated design can be uploaded and tested out just as easily.

It's genuinely exciting to see how 3D printing can produce both practical and unique results. The next step will be to add in the NFC chip (it turns out they're very cheap and easy to source), so that the phone can identify when the cover is attached.



9 Feb 2014 : Jolla: Easy Wins #
This weekend I tried my hand at a bit of SailfishOS programming, and once again have been pleasantly surprised.

There's no shortage of places to get Apps from for a Jolla phone: the Jolla Store, the Yandex Store and the OpenRepos Warehouse being just a few. But even with this smörgåsbord of stores there are still obvious gaps. For example, I wanted to connect my phone through my home VPN, so that I can access things like SMB shares and ssh into my machines.

The iPhone has an OpenVPN client, but the frustrating file management on the iPhone meant I never got it up and running. Unsurprisingly Android has good OpenVPN support which combines well with the broad range of other good network tools for the platform.

In contrast the various SailfishOS stores are sadly bereft of OpenVPN solutions. However, a quick search using pkcon showed the command line openvpn client available in the Jolla repositories. I was astonished when, after a few commands to transfer the relevant client certificates and install the tool, it was able to connect to my VPN first time.


This is what I'm loving about SailfishOS. It speaks the same language as my other machines and runs the same software. Getting it to talk to my VPN server was really easy, even though you won't find this advertised in the headline features list.

Still, having a command line tool isn't the same as having a nicely integrated GUI App, so this seemed like a great opportunity to try out Jolla's Qt development tools. I've not done any Qt development in the past so started by working through the examples on the Sailfish site.

Qt seems to be a nice toolkit and it's set up well for the phone, but Qt Quick and QML in particular require a shift in approach compared to what I'm used to. Qt Quick obfuscates the boundary between the QML and C++ code. It's effective, but I find it a bit confusing.


Still, after a weekend of learning and coding, I've been able to knock together a simple but effective front-end for controlling OpenVPN connections from my phone.

As well as providing a simple fullscreen interface, you can also control the connection directly from the home screen using the clever SailfishOS multi-tasking cover gestures: pull the application thumbnail left or right to connect to or disconnect from the server.


What I think this demonstrates is how quick and easy it is to get a useful application up and running. The strength is the combination of the existing powerful Linux command line tools, and the ability to develop well-integrated SailfishOS user interfaces using Qt. I'm really pleased with the result given the relatively small amount of effort required.

If I get time, there's plenty more to be done. Currently the configuration runs directly from the openvpn script, but allowing this to be configured from the front-end would be an obvious and simple improvement. After this, mounting SMB shares will be next.

2 Feb 2014 : Smartphone Homecoming #
First, a warning: if technology doesn't interest you then you're likely to find what you read below just a bit odd. If it does then you might find it a bit opinionated. If you're normal, you'll find it boring. If you're not sure which category you fall into, go ahead and read on, and then check back here to find out!

For many months now I've been stuck in the smartphone wilderness, wandering between platforms trying to find one that makes me feel empowered in the way a good computer should.

Well, I think I've finally found my nirvana, having received my Jolla smartphone yesterday. After playing around with it for just a day, it's already in a much more usable state than the iPhone it's replacing. Although the hardware's nothing to write home about, the whole package is beautifully designed with a flair you rarely see on a mobile device. Programs run well, with fluid and transparent multitasking. The gestures are simple, consistent and brilliantly effective: you can use the phone with just a single hand. For a first device, the completeness of the functionality is impressive. Best yet, the console is just a couple of clicks away, giving full access to the entire device (I already have gcc and python installed).

I have to admit, this is all very exciting. I've used multiple devices over the last year trying to find something interesting without luck, so it's worth considering the path that brought me here. It can be neatly summarised by the photo below.

My smartphone experience has been coloured by the earlier devices that defined my computing development. The strength of a device has always been measured - for me - by the potential to program directly on the device. What's the point of carrying a computer around if you can't use it to compute?! From Psions to Nokia Communicators through to the ill-fated Meamo devices, this has always been by far their most exciting trait.

When Maemo/Meego was killed off, the only real alternatives were iOS and Android. I tried both. Android is the spiritual successor to Windows. Its strength is defined by the software that runs on top of it, and it's open enough to interest developers. It's not so bad that people want to avoid it but nonetheless doesn't excel in any particular way. The iPhone on the other hand is an astonishing device. It achieves simplicity through a mixture of control and illusion. In its own way it's perfect, making an excellent communication device. A computing device: less so.

As an aside, both devices are also Trojan horses. Google just wants you logged in to your Google account so it can collect data. Apple wants to seduce you in to its ecosystem, if necessary by making it harder to use anything else. Both are fine as long as the value proposition is worth it.

In February 2013 I finally decided to retire my N900. The provocation for this was actually the release of the Ubuntu Touch developer preview. I purchased a Nexus 4, which is a beautiful piece of hardware, and flashed it with Ubuntu. Sadly, the operating system wasn't ready yet. I've kept the OS on the phone up-to-date (the device is now dual-boot) and in fact it's still not ready yet. If it fulfils its goal of becoming a dual mobile/desktop OS, it could have real potential. But (in the immortal words of Juba) "not yet".

So, in May 2013 I moved to an iPhone. The main motivation for this was to try to establish what data Apple collects during its use, especially given the way Siri works. I've continued using it for this purpose until now, maintaining it exclusively as my main phone in order to ensure valid results. After ten months of usage I think I've given it a fair tryout, but it's definitely not for me. It implements non-standard methods where existing standards would have worked just as well. Options are scattered around the interfaces or programs through a mixture of soft-buttons, hardware-buttons and gestures. I find this constantly frustrating, since most of the time the functionality I'm after doesn't actually exist. Yes, mystery meat navigation has escaped the nineties: it's alive and well on the iPhone. The hardware - while well made - is fussy with its mixture of materials and over-elaborate bevelling. However, ultimately what rules it out is the lack of support for programming the device on the device. There are some simple programming tools, but nothing that really grants proper control.

Finally I've ended up with a Jolla phone running Sailfish OS. There's no doubt that this is the true successor to Maemo. If you have fond memories of the Internet Tablet/N900/N9/N950 line of devices, then I'd recommend a Jolla. If you like Linux and want a phone that really is Linux, rather than a Java VM that happens to be running on the Linux kernel, then I'd recommend a Jolla. Clearly, I'm still suffering from the first-flush of enthusiasm, but it definitely feels good to be finally in possession of a phone that I feel like I can control, rather than one that controls me.

For the record, the photo shows (from right to left) Ubuntu Touch running on a Nexus 4, an iPhone 5 running iOS 7.0.4, Android 4.4.2 KitKat on a Nexus 4 and a Jolla device running Sailfish OS (Naamankajärvi). There are actually only three devices here: both Nexuses are the same. The overall photo and Android device was taken using the Jolla; the Jolla and Ubuntu phones were shot with the iPhone; the iPhone photo was taken with the Android.

I had an interesting experience getting all of the photos off the phones and onto my computer for photoshopping together. Getting the photos off the Jolla and Android devices was easy enough using Bluetooth transfer. The iPhone inexplicably doesn't support Bluetooth file transfer (except with the uselessly myopic AirDrop), and getting anything off the device is generally painful. Eventually I used a third-party application to share the photos over Wi-Fi. However, it was Ubuntu Touch that gave the most trouble. The Nexus 4 doesn't support memory cards, Ubuntu Touch doesn't yet support Bluetooth and the only option offered was to share via Facebook. I gave up on this. No doubt Ubuntu Touch will improve and ultimately outdo iOS on this, but... not yet.

8 Jan 2014 : Digital Forensics: can it really be an academic discipline? #
Although Digital Forensics isn't my main research area, it is one that I've had involvement with for some time. I work with many very talented researchers in the area of digital forensics, and have worked in the past with the Police in testing new digital forensics tools.

Yet in spite of this, I've struggled with the underpinnings of digital forensics for some time. Unlike security research, which is built on a set of clear principals that remain consistent over time, the principal techniques of digital forensics appear to me to be inevitably temporary and fleeting.

To be clear, I do understand that there are clearly defined goals for good digital forensics practice, and that the overarching aim is to collect evidence within the constraints of these requirements. For example, the need to collect data in a non-destructive way, while ensuring traceability, collecting information about provenance, and ideally supporting repeatability of collection. If digital forensics constrained itself to the pure pursuit of managing data based on these principals, then that would provide scope for a practically useful, but theoretically unremarkable area for future research.

I also understand that there are interesting questions related to how data can be analysed and interpreted to better understand it . But to me this falls under intelligence gathering rather than digital forensics. It fits into a much broader class of research (data analysis) which exists separately and independently.

Instead, at the heart of most digital forensics research you'll invariably find a data collection technique that's designed to uncover unexpected data. Data that the user never intended to persist or become accessible. As others have noted, this goal is diametrically opposed to the central goal of security, which is to offer a strict decision over what access is granted and to whom (where access can apply to not just data but also actions). Presumably, a tightly configured and accurately implemented security policy would prevent any effective digital forensics techniques from being used.

As a consequence, much digital forensics research focusses on bypassing security measures, making use of unanticipated data leaks or amalgamating data that had hitherto been considered benign. As soon as these techniques have been identified, a good security process should provide a countermeasure.

Certainly this offers a lucrative seam of challenges to undertake research around. However, each is just the exploitation of a transient mistake, framed from a perspective of intent. Consequently, when I read about digital forensics research I always struggle to understand the enduring principals which have been uncovered by it.

In contrast, the enduring principals of security research are clear. The aim there is to provide control: the ability to allow or disallow access to digital functionality or information based on a stated security policy. The security policy might change, and so the controls and feedback given to the user might also change, but good security research accommodates this without diverging from this clearly defined goal.

No doubt security doesn't always work like this and there are many challenges to achieving it. Security policies must be suitably complete, definable and understood by the user to achieve the intended results. There must be methods for applying the policy which ensure that the model (policy) and design (controls) align. Finally, the implementation must be correct, so that it - ideally verifiably - matches the requirements.

There will always be difficulties that arise in achieving this, but there is no reason why the methods developed today, which fulfil these objectives within a particular context, shouldn't be as applicable in the future as they are now. I'll grant that the completeness part may be an unachievable aspiration. But this doesn't make the steps towards it any less valid.

On the other hand, the goal of digital forensics is always moving: not forwards but sideways. So what are the underlying principals of digital forensics that an academic research discipline can uncover? How will digital forensics survive as a research area in the future, other than through the drive for practical outcomes? What area is there left for digital forensics to inhabit, once the security problem has been solved?

31 Dec 2013 : Music in the Air #
After wrangling for days with all of the other services to try to get them set up properly on our new home server (mostly the print and folder shares), setting up media streaming has been a breath of fresh air. A quick install of MiniDLNA from the repositories and some minor tweaking of the configuration file, and we can now access music and video from anywhere in the house. Particularly nice is the fact we can upload music via ownCloud and immediately access it direct from the TV. It's all very impressive, for negligible effort on my part (which is just how I like it)!
28 Dec 2013 : Constantia Mk II Goes Live #
After over five years running as my main server resource the time has finally arrived to retire my mini Koolu server, called Constantia. The last few days have been spent transferring its contents over to a new server, ready to take on the same role. The switch has been necessitated by the ageing hardware of the Koolu device. While it's still running beautifully the last Ubuntu release to support the hardware dropped out of its support period earlier this year.

The new hardware is an Aleutia T1 device. With its fanless chassis, low (10W) power consumption, tiny (20cm × 18cm) footprint and supported hardware it makes an ideal successor to the Koolu, as you can see below (Koolu on the left, Aleutia on the right).

Aleutia build the devices for projects such as solar classrooms in rural Africa, but they were also very happy to supply a single machine, even going to the trouble of preloading Ubuntu with a bespoke configuration.

I've been working with it for a couple of days now, and first impressions are good. There's a big performance improvement (noticeable even when accessing basic server tasks over the LAN). The more recent Ubuntu support means a host of new useful features and so far the the new server has picked up the following roles:

  • DNS server.
  • Apache SSL/TLS web server.
  • Subversion repository.
  • SMB shared folders.
  • Shared print server.
  • OwnCloud cloud storage and services.
  • Trac project management.
  • OpenVPN secure remote access.
  • DLNA media streaming.

Most of these were transferred over seamlessly; for example clients see the Subversion repository as just a continuation of what was there before. I'm looking forward to the improved performance, increased functionality and a more robust server to run the network for the next five years or more! Constantia Mk II can be found at

23 Nov 2013 : A Very Exciting Day #
Today is very exciting. I hear you ask: is it because of the Liverpool Derby? The Day of the Doctor? The Xbox One release? 1D Day?*

No. Today is when I get to try out my new server, which will be replacing Constantia and which will basically be running my entire life. For the last five years this has been very ably managed by a Koolu box (actually an FIC built Ion A603 with an AMD Geode LX processor) running Ubuntu 8.04. It's served beautifully all this time and never let me down. Sadly, Ubuntu 8.04 drifted out of its LTS support cycle earlier this year and the hardware combination isn't usable with newer versions of Ubuntu. It's taken me ages to choose a worthy successor given my demanding requirements (very small, passively cooled, low power, silent, good Linux and software compatibility, etc.). Finally I settled on an Aleutia T1 Fanless PC.

Hence my excitement. It's not the highest specced device in the world, but it runs at 10 Watts, is fanless, with supported chipsets. It arrived yesterday and I've not yet even turned it on. Actually getting it to the stage where it can replace my existing server wholesale is going to take a lot of configuration and data transfer between the two, but that'll all be part of the fun challenge.

In my small world, this is a big event, which could very well end in disaster. If this is my last ever post, you'll know why.

* (The Liverpool what? A little. Waiting for SteamBoxen. Please save me!)

27 Jul 2013 : Raiding Revisited #

Over the years I've collected a lot of screenshots of the various games I've played. Still, the games that have captured the essence of adventure and exploration most consistently for me over a long period of time are those from the Tomb Raider series.

The thing they've consistently managed to get right throughout the series is the sense of scale needed to pull the adventure forwards. Surprisingly evocative vistas and large internal cavernous rooms (captured using clever cinematic long-shots) are balanced against intricate mazes with hidden alcoves. The large scale of the vistas offers the promise of future adventure; the claustrophobic corridors achieve the sense of exploration.

On top of this, there have even been some beautiful weather effects (contrast the atmospheric storm at Dr Willard's Scottish castle against the bright burning sunlight of the Coastal Ruins in Alexandria).

The Tomb Raider Reboot didn't disappoint. To celebrate this (it's a small, private celebration to which only me and the Internet have been invited) below are a selection of some of the more powerful screenshots captured during my playthrough of the game.

7 Jul 2013 : Tombs: Raided #

No one other than me will care about this, but I've finally completed the full complement of Tomb Raider games. It's been a long slog, over 10 years in the passing. It doesn't help that they continue to make things harder by releasing new games every so often.

Perhaps surprisingly, but fittingly, the last game that I managed to complete wasn't the latest Tomb Raider reboot, but instead was Unfinished Business, where Lara returns to the Atlantean Hive from Tomb Raider 1. To be fair, I'd already completed this, but had taken the shortcut to skip the Atlantean Stronghold level. I've now done it properly.

Although there are lots of Tomb Raider games I've not played, most of them are mobile, Gameboy or Xbox exclusives which I don't imagine I'll ever get to have access to. I like to think of them as not being canon! Here's the full list of conquered games.

  • Tomb Raider.
  • Unfinished Business and Shadow of the Cat.
  • Tomb Raider II: Starring Lara Croft.
  • Tomb Raider III: Adventures of Lara Croft.
  • The Golden Mask.
  • Tomb Raider: The Last Revelation.
  • Tomb Raider: The Lost Artefact.
  • Tomb Raider Chronicles.
  • Tomb Raider: The Angel of Darkness.
  • Tomb Raider Legend.
  • Tomb Raider Anniversary.
  • Tomb Raider Underworld.
  • Lara Croft and the Guardian of Light.
  • Tomb Raider (reboot).

From all of these, the standout levels are the Venice Level from Tomb Raider II, and (ironically, given the bad reviews) the Louvre level from Angel of Darkness. I loved leaping around those roofs. The latest Tomb Raider was a great game and worked really well as a fresh approach. Still, Edge had it spot on when they said it was a reversal of the formula: from precise platforming and loose shooting to loose platforming and precise shooting. I'd rather have precise platforming and no shooting myself. In spite of this it was a great game and it would be a lie to say I didn't enjoy it a lot.

Thankfully Tomb Raider is a bit like Doctor Who. There's more than enough non-canon material to fill a lifetime, so I have absolutely no plans to stop here. Even the original block-based adventures have their place in modern gaming as rare examples of games worth playing on a laptop without the need for a mouse (it's a dubious accolade I admit). With a bit of luck they'll continue to release great games in the future.

Below are a few more images taken from my final foray into the original world of Lara Croft: Unfinished Business.

28 Jun 2013 : Bitcash #
A while ago I traded my first Bitcoins, and now the purchased product has arrived, all the way from Switzerland. What did I buy with my Bitcoins? Well, a Bitcoin of course! Except this one is a Cacasius physical Bitcoin. Its an interesting idea: create a physical coin that contains the private key to access an "actual" (virtual!) Bitcoin. The private key is printed on the coin under an opaque tamper-proof cover, so that anyone can easily ensure that the coin is worth the valid amount by checking the seal. Consequently it can be passed between people like a normal coin. With a real coin if you want the government to make good on its promise to pay the bearer youre out of luck. With this coin, to redeem the amount just pull off the cover to reveal the key. In practice youd never want to do this, and the virtual Bitcoin youd get wouldnt necessarily be worth any more (or less) than the governments promise, but its still a neat idea.

Cacasius physical Bitcoins

3 May 2013 : Finished Business #
After destroying the Scion and fighting off Natla I thought my work would be done. Not so. There were many other mysteries to solve, and given my late arrival, lots of adventures to pursue in a largely random order. Finally, after completing all of the other adventures, it was time to return to the Atlantean Stronghold and destroy the mutants created by Natla once and for all. To that end, I returned with intent to ultimately destroying the remaining inner Hive.

From the top of the structure overlooking where the hive pyramid erupted from the rock I could see the far cliffs ahead, but no way to reach them. On the ground below, in the far distance, I perceived two golden doors, tempting me forwards as the best means of progress. Alas, despite my best investigatory efforts, there was no way to open the doors and although I knew I needed to ascend to reach my goal, the only way forwards was now down into the hive pyramid itself.

Working my way through the pyramid, I dispatched various terrestrial and winged mutants en route, including those showering me with deadly darts and explosive projectiles. Luckily many of them suffered from idiosyncratic perception difficulties no doubt a result of the mutation process that made them more likely to follow my shadow than me. Fooling them using acrobatic prowess dangling from ledges and leaping on top of blocks and showering them with persistent pistol fire while dodging their own deadly projectiles saw me prevail. Yet this was no easy fight through the chambers and passageways.

As I continued onwards the way became more treacherous still, with lava flows cutting off my path, dangerous precipices to be scaled over lethal spikes and watery pools containing hidden switches that bore the secrets to opening the passageways ahead. Oftentimes I saw glimpses of future dangers, obliquely viewable through the many impenetrable glass and tissue structures of which the pyramid was built. But these ominous forewarnings only drove me harder to complete my journey.

Eventually, working down and then higher again into the rocks above, I found myself overlooking the same pyramid again, but now from the opposite side, from an angle where my goal was visible. Leaping into the unknown, I dived through the darkened hole in the pyramid with only serendipity and an unwavering belief in the existence of a path forwards to trust in. My faith was rewarded, with the pool below deep enough to buffer the impact of my fall. I climbed out to find myself in the inner hive of the mutants, and able to finally finish what had been started all those years ago in the Peruvian Andes searching for the Scion.

28 Apr 2013 : Bitcoins #
Today I traded my very first Bitcoins. It's possibly the worst time to be buying them, given the amazing amount of publicity they've been getting recently (and the upsurge in their value that's resulted). Still, today I'm buying them for a reason rather than as an investment, so I've convinced myself that it's okay. Why the rush? I just discovered that Casascius is no longer selling physical Bitcoins to individuals. Since I'm keen to have one, the high price is just something I have to suck up. I'm looking forward to getting hold of a physical coin (even if it epitomises everything Bitcoins aren't!), and it's exciting to actual own some of the currency. The Web right now manages to make Bitcoins look a lot more daunting than they actually are, which is quite an accomplishment.
9 Dec 2012 : PiBot2 parts #
I've been really quite shocked (in a good way) at the interest that PiBot has generated. Apparently the world needs more Raspberry flavoured Lego robots, so to help anyone aspiring to own their own robot army, here's the list of parts that was used for PiBot2.

Pretty much everything came from Amazon, so most of the links are to the UK Amazon site. Apologies if you're from outside the UK or are currently boycotting Amazon for their dubious tax practices, but all of these should be readily available from lots of other places too.

The table is split into two parts. The first part covers just those bits and pieces that you're likely to need to get a Raspberry Pi up and running. If you've already got a Raspberry Pi, you probably already have all of these things. The second part covers the materials needed to get the robot working.

Parts needed for the Pi
Raspberry PiThe computer.£25.92
Logitech K340 Wireless KeyboardKeyboard works well with Pi.£34.95
Logitech M505 Wireless MouseMouse works well with Pi. The Logitech unifying receiver takes one USB port for both keyboard and mouse.£30.98
HDMI cableTo connect the Pi to a screen.£1.03
Micro USB Mains ChargerTo power the Pi when it's not attached to the battery.£2.75
16GB Micro SDHC cardTo run the OS from.£8.22
Parts needed for the Robot
LEGO MINDSTORMS NXT 2.0The actual robot. This includes the motors and ultrasonic sensor needed for control.£234.99
TeckNet iEP387 7000mAh 2.1Amp Output USB BatteryFor powering the Pi when it's on the move. I tried cheaper less powerful chargers (including AA batteries), but they couldn't provide enough juice to keep the Pi running.£23.97
USB 2.0 A/b Cable - 1mFor connecting the Pi to the Mindstorm control Brick.£1.69
USB A to Micro B Cable - 6 inchFor connecting the Pi to the battery.£2.13

The total bill for this lot was around £370. However, £235 of this is the LEGO Mindstorm and £65 is for the wireless keyboard and mouse, so if you've already got these I'd say the rest is pretty reasonable. I had to try a number of wireless keyboards before finding one which didn't cause the Raspberry Pi to reset randomly though. If anyone knows of a cheaper keyboard/mouse combo that works well with the Pi, let me know and I can alter the list.

If you're building a PiBot, I hope this helps to get things underway. I'd be really interested to know how other people get on; it'd be fantastic to feature some other PiBot designs on the site!

9 Aug 2012 : PiBot2 #
After a frantic buying spree on Amazon and some tense anticipation each day with the post, PiBot has now been augmented (Deus Ex style) with better hardware, neater design and improved software. Meet PiBot2!. The upgrades include a much larger (7000 mAh) battery, a USB connector that doesn't cut power when riding over bumps; a mere 1m long cable (as compared to the previous 5m long version), and auto-roaming code that will explore the room without intervention (mostly!).

The cable is still a good 80cm too long, and the exploration code is simple to say the least, but it's one step further on. Using PyGame for the code means proper asynchronous keyboard input, so that human-control and auto-exploration can be switched between seamlessly. The next part of the plan is to draw objects in the PyGame window as PiBot senses them. I don't expect this to work very well, but I plan to have fun trying it!

Below are a few screenshots of the new PyBot, along with the code in its latest state.

#!/usr/bin/env python

import pygame
import sys
from pygame.locals import *
import nxt
import nxt.locator
from nxt.sensor import *
from nxt.motor import *
from time import sleep

def input(events, state):
    for event in events:
        if event.type == QUIT:
            state = 0
        if event.type == KEYDOWN:
            if event.key == K_q:
                print "q"
                state = 0
            elif event.key == K_w:
                print "Forwards"
                both.turn(100, 360, False)
            elif event.key == K_s:
                print "Backwards"
                both.turn(-100, 360, False)
            elif event.key == K_a:
                print "Left"
                leftboth.turn(100, 90, False)
            elif event.key == K_d:
                print "Right"
                rightboth.turn(100, 90, False)
            elif event.key == K_f:
                print "Head"
                head.turn(30, 45, False)
            elif event.key == K_r:
                state = explore(state)

    return state

def explore(state):
    if state == 1:
        state = 2
        print "Explore"
    elif state == 2:
        state = 1
        print "Command"
    return state

def autoroll():
    if Ultrasonic(brick, PORT_2).get_sample() < 20:
        both.turn(-100, 360, False)
        leftboth.turn(100, 360, False)

def update(state):
    if state == 2:
    return state

window = pygame.display.set_mode((400, 400))
fpsClock = pygame.time.Clock()

brick = nxt.locator.find_one_brick()
left = Motor(brick, PORT_B)
right = Motor(brick, PORT_C)
both = nxt.SynchronizedMotors(left, right, 0)
leftboth = nxt.SynchronizedMotors(left, right, 100)
rightboth = nxt.SynchronizedMotors(right, left, 100)
head = Motor(brick, PORT_A)

state = 1
print "Running"
while (state > 0):
    state = input(pygame.event.get(), state)
    state = update(state)

print "Quit"

22 Jul 2012 : The PiBot Raspberry Pi NXT robot #

Inspired by the amazing things the Boreatton Scouts group are doing with their Raspberry Pis, as well as a conversation with David Lamb and Andrew Attwood two colleagues of mine at LJMU I thought it was about time I actually tried to use my Pi for something other than recompiling existing software. I'm not a hardware person. Not at all. But I do have a Lego Mindstorms NXT robot which has always had far more potential than I've ever had the energy to extract from it.

But after reading about how it's possible to control the NXT brick with Python using nxt-python, and with David pointing out how manifestly great it would be to get the first year undergraduates learning programming using it, I couldn't resist giving it a go.

It turned out to be surprisingly easy. The hard parts? First was getting the Pi to discover the NXT brick over USB. The instructions for this aren't too great, but in fact it turned out to be as simple as copying the NXT MAC address into the PyUSB configuration file. Second was getting the Pi, battery pack and 5 metres (yes, you read that right) of USB lead to balance on top of the robot!

The PiBot components

I'm not exactly sure why I bought such a huge lead given I knew it would all end up on top of the robot, but that's planning for you!

The result really is as crazy and great as I'd hoped. I wrote a 50 line python programme to read key presses and drive the robot appropriately right, left, forward and back and nxt-python does all of the hard work. The keyboard is wireless, attached to the Raspberry Pi using a micro dongle. The USB lead connects the Pi with the NXT brick. The Raspberry Pi is powered by a USB phone charger. The monitor lead and ethernet aren't needed when the machine's running, which means the robot/pi combination is entirely self-contained and can be controlled using the wireless keyboard.

It was also possible to read data from the sensors, allowing the robot to drive itself entirely autonomously around the room avoiding objects and generally exploring. The next step is to collect more input about the distance it's travelled so that it can be mapped on to a virtual room on the Raspberry Pi and build a picture of the world.

Here's a video of Joanna controlling the Heath-Robinson contraption as well some photos showing all of the different parts balanced on top of one another.

The PiBot components

The wonderful thing about all of this is that although it requires a huge amount of effort and insight to get each of the individual pieces working, none of the effort was mine. Pulling the pieces together is really straightforward, building on so much clever work by so many people. It's got to the stage where you can grab a phone charger, some Lego, a 35 PC the size of a credit card, a wireless keyboard, an entirely open source software stack, 5m of USB cable and a Sunday afternoon and end up with a complete robot you can programme directly in Python. Brilliant.

The PiBot components

#!/usr/bin/env python

import nxt
import sys
import tty, termios
import nxt.locator
from nxt.sensor import *
from nxt.motor import *

brick = nxt.locator.find_one_brick()
left = nxt.Motor(brick, PORT_B)
right = nxt.Motor(brick, PORT_C)
both = nxt.SynchronizedMotors(left, right, 0)
leftboth = nxt.SynchronizedMotors(left, right, 100)
rightboth = nxt.SynchronizedMotors(right, left, 100)

def getch():
	fd = sys.stdin.fileno()
	old_settings = termios.tcgetattr(fd)
		ch =
		termios.tcsetattr(fd, termios.TCSADRAIN, old_settings)
	return ch

ch = ' '
print "Ready"
while ch != 'q':
	ch = getch()

	if ch == 'w':
		print "Forwards"
		both.turn(100, 360, False)
	elif ch == 's':
		print "Backwards"
		both.turn(-100, 360, False)
	elif ch == 'a':
		print "Left"
		leftboth.turn(100, 90, False)
	elif ch == 'd':
		print "Right"
		rightboth.turn(100, 90, False)

print "Finished"

19 Feb 2012 : The World Wild West #
The Web used to be like the Wild West: lawless and anarchic, yet at the same time inspirational and free. But frontiers get pushed back, and beasts get tamed. Today the Web is a far 'safer' place, with much of the control ceded to governments and corporations. One of the happier casualties of this appears to be spam, which through a combination of law and technology, is now a far less aggressive problem than it was back then.
Since the start of this site, I've always used a public email address that was separate from the private email address I gave to people personally. The reason was to reduce spam, and also because companies couldn't be trusted to use my email address responsibly. Today the amount of spam I receive, even on the public address, is bearable and companies are much more likely to actually comply with the data protection laws preventing distribution of contact details. As a result, I've decided to finally move over to using just a single, simple, email address. The plan is to make my life easier and have fewer addresses to deal with. Whenever I write out my name on official forms it's always hard to fit it into the space provided. Finally I can now avoid having the same problem with my email address as well!
12 Feb 2012 : Celtic Knots: moving from 2D to 3D #
A couple more prints have arrived from Shapeways and once again I'm really pleased with the results. The first was a bit of an experimental print for a number of reasons. It's another 3D Celtic knot, but this time I tried it with much thinner threads, right down to the minimum of 0.7mm thickness recommended by Shapeways. I get the feeling this recommendation is intended for walls, so I'd feared the threads wouldn't be strong enough to hold together. In fact, the final result is perfectly sturdy and the threads seem quite robust. Second, I tried the polished version of the "white strong and flexible" material (which is apparently a kind of nylon). The polishing process involves shaking the model with lots of tiny polishing balls, so again I'd feared this might affect the models strength. And again, it seems my fears were unfounded. Finally, I generated the model to have gaps where the threads cross over, the hope being it would be printed in four separate pieces. Unfortunately I apparently didn't give enough clearance, and some of the threads fuse at these intersections. Nonetheless, some of them are still loose, and the result is really great. I may try it again with a bit more of a gap next time though.

The second knot is a proper 2D Celtic knot. The idea is that this is generated from the same seed as the 3D knot, making it in some sense the 'same' knot. That's not really true, but until I figure out what's really meant by 'the same', this is as close as I can think of. I was pleased to find that, since they're both printed with the same dimensions, resting the 2D version on one of the faces of the 3D knot, they align nicely and really look like one is an extruded version of the other.
Once again, printing out these knots has provided some really nice results, leaving the biggest problem the question of what to print next.
4 Feb 2012 : From theory to practice #
Yesterday I received another print from Shapeways. It's my first metal creation using a clever printing process that takes a 3D model as input and creates a completely formed bronze object as output.
Perhaps unsurprisingly it's another 3D Celtic knot. Once again, in spite of the dubious model I provided, the result is just brilliant. It's a real chunk of metal that looks like it's been hewn and polished into a complex shape through hours of craftsmanship. It did take hours of work of course, but in reality it was largely done using completely routine machine production techniques. Here's a shot of the result.

We all have dreams about the things we want to do when we're grown up, like becoming pop stars, train drivers, footballers or whatever. As we grow older we find we have to shed some of these hopes. There comes a point when the realisation sets in that perhaps there are people better suited to fighting dragons. Having spent practically all my life working with either maths or computers, I'd pretty much given up hope of ever doing something that actually produced physical results. It sounds like a strange dream, but the prospect of being able to create something tangible has always seemed exciting.
It's surprising then to find a path of entirely abstract ideas can lead so naturally into a process of creating physical constructs. This is the solution 3D printing offers. It allows people to turn abstract ideas into physical form, without ever having to leave the comfort of a computer screen. No need to get your hands dirty.
Of course, the physical infrastructure needed to get to this point is phenomenal (electricity, Internet, banking, etc.). Someone had to build it and huge numbers of people are still needed to maintain it. But as far as I'm concerned, sitting behind a computer screen, it's still an utterly seamless and physically effortless process. Only thought required.
You can download the 3D model for this, or buy a physical artefact, direct from the Shapeways website.
6 Sep 2011 : 3D printing #
In case you're interested to know more about my recent 3D printing experience, I've put together some more words and pictures. Feel free to take a look. Alternatively, there's also a link in case you want to print a copy of the Celtic knot yourself. That's right, you really can print your own. Still doesn't seem right.
3 Sep 2011 : 3D printing #
Today I received my first ever 3D print. It's a 3D Celtic Knot which was generated by some code I put together while Joanna and I were in Tuscany a couple of weeks back. The model was sent off to a company called Shapeways in the Netherlands, and today I received the final printed object. The technology that allows you to print 3D objects is just phenomenal, both in terms of how clever it is, and the astonish potential it promises. It really does provide the opportunity to create just about anything, turning the wildest imaginings into reality.
I was pretty nervous getting it out of its box as I really wasn't sure how well it would come out, but in fact I'm astonished at how clean the printing is and how sturdy the object is. I hope to put together the full story of my 3D printing experience, from theory to reality, tomorrow.
One of the things I love about the idea of 3D printing, is that it really seems as close as we can get right now to the Star Trek replicator way of doing things. That might seem like an irrelevance - just a nerdy reaction - but I see it as a real vision of how things will change in the future. I find the shift from small-scale-bespoke, through mass-production, to mass-bespoke just a little exhilarating to be lucky enough to experience.
3D printed celtic knot
31 Aug 2011 : Syncing my Google Calendar #
At work I use Outlook, since the University uses MS Exchange and the nature of collaboration tools is that you have to use what other people are using. However, for some time now I've also been syncing this with a Google Calendar so that I can also make some of the details available on this site. Google provides a free syncing tool, but this had various limitations, such as only being able to sync one calendar, making it no good for what I wanted. The solution was to use a piece of software called SyncMyCal. For the record, this is a great piece of software that does a straightforward task very well. Once it's properly configured, it's the kind of software that works best if you don't notice it again, which is exactly how things were until recently. It was well worth the asking price.
So, this worked great for ages, until half a year or so ago the University started upgrading the Exchange servers, and I upgraded my machine to Outlook 2010. SyncMyCal was only compatible with Outlook 2003.
My solution at the time was to continue running OUtlook 2003 with SyncMyCal on a separate machine. This kind of worked, but had problems. The machine would get turned off and I wouldn't notice, or it would reboot after an automatic update leaving Outlook asleep on the hard drive. My Google calendar was only updated intermittently. Nobody really cared, except for me, since it increased the disorder in my world and kept me locked in to running an old machine just for the sake of syncing.
Until yesterday that is. On the offchance I checked the SyncMyCal site yesterday and found they'd finally released an Outlook 2010 version of their tool. Yay!
The result is that now my calendars are syncing normally, the version on this site is telling the truth, rather than some partial version of it, and the world - for me at least - has become a little more ordered!