flypig.co.uk

Gecko-dev Diary

Starting in August 2023 I'll be upgrading the Sailfish OS browser from Gecko version ESR 78 to ESR 91. This page catalogues my progress.

Latest code changes are in the gecko-dev sailfishos-esr91 branch.

There is an index of all posts in case you want to jump to a particular day.

Gecko RSS feed Click the icon for the Gecko-dev Diary RSS feed.

Gecko

5 most recent items

6 Oct 2024 : Reviewing My Browser History #
For many years I thought it would be a mistake to mix my hobbies with my professional life. Blurring the boundary would prevent me defining a clear boundary between my work time and my relaxation time. I thought it could also lead to things I enjoy becoming contaminated, irreversibly harming the joy I get from them. It's not that I didn't want to enjoy my work: quite the opposite in fact. I felt in order to enjoy both I needed to maintain a separation.

As my life has progressed I've changed my opinion on this. It's great to separate work and play, but there's also immense joy to be had from doing something you love as a professional endeavour. Mixing the two together has the potential to amplify the joy from both.

Working for Jolla was what really brought this home to me. Smartphone development, user privacy and control, and Sailfish OS in particular were always part of the life I separated from my career. When I started working at Jolla I thought I was taking a risk. Would I lose my passion? Would I regret knowing what goes on inside the "sausage factory"?

My concerns were unfounded and, with this experience in hand, I now make it my aim to bring my personal passions into my professional life as well.

Now I'm at the Turing I'm no longer developing for Sailfish OS during work hours. As readers of my gecko dev diaries will know, upgrading the Sailfish browser has been one of my main activities outside of work. Finding opportunities to bring this Sailfish development into my professional world has been one of my objectives and recently just such an opportunity arose.

It's only a small overlap: in November I'll be giving a presentation about browsers at the Turing. The title of the talk will be "The Anatomy of a Browser: Embedded Mobile Lizards". Lizards being a reference to Gecko.

To help with this I've been digging a little into the history of browsers. They have a rich, fascinating and often fractious history that I find fascinating and one I want to talk a little more about today.

But to understand the history, we first need to understand a little about the internals of a browser.

Browser Internals

What are the various pieces that make up a browser? Broadly speaking we can see it as being made up of five parts:
  1. Protocol client (HTTP/S, WS/S, file, FTP,...).
  2. JavaScript engine.
  3. DOM - Document Object Model.
  4. Layout/rendering engine (HTML, CSS, SVG).
  5. Media encoder/decoder (JPEG, PNG, audio, video,...).
  6. User interface.
That's already quite a lot to think about, but each of these can be broken down into many more pieces. Let's look at them in a bit more detail.
 
A graph showing 11 nodes: Web server, JavaScript engine, Protocol client, Media encoder/decoder, Layout engine, DOM + scene graph, Renderer, Chrome, nsDocShell + nsWebBrowser, Render backend and Compositor. The nodes are connected with arrows indicating functional relationships

The protocol client handles network interactions. It opens a network connection, sends a message to the server, then waits for and interprets the response. If the protocol uses a secure transport layer it handles certificate validation, checking certificate revocation, data encryption and integrity. The latest releases of Firefox and Chrome support HTTP, HTTPS, WebSockets, Secure WebSockets, Secure Real-time Transport Protocol, file access and probably others I'm not aware of. Firefox and Chrome used to support FTP but have since dropped it. Firefox dropped support in version 90 (July 2021) while Chrome dropped it in version 95 (September 2021). Unlike the rendering and JavaScript engines, protocol clients tend not to be given their own bespoke names separate from the browser. Maybe this is because they're often built from other libraries offering support for specific protocols. Nevertheless the protocol client is both a crucial and complex part of the browser.

I've listed the DOM as a separate piece of the browser, but it's usually tightly coupled with the layout and rendering engines. The DOM defines the internal data structures used to represent the page being rendered. For HTML, XML or SVG documents these are hierarchically built from nodes that have a parent and multiple (possibly zero) children. Typically the document structure will map naturally onto the DOM, with XML elements and attributes mapping onto nodes. Child nodes in the document will map to child nodes in the DOM. In practice nodes are likely to be represented as class objects in the code containing references to child nodes. The DOM is usually part of the rendering engine separate from the JavaScript engine, but if it weren't for JavaScript the DOM might be considered as just an implementation details. The existence of JavaScript elevates the DOM to something Web developers have to have a good understanding of, as we'll see.

The JavaScript engine allows execution of JavaScript code. JavaScript has an odd history. Originally invented at Netscape by Brendan Eich, you might think that the JavaScript language has something to do with Java. In fact they're very different. Java is a strictly-typed object-oriented garbage-collected language that compiles down to a bytecode representation that can be executed by a Java Virtual Machine. Although at one point Java "applets" that ran in the browser were a thing, you rarely see these nowadays (they're not supported without installing a plugin). Java is still used in server applications and to be honest, given Sun's expertise and revenue rested primarily with servers I always found it rather surprising that it was ever anything other than server-focused. JavaScript on the other hand is very much a client-side, dynamically-typed event-based scripting language with prototype-based object orientation. In recent years it's also become popular as a server-side language for reasons that I won't go in to here. Both Java and JavaScript are "curly-brace" languages with similarities to C++; and while I realise I've managed to make them sound quite similar, they're actually totally different. The only reason they share a name is that in a bit to ride the wave of Java's popularity, Netscape signed a licensing agreement with Sun to use the name. Marketing genius or ontological vandalism? You decide.

Another key difference between JavaScript and Java is that the DOM is a first-class entity in JavaScript. Although they live in different parts of the browser, the development of the DOM is tightly intertwined with that of the language. When first released by Netscape JavaScript could interact with only certain elements of the page, most notably form elements. The name now given to the set of elements exposed at that time is DOM Level 0. Access to the full document didn't come until DOM Level 1. While JavaScript is a perfectly good language even without the DOM, in a browser context the two are tightly coupled.

Although JavaScript refers to and can modify the DOM, the DOM implementation is part of the layout and rendering engine. When we refer to browser engines (WebKit, Gecko, Blink,...) we're usually referring to this layout/render engine portion of the browser. The layout engine takes the document, structured using the DOM, and lays it out as elements on the page in the way they'll be viewed by the user. This allows the browser to build up the equivalent of a scene graph which is then rendered by the rendering engine to some sort of canvas (the screen or an offscreen buffer). This rendering usually uses an appropriate render backend, for example on the Sailfish Browser it calls a serious of GLES commands. The layout engine follows a strict set of rules for positioning elements on the page. The HTML/CSS box model is used for rendering most items, but there are exceptions. For example SVG has its own rendering model which Gecko also supports as part of the same DOM hierarchy.

HTML and SVG documents embed or reference large numbers of other file types, which the browser has to support as well. These multimedia files include images, audio, animations and video. Historically browser support for different multimedia elements has been a mess, often delegated to some other operating system component (e.g. Windows Media Player, ffmpeg, gstreamer). Each file type will have its own decoder and there may be Digital Rights Management involved as well (e.g. Widevine). In practice browsers tend to separate raster and vector images from video and audio. The former have been tightly integrated into HTML for decades whereas the latter two only became standardised in HTML 5 with the introduction of the audio and video tags. These allow audio and video to be embedded with customisable controls.

Finally we have the user interface, which is the bit that we most associate with the browser. This is a little ironic given I'd argue the depth, complexity and maintenance burden is weighted towards the other layers. But most people aren't really concerned with the rendering or JavaScript engine, they care about whether a particular user interface feature is supported or not.

And to be fair, the user interface doesn't just display an address bar. It also has to provide tabs, JavaScript pop-ups, permissions dialogues, Settings controls, password management functionality, bookmarks, history management and a whole lot more.

In the embedded browser space the user interface is intentionally minimal. The idea is that the browser gets embedded into some other application which provides the user interface elements needed over and above those provided by the rendered Web page itself. On Sailfish OS this minimal interface is provided by the WebView. The additional capabilities are managed through the WebView's Application Programming Interface. On Sailfish OS there's also a Qt-based user interface to the browser, which brings its own complexity. For simplicity I've grouped together the user interface of the browser and the application programming interface of the embeddable WebView in the "Interface" section in the diagram.

During my time upgrading the Sailfish Browser from ESR 78 to ESR 91 I routinely referred to it as a Gecko upgrade. The name Gecko covers the DOM, layout engine and rendering engine but typically doesn't include the JavaScript engine or user interface. The user interface is typically referred to by the name of the browser itself. For example Firefox uses the Gecko rendering engine, the SpiderMonkey JavaScript engine and the Firefox user interface. For Safari it's WebKit, Nitro and Safari. For Chrome it's Blink, V8 and Chrome. And so on.

Now that we've broken down the different parts of the browser we're equipped to delve into the history of Web browsers in more detail.

Libwww

We're going to start our history in 1990 when Tim-Berners Lee and Jean-François Groff, both working at CERN, created the HTTP protocol and HTML language that still define the Web today. It fascinates me that Tim-Berners Lee is so well-known as the inventor of the Web, but pioneers like Jean-François Groff and Nicola Pellow, who were there at the beginning, are scarcely recorded. But the Computer History Museum has documented a fascinating interview with Jean-François in which he gives am explanation of the very first Web engine.
 
my main task during my days at CERN... was porting all the software libraries, I mean the software components that were on the NeXT system into a universal code library that was written in C, it's the 'libwww'" It didn't even have a name at the beginning, which is why in some history books, you see, 'Oh, libwww was released in November of 92.' No, it wasn't, you know? It was running since February '91, it just didn't have that name... We had the page rendering system, the parsing of HTML, and also all the URL mechanisms, history list, all that was abstracted into one software library as a package, as a toolset basically. And then in August 91, I think when we announced the World Wide Web, we also said 'You can use that toolset and build whatever you want with it'".

Right at the start the history is a bit messy. The WorldWideWeb browser was the graphical HTML browser (and editor) written by Tim-Berners Lee in Objective-C to run on NeXTSTEP. The first version was completed at the end of 1990 with the browser being later renamed to Nexus. The code was re-written in C by Tim and Jean-François and turned into the Libwww library to become the very first browser engine. This was then used by Nicola Pellow at CERN to write the Line Mode Browser which was text-based, usable over telnet and released in 1991.
 
A Gantt chart with eleven groups referencing different browser engines (Libwww, Trident, Navigator, Gecko, Servo, KHTML, WebKit, Blink, Presto, LibWeb and Netsurf). Horizontally years between 1990 and 2024 are shown, with bars to represent when the various browsers were supported.

This was not just the birth of the Web, but also the genesis of structures that now define what it means to be a Web browser. These same structures can be seen in how browsers are built today.

Libwww and the Line Mode Browser that were created from it continued to be developed right up until 2017. Although the library is written in C it applies an object-oriented approach. Structures have constructors and destructors with the me context variable often used in places where you might find this or self in an object-oriented language. Reading this code in the early noughties had a profound influence on me, shaping my own style of C coding to this day.
/*	Create a Context Object
**	-----------------------
*/
PRIVATE Context * Context_new (LineMode *lm, HTRequest *request, LMState state)
{
    Context * me;
    if ((me = (Context  *) HT_CALLOC(1, sizeof (Context))) == NULL)
        HT_OUTOFMEM("Context_new");
    me->state = state;
    me->request = request;
    me->lm = lm;
    HTRequest_setContext(request, (void *) me); 
    HTList_addObject(lm->active, (void *) me);
    return me;
}
Besides the Line Mode Browser, Libwww was used in countless other projects. My own port to RISC OS from 2004 is still available. I used it to extend a forensic analysis tool for use on the Web (that the University I worked for later patented).

More notably it was also used by the Amaya lightweight Web editor developed at INRIA and the Mosaic Browser developed at the NCSA. The Mosaic browser was popular in its day and the NCSA spun out a commercial entity in the form of Spyglass Mosaic which built on the NCSA Mosaic code. The company was set up to licence the browser to other companies.

Trident

This Microsoft duly did. The browser engine of Internet Explorer — called Trident — was built on the Mosaic technology. The first version of Internet Explorer shipped without JavaScript support (the language hadn't been invented yet), but when it arrived in IE 3 in 1996 it was powered by Microsoft's Chakra JavaScript (nee JScript) engine.

The licensing agreement struck with Spyglass required Microsoft to pay a small monthly fee with additionally a portion of all non-Windoww revenue from the browser going to Spyglass.

As anyone who experienced the browser wars at that time will know, Microsoft proceeded to give Internet Explorer away for free with Windows. This ultimately earned them a lawsuit from Spyglass (settled out of court for $8 million) and an antitrust lawsuit from the US Government (eventually resulting in Microsoft having to change its approach to interoperability).

Internet Explorer remained as a core component of Windows until Windows 10, after which the company finally switched to offering Edge as the default browser. While Edge is built on Google's Blink engine, even that wasn't enough to dislodge Trident entirely. It remains to this day as the rendering engine powering Edge's compatibility mode. While it's not clear whether any of the original code can still be found in Edge (seems unlikely), a thirty-five year legacy is pretty good going.

Gecko

Internet Explorer's arch rival during the browser wars was Netscape Navigator, offered to consumers by countless dial-up Internet providers bundling it on free CDs alongside their own dial-up software and configurations. Netscape was the first browser to incorporate JavaScript support, which it did using the SpiderMonkey JavaScript interpreter in 1995.

Running up to 2000 Netscape completely re-wrote their browser engine. The result was what we now know as Gecko and which powers both Firefox and the Sailfish Browser. The purpose of the re-write was ostensibly to improve standards compliance and maintainability. But the highly abstracted code — arguably what has allowed the renderer to remain relevant to this day — resulted in poor performance. Netscape Navigator was a large programme, incorporating not just a browser but also a full email client and Website editor. In an attempt to improve performance in 2002 the components were split up to form Firefox as a stand-alone Web browser and Thunderbird as a stand-alone email client. My recollection is that this was controversial at the time and didn't improve performance a great deal. But the separation stuck. Splitting email from Web and dropping editing entirely seems to have resonated with users.

Gecko, in the form of Firefox, has experienced ups and downs. Browser statistics are notoriously subjective, but Statscounter registers Firefox market share as having dropped to just over 3% as of January 2024, having peaked in January 2010 at just over 30%.

There's plenty more to say about Gecko's history, not least in relation to its use as an embeddable component, but let's put that aside for today and I'll return to it in a future post.

Gecko remains relevant today as the most popular alternative to the WebKit/Blink family of browsers. While technically open source, both WebKit and Blink are directed by large corporations with few concessions to open source development methodologies. Mozilla on the other hand is a not-for-profit foundation that embraces the spirit of open source as well as the letter. For many, Gecko is an important bulwark against a corporate-controlled browser monoculture.

An interesting twist in Gecko's development comes from its adoption of the Rust language. Developed by Mozilla employee Graydon Hoare and officially adopted by Mozilla in 2009, Mozilla has been gradually moving Gecko's internal components from C++ to Rust.

This led to the development of the Servo engine, written wholly in Rust as a Mozilla research project. While never intended to replace Gecko, elements of the Servo engine were integrated back into the Gecko's WebRender rendering engine.

Servo is currently available as an engine with an intentionally bare-bones user interface. Mozilla divested itself of Servo in 2020, but development continues with the aim of specifying a WebView API during 2024 for use as an embeddable engine.

Presto

We're going to jump ahead a little in the diagram and turn our attention to the Opera browser. There are many unique and fascinating facets to Opera that it won't be possible to explore fully here, but it's still worth skimming the surface. Opera is unusual in that it was, for a long time, one of the few independent commercial browsers. When first released in 1995 it was shareware (requiring payment after a trial period). There was no JavaScript support (the language hadn't been invented yet) and at the outset the rendering engine wasn't named separately to the browser. This changed in 2000 with the introduction of the Elektra rendering engine and the Linear A JavaScript engine. In 2003 Opera switched to using what they claimed to be a new rendering engine, the internally developed Presto, alongside a new Linear B JavaScript engine. While the Presto name stuck, Opera's JavaScript engines have enjoyed periodic renaming: the Futhark JavaScript engine in 2008, followed by the Carakan JavaScript engine in 2010. Since the browser and all of these engines are closed source it's impossible to know to what extent they were really new technology as compared to an evolution of existing code.

Through much of its life Opera forged its own path. It was the first mainstream browser to introduce tabs. It integrated a (very good) email client long after Mozilla had disentangled Firefox and Thunderbird from Netscape Navigator. It even integrated its own Web server at one point. Opera also made a point of sticking to W3C standards while other browsers were still trying to lock users in to a proprietary Web.

Perhaps it's for this reason that there was much disappointment when Opera switched to using Blink and V8 in 2013, soon after Google and announced it would fork WebKit. To find out how it got to this point we'll need to go back a bit again and look at the evolution of WebKit.

WebKit and Blink

At this point in time WebKit is the most popular engine for accessing the Web (in either its WebKit or Blink variants). Moreover it's also the go-to browser engine for use in embedded scenarios as we'll see shortly.

Initially part of the KDE project, WebKit provided the engine for Konquerer, the default browser for the KDE desktop environment. At that point the engine was referred to by the name KHTML, alongside the KJS JavaScript engine. It was picked up by Apple in 2001, apparently because of its small code footprint. Apple renamed KHTML and KJS to WebCore and JavaScriptCore respectively with the WebKit project encompassing both.

Contributions to WebKit came from both Apple and the KDE project, as well as from the Qt Project which offered the QtWebKit embeddable widget. Sailfish OS supported use of QtWebKit up until its deprecation in Sailfish OS 4.4 and removal in 4.5, the functionality being replaced by the Gecko WebView API.

In 2008 Google introduced its own Chrome browser also built on WebKit but using the new and Google-developed V8 JavaScript engine. Google's advertising for it emphasise speed (start times and JavaScript execution in particular). Chrome also had an — at the time — unusual sandboxing model with each tab executed as a separate process. This meant that crashes triggered by WebKit or V8 would only bring down a single tab, leaving other tabs and the browser intact.

Although built using many open source components, Chrome itself is made available under a proprietary licence. The Chromium project, also developed by Google, is a fully open source implementation of Chrome, but with the proprietary components removed.

From the outset Google had to make changes to WebKit to support its use in Chrome. Still it took another five years before Google officially forked WebKit in 2013, creating the Blink browser engine. Consequently Chrome now uses both its own renderer and JavaScript combination: Blink and V8.

One of the attractive features of the Blink engine, also particularly relevant to Sailfish OS, is its embedding API which allows it to be used separately from Chrome (or Chromium) and embedded in independent applications. A common example of this usage can be found in the Electron framework, which uses Blink for rendering.

This embeddable design, which neatly separates the chrome from the engine, also makes Blink attractive for use by other browser developers. As noted earlier, Opera switched from Presto and Caraken to Blink and V8 for rendering and JavaScript respectively. Microsoft similarly chose Blink and V8 as the basis for its Edge browser in 2019.

Qt introduced the Qt WebEngine component, wrapping Blink and V8 to offer an embeddable browser, around the release of Qt 5.2 in 2013. This was intended to replace QtWebKit, which was ultimately removed in Qt 5.6. The closest KDE has to a default browser is Falkon, which uses the Qt WebEngine. This therefore completed a strange cycle, with KHTML having been started as part of KDE, forked by Apple, forked again by Google and then integrated back in to KDE via Qt.

LibWeb

An unexpected entrant into the browser space was recently announced in the form of Ladybird. To understand why Ladybird exists, it helps to understand a little about Serenity OS, the operating system project it grew out of and which it has now eclipsed. According to the FAQ of Serenity OS the developers try to "maximize hackability, accountability, and fun(!) by implementing everything ourselves.". And that includes the Web browser: the project developed its own renderer and JavaScript engine in the form of the imaginatively-named LibWWW and LibJS.

Recently the main Serenity OS developer, Andreas King, refocused his attention from the operating system to the Ladybird browser. Ladybird is built using the LibWeb and LibJS browser components of Serenity OS, but which he now develops independently. This arguably represents the first new engine to be introduced with the aim of being a fully-fledged browser for over twenty years, making for a particularly interesting development.

NetSurf

Last but not least we have NetSurf, which like Ladybird, is a bit of an outlier. Like Ladybird it was originally developed for exclusive use on a non-mainstream operating system.

The first version of NetSurf was released in 2002. At that time it was developed exclusively for use on RISC OS, the operating system that powered the Acorn Archimedes (the first publicly available computer to use an ARM processor).

RISC OS is very different from most other operating systems available today. It makes no attempt to be Unix-like and has its own distinctive and cooperatively multitasking desktop environment. This heritage means that the browser is incredibly lightweight, with good CSS support but without viable JavaScript.

During the early days of development JavaScript support was considered out-of-scope for the browser. The reason for this is interesting: it wasn't for lack of a usable JavaScript interpreter, but because the browser lacked a standards-compliant DOM. It turns out JavaScript isn't especially useful without a standards-compliant way to access the elements of a Web page.

Despite the lack of JavaScript support NetSurf still managed to find a niche as a fast and lightweight browser, growing beyond RISC OS. As of today there are downloadable packages available for RISC OS, GTK (Linux), Haiku, AmigoOS, Atari and experimentally for Windows.

The Truth About Browsers

Browser history is a tangled Web. While writing this it quickly became clear that, when it comes to browsers, any generalised claim is likely to turn out false. The date a browser came into existence? Do you mean the date the project was first thought of? The first commit? The first release? An alpha release? A beta release? Release 1.0? Is a particular engine entirely new, the redevelopment of an old engine, or just a rename? To what extent does the code of one engine flow into another when they both share libraries? Every browser out there is like the Ship of Theseus at this point. When we talk about a browser engine are we talking about the renderer, the layout engine, the JavaScript engine, the chrome? Sometimes these things can be separated, other times they're intrinsically tied together. Is it good to have a single reference engine that all browsers use for a consistent experience across the Web, or should we be championing diversity as way to prevent any single entity taking control? Do we even know how to calculate browser market share?

Even the question of what a browser is, presented in anything other than the most abstract terms, is likely to suffer exceptions.

What is clear is that browsers have become deeply integrated into our lives. Whether using a computer or smartphone, access to a browser has become a necessity. Over time they've continued to become more capable and more technically complex. Combined with their convoluted history, that makes them fascinating objects of study.

Comments

Uncover Disqus comments