flypig.co.uk

List items

Items from the current list are shown below.

Blog

5 Mar 2024 : Day 176 #
Just as I was explaining in my diary entry yesterday, the first thing I like to do after I wake up following a night of hard work — all performed by my laptop building the latest Gecko changes of course — is to scan the output for red. Red indicates errors.

When I peeked in this morning there was no red showing on the console. But my excitement was short-lived. The exception to the "errors are red" rule is when they come from the linker rather than the compiler and that's what seems to have happened here.

I'm not sure why the linker doesn't bother highlighting errors in red, but on close inspection they definitely are errors.

The fact it compiled without error is nevertheless exciting in itself, just not as exciting as actually having a binary to test. So what are these errors? As is the way with the linker, they're all "undefined reference" or symbol-related errors. This happens when the code calls something that may, for example, have a method signature but no method definition. That can be common with pure virtual functions that aren't subsequently overridden, say, but there are other reasons too. For example I might have added a method signature simply without adding in the method body. Here's a sample of the errors that came out:
403:37.88 toolkit/library/build/libxul.so
410:16.07 aarch64-meego-linux-gnu-ld: ../../../gfx/gl/GLScreenBuffer.o:
    in function `mozilla::gl::GLScreenBuffer::~GLScreenBuffer()':
410:16.07 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:296:
    undefined reference to `mozilla::gl::SurfaceCaps::~SurfaceCaps()'
410:16.07 aarch64-meego-linux-gnu-ld: /jol403:37.88
    toolkit/library/build/libxul.so
410:16.07 aarch64-meego-linux-gnu-ld: ../../../gfx/gl/GLScreenBuffer.o:
    in function `mozilla::gl::GLScreenBuffer::~GLScreenBuffer()':
410:16.07 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:296:
    undefined reference to `mozilla::gl::SurfaceCaps::~SurfaceCaps()'
410:16.07 aarch64-meego-linux-gnu-ld: ${PROJECT}/gecko-dev/gfx/gl/
    GLScreenBuffer.cpp:296: undefined reference to
    `mozilla::gl::SurfaceCaps::~SurfaceCaps()'
410:16.07 aarch64-meego-linux-gnu-ld: ../../../gfx/gl/Unified_cpp_gfx_gl0.o:
    in function `mozilla::gl::GLContext::GLContext(mozilla::gl::GLContextDesc
    const&, mozilla::gl::GLContext*, bool)':
410:16.07 ${PROJECT}/gecko-dev/gfx/gl/GLContext.cpp:290: undefined reference to
    `mozilla::gl::SurfaceCaps::SurfaceCaps()'
[...]
410:16.10 aarch64-meego-linux-gnu-ld: libxul.so: hidden symbol
    `_ZNK7mozilla2gl9GLContext15ChooseGLFormatsERKNS0_11SurfaceCapsE'
    isn't defined
410:16.10 aarch64-meego-linux-gnu-ld: final link failed: bad value
410:16.10 collect2: error: ld returned 1 exit status
Let me summarise all of the errors we have here and make things a bit clearer. The following are all the error locations followed by the names of the missing implementations.
GLScreenBuffer.cpp:296: SurfaceCaps::~SurfaceCaps()
GLContext.cpp:290: SurfaceCaps::SurfaceCaps()
SharedSurface.cpp:362: SurfaceCaps::SurfaceCaps(mozilla::gl::SurfaceCaps const&)
SharedSurface.cpp:361: GLContext::ChooseGLFormats(
    mozilla::gl::SurfaceCaps const&) const
SharedSurface.cpp:362: SurfaceCaps::SurfaceCaps()
SharedSurface.cpp:338: SurfaceCaps::SurfaceCaps()
SurfaceTypes.h:38: SurfaceCaps::SurfaceCaps()
SurfaceTypes.h:38: SurfaceCaps::operator=(mozilla::gl::SurfaceCaps const&)
SurfaceTypes.h:38: SurfaceCaps::~SurfaceCaps()
SharedSurface.cpp:350: SurfaceCaps::operator=(mozilla::gl::SurfaceCaps const&)
SharedSurface.cpp:338: SurfaceCaps::~SurfaceCaps()
SharedSurface.cpp:367: SurfaceCaps::~SurfaceCaps()
GLContext.cpp:296: SurfaceCaps::~SurfaceCaps()
SharedSurfaceGL.cpp:39: SurfaceCaps::SurfaceCaps()
SharedSurfaceGL.cpp:39: SurfaceCaps::~SurfaceCaps()
RefPtr.h:590: SharedSurfaceTextureClient::SharedSurfaceTextureClient(
    mozilla::layers::SharedSurfaceTextureData*, mozilla::layers::TextureFlags,
    mozilla::layers::LayersIPCChannel*)
So, what's going on with these? Let's take a look. First up we have line 296 of GLScreenBuffer.cpp. This line, it turns out, is where the GLScreenBuffer destructor implementation starts in the code:
GLScreenBuffer::~GLScreenBuffer() {
  mFactory = nullptr;
  mRead = nullptr;

  if (!mBack) return;

  // Detach mBack cleanly.
  mBack->Surf()->ProducerRelease();
}
Any instance of GLScreenBuffer will also hold an instance of SurfaceCaps as we can see in the header:
 public:
  const SurfaceCaps mCaps;
This is an actual instance of SurfaceCaps not just a pointer to it, so when the GLScreenBuffer instance is destroyed mCaps will be too, triggering a call to the SurfaceCaps destructor. If we look in the SurfaceTypes.h header file we can see that there is a destructor specified:
struct SurfaceCaps final {
[...]
  SurfaceCaps();
  SurfaceCaps(const SurfaceCaps& other);
  ~SurfaceCaps();
[...]
But nowhere is the body of the method defined. Compare that to the ESR 78 code where the SurfaceCaps class looks the same. The difference is that checking in the GLContect.cpp source we find these:
// These are defined out of line so that we don't need to include
// ISurfaceAllocator.h in SurfaceTypes.h.
SurfaceCaps::SurfaceCaps() = default;
SurfaceCaps::SurfaceCaps(const SurfaceCaps& other) = default;
SurfaceCaps& SurfaceCaps::operator=(const SurfaceCaps& other) = default;
SurfaceCaps::~SurfaceCaps() = default;
They're pretty much the simplest implementations you can get. But they are at least implementations. So with any luck adding these in to the ESR 91 code as well will solve quite a few of the linker errors. That's not everything though. After we take away the instances that these four methods deal with we're left with these two:
SharedSurface.cpp:361: GLContext::ChooseGLFormats(
    mozilla::gl::SurfaceCaps const&) const
RefPtr.h:590: SharedSurfaceTextureClient::SharedSurfaceTextureClient(
    mozilla::layers::SharedSurfaceTextureData*, mozilla::layers::TextureFlags,
    mozilla::layers::LayersIPCChannel*)
Let's check SharedSurface.cpp line 361 next. The line looks like this:
      mFormats(partialDesc.gl->ChooseGLFormats(caps)),
If we look in the GLContext class, sure enough we can see the method signature:
  // Only varies based on bpp16 and alpha.
  GLFormats ChooseGLFormats(const SurfaceCaps& caps) const;
We can see the same signature in the ESR 78 code. Once again though, the body of the method is defined in GLContext.cpp of ESR 78, whereas it's nowhere to be found in ESR 91. So I've copied the missing code over: 
GLFormats GLContext::ChooseGLFormats(const SurfaceCaps& caps) const { GLFormats formats; // If we're on ES2 hardware and we have an explicit request for 16 bits of // color or less OR we don't support full 8-bit color, return a 4444 or 565 // format. bool bpp16 = caps.bpp16; [...] 
Finally we have a reference to SharedSurfaceTextureClient. The reference to line 590 of RefPtr.h is unhelpful; what we really want to know is where the RefPtr is being used. A quick search suggests it's inside GLScreenBuffer where there are a couple of cases that look like this:
bool GLScreenBuffer::Swap(const gfx::IntSize& size) {
  RefPtr<layers::SharedSurfaceTextureClient> newBack =
      mFactory->NewTexClient(size);
[...]
There's no definition of SharedSurfaceTextureClient::SharedSurfaceTextureClient() in the ESR 91 code, but there is in the ESR 78 code, in the TextureClientSharedSurface.cpp source:
SharedSurfaceTextureClient::SharedSurfaceTextureClient(
    SharedSurfaceTextureData* aData, TextureFlags aFlags,
    LayersIPCChannel* aAllocator)
    : TextureClient(aData, aFlags, aAllocator) {
  mWorkaroundAnnoyingSharedSurfaceLifetimeIssues = true;
}
I've copied that code over as well. But my reading that's all of them. So this may mean we're ready to kick off a full rebuild. Before doing so, it's worth spending a bit of extra time to check that the changes pass our three partial builds already.
$ make -j1 -C obj-build-mer-qt-xr/gfx/gl/
[...]
$ make -j1 -C obj-build-mer-qt-xr/dom/canvas
[...]
$ make -j1 -C obj-build-mer-qt-xr/gfx/layers
[...]
The first two work fine, but the last generates a new error:
${PROJECT}/gecko-dev/gfx/layers/client/TextureClientSharedSurface.cpp:109:3:
    error: ‘mWorkaroundAnnoyingSharedSurfaceLifetimeIssues’
    was not declared in this scope
   mWorkaroundAnnoyingSharedSurfaceLifetimeIssues = true;
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This mWorkaroundAnnoyingSharedSurfaceLifetimeIssues variable is present in ESR 78, used to decide whether or not to deallocate the TextureData when the TextureClient is destroyed. Here's the logic it's part of:
void TextureClient::Destroy() {
[...]
  TextureData* data = mData;
  if (!mWorkaroundAnnoyingSharedSurfaceLifetimeIssues) {
    mData = nullptr;
  }

  if (data || actor) {
[...]
    params.workAroundSharedSurfaceOwnershipIssue =
        mWorkaroundAnnoyingSharedSurfaceOwnershipIssues;
    if (mWorkaroundAnnoyingSharedSurfaceLifetimeIssues) {
      params.data = nullptr;
    } else {
      params.data = data;
    }
[...]
    DeallocateTextureClient(params);
This logic has been removed in ESR 91 and, if I'm honest, I'm comfortable leaving it this way. Here's what the equivalent ERS 91 code looks like:
void TextureClient::Destroy() {
[...]
  TextureData* data = mData;
  mData = nullptr;

  if (data || actor) {
[...]
    DeallocateTextureClient(params);
  }
}
As you can see, there are no references to mWorkaroundAnnoyingSharedSurfaceLifetimeIssues, or anything like it, there at all. Consequently I'm just going to remove the references to it from our new SharedSurfaceTextureClient constructor as well:
SharedSurfaceTextureClient::SharedSurfaceTextureClient(
    SharedSurfaceTextureData* aData, TextureFlags aFlags,
    LayersIPCChannel* aAllocator)
    : TextureClient(aData, aFlags, aAllocator) {
}
Maybe this will cause problems later, but my guess is that if there are going to be problems, they'll be pretty straightforward to track down with the debugger, at which point we can refer back to the ESR 78 code to restore these checks.

The partial builds all now pass, so it's time to do a full rebuild. Here's the updated tally of changes once again to highlight today's progress:
$ git diff --numstat
145     16      gfx/gl/GLContext.cpp
73      8       gfx/gl/GLContext.h
11      0       gfx/gl/GLContextTypes.h
590     0       gfx/gl/GLScreenBuffer.cpp
171     1       gfx/gl/GLScreenBuffer.h
305     5       gfx/gl/SharedSurface.cpp
132     4       gfx/gl/SharedSurface.h
16      10      gfx/gl/SharedSurfaceEGL.cpp
13      3       gfx/gl/SharedSurfaceEGL.h
81      2       gfx/gl/SharedSurfaceGL.cpp
61      0       gfx/gl/SharedSurfaceGL.h
60      0       gfx/gl/SurfaceTypes.h
27      0       gfx/layers/client/TextureClientSharedSurface.cpp
25      0       gfx/layers/client/TextureClientSharedSurface.h
$ git diff --shortstat
 14 files changed, 1710 insertions(+), 49 deletions(-)
So off the build goes again:
$ sfdk build -d --with git_workaround
NOTICE: Appending changelog entries to the RPM SPEC file…
Setting version: 91.9.1+git1+sailfishos.esr91.20240302180401.
    2f1f19ac7d73+gecko.dev.5292b747b036
Directory walk started
[...]
This will likely take until the morning, at which point I'll be eagerly checking for errors once again.

[...]

An early and unusually short build run (just 5 hours 12 minutes) meant that I've got a second bit of the cherry! The build finished before the end of the day. It's not a successful build unfortunately, but by discovering this now rather than in the morning I'm able to claim an entire day back. So I'm very happy about that.

Here are the linker errors. Well, actually, I think it's just one error that's spread over many lines:
308:11.87 toolkit/library/build/libxul.so
311:51.74 aarch64-meego-linux-gnu-ld: ../../../gfx/layers/
    Unified_cpp_gfx_layers6.o: in function `mozilla::layers::
    SharedSurfaceTextureClient::SharedSurfaceTextureClient
    (mozilla::layers::SharedSurfaceTextureData*, mozilla::layers::TextureFlags,
    mozilla::layers::LayersIPCChannel*)':
311:51.74 ${PROJECT}/gecko-dev/gfx/layers/client/
    TextureClientSharedSurface.cpp:108: undefined reference to
    `vtable for mozilla::layers::SharedSurfaceTextureClient'
311:51.74 aarch64-meego-linux-gnu-ld: ${PROJECT}/gecko-dev/gfx/layers/client/
    TextureClientSharedSurface.cpp:108: undefined reference to
    `vtable for mozilla::layers::SharedSurfaceTextureClient'
311:51.74 aarch64-meego-linux-gnu-ld: libxul.so: hidden symbol
    `_ZTVN7mozilla6layers26SharedSurfaceTextureClientE' isn't defined
311:51.74 aarch64-meego-linux-gnu-ld: final link failed: bad value
311:51.75 collect2: error: ld returned 1 exit status
The essence of error is the following:
TextureClientSharedSurface.cpp:108: undefined reference to
    `vtable for mozilla::layers::SharedSurfaceTextureClient'
I'm going to hold of explaining what's going on here until tomorrow. This post is already rather long and I don't have the energy to go in to the details tonight. But I have added in a fix.

With these latest changes the partial builds still all pass. So it's time to set the full build off again. But this also means it's the end of my gecko development for the day. More tomorrow when we'll see how things have gone.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments