flypig.co.uk

List items

Items from the current list are shown below.

Gecko

2 Jul 2024 : Day 276 #
Last night the new packages built so that I now have a version with working WebGL and partially working WebView. Crucially, this version splits off the functionality needed for working WebGL from the WebView functionality, meaning I should now be able to reintroduce the WebView functionality without it affecting the WebGL rendering.

So the next steps will be to reintroduce as much of the WebView changes needed in order to get WebView working, but preferably no more than that.

Before getting on to that I have one other issue I want to address. Although the browser is now working, along with the WebGL rendering, there are also now spurious crashes happening, presumably as a result of some of these changes. The backtrace associated with the crash is basically useless, even with a fully installed and correct set of debug symbols:
Thread 8 "GeckoWorkerThre" received signal SIGSEGV, Segmentation 
    fault.
[Switching to LWP 30258]
0x0000007fe5ee29a0 in ?? ()
(gdb) bt
#0  0x0000007fe5ee29a0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 
My hunch is that this relates to the changes to SharedSurface_Basic. In particular, although I changed the code so that a different constructor is used to create the SharedSurface_Basic objects depending on the type of rendering, the process of destruction, which used to rely on just the default destructor, now has bespoke functionality:
SharedSurface_Basic::~SharedSurface_Basic() {
  if (!mDesc.gl || !mDesc.gl->MakeCurrent()) return;

  if (mFB) mDesc.gl->fDeleteFramebuffers(1, &mFB);

  if (mOwnsTex) mDesc.gl->fDeleteTextures(1, &mTex);
}
This code needs to be there for the WebView to work, but it was never there in the WebGL version. My guess is that as surfaces are destroyed periodically, they're calling some of this code which is then triggering a crash. In particular, I notice that mFB isn't being set in the WebGL constructor, which means it could have a value other than zero. If this were the case, it could be causing problems when this uninitialised value gets passed to the OpenGL library, which could in turn explain why the backtrace is so poor.

As an attempt to fix this I've added in a line to initialise mFB to zero, which should be enough to prevent the new lines in the destructor from executing.
 SharedSurface_Basic::SharedSurface_Basic(const SharedSurfaceDesc& desc,
                                          UniquePtr<MozFramebuffer>&& fb)
     : SharedSurface(desc, std::move(fb)),
       mTex(0),
       mOwnsTex(false),
+      mFB(0)
 {
 }
Great, that's done the trick! No more random crashes. So, now I can move on to the larger task of getting the WebView to work. It's a bigger task, but I'm also hoping it'll be relatively formulaic, given that I already have a working version to compare against. I just need to gradually reintroduce the code from that version until it all clicks into place and works. The plan is that, with the WebGL rendering now separated from it, what I'll end up with is a working WebView and working WebGL.

So I'm looking through the code that's been amended and it looks like there are some relatively self-contained changes related to destruction of TextureClient instances. By reverting the changes from the following files, I should get a slice of the changes back with minimal fuss:
  1. gfx/layers/CompositorTypes.h
  2. gfx/layers/client/TextureClient.h
  3. gfx/layers/client/TextureClient.cpp
  4. gfx/layers/client/TextureClientSharedSurface.cpp
  5. gfx/layers/ipc/CompositorVsyncScheduler.cpp
I'm not sure if this will actually have any effect though, so I'm going to make a copy of the diff before reverting it. That way I can restore the changes if needed.
$ git diff gfx/layers/CompositorTypes.h \
    gfx/layers/client/TextureClient.h \
    gfx/layers/client/TextureClient.cpp \
    gfx/layers/client/TextureClientSharedSurface.cpp \
    gfx/layers/ipc/CompositorVsyncScheduler.cpp \
    > destroy-texture-changes.diff
$ git checkout gfx/layers/CompositorTypes.h \
    gfx/layers/client/TextureClient.h \
    gfx/layers/client/TextureClient.cpp \
    gfx/layers/client/TextureClientSharedSurface.cpp \
    gfx/layers/ipc/CompositorVsyncScheduler.cpp
I'm a little surprised to discover that after resetting the files to revert the changes the code still builds just fine without needing any further adjustments. But when I test out the new library I don't see any obvious positive (or indeed negative) effects. I was hoping for something different. I guess this isn't the critical difference that I'm looking for.

So I'm going to restore thee changes and try something else tomorrow. I feel like getting the WebGL not to crash was a good step forwards today. It's a shame the TextureClient changes didn't work, but that's still something to cross off my list of changes to try. So that's progress.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments