flypig.co.uk

Personal Blog

View the blog index.

RSS feed Click the icon for the blog RSS feed.

Blog

5 most recent items

9 Jun 2024 : Day 258 #
As I mentioned a couple of days back, I'm taking part in a hackathon for my work during the next week, so I'm not planning to make any posts for the next five days. This coming Saturday I'll continue right back off where I leave off at the end of today though.

For today, I'm looking further into why WebGL might not be doing what it's supposed to be doing. So far I've found that there are two methods in my commit diff that get hit when executing the broken code. These are:
  1. SurfaceFactory::SurfaceFactory()
  2. TextureImageEGL::TextureImageEGL()

Looking at the code and observing the execution using the debugger I can see that the stack trace for the second of these includes TileGenFunc(), which calls TileGenFuncEGL() which then calls TextureImageEGL::TextureImageEGL(). And the flow is definitely being affected by what happens in TileGenFunc().

Here's the diff between the two versions:
 static already_AddRefed<TextureImage> TileGenFunc(
     GLContext* gl, const IntSize& aSize, TextureImage::ContentType 
    aContentType,
     TextureImage::Flags aFlags, TextureImage::ImageFormat aImageFormat) {
-  return CreateBasicTextureImage(gl, aSize, aContentType,
-                                 LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+  switch (gl->GetContextType()) {
+    case GLContextType::EGL:
+      return TileGenFuncEGL(gl, aSize, aContentType, aFlags, aImageFormat);
+    default:
+      return CreateBasicTextureImage(gl, aSize, aContentType,
+                                     LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+  }
 }
As we can see, the original version always calls CreateBasicTextureImage() in the original version, whereas in the new version there's a switch to contend with. That means that in the new version, rather than doing the same thing as the original it will instead on occasion call TileGenFuncEGL(). So this is clearly a candidate for where things are going wrong.

To see whether this is having an important effect I've amended the method so that it has the same approach as previously, by changing it to this:
$ git diff
diff --git a/gfx/gl/GLTextureImage.cpp b/gfx/gl/GLTextureImage.cpp
index c2def2dedb18..8152128bdc9c 100644
--- a/gfx/gl/GLTextureImage.cpp
+++ b/gfx/gl/GLTextureImage.cpp
@@ -47,6 +47,9 @@ already_AddRefed<TextureImage> CreateTextureImage(
 static already_AddRefed<TextureImage> TileGenFunc(
     GLContext* gl, const IntSize& aSize, TextureImage::ContentType 
    aContentType,
     TextureImage::Flags aFlags, TextureImage::ImageFormat aImageFormat) {
+  return CreateBasicTextureImage(gl, aSize, aContentType,
+                                 LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+
   switch (gl->GetContextType()) {
     case GLContextType::EGL:
       return TileGenFuncEGL(gl, aSize, aContentType, aFlags, aImageFormat);
Now when this gets called, it will immediately call CreateBasicTextureImage() rather than going into the switch conditional. This isn't a long term solution, it's just a way for me to test things out.

Unfortunately though, rebuilding and executing this change gives me the same result as before, in that the WebGL is still not showing signs of life.

So it's back to the code again. There's also another important change in that in some cases in GLScreenBuffer I've switched use of mSwapChain for mScreen instead. The two have quite different characteristics, so I should try switching this back as well, for example like this:
 bool GLContext::ResizeScreenBuffer(const gfx::IntSize& size) {
   if (!IsOffscreenSizeAllowed(size)) return false;
 
-  return mScreen->Resize(size);
+  return mSwapChain->Resize(size);
 }
Now when I build and try this something different happens. Now the app crashes when it tries to render the WebGL. That's not a bad thing, because the debugger will tell me where the crash is taking place.

I'll need to investigate this further. Not today though as I'm out of time, and I won't be picking this up tomorrow either. Instead there will be the five-day pause I mentioned at the top of this post, but I'll be back to continue this where I've left it this coming Saturday.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
8 Jun 2024 : Day 257 #
It's an early start for me today as I'm travelling to London and back. But I had difficulty sleeping last night and am up even earlier than I usually would be, so I'm pleased to discover that the build I kicked off last night has already completed.

This means I now have two sets of RPM packages. One set that represents the last commit of ESR 91 when WebGL was working and a second set that adds a commit on top of this, but which breaks WebGL.

Here's a list of the packages, where the sailfishos.esr91 represents the most recent changes that caused the breakage, while the temp branch has these changes reverted.
$ ls webgl-broken/ webgl-working/
webgl-broken/:
xulrunner-qt5-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-debuginfo-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-debugsource-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-devel-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-misc-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm

webgl-working/:
xulrunner-qt5-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-debuginfo-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-debugsource-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-devel-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-misc-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
While the newly built RPMs transfer over to my phone, let me summarise what I'm expecting.

Previously the broken RPMs were crashing on a call to ToSurfaceDescriptor(). The reason for the crash is that I'd added an explicit request for the app to crash if this was ever called:
Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::ToSurfaceDescriptor() {
  MOZ_CRASH(&quot;GFX: ToSurfaceDescriptor&quot;);
  return Nothing();
}
I added it for debugging purposes while working on the WebView changes. These latest packages have this MOZ_CRASH statement removed, so I'm no longer expecting a crash to happen here. However, I do expect it to crash nevertheless, just in some other location. Removing the MOZ_CRASH would be too simple a fix for it to actually work as a solution!

So I'm expecting to get a new backtrace from the crash. The question will be: what is this crash and how does it compare with the execution of the working version. As soon as I have this backtrace it will hopefully be clear the path the execution took to get there. Then I'll reinstall the working version and compare against the equivalent path there to establish what's changed.

This is the plan, at least.

They packages have copied over, so let's get to work.
$ sailfish-browser https://shadertoy.com
[...]
Created LOG for EmbedLiteLayerManager
JavaScript warning: https://www.shadertoy.com/, line 2388: WebGL warning: 
    drawArraysInstanced: Tex image TEXTURE_2D level 0 is incurring lazy 
    initialization.
[...]
Well, that's interesting. There is now no crash, so that failure was entirely self-induced. However the WebGL is broken. It's just displaying an empty canvas where the WebGL should be rendered. This makes things somewhat harder to debug, because now there's no obvious please to start from.

So my new plan is to debug the same piece of code that I debugged yesterday on the working version. Let's see if anything has changed.
(gdb) info break
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000007ff29b0cfc in mozilla::layers::
    ShareableCanvasRenderer::UpdateCompositableClient() 
                                                   at gfx/layers/
    ShareableCanvasRenderer.cpp:191
        breakpoint already hit 1 time
(gdb) c
[...]

Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 2, mozilla::layers::
    ShareableCanvasRenderer::UpdateCompositableClient (this=0x7fc963c520)
    at gfx/layers/ShareableCanvasRenderer.cpp:192
192         FirePreTransactionCallback();
(gdb) n
195         auto tc = fnGetExistingTc();
(gdb) n
196         if (!tc) {
(gdb) p tc
$1 = {mRawPtr = 0x0}
(gdb) n
198           tc = fnMakeTcFromSnapshot();
(gdb) n
200         if (tc != mFrontBufferFromDesc) {
(gdb) p tc
$2 = {mRawPtr = 0x7fc8ceb370}
(gdb) p tc.mRawPtr
$3 = (mozilla::layers::TextureClient *) 0x7fc8ceb370
(gdb) 
This matches the flow in the working version, so it seems this isn't where the problem is. I'm going to have to look further afield.

To help with this search I've attached breakpoints to the majority of the new functions that have been added or seen significant changes in the latest commit. Here they all are (there are quite a few):
(gdb) break GLScreenBuffer::Create
Breakpoint 4 at 0x7ff28a7d94: file gfx/gl/GLScreenBuffer.cpp, line 171.
(gdb) break InitOffscreen
Breakpoint 5 at 0x7ff28d2500: file gfx/gl/GLContext.cpp, line 2345.
(gdb) break GLContext::CreateScreenBuffer
Breakpoint 6 at 0x7ff28d2428: file gfx/gl/GLContext.cpp, line 2073.
(gdb) b WaylandGLSurface::WaylandGLSurface
Breakpoint 7 at 0x7ff28c084c: file gfx/gl/GLContextProviderEGL.cpp, line 954.
(gdb) b GLContextProviderEGL::CreateOffscreen
Breakpoint 8 at 0x7ff28d2610: file gfx/gl/GLContextProviderEGL.cpp, line 1451.
(gdb) b ReadBuffer::Create
Breakpoint 9 at 0x7ff28a6ea8: file gfx/gl/GLScreenBuffer.cpp, line 358.
(gdb) b SurfaceFactory::SurfaceFactory
Breakpoint 10 at 0x7ff28acdc4: file gfx/gl/SharedSurface.cpp, line 167.
(gdb) b SharedSurface_EGLImage::SharedSurface_EGLImage
Breakpoint 11 at 0x7ff28d363c: file gfx/gl/SharedSurfaceEGL.cpp, line 95.
(gdb) b TextureImageEGL::TextureImageEGL
Breakpoint 12 at 0x7ff28d3e80: file gfx/gl/TextureImageEGL.cpp, line 46.
(gdb) r
[...]
If any of these breakpoints hit, that means they'd be good candidates for comparing against the working version. If they're new (rather just heavily amended) methods then that'll be even more relevant, because that'll indicate a wholesale change of flow. In that case I'll need to work backwards through the call stack to see where — and why — the divergence happened.

Contrariwise if they're not hit then they're not part of the execution flow and it should be safe for me to ignore them in my investigation.

When I now debug the program there are three breakpoints that hit; or rather two breakpoints are hit a total of three times:
Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 10, mozilla::gl::
    SurfaceFactory::SurfaceFactory (this=0x7fc95eb220, partialDesc=..., 
    allocator=..., 
    flags=@0x7fdf29256c: mozilla::layers::TextureFlags::NO_FLAGS)
    at gfx/gl/SharedSurface.cpp:167
167     SurfaceFactory::SurfaceFactory(const PartialSharedSurfaceDesc& 
    partialDesc,

Thread 37 &quot;Compositor&quot; hit Breakpoint 12, mozilla::gl::
    TextureImageEGL::TextureImageEGL (this=0x7ed81ab2d0, aTexture=20, 
    aSize=..., aWrapMode=33071, 
    aContentType=gfxContentType::COLOR_ALPHA, aContext=0x7ed81a2780, 
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aTextureState=mozilla::gl::TextureImage::Created, aImageFormat=mozilla::gfx:
    :SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
46      TextureImageEGL::TextureImageEGL(GLuint aTexture, const gfx::IntSize& 
    aSize,

Thread 37 &quot;Compositor&quot; hit Breakpoint 12, mozilla::gl::
    TextureImageEGL::TextureImageEGL (this=0x7ed825ed80, aTexture=21, 
    aSize=..., aWrapMode=33071, 
    aContentType=gfxContentType::COLOR_ALPHA, aContext=0x7ed81a2780, 
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aTextureState=mozilla::gl::TextureImage::Created, aImageFormat=mozilla::gfx:
    :SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
46      TextureImageEGL::TextureImageEGL(GLuint aTexture, const gfx::IntSize& 
    aSize,
Let's get some backtraces from those. These are really long backtraces and I do apologise for that. I want to keep copies here for future reference, but there's no need to look at them in any detail. Certainly not right now anyway. Here's the first one:
Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 10, mozilla::gl::
    SurfaceFactory::SurfaceFactory (this=0x7fc95e9c10, partialDesc=..., 
    allocator=...,
    flags=@0x7fdf29256c: mozilla::layers::TextureFlags::NO_FLAGS)
    at gfx/gl/SharedSurface.cpp:167
167     SurfaceFactory::SurfaceFactory(const PartialSharedSurfaceDesc& 
    partialDesc,
(gdb) bt
#0  mozilla::gl::SurfaceFactory::SurfaceFactory (this=0x7fc95e9c10, 
    partialDesc=..., allocator=...,
    flags=@0x7fdf29256c: mozilla::layers::TextureFlags::NO_FLAGS)
    at gfx/gl/SharedSurface.cpp:167
#1  0x0000007ff28d3950 in mozilla::gl::SurfaceFactory_Basic::
    SurfaceFactory_Basic (this=0x7fc95e9c10, gl=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:113
#2  0x0000007ff369d0d4 in mozilla::MakeUnique<mozilla::gl::
    SurfaceFactory_Basic, mozilla::gl::GLContext&> ()
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#3  mozilla::WebGLContext::Present (this=this@entry=0x7fc93ab2a0, 
    xrFb=<optimized out>,
    consumerType=consumerType@entry=mozilla::layers::TextureType::Unknown, 
    webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:929  
#4  0x0000007ff366511c in mozilla::HostWebGLContext::Present (webvr=false, 
    t=mozilla::layers::TextureType::Unknown, xrFb=<optimized out>,
    this=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    mozilla/RefPtr.h:280
#5  mozilla::ClientWebGLContext::Run<void (mozilla::HostWebGLContext::*)(
    unsigned long, mozilla::layers::TextureType, bool) const, &(mozilla::
    HostWebGLContext::Present(unsigned long, mozilla::layers::TextureType, 
    bool) const), unsigned long, mozilla::layers::TextureType const&, bool 
    const&> (
    this=<optimized out>, args#0=@0x7fdf2926c0: 0, args#1=@0x7fdf2926bf: 
    mozilla::layers::TextureType::Unknown, args#2=@0x7fdf2926be: false)
    at dom/canvas/ClientWebGLContext.cpp:313
#6  0x0000007ff3665284 in mozilla::ClientWebGLContext::Present (
    this=this@entry=0x7f28004210, xrFb=xrFb@entry=0x0, type=<optimized out>,
    webvr=<optimized out>, webvr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:363
#7  0x0000007ff3690a94 in mozilla::ClientWebGLContext::OnBeforePaintTransaction 
    (this=0x7f28004210)
    at dom/canvas/ClientWebGLContext.cpp:345
#8  0x0000007ff28fff30 in mozilla::layers::CanvasRenderer::
    FirePreTransactionCallback (this=this@entry=0x7fc93fb900)
    at gfx/layers/CanvasRenderer.cpp:75 
#9  0x0000007ff29b0d04 in mozilla::layers::ShareableCanvasRenderer::
    UpdateCompositableClient (this=0x7fc93fb900)
    at gfx/layers/ShareableCanvasRenderer.cpp:192
#10 0x0000007ff29f08a0 in mozilla::layers::ClientCanvasLayer::RenderLayer (
    this=0x7fc95fc380)
    at gfx/layers/client/ClientCanvasLayer.cpp:25
#11 0x0000007ff29ef9c0 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#12 0x0000007ff29ffd08 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc92fc450)
    at gfx/layers/Layers.h:1051
#13 0x0000007ff29ef9c0 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#14 0x0000007ff29ffd08 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc934a230)
    at gfx/layers/Layers.h:1051
#15 0x0000007ff29ef9c0 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#16 0x0000007ff29ffd08 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc8d123e0)
    at gfx/layers/Layers.h:1051
#17 0x0000007ff2a069ec in mozilla::layers::ClientLayerManager::
    EndTransactionInternal (this=this@entry=0x7fc8a5ea90, 
    aCallback=aCallback@entry=
    0x7ff46a31ec <mozilla::FrameLayerBuilder::DrawPaintedLayer(mozilla::layers::
    PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*)>, 
    aCallbackData=aCallbackData@entry=0x7fdf293268)
    at gfx/layers/client/ClientLayerManager.cpp:341
#18 0x0000007ff2a118ec in mozilla::layers::ClientLayerManager::EndTransaction (
    this=0x7fc8a5ea90,
    aCallback=0x7ff46a31ec <mozilla::FrameLayerBuilder::DrawPaintedLayer(
    mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=0x7fdf293268, aFlags=mozilla::layers::
    LayerManager::END_DEFAULT)
    at gfx/layers/client/ClientLayerManager.cpp:397
#19 0x0000007ff46a060c in nsDisplayList::PaintRoot (
    this=this@entry=0x7fdf295078, aBuilder=aBuilder@entry=0x7fdf293268, 
    aCtx=aCtx@entry=0x0,
    aFlags=aFlags@entry=13, aDisplayListBuildTime=...)
    at layout/painting/nsDisplayList.cpp:2622
#20 0x0000007ff442c968 in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7fc9280d10, aDirtyRegion=...,
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting,
    aFlags=aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | 
    nsLayoutUtils::PaintFrameFlags::ExistingTransaction | nsLayoutUtils::
    PaintFrameFlags::NoComposite)) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/mozilla/MaybeStorageBase.h:80
#21 0x0000007ff43b705c in mozilla::PresShell::Paint (
    this=this@entry=0x7fc921c9a0, aViewToPaint=aViewToPaint@entry=0x7fc8563cb0, 
    aDirtyRegion=...,
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6400
#22 0x0000007ff41eef2c in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7fc8563c70, aWidget=aWidget@entry=0x7fc90d0760)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/RectAbsolute.h:43
[...]
#55 0x0000007fefbab89c in ?? () from /lib64/libc.so.6
(gdb)
Here's the second one:
Thread 37 &quot;Compositor&quot; hit Breakpoint 12, mozilla::gl::
    TextureImageEGL::TextureImageEGL (this=0x7ee01faa90, aTexture=21, 
    aSize=..., aWrapMode=33071,
    aContentType=gfxContentType::COLOR_ALPHA, aContext=0x7ee01a28a0, 
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft,
    aTextureState=mozilla::gl::TextureImage::Created, aImageFormat=mozilla::gfx:
    :SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
46      TextureImageEGL::TextureImageEGL(GLuint aTexture, const gfx::IntSize& 
    aSize,
(gdb) bt
#0  mozilla::gl::TextureImageEGL::TextureImageEGL (this=0x7ee01faa90, 
    aTexture=21, aSize=..., aWrapMode=33071, aContentType=gfxContentType::
    COLOR_ALPHA,
    aContext=0x7ee01a28a0, aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aTextureState=mozilla::gl::TextureImage::Created,
    aImageFormat=mozilla::gfx::SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
#1  0x0000007ff28d42e0 in mozilla::gl::TileGenFuncEGL (
    gl=gl@entry=0x7ee01a28a0, aSize=..., 
    aContentType=aContentType@entry=gfxContentType::COLOR_ALPHA,
    aFlags=aFlags@entry=mozilla::gl::TextureImage::OriginBottomLeft, 
    aImageFormat=aImageFormat@entry=mozilla::gfx::SurfaceFormat::B8G8R8A8)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#2  0x0000007ff28b7ec8 in mozilla::gl::TileGenFunc (aImageFormat=mozilla::gfx::
    SurfaceFormat::B8G8R8A8,
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aContentType=gfxContentType::COLOR_ALPHA, aSize=..., gl=0x7ee01a28a0)
    at gfx/gl/GLTextureImage.cpp:52
#3  mozilla::gl::TiledTextureImage::Resize (this=this@entry=0x7ee01d7660, 
    aSize=...)
    at gfx/gl/GLTextureImage.cpp:399
#4  0x0000007ff28b81cc in mozilla::gl::TiledTextureImage::TiledTextureImage (
    this=0x7ee01d7660, aGL=0x7ee01a28a0, aSize=...,
    aContentType=<optimized out>, aFlags=<optimized out>, 
    aImageFormat=<optimized out>)
    at gfx/gl/GLTextureImage.cpp:221
#5  0x0000007ff28d41f8 in mozilla::gl::CreateTextureImageEGL (
    gl=gl@entry=0x7ee01a28a0, aSize=...,
    aContentType=aContentType@entry=gfxContentType::COLOR_ALPHA, 
    aWrapMode=aWrapMode@entry=33071,
    aFlags=aFlags@entry=mozilla::gl::TextureImage::OriginBottomLeft, 
    aImageFormat=aImageFormat@entry=mozilla::gfx::SurfaceFormat::B8G8R8A8)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#6  0x0000007ff28b8350 in mozilla::gl::CreateTextureImage (
    gl=gl@entry=0x7ee01a28a0, aSize=...,
    aContentType=aContentType@entry=gfxContentType::COLOR_ALPHA, 
    aWrapMode=aWrapMode@entry=33071,
    aFlags=aFlags@entry=mozilla::gl::TextureImage::OriginBottomLeft, 
    aImageFormat=<optimized out>)
    at gfx/gl/GLTextureImage.cpp:30
#7  0x0000007ff294ec88 in mozilla::layers::TextureImageTextureSourceOGL::Update 
    (this=0x7ee01c70f0, aSurface=0x7ee019b290, aDestRegion=0x0,
    aSrcOffset=0x0, aDstOffset=0x0) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/gfx2DGlue.h:70
#8  0x0000007ff2a43ea8 in mozilla::layers::BufferTextureHost::Upload (
    this=this@entry=0x7ee01bb470, aRegion=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#9  0x0000007ff2a444e0 in mozilla::layers::BufferTextureHost::MaybeUpload (
    this=this@entry=0x7ee01bb470, aRegion=<optimized out>)
    at gfx/layers/composite/TextureHost.cpp:1046
#10 0x0000007ff2a44808 in mozilla::layers::BufferTextureHost::UploadIfNeeded (
    this=this@entry=0x7ee01bb470)
    at gfx/layers/composite/TextureHost.cpp:1031
#11 0x0000007ff2a44824 in mozilla::layers::BufferTextureHost::Lock (
    this=0x7ee01bb470)
    at gfx/layers/composite/TextureHost.cpp:650
#12 0x0000007ff2a35d8c in mozilla::layers::ImageHost::Lock (this=0x7ee01b7c60)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#13 0x0000007ff2a3621c in mozilla::layers::AutoLockCompositableHost::
    AutoLockCompositableHost (aHost=0x7ee01b7c60, this=0x7f364e8ca0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#14 mozilla::layers::ImageHost::Composite (this=this@entry=0x7ee01b7c60, 
    aCompositor=aCompositor@entry=0x7ee0002ed0, 
    aLayer=aLayer@entry=0x7ee0265370,
    aEffectChain=..., aOpacity=1, aTransform=..., aSamplingFilter=<optimized 
    out>, aClipRect=..., aVisibleRegion=aVisibleRegion@entry=0x0, aGeometry=...)
    at gfx/layers/composite/ImageHost.cpp:197
#15 0x0000007ff2a26d3c in mozilla::layers::CanvasLayerComposite::<lambda(
    mozilla::layers::EffectChain&, const IntRect&)>::operator() (clipRect=...,
    effectChain=..., __closure=<synthetic pointer>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/MaybeStorageBase.h:50
#16 mozilla::layers::RenderWithAllMasks<mozilla::layers::CanvasLayerComposite::
    RenderLayer(const IntRect&, const mozilla::Maybe<mozilla::gfx::
    PolygonTyped<mozilla::gfx::UnknownUnits> >&)::<lambda(mozilla::layers::
    EffectChain&, const IntRect&)> >(mozilla::layers::Layer *, mozilla::layers::
    Compositor *, const mozilla::gfx::IntRect &, mozilla::layers::
    CanvasLayerComposite::<lambda(mozilla::layers::EffectChain&, const 
    IntRect&)>) (aLayer=aLayer@entry=
    0x7ee0264f60, aCompositor=<optimized out>, aClipRect=..., 
    aRenderCallback=aRenderCallback@entry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
    LayerManagerCompositeUtils.h:69
#17 0x0000007ff2a27090 in mozilla::layers::CanvasLayerComposite::RenderLayer (
    this=0x7ee0264f60, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:289
#18 0x0000007ff2a32f88 in mozilla::layers::RenderLayers<mozilla::layers::
    ContainerLayerComposite> (aContainer=aContainer@entry=0x7ee025d580,
    aManager=aManager@entry=0x7ee01a43a0, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
#19 0x0000007ff2a33e78 in mozilla::layers::ContainerRender<mozilla::layers::
    ContainerLayerComposite> (aContainer=0x7ee025d580, aManager=0x7ee01a43a0,
    aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/BaseRect.h:53
#20 0x0000007ff2a33fc0 in mozilla::layers::ContainerLayerComposite::RenderLayer 
    (this=<optimized out>, aClipRect=..., aGeometry=...)
    at gfx/layers/composite/ContainerLayerComposite.cpp:745
#21 0x0000007ff2a32f88 in mozilla::layers::RenderLayers<mozilla::layers::
    ContainerLayerComposite> (aContainer=aContainer@entry=0x7ee01d0140,
    aManager=aManager@entry=0x7ee01a43a0, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
#22 0x0000007ff2a33e78 in mozilla::layers::ContainerRender<mozilla::layers::
    ContainerLayerComposite> (aContainer=0x7ee01d0140, aManager=0x7ee01a43a0,
    aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/BaseRect.h:53
#23 0x0000007ff2a33fc0 in mozilla::layers::ContainerLayerComposite::RenderLayer 
    (this=<optimized out>, aClipRect=..., aGeometry=...)
    at gfx/layers/composite/ContainerLayerComposite.cpp:745
#24 0x0000007ff2a32f88 in mozilla::layers::RenderLayers<mozilla::layers::
    ContainerLayerComposite> (aContainer=aContainer@entry=0x7ee01b0d00,
    aManager=aManager@entry=0x7ee01a43a0, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
#25 0x0000007ff2a33e78 in mozilla::layers::ContainerRender<mozilla::layers::
    ContainerLayerComposite> (aContainer=0x7ee01b0d00, aManager=0x7ee01a43a0,
    aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/BaseRect.h:53
#26 0x0000007ff2a33fc0 in mozilla::layers::ContainerLayerComposite::RenderLayer 
    (this=<optimized out>, aClipRect=..., aGeometry=...)
    at gfx/layers/composite/ContainerLayerComposite.cpp:745
#27 0x0000007ff2a1bc84 in mozilla::layers::LayerManagerComposite::<lambda(const 
    IntRect&)>::operator()(const mozilla::gfx::IntRect &) const (
    __closure=__closure@entry=0x7f364e98c8, aClipRect=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/MaybeStorageBase.h:50
#28 0x0000007ff2a30e68 in mozilla::layers::LayerManagerComposite::Render (
    this=this@entry=0x7ee01a43a0, aInvalidRegion=..., aOpaqueRegion=...)
    at gfx/layers/composite/LayerManagerComposite.cpp:1237
#29 0x0000007ff2a3148c in mozilla::layers::LayerManagerComposite::
    UpdateAndRender (this=this@entry=0x7ee01a43a0)
    at gfx/layers/composite/LayerManagerComposite.cpp:657
#30 0x0000007ff2a3183c in mozilla::layers::LayerManagerComposite::
    EndTransaction (this=this@entry=0x7ee01a43a0, aTimeStamp=...,
    aFlags=aFlags@entry=mozilla::layers::LayerManager::END_DEFAULT)
    at gfx/layers/composite/LayerManagerComposite.cpp:572
#31 0x0000007ff2a72fbc in mozilla::layers::CompositorBridgeParent::
    CompositeToTarget (this=0x7fc89b9920, aId=..., aTarget=0x0, 
    aRect=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#32 0x0000007ff4e07e38 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc89b9920, aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:160
#33 0x0000007ff2a58718 in mozilla::layers::CompositorVsyncScheduler::Composite (
    this=0x7fc8bd6dd0, aVsyncEvent=...)
    at gfx/layers/ipc/CompositorVsyncScheduler.cpp:256
#34 0x0000007ff2a50b98 in mozilla::detail::RunnableMethodArguments<mozilla::
    VsyncEvent>::applyImpl<mozilla::layers::CompositorVsyncScheduler, void (
    mozilla::layers::CompositorVsyncScheduler::*)(mozilla::VsyncEvent const&), 
    StoreCopyPassByConstLRef<mozilla::VsyncEvent>, 0ul> (args=..., m=<optimized 
    out>,
    o=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    nsThreadUtils.h:887
[...]
#46 0x0000007fefbab89c in ?? () from /lib64/libc.so.6
(gdb)                              
To prevent this becoming tiresome I'm going to skip the last backtrace, since it relates to the same TextureImageEGL::TextureImageEGL() call we've just seen.

That feels like plenty to be getting on with. Tomorrow I'll need to compare these backtraces with the working ESR 91 code to see whether it's possible to get to the same place or not and, if it is, what might have changed.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
7 Jun 2024 : Day 256 #
It's the big one! A full 2^8 days of development have gone in to this now, which seems like an absurd amount of effort.
 
2^8 in the centre of a bright coloured flash

Unfortunately, while numerically this is very exciting, the actual work I'm doing right now isn't, so there's no big reveal to impress you with. Instead I'm going to continue hacking away at the WebGL bug I discovered a couple of days back.

To elaborate, I'm currently trying to find out why the WebView rendering fix has caused WebGL rendering to fail. Both are types of offscreen rendering, so it's not surprising that one has affected the other, but it's important that both of them are working correctly.

Over the last couple of days I discovered that the problem definitely exists in the latest commit added to the code. I checked that by rolling the repository back one commit, rebuilding and checking that the problem doesn't happen with the slightly older version.

Now I need to find out what has changed in the flow of the code to make the problem appear.

From the earlier backtraces we know that the problem is a call to SharedSurface_Basic::ToSurfaceDescriptor(), which itself is called from WebGLContext::GetFrontBuffer(). Stepping through this method I can see that there's no immediate crashing happening there, and execution continues into ShareableCanvasRenderer::UpdateCompositableClient(). The code being executed there looks like this:
    // First, let's see if we can get a no-copy TextureClient from the canvas.
    auto tc = fnGetExistingTc();
    if (!tc) {
      // Otherwise, snapshot the surface and copy into a TexClient.
      tc = fnMakeTcFromSnapshot();
    }
    if (tc != mFrontBufferFromDesc) {
      mFrontBufferFromDesc = nullptr;
    }
Both fnGetExistingTc() and fnMakeTcFromSnapshot() are lambda functions defined inside the method. But the first of these is where the call to SharedSurface_Basic::ToSurfaceDescriptor() occurs. This is returning null because a call to SharedSurface_Basic::ToSurfaceDescriptor() always returns Nothing().

However, the following call to fnMakeTcFromSnapshot() is returning a value, as we can see in the following debug steps:
(gdb) n
32        return Nothing();
(gdb) n
50      ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/MaybeStorageBase.h: 
    No such file or directory.
(gdb) n
mozilla::ClientWebGLContext::GetFrontBuffer (this=this@entry=0x7fc8b4a4b0, 
    fb=fb@entry=0x0, vr=<optimized out>, vr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:368
368       const auto notLost = mNotLost;
(gdb) n
mozilla::layers::ShareableCanvasRenderer::<lambda()>::operator() (
    __closure=<synthetic pointer>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
443     ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h: No such 
    file or directory.
(gdb) 
149         if (!desc) return nullptr;
(gdb) n
148         const auto desc = webgl->GetFrontBuffer(nullptr);
(gdb) n
mozilla::layers::ShareableCanvasRenderer::UpdateCompositableClient (
    this=0x7fc98a98e0)
    at gfx/layers/ShareableCanvasRenderer.cpp:196
196         if (!tc) {
(gdb) p tc
$8 = {mRawPtr = 0x0}
(gdb) n
198           tc = fnMakeTcFromSnapshot();
(gdb) n
200         if (tc != mFrontBufferFromDesc) {
(gdb) p tc
$9 = {mRawPtr = 0x7fc93bc9a0}
(gdb) 
This will need comparing against what happens in our newer build where the crash occurs. Thinking back, I'm now a little concerned that the sole reason for the crash is this line that I added to SharedSurface_Basic::ToSurfaceDescriptor():
Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::ToSurfaceDescriptor() {
  MOZ_CRASH(&quot;GFX: ToSurfaceDescriptor&quot;);
  return Nothing();
}
Certainly this will cause a crash, but I thought I'd also tested it without this. Now I'm not so sure...

Sadly I didn't keep copies of the newer packages to install back again, but I do have a copy of the libxul.so library from back then. I'm not sure if I'll be able to debug using it, but it's worth a try. If it turns out not to be debuggable I'll just have to do another complete rebuild (although, this time, I'll keep a copy of the current packages so I can reinstall them if I need to do another comparison!).

Sadly I don't get any joy testing the library:
Thread 8 &quot;GeckoWorkerThre&quot; received signal SIGSEGV, Segmentation 
    fault.
0x0000007fe5ee13a8 in ?? ()
(gdb) bt
#0  0x0000007fe5ee13a8 in ?? ()
#1  0x0000007fdf293e08 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 
I'm going to have to do a rebuild. This means restoring the original branch, then performing the build to create the full set of RPM packages.
$ cd gecko-dev
$ git checkout -b temp
$ git checkout FIREFOX_ESR_91_9_X_RELBRANCH_patches
$ git log --oneline -5
7437a9d17284 (HEAD -> FIREFOX_ESR_91_9_X_RELBRANCH_patches) Restore 
    GLScreenBuffer and TextureImageEGL
d3ba4df29a32 (temp) Restore NotifyDidPaint event and timers
f55057391ac0 Prevent errors from DownloadPrompter
eab04b8c0d80 Enable dconf
c6ea49286566 (origin/FIREFOX_ESR_91_9_X_RELBRANCH_patches) Disable SessionStore 
    functionality
$ cd ..
Before now performing the build I must remove the code that's guaranteed to cause a crash:
Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::ToSurfaceDescriptor() {
  return Nothing();
}
Now to build:
$ sfdk build -d --with git_workaround
[...]
The build won't be ready until the morning at the earliest. So I'm going to pause there and come back to this tomorrow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
6 Jun 2024 : Day 255 #
It's day 0b011111111 today, or to put it another way, day (2^8 - 1). That means tomorrow is the big one. I'm certainly hoping I won't need until 2^9 before ESR 91 is released, which hopefully means this will be the last big one, numerical speaking, for this project.

A couple of months back Adam Pigg (piggz) claimed he suspected me of holding out on a solution:
 
[M]y theory is that its all working just fine, and he's just dragging it out to the big reveal on day 2^8 :)

The truth was that at that stage I wasn't at all convinced I'd be able to get the WebView working in time. Thankfully it is now working, in the nick of time as it turns out, but nevertheless the task isn't quite complete. Even once I've finalised this WebView patch, there'll still be more work to do in areas including video rendering, WebRTC videoconferencing, patch refactoring and a bunch of smaller glitches to iron out. So I'm sorry to say there's still no release on the horizon just yet. But as I hope is clear by now, I'm playing the long game. Not only am I committed to getting it finished, but I'm also doing my best to help ensure the process is as streamlined as possible for the future too. Hopefully, when it comes to the next release, things will be easier.

Before I get back to coding, I also need to give advance warning that I'll not be posting entries next week. Next week is Hackweek at work, which means a week long intensive coding session with my colleagues. There's a good chance that this won't leave much in the way of free-time for me to be working on Gecko. That'll be from Monday 10th June to Friday 14th June. I'll start up right back where I leave things off on the Saturday though.

Alright, now back to coding. Yesterday you'll recall I discovered a problem with WebGL rendering. I know this was working back in February because I demoed it at FOSDEM, but some change I've made between then and now has broken it.

Yesterday I recorded a couple of backtraces around the crash. My suspicion is that the problem relates to the recent changes to offscreen rendering.

To test this theory out I've created a new branch and rolled the project back a single commit to before I started making the WebView changes. The wonders of version control! During the day today I set it building a completely fresh set of RPM packages based on this slightly older version of the code.
$ cd gecko-dev
$ git checkout -b temp
$ git log FIREFOX_ESR_91_9_X_RELBRANCH_patches_temp --oneline -5
eb40ffd47432 (FIREFOX_ESR_91_9_X_RELBRANCH_patches_temp) Restore GLScreenBuffer 
    and TextureImageEGL
d3ba4df29a32 (HEAD -> temp) Restore NotifyDidPaint event and timers
f55057391ac0 Prevent errors from DownloadPrompter
eab04b8c0d80 Enable dconf
c6ea49286566 (origin/FIREFOX_ESR_91_9_X_RELBRANCH_patches) Disable SessionStore 
    functionality
$ git reset --hard d3ba4df29a32d53c38c68e4512d1fa82073ecdf4
$ git log --oneline -4
d3ba4df29a32 (HEAD -> temp) Restore NotifyDidPaint event and timers
f55057391ac0 Prevent errors from DownloadPrompter
eab04b8c0d80 Enable dconf
c6ea49286566 (origin/FIREFOX_ESR_91_9_X_RELBRANCH_patches) Disable SessionStore 
    functionality
$ cd ..
$ sfdk build -d --with git_workaround
[...]
Testing these new packages this evening I find that WebGL is indeed working with this one-commit-older version. That narrows down the problem to somewhere in the most recent commit eb40ffd47432.

That's a big help. With the two backtraces captured yesterday my plan is to compare the execution flow with the working version to see how they differ. Here's what I believe to be the equivalent backtrace:
Thread 10 &quot;GeckoWorkerThre&quot; hit Breakpoint 2, mozilla::gl::
    SharedSurface_Basic::SharedSurface_Basic (this=0x7f81347dc0, 
    gl=0x7f815d82c0, size=...,
    hasAlpha=true, tex=1, ownsTex=true) at gfx/gl/SharedSurfaceGL.cpp:54
54      SharedSurface_Basic::SharedSurface_Basic(GLContext* gl, const IntSize& 
    size,
This leads us to the second backtraces for the ToSurfaceDescriptor conversion method:
Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 5, mozilla::gl::
    SharedSurface_Basic::ToSurfaceDescriptor (this=0x7fc8d8c9f0)
    at gfx/gl/SharedSurfaceGL.cpp:31
31      Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::
    ToSurfaceDescriptor() {
(gdb) bt
#0  mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor (this=0x7fc8d8c9f0)
    at gfx/gl/SharedSurfaceGL.cpp:31
#1  0x0000007ff3694278 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94b8d10, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
#2  0x0000007ff365c528 in mozilla::HostWebGLContext::GetFrontBuffer (
    this=<optimized out>, xrFb=<optimized out>, webvr=false)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:280
#3  0x0000007ff365c5d8 in mozilla::ClientWebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc8b4a4b0, fb=fb@entry=0x0, vr=<optimized out>, 
    vr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:373
#4  0x0000007ff29b2410 in mozilla::layers::ShareableCanvasRenderer::<lambda()>::
    operator() (__closure=<synthetic pointer>)
    at gfx/layers/ShareableCanvasRenderer.cpp:148
#5  mozilla::layers::ShareableCanvasRenderer::UpdateCompositableClient (
    this=0x7fc98a98e0)
    at gfx/layers/ShareableCanvasRenderer.cpp:195
#6  0x0000007ff29f1e10 in mozilla::layers::ClientCanvasLayer::RenderLayer (
    this=0x7fc959bd60)
    at gfx/layers/client/ClientCanvasLayer.cpp:25
#7  0x0000007ff29f0f30 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#8  0x0000007ff2a01054 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc9798e60)
    at gfx/layers/Layers.h:1051
#9  0x0000007ff29f0f30 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#10 0x0000007ff2a01054 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc8d810f0)
    at gfx/layers/Layers.h:1051
#11 0x0000007ff29f0f30 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#12 0x0000007ff2a01054 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc93748a0)
    at gfx/layers/Layers.h:1051
#13 0x0000007ff2a08270 in mozilla::layers::ClientLayerManager::
    EndTransactionInternal (this=this@entry=0x7fc8b18a30, 
    aCallback=aCallback@entry=0x7ff46a44d0 <mozilla::FrameLayerBuilder::
    DrawPaintedLayer(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=aCallbackData@entry=0x7fdf2dd268)
    at gfx/layers/client/ClientLayerManager.cpp:341
#14 0x0000007ff2a12be4 in mozilla::layers::ClientLayerManager::EndTransaction (
    this=0x7fc8b18a30, 
    aCallback=0x7ff46a44d0 <mozilla::FrameLayerBuilder::DrawPaintedLayer(
    mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=0x7fdf2dd268, aFlags=mozilla::layers::
    LayerManager::END_DEFAULT)
    at gfx/layers/client/ClientLayerManager.cpp:397
#15 0x0000007ff46a18f0 in nsDisplayList::PaintRoot (
    this=this@entry=0x7fdf2df078, aBuilder=aBuilder@entry=0x7fdf2dd268, 
    aCtx=aCtx@entry=0x0, 
    aFlags=aFlags@entry=13, aDisplayListBuildTime=...)
    at layout/painting/nsDisplayList.cpp:2622
#16 0x0000007ff442dc4c in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7fc9362940, aDirtyRegion=..., 
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting, 
    aFlags=aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | 
    nsLayoutUtils::PaintFrameFlags::ExistingTransaction | nsLayoutUtils::
    PaintFrameFlags::NoComposite)) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/mozilla/MaybeStorageBase.h:80
#17 0x0000007ff43b8340 in mozilla::PresShell::Paint (
    this=this@entry=0x7fc92df890, aViewToPaint=aViewToPaint@entry=0x7fc8570b20, 
    aDirtyRegion=..., 
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6400
#18 0x0000007ff41f0210 in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7fc8570ae0, aWidget=aWidget@entry=0x7fc8570ba0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/RectAbsolute.h:43
#19 0x0000007ff41f05c4 in nsViewManager::ProcessPendingUpdatesForView (
    this=this@entry=0x7fc8570ae0, aView=<optimized out>, 
    aFlushDirtyRegion=aFlushDirtyRegion@entry=true)
    at view/nsViewManager.cpp:394
#20 0x0000007ff41f0bb4 in nsViewManager::ProcessPendingUpdates (
    this=this@entry=0x7fc8570ae0)
    at view/nsViewManager.cpp:972
[...]
#51 0x0000007fefbb189c in ?? () from /lib64/libc.so.6
(gdb)
There's actually very little difference between these calls, as we can see if we look at just the first couple of frames of each next to each other:
#0  0x0000007ff28d1ca4 in mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor 
    (this=<optimized out>)
    at gfx/gl/SharedSurfaceGL.cpp:38
#1  0x0000007ff36920a4 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94889c0, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
#0  mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor (this=0x7fc8d8c9f0)
    at gfx/gl/SharedSurfaceGL.cpp:31
#1  0x0000007ff3694278 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94b8d10, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
The first of these is the broken version, while the second is working. In order to get a deeper understanding, I'm going to want to step through the code between here and the crash.

Unfortunately after the build during the day I'm a bit short on time to delve deeper in to this now. But I'll pick this up again tomorrow to try to figure out what the difference is. Once I have that it will hopefully give a much clearer idea about how to fix the problem with my latest changeset. I can then roll back to my original commit, fix it, and... well, let's see.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
5 Jun 2024 : Day 254 #
I'm continuing with my attempts to simplify the latest commit by removing unnecessary changes and doing my best to align the changes with the upstream ESR 91 changes.

To do this I've been looking carefully through the code to try to find unused methods that were added in my latest commit. I managed to find a couple. Both GLContext::ResizeOffscreen() and GLContext::OffscreenSize() have equivalents in the GLScreenBuffer class which can be used as drop-in replacements. So those two methods, which were previously added, are now added no more.

The other change I've made is to remove the following member variable from GLContext:
  std::map<GLuint, SharedSurface*> mFBOMapping
With this removed I'm also able to remove the related code from the source file as well as the dependency on the standard map header. I've tested the result and it doesn't seem to have any negative effects on either the browser or the WebView.

The other change I've made today is to simplify how the EGLDisplay is passed around when things are still being initialised. I'd created quite a web of methods to pass this between, with all of these needing this variable to be passed in. Here they are, with the aDisplay parameter at the end being the one I'd really like to avoid the need for.
RefPtr<GLLibraryEGL> DefaultEglLibrary(nsACString* const out_failureId, 
    EGLDisplay aDisplay);
inline std::shared_ptr<EglDisplay> DefaultEglDisplay(nsACString* const 
    out_failureId, EGLDisplay aDisplay);
RefPtr<GLLibraryEGL> GLLibraryEGL::Create(nsACString* const out_failureId, 
    EGLDisplay aDisplay);
bool GLLibraryEGL::Init(bool forceAccel, nsACString* const out_failureId, 
    EGLDisplay aDisplay);
Just by rearranging things a little I've been able to remove the aDisplay parameter from all of these. It turns out that most of them were just passing the value on to one of the other methods. Once they were taken away from one, they weren't needed in the others either.

After these changes the patch is still pretty large, but looking considerably better.

However, during my testing I hit another problem. It turns out that somewhere, some of the changes I made (probably related to offscreen rendering) have broken the WebGLContext capabilities. That means that if a web page uses WebGL it'll now trigger a crash. I know for sure that this was working earlier, so something has changed.

You'll have to forgive me for including a lengthy backtrace. I admit it's not very enlightening, but I want to keep a copy here so I have something to refer to. This is the backtrace for the crash:
Thread 8 &quot;GeckoWorkerThre&quot; received signal SIGSEGV, Segmentation 
    fault.
[Switching to LWP 10166]
0x0000007ff28d1ca4 in mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor (
    this=<optimized out>)
    at gfx/gl/SharedSurfaceGL.cpp:38
38        MOZ_CRASH(&quot;GFX: ToSurfaceDescriptor&quot;);
(gdb) bt
#0  0x0000007ff28d1ca4 in mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor 
    (this=<optimized out>)
    at gfx/gl/SharedSurfaceGL.cpp:38
#1  0x0000007ff36920a4 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94889c0, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
#2  0x0000007ff365a410 in mozilla::HostWebGLContext::GetFrontBuffer (
    this=<optimized out>, xrFb=<optimized out>, webvr=false)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:280
#3  0x0000007ff365a4c0 in mozilla::ClientWebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc8b48ec0, fb=fb@entry=0x0, vr=<optimized out>, 
    vr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:373
#4  0x0000007ff29b0084 in mozilla::layers::ShareableCanvasRenderer::<lambda()>::
    operator() (__closure=<synthetic pointer>)
    at gfx/layers/ShareableCanvasRenderer.cpp:148
#5  mozilla::layers::ShareableCanvasRenderer::UpdateCompositableClient (
    this=0x7fc96d7db0)
    at gfx/layers/ShareableCanvasRenderer.cpp:195
#6  0x0000007ff29efa84 in mozilla::layers::ClientCanvasLayer::RenderLayer (
    this=0x7fc978af90)
    at gfx/layers/client/ClientCanvasLayer.cpp:25
#7  0x0000007ff29eeba4 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#8  0x0000007ff29feeec in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc978a680)
    at gfx/layers/Layers.h:1051
#9  0x0000007ff29eeba4 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#10 0x0000007ff29feeec in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc9424e20)
    at gfx/layers/Layers.h:1051
#11 0x0000007ff29eeba4 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#12 0x0000007ff29feeec in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc8dd8d50)
    at gfx/layers/Layers.h:1051
#13 0x0000007ff2a05bd0 in mozilla::layers::ClientLayerManager::
    EndTransactionInternal (this=this@entry=0x7fc8b17440, 
    aCallback=aCallback@entry=
    0x7ff46a23d0 <mozilla::FrameLayerBuilder::DrawPaintedLayer(mozilla::layers::
    PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*)>, 
    aCallbackData=aCallbackData@entry=0x7fdf293268)
    at gfx/layers/client/ClientLayerManager.cpp:341
#14 0x0000007ff2a10ad0 in mozilla::layers::ClientLayerManager::EndTransaction (
    this=0x7fc8b17440, 
    aCallback=0x7ff46a23d0 <mozilla::FrameLayerBuilder::DrawPaintedLayer(
    mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=0x7fdf293268, aFlags=mozilla::layers::
    LayerManager::END_DEFAULT)
    at gfx/layers/client/ClientLayerManager.cpp:397
#15 0x0000007ff469f7f0 in nsDisplayList::PaintRoot (
    this=this@entry=0x7fdf295078, aBuilder=aBuilder@entry=0x7fdf293268, 
    aCtx=aCtx@entry=0x0, 
    aFlags=aFlags@entry=13, aDisplayListBuildTime=...)
    at layout/painting/nsDisplayList.cpp:2622
#16 0x0000007ff442bb4c in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7fc932f550, aDirtyRegion=..., 
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting, 
    aFlags=aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | 
    nsLayoutUtils::PaintFrameFlags::ExistingTransaction | nsLayoutUtils::
    PaintFrameFlags::NoComposite)) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/mozilla/MaybeStorageBase.h:80
#17 0x0000007ff43b6240 in mozilla::PresShell::Paint (
    this=this@entry=0x7fc92a8a70, aViewToPaint=aViewToPaint@entry=0x7fc9287140, 
    aDirtyRegion=..., 
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6400
#18 0x0000007ff41ee110 in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7fc92870d0, aWidget=aWidget@entry=0x7fc92871c0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/RectAbsolute.h:43
[...]
#51 0x0000007fefba989c in ?? () from /lib64/libc.so.6
(gdb)
Reading through just the first few items in this backtrace, it's clear that the reason is due to the call to SharedSurface_Basic::ToSurfaceDescriptor(), which is intentionally triggering a crash based on the code that's there. Given this, it's possible the problem is that the SharedSurface_Basic object that's being used should have been one of the several other alternative surface variants.

So because I think it could be helpful in getting to the bottom of this, I'm also going to record how and where this surface is being created. Here's the backtrace for its creation:
#0  mozilla::gl::SharedSurface_Basic::SharedSurface_Basic (this=0x7fc8d77e40, 
    gl=0x7fc97f9830, size=..., hasAlpha=false, tex=1, ownsTex=true)
    at gfx/gl/SharedSurfaceGL.cpp:78
#1  0x0000007ff28d3e6c in mozilla::gl::SharedSurface_Basic::Create (
    gl=0x7fc97f9830, formats=..., size=..., hasAlpha=false)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#2  0x0000007ff28a3720 in mozilla::gl::SurfaceFactory_Basic::CreateSharedImpl (
    this=<optimized out>, desc=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/WeakPtr.h:185
#3  0x0000007ff28a3628 in mozilla::gl::SurfaceFactory::CreateShared (
    this=0x7fc93e17c0, size=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefCounted.h:240
#4  0x0000007ff28a5f24 in mozilla::gl::SwapChain::Acquire (
    this=this@entry=0x7fc92ed298, size=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
#5  0x0000007ff369bed4 in mozilla::WebGLContext::PresentInto (
    this=this@entry=0x7fc92ece00, swapChain=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
#6  0x0000007ff369c304 in mozilla::WebGLContext::Present (
    this=this@entry=0x7fc92ece00, xrFb=<optimized out>, 
    consumerType=consumerType@entry=mozilla::layers::TextureType::Unknown, 
    webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:936
#7  0x0000007ff3664300 in mozilla::HostWebGLContext::Present (webvr=false, 
    t=mozilla::layers::TextureType::Unknown, xrFb=<optimized out>, 
    this=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    mozilla/RefPtr.h:280
#8  mozilla::ClientWebGLContext::Run<void (mozilla::HostWebGLContext::*)(
    unsigned long, mozilla::layers::TextureType, bool) const, &(mozilla::
    HostWebGLContext::Present(unsigned long, mozilla::layers::TextureType, 
    bool) const), unsigned long, mozilla::layers::TextureType const&, bool 
    const&> (
    this=<optimized out>, args#0=@0x7fdf2926c0: 0, args#1=@0x7fdf2926bf: 
    mozilla::layers::TextureType::Unknown, args#2=@0x7fdf2926be: false)
    at dom/canvas/ClientWebGLContext.cpp:313
#9  0x0000007ff3664468 in mozilla::ClientWebGLContext::Present (
    this=this@entry=0x7fc9558160, xrFb=xrFb@entry=0x0, type=<optimized out>, 
    webvr=<optimized out>, webvr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:363
#10 0x0000007ff368fc78 in mozilla::ClientWebGLContext::OnBeforePaintTransaction 
    (this=0x7fc9558160)
    at dom/canvas/ClientWebGLContext.cpp:345
#11 0x0000007ff28ff0dc in mozilla::layers::CanvasRenderer::
    FirePreTransactionCallback (this=this@entry=0x7fc9646e50)
    at gfx/layers/CanvasRenderer.cpp:75
#12 0x0000007ff29afee8 in mozilla::layers::ShareableCanvasRenderer::
    UpdateCompositableClient (this=0x7fc9646e50)
    at gfx/layers/ShareableCanvasRenderer.cpp:192
#13 0x0000007ff29efa84 in mozilla::layers::ClientCanvasLayer::RenderLayer (
    this=0x7fc98f02c0)
    at gfx/layers/client/ClientCanvasLayer.cpp:25
#14 0x0000007ff29eeba4 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#15 0x0000007ff29feeec in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc991a8d0)
    at gfx/layers/Layers.h:1051
#16 0x0000007ff29eeba4 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#17 0x0000007ff29feeec in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc935acd0)
    at gfx/layers/Layers.h:1051
#18 0x0000007ff29eeba4 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#19 0x0000007ff29feeec in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc8d8f2f0)
    at gfx/layers/Layers.h:1051
#20 0x0000007ff2a05bd0 in mozilla::layers::ClientLayerManager::
    EndTransactionInternal (this=this@entry=0x7fc8b17e40, 
    aCallback=aCallback@entry=
    0x7ff46a23d0 <mozilla::FrameLayerBuilder::DrawPaintedLayer(mozilla::layers::
    PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*)>, 
    aCallbackData=aCallbackData@entry=0x7fdf293268)
    at gfx/layers/client/ClientLayerManager.cpp:341
#21 0x0000007ff2a10ad0 in mozilla::layers::ClientLayerManager::EndTransaction (
    this=0x7fc8b17e40, 
    aCallback=0x7ff46a23d0 <mozilla::FrameLayerBuilder::DrawPaintedLayer(
    mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=0x7fdf293268, aFlags=mozilla::layers::
    LayerManager::END_DEFAULT)
    at gfx/layers/client/ClientLayerManager.cpp:397
#22 0x0000007ff469f7f0 in nsDisplayList::PaintRoot (
    this=this@entry=0x7fdf295078, aBuilder=aBuilder@entry=0x7fdf293268, 
    aCtx=aCtx@entry=0x0, 
    aFlags=aFlags@entry=13, aDisplayListBuildTime=...)
    at layout/painting/nsDisplayList.cpp:2622
#23 0x0000007ff442bb4c in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7fc93240a0, aDirtyRegion=..., 
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting, 
    aFlags=aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | 
    nsLayoutUtils::PaintFrameFlags::ExistingTransaction | nsLayoutUtils::
    PaintFrameFlags::NoComposite)) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/mozilla/MaybeStorageBase.h:80
#24 0x0000007ff43b6240 in mozilla::PresShell::Paint (
    this=this@entry=0x7fc9295080, aViewToPaint=aViewToPaint@entry=0x7fc9273470, 
    aDirtyRegion=..., 
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6400
#25 0x0000007ff41ee110 in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7fc9273430, aWidget=aWidget@entry=0x7fc92734f0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/RectAbsolute.h:43
[...]
#58 0x0000007fefba989c in ?? () from /lib64/libc.so.6
(gdb) 
Again, apologies for the lengthy backtraces. I'm not going to try to debug and fix this today, so I'm thinking it might be helpful to have these backtraces as something to refer back to for the future.

Sadly I don't have the energy to dig into this crash further today, so I'll have to leave things there. I hope it won't be too hard to fix, but at this point it's already looking a bit tricky, so we'll have to see.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment