flypig.co.uk

List items

Items from the current list are shown below.

Blog

2 Apr 2024 : Day 204 #
It's refreshing to be working with a fully built binary again with all the debug source aligning with the binary. Debugging is so much more fulfilling this way.

That's good, because I spent today doing lots more debugging. The first thing I did today is step through the CompositorOGL::BeginFrame() method so see whether my changes were actually being executed. In particular, I wanted to know whether the colour used to clear the texture was being set to something other than white.

Debugging confirms that it is:
[...]
Thread 37 "Compositor" hit Breakpoint 1, mozilla::layers::
    CompositorOGL::BeginFrame (this=0x7ed8002ed0, aInvalidRegion=..., 
    aClipRect=..., 
    aRenderBounds=..., aOpaqueRegion=...)
    at gfx/layers/opengl/CompositorOGL.cpp:1084
1084      mClearColor.r = 0.0;
(gdb) n
[Thread 0x7f172fe7e0 (LWP 2982) exited]
1085      mClearColor.g = 0.0;
(gdb) 
1086      mClearColor.b = 1.0;
(gdb) 
[New Thread 0x7f170bc7e0 (LWP 3141)]
1087      mClearColor.a = 0.0;
(gdb) 
1088      mGLContext->fClearColor(mClearColor.r, mClearColor.g, mClearColor.b,
(gdb) 
[New Thread 0x7f1707b7e0 (LWP 3143)]
1091      mGLContext->fClear(clearBits);
(gdb) 
1093      return Some(rect);
1084      mClearColor.r = 0.0;
(gdb) disable break
(gdb) c
Continuing.
But this has no effect on the screen output. Maybe this isn't where the background colour gets set at all? My explorations with ESR 78 yesterday would seem to imply this is the case. So I've dug around the code a bit more but can't see anywhere more appropriate to add similar changes in. There are a couple of other places where textures are cleared so I tried changing the colours for those as well but to no avail.

However, while digging through the code I did discover some anomalies. Anomalies caused by patches making changes in ESR 78 that I'd not applied to ESR 91. The two patches are 0070 "Fix flipped FBO textures when rendering to an offscreen target" and 0071 "Do not flip scissor rects when rendering to an offscreen window." Both are certainly relevant to the parts of the code I'm working on. I have my doubts that rendering would be completely scuppered without them, but fixing them still looks worthwhile.

After all, the changes are likely to be needed anyway, even if they're not going to solve the rendering problem on their own.

So I've applied them both. They're both small patches and so easy to introduce into the ESR 91 code.

Although I've done a quick build and executed the code to check the screen has remained stubbornly blank. So as suspected these aren't the only problems I need to fix. So now I've reached the point where it feels it might be helpful to step through the code, side-by-side, with ESR 78 and ESR 91 simultaneously.

The purpose of this is to try to find where the two diverge. If there are differences, this could be the source of my troubles. It's a laborious but thorough process and so feels to me to be the most fruitful way forwards at this point.

Since I'm already working in this area the CompositorOGL::CreateContext() method seems like as good a place to start as any, so I've been stepping through from there for most of the day.

Eventually I notice a difference between the execution of ESR 78 and that of ESR 91. I pored over the values assigned in the mScreen object. This is of type GLScreenBuffer and is one of the class variables in GLContext (note that this is GLContext rather than the GLContextEGL that we were focusing on last week).

The interesting parts are the frame buffer variables. Here's what they look like on ESR 78:
Thread 36 "Compositor" hit Breakpoint 1, mozilla::layers::
    CompositorOGL::CreateContext (this=0x7eac003420)
    at gfx/layers/opengl/CompositorOGL.cpp:223
223     already_AddRefed<mozilla::gl::GLContext> CompositorOGL::CreateContext() 
    {
[...]
(gdb) n
mozilla::layers::CompositorOGL::Initialize (this=0x7eac003420, 
    out_failureReason=0x7fa516f730)
    at gfx/layers/opengl/CompositorOGL.cpp:374
374       mGLContext = CreateContext();
(gdb) n
383       if (!mGLContext) {
(gdb) p mGLContext
$5 = {mRawPtr = 0x7eac109140}
(gdb) p mGLContext.mRawPtr
$6 = (mozilla::gl::GLContext *) 0x7eac109140
[...]
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mUserDrawFB
$12 = 0
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mUserReadFB
$13 = 0
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mInternalDrawFB
$14 = 2
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mInternalReadFB
$15 = 2
(gdb) 
But on ESR 91 these same values are decidedly more zero:
Thread 37 &quot;Compositor&quot; hit Breakpoint 1, mozilla::layers::
    CompositorOGL::CreateContext (this=this@entry=0x7ed8002f10)
    at gfx/layers/opengl/CompositorOGL.cpp:227
227     already_AddRefed<mozilla::gl::GLContext> CompositorOGL::CreateContext() 
    {
[...]
(gdb) n
401       if (!mGLContext) {
(gdb) p mGLContext
$6 = {mRawPtr = 0x7ed819aa50}
(gdb) p mGLContext.mRawPtr
$7 = (mozilla::gl::GLContext *) 0x7ed819aa50
[...]
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mUserDrawFB
$10 = 0
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mUserReadFB
$11 = 0
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mInternalDrawFB
$12 = 0
(gdb) p mGLContext.mRawPtr->mScreen.mTuple.mFirstA.mInternalReadFB
$13 = 0
(gdb) 
That looks suspicious to me. A zero value suggests they've not been initialised at all, compared to ESR 78 where they quite clearly have been initialised. It looks broken. But that's also great! Something very concrete to fix.

The next step is to find out where they're getting set on ESR 78. It's immediately clear from the code that the mInternalDrawFB and mInternalReadFB variables get set in the GLContext::fBindFramebuffer() method, but actually figuring out where this happens turns out to be more challenging. Here's what the relevant method looks like:
void GLContext::fBindFramebuffer(GLenum target, GLuint framebuffer) {
  if (!mScreen) {
    raw_fBindFramebuffer(target, framebuffer);
    return;
  }

  switch (target) {
    case LOCAL_GL_DRAW_FRAMEBUFFER_EXT:
      mScreen->BindDrawFB(framebuffer);
      return;

    case LOCAL_GL_READ_FRAMEBUFFER_EXT:
      mScreen->BindReadFB(framebuffer);
      return;

    case LOCAL_GL_FRAMEBUFFER:
      mScreen->BindFB(framebuffer);
      return;

    default:
      // Nothing we care about, likely an error.
      break;
  }

  raw_fBindFramebuffer(target, framebuffer);
}
It takes quite a bit of stepping through to find the right call, because the the method is called multiple times and usually exits early due to the fact there's no mScreen value set.

Eventually I do get to the right point though. When reading through the debugging steps it's useful to know that the value of LOCAL_GL_FRAMEBUFFER is defined to be 0x8D40 in GLConsts.h:
#define LOCAL_GL_FRAMEBUFFER                                 0x8D40
That should help clarify what's going on here:
Thread 36 &quot;Compositor&quot; hit Breakpoint 4, mozilla::gl::GLContext::
    fBindFramebuffer (this=this@entry=0x7ea8109140, target=target@entry=36160, 
    framebuffer=framebuffer@entry=0) at obj-build-mer-qt-xr/dist/include/
    mozilla/UniquePtr.h:287
287     obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h: No such file or 
    directory.
(gdb) n
2325        raw_fBindFramebuffer(target, framebuffer);
(gdb) p mScreen.mTuple.mFirstA
$17 = (mozilla::gl::GLScreenBuffer *) 0x0
(gdb) c
Continuing.
[...]
Thread 36 &quot;Compositor&quot; hit Breakpoint 4, mozilla::gl::GLContext::
    fBindFramebuffer (this=0x7ea8109140, target=36160, framebuffer=0)
    at obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:287
287     obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h: No such file or 
    directory.
(gdb) n
2329      switch (target) {
(gdb) p /x target
$21 = 0x8d40
(gdb) n
2339          mScreen->BindFB(framebuffer);
So this is definitely the right place. But the backtrace produced from this is confusing because checking the GLContext::CreateScreenBufferImpl() method, which is the call on the second frame, none of the code in that method actually calls GLContext::fBindFramebuffer().
(gdb) bt
#0  mozilla::gl::GLContext::fBindFramebuffer (this=0x7ea8109140, target=36160, 
    framebuffer=0)
    at gfx/gl/GLContext.cpp:2339
#1  0x0000007fb8e81890 in mozilla::gl::GLContext::CreateScreenBufferImpl (
    this=this@entry=0x7ea8109140, size=..., caps=...)
    at gfx/gl/GLContext.cpp:2135
#2  0x0000007fb8e818ec in mozilla::gl::GLContext::CreateScreenBuffer (caps=..., 
    size=..., this=0x7ea8109140)
    at gfx/gl/GLContext.h:3517
#3  mozilla::gl::GLContext::InitOffscreen (this=0x7ea8109140, size=..., 
    caps=...)
    at gfx/gl/GLContext.cpp:2578
#4  0x0000007fb8e81ac8 in mozilla::gl::GLContextProviderEGL::CreateOffscreen (
    size=..., minCaps=..., 
    flags=flags@entry=mozilla::gl::CreateContextFlags::REQUIRE_COMPAT_PROFILE, 
    out_failureId=out_failureId@entry=0x7fa51fe378)
    at gfx/gl/GLContextProviderEGL.cpp:1443
#5  0x0000007fb8ee275c in mozilla::layers::CompositorOGL::CreateContext (
    this=0x7ea8003420)
    at gfx/layers/opengl/CompositorOGL.cpp:250
#6  mozilla::layers::CompositorOGL::CreateContext (this=0x7ea8003420)
    at gfx/layers/opengl/CompositorOGL.cpp:223
#7  0x0000007fb8f033bc in mozilla::layers::CompositorOGL::Initialize (
    this=0x7ea8003420, out_failureReason=0x7fa51fe730)
    at gfx/layers/opengl/CompositorOGL.cpp:374
#8  0x0000007fb8fdaf7c in mozilla::layers::CompositorBridgeParent::
    NewCompositor (this=this@entry=0x7f8c99db50, aBackendHints=...)
    at gfx/layers/ipc/CompositorBridgeParent.cpp:1534
#9  0x0000007fb8fe45e8 in mozilla::layers::CompositorBridgeParent::
    InitializeLayerManager (this=this@entry=0x7f8c99db50, aBackendHints=...)
    at gfx/layers/ipc/CompositorBridgeParent.cpp:1491
#10 0x0000007fb8fe4730 in mozilla::layers::CompositorBridgeParent::
    AllocPLayerTransactionParent (this=this@entry=0x7f8c99db50, 
    aBackendHints=..., aId=...)
    at gfx/layers/ipc/CompositorBridgeParent.cpp:1587
#11 0x0000007fbb2e11b4 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    AllocPLayerTransactionParent (this=0x7f8c99db50, aBackendHints=..., 
    aId=...) at mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:77
#12 0x0000007fb88bf3d0 in mozilla::layers::PCompositorBridgeParent::
    OnMessageReceived (this=0x7f8c99db50, msg__=...) at 
    PCompositorBridgeParent.cpp:1391
[...]
#28 0x0000007fbe70b89c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
(gdb) 
It looks like the reason for this is that it's being called as part of the ScopedBindFramebuffer wrapper. But it's confusing.

Now this is all well and good, but the real question is why this is happening correctly on ESR 78 but not ESR 91. When I eventually get to checking the ESR 91 code I discover it's because on ESR 91 the fBindFramebuffer() method doesn't have the same wrapper as on ESR 78; instead it goes straight for the library method:
  void fBindFramebuffer(GLenum target, GLuint framebuffer) {
    BEFORE_GL_CALL;
    mSymbols.fBindFramebuffer(target, framebuffer);
    AFTER_GL_CALL;
  }
On ESR 78 this has been wrapped and replaced by a call to raw_fBindFramebuffer() (which looks the same as the above snippet for fBindFramebuffer(). The fix, therefore should be to add the same wrapper from ESR 78 into the ESR 91 code.

I've done this and checked it compiles. But to properly understand the resulting effect I'm going to need to step through the code again, which means a full rebuild will be needed. It's 21:31 in the evening now so, as you know, that means an overnight build. Probably now is a good time to stop anyway. So onward to tomorrow, when we'll see if this has made any practical difference!

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments