List items
Items from the current list are shown below.
Blog
3 Apr 2024 : Day 205 #
Time to test out my overnight build. Yesterday I added in some code that had got lost in the transition from ESR 78 to ESR 91. The code allows the name of the frame buffer objects created for reading and writing to be recorded for use within gecko when they get bound to a target.
This is achieved by (sort-of) overriding the call to fBindFramebuffer. Instead of just calling the relevant OpenGL function (in this case glBindFramebuffer() as all of the other GL library methods do, this new call does a bit more work before calling this underlying function. Here's the code for the underlying method (this is the same snippet I gave in my diary entry yesterday):
I built new packages overnight, but now when I run them this morning I immediately get a segmentation fault.
This clearly isn't just your standard invalid memory read or write; it's the program running out of stack due to an accidental recursion. Fascinating! But not intentional! The GLContext::fBindFramebuffer() method is calling GLScreenBuffer::BindFB()< which is again calling GLContext::fBindFramebuffer() creating a loop of pairs of calls repeatedly calling themselves over and over again. Let's go back to the code to find out why. As it happens we already saw the code for the two methods above. And sure enough fBindFramebuffer() calls mScreen->BindFB() which then goes on to call mGL->fBindFramebuffer().
This code is guaranteed to cause problems and I should have noticed it yesterday before setting the build to run. The code in the fBindFramebuffer() method came directly from ESR 78, so the issue must be a difference in implementation between the version of BindFB() on ESR 78 compared to ESR 91. So let's check the ESR 78 version of that:
There are many other cases where fBindFramebuffer() gets called but a quick search of the ESR 78 code shows that these are the only ones that have been switched to use raw_fBindFramebuffer(), other than the couple of additional calls that we already have inside our new fBindFramebuffer() override.
With any luck, that should be enough to fix the infinite calling loop. I've set the build going, so now it's a case of finding out what effect this will have tomorrow morning.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
This is achieved by (sort-of) overriding the call to fBindFramebuffer. Instead of just calling the relevant OpenGL function (in this case glBindFramebuffer() as all of the other GL library methods do, this new call does a bit more work before calling this underlying function. Here's the code for the underlying method (this is the same snippet I gave in my diary entry yesterday):
void GLContext::fBindFramebuffer(GLenum target, GLuint framebuffer) { if (!mScreen) { raw_fBindFramebuffer(target, framebuffer); return; } switch (target) { case LOCAL_GL_DRAW_FRAMEBUFFER_EXT: mScreen->BindDrawFB(framebuffer); return; case LOCAL_GL_READ_FRAMEBUFFER_EXT: mScreen->BindReadFB(framebuffer); return; case LOCAL_GL_FRAMEBUFFER: mScreen->BindFB(framebuffer); return; default: // Nothing we care about, likely an error. break; } raw_fBindFramebuffer(target, framebuffer); }In this code there are a couple of calls to raw_fBindFramebuffer(). These are the calls that execute the underlying OpenGL function. But there are also calls to GLScreenBuffer::BindDrawFB(), GLScreenBuffer::BindReadFB() and GLScreenBuffer::BindFB(). These all do some additional work, for example the code for the last of these looks like this:
void GLScreenBuffer::BindFB(GLuint fb) { GLuint drawFB = DrawFB(); GLuint readFB = ReadFB(); mUserDrawFB = fb; mUserReadFB = fb; mInternalDrawFB = (fb == 0) ? drawFB : fb; mInternalReadFB = (fb == 0) ? readFB : fb; if (mInternalDrawFB == mInternalReadFB) { mGL->fBindFramebuffer(LOCAL_GL_FRAMEBUFFER, mInternalDrawFB); } else { MOZ_ASSERT(mGL->IsSupported(GLFeature::split_framebuffer)); mGL->fBindFramebuffer(LOCAL_GL_DRAW_FRAMEBUFFER_EXT, mInternalDrawFB); mGL->fBindFramebuffer(LOCAL_GL_READ_FRAMEBUFFER_EXT, mInternalReadFB); } }As we can see, this records the value of fb inside the GLSCreenBuffer object and then goes on to perform some other OpenGL binding actions. Without the overriding GLContext::fBindFramebuffer() method the frame buffer names never get recorded. That could be quite a serious gap in the rendering process, so it'll be good to have it fixed.
I built new packages overnight, but now when I run them this morning I immediately get a segmentation fault.
$ harbour-webview [D] unknown:0 - QML debugging is enabled. Only use this in a safe environment. [...] [JavaScript Error: "Unexpected event profile-after-change" {file: "resource://gre/modules/URLQueryStrippingListService.jsm" line: 228}] observe@resource://gre/modules/URLQueryStrippingListService.jsm:228:12 Created LOG for EmbedPrefs Created LOG for EmbedLiteLayerManager Segmentation fault $When I run it through the debugger to get a backtrace I get an unexpected and unusual and unexpected result:
$ gdb harbour-webview [...] (gdb) r Starting program: /usr/bin/harbour-webview [...] Created LOG for EmbedLiteLayerManager Thread 36 "Compositor" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f1f7c07e0 (LWP 18578)] mozilla::gl::GLScreenBuffer::BindFB (this=0x7fc4697440, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:306 306 void GLScreenBuffer::BindFB(GLuint fb) { (gdb) bt #0 mozilla::gl::GLScreenBuffer::BindFB (this=0x7fc4697440, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:306 #1 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 #2 0x0000007ff1100ecc in mozilla::gl::GLScreenBuffer::BindFB (this=<optimized out>, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:316 #3 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 #4 0x0000007ff1100ecc in mozilla::gl::GLScreenBuffer::BindFB (this=<optimized out>, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:316 #5 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 #6 0x0000007ff1100ecc in mozilla::gl::GLScreenBuffer::BindFB (this=<optimized out>, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:316 #7 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 #8 0x0000007ff1100ecc in mozilla::gl::GLScreenBuffer::BindFB (this=<optimized out>, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:316 #9 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 #10 0x0000007ff1100ecc in mozilla::gl::GLScreenBuffer::BindFB (this=<optimized out>, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:316 [...] #335 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 #336 0x0000007ff1100ecc in mozilla::gl::GLScreenBuffer::BindFB (this=<optimized out>, fb=fb@entry=2) at gfx/gl/GLScreenBuffer.cpp:316 #337 0x0000007ff110e244 in mozilla::gl::GLContext::fBindFramebuffer ( this=0x7edc19aa50, target=target@entry=36160, framebuffer=2) at gfx/gl/GLContext.cpp:2256 [...]It goes on and on and on until eventually I give up following it.
This clearly isn't just your standard invalid memory read or write; it's the program running out of stack due to an accidental recursion. Fascinating! But not intentional! The GLContext::fBindFramebuffer() method is calling GLScreenBuffer::BindFB()< which is again calling GLContext::fBindFramebuffer() creating a loop of pairs of calls repeatedly calling themselves over and over again. Let's go back to the code to find out why. As it happens we already saw the code for the two methods above. And sure enough fBindFramebuffer() calls mScreen->BindFB() which then goes on to call mGL->fBindFramebuffer().
This code is guaranteed to cause problems and I should have noticed it yesterday before setting the build to run. The code in the fBindFramebuffer() method came directly from ESR 78, so the issue must be a difference in implementation between the version of BindFB() on ESR 78 compared to ESR 91. So let's check the ESR 78 version of that:
void GLScreenBuffer::BindFB(GLuint fb) { GLuint drawFB = DrawFB(); GLuint readFB = ReadFB(); mUserDrawFB = fb; mUserReadFB = fb; mInternalDrawFB = (fb == 0) ? drawFB : fb; mInternalReadFB = (fb == 0) ? readFB : fb; if (mInternalDrawFB == mInternalReadFB) { mGL->raw_fBindFramebuffer(LOCAL_GL_FRAMEBUFFER, mInternalDrawFB); } else { MOZ_ASSERT(mGL->IsSupported(GLFeature::split_framebuffer)); mGL->raw_fBindFramebuffer(LOCAL_GL_DRAW_FRAMEBUFFER_EXT, mInternalDrawFB); mGL->raw_fBindFramebuffer(LOCAL_GL_READ_FRAMEBUFFER_EXT, mInternalReadFB); } }The difference is clear. While on ESR 78 the code has been updated to call the underlying raw_fBindFramebuffer() method, I didn't make this change on ESR 91. So, time to do that now. Apart from this small change the rest of the code in the BindFB() method looks identical. There are a few other changes to make though as it turns out there are similar issues going on with BindDrawFB(), BindReadFB() and BindReadFB_Internal() as well. The last of these is rather mysterious and doesn't seem to get called anywhere, but I've updated it just in case.
There are many other cases where fBindFramebuffer() gets called but a quick search of the ESR 78 code shows that these are the only ones that have been switched to use raw_fBindFramebuffer(), other than the couple of additional calls that we already have inside our new fBindFramebuffer() override.
With any luck, that should be enough to fix the infinite calling loop. I've set the build going, so now it's a case of finding out what effect this will have tomorrow morning.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comments
Uncover Disqus comments