List items
Items from the current list are shown below.
Blog
28 Feb 2024 : Day 170 #
Today I woke up to discover a bad result. The build I started yesterday stalled about half way through. This does happen very occasionally, but honestly since I dropped down to just using a single process, it's barely happened at all. So that's more than a little annoying. Nevertheless I've woken up early today and it does at least mean that my first task of the day is an easy one: kick off the build once again.
So here goes... Once it's done, I'll give the changes I made yesterday a go to see whether they've fixed the segfault.
[...]
Finally the build completed, second time lucky it seems. So now SwapChain, SurfaceFactory and the SharedSurface back buffer should all be created respectively in this order. And this should also be the correct order. Let's find out.
Now there's still a crash, but it does at least get further than last time:
Checking the ESR 78 code, there is no mFrontBuffer variable, but there is an mFront which appears to be doing ostensibly the same thing. The mFront is only every used to switch the back buffer in to it, or to be accessed by EmbedLiteCompositorBridgeParent::GetPlatformImage(). In the latter case it's used, but not set.
So the arrangement isn't so dissimilar. Perhaps the main difference is that in ESR 78 there's no call to get the size of the front buffer as there is in ESR 91. Just as a reminder again: it's this size request that's causing the crash.
In ESR 78 the Swap() method is called from PublishFrame(), which is called from EmbedLiteCompositorBridgeParent::PresentOffscreenSurface(). It would be good to try to find out whether there's anything tying these together, to understand the sequencing, but the code is too convoluted for me to figure that out by hand.
So, instead, I'm going to look at the call to SwapChain::Size(). This is a call I added myself on top of the changes since ESR 91 and which doesn't have an immediately obvious equivalent call in ESR 78, so there must have been some reason why I added it.
Looking at the code in ESR 78 I can see that this is the reason I added this call:
After thinking long and hard about this I don't think it's going to be possible to fit everything that's needed into the current SwapChain structure. So tomorrow I'm going to start putting back in all of the pieces from ESR 78 that were ripped out of ESR 91. This should be a much more tractable exercise than trying to reconstruct the functionality from scratch. Once I've got a working renderer I can then take the diff and try to fit as much of what's needed as possible into the swap chain structure.
But I'm not going to be able to do that today as it's time for me to head to bed. I'll pick this up in the morning.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
So here goes... Once it's done, I'll give the changes I made yesterday a go to see whether they've fixed the segfault.
[...]
Finally the build completed, second time lucky it seems. So now SwapChain, SurfaceFactory and the SharedSurface back buffer should all be created respectively in this order. And this should also be the correct order. Let's find out.
Now there's still a crash, but it does at least get further than last time:
$ harbour-webview [D] unknown:0 - QML debugging is enabled. Only use this in a safe environment. [D] main:30 - WebView Example [D] main:44 - Using default start URL: "https://www.flypig.co.uk/search/" [D] main:47 - Opening webview [D] unknown:0 - Using Wayland-EGL library "libutils.so" not found [...] Created LOG for EmbedLiteLayerManager =============== Preparing offscreen rendering context =============== CONSOLE message: OpenGL compositor Initialized Succesfully. Version: OpenGL ES 3.2 V@0502.0 (GIT@704ecd9a2b, Ib3f3e69395, 1609240670) (Date:12/29/20) Vendor: Qualcomm Renderer: Adreno (TM) 619 FBO Texture Target: TEXTURE_2D JSScript: ContextMenuHandler.js loaded JSScript: SelectionPrototype.js loaded JSScript: SelectionHandler.js loaded JSScript: SelectAsyncHelper.js loaded JSScript: FormAssistant.js loaded JSScript: InputMethodHandler.js loaded EmbedHelper init called Available locales: en-US, fi, ru Frame script: embedhelper.js loaded Segmentation faultThat's without the debugger. To find out where precisely it's crashing we can execute it again, but this time with the debugger attached:
$ gdb harbour-webview GNU gdb (GDB) Mer (8.2.1+git9) [...] (gdb) r Starting program: /usr/bin/harbour-webview [...] Thread 36 "Compositor" received signal SIGSEGV, Segmentation fault. [Switching to LWP 13568] 0x0000007ff110a378 in mozilla::gl::SwapChain::Size (this=this@entry=0x7ed81ce090) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290 290 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h: No such file or directory. (gdb) bt #0 0x0000007ff110a378 in mozilla::gl::SwapChain::Size (this=this@entry=0x7ed81ce090) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290 #1 0x0000007ff3667cc8 in mozilla::embedlite::EmbedLiteCompositorBridgeParent:: PresentOffscreenSurface (this=0x7fc4b41c20) at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:199 #2 0x0000007ff3680fe0 in mozilla::embedlite::nsWindow::PostRender (this=0x7fc4c331e0, aContext=<optimized out>) at mobile/sailfishos/embedshared/nsWindow.cpp:248 #3 0x0000007ff2a664fc in mozilla::widget::InProcessCompositorWidget::PostRender (this=0x7fc4658990, aContext=0x7f17ae4848) at widget/InProcessCompositorWidget.cpp:60 #4 0x0000007ff1291074 in mozilla::layers::LayerManagerComposite::Render (this=this@entry=0x7ed81afa80, aInvalidRegion=..., aOpaqueRegion=...) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/ Compositor.h:575 #5 0x0000007ff12914f0 in mozilla::layers::LayerManagerComposite:: UpdateAndRender (this=this@entry=0x7ed81afa80) at gfx/layers/composite/LayerManagerComposite.cpp:657 #6 0x0000007ff12918a0 in mozilla::layers::LayerManagerComposite:: EndTransaction (this=this@entry=0x7ed81afa80, aTimeStamp=..., aFlags=aFlags@entry=mozilla::layers::LayerManager::END_DEFAULT) at gfx/layers/composite/LayerManagerComposite.cpp:572 #7 0x0000007ff12d303c in mozilla::layers::CompositorBridgeParent:: CompositeToTarget (this=0x7fc4b41c20, aId=..., aTarget=0x0, aRect=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313 #8 0x0000007ff12b8784 in mozilla::layers::CompositorVsyncScheduler::Composite (this=0x7fc4d01e30, aVsyncEvent=...) at gfx/layers/ipc/CompositorVsyncScheduler.cpp:256 #9 0x0000007ff12b0bfc in mozilla::detail::RunnableMethodArguments <mozilla::VsyncEvent>::applyImpl<mozilla::layers::CompositorVsyncScheduler, void (mozilla::layers::CompositorVsyncScheduler::*)(mozilla::VsyncEvent const&), StoreCopyPassByConstLRef<mozilla::VsyncEvent>, 0ul> (args=..., m=<optimized out>, o=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsThreadUtils.h:887 [...] #21 0x0000007ff6a0489c in ?? () from /lib64/libc.so.6 (gdb) p mFrontBuffer $1 = std::shared_ptr<mozilla::gl::SharedSurface> (empty) = {get() = 0x0} (gdb)Looking at the above, it seems that the back buffer isn't causing a crash any more. The problem now seems to be the front buffer. That's okay: that's progress! There are only two situations in which the front buffer gets set. First it happens if the SwapChainPresenter destructor is called. In this case the back buffer held by the presenter is moved into the front buffer, then the presenter's back buffer is set to null. Second it happens when the SwapChain::Swap() method is called. In this case the back buffer held by the presenter and the front buffer held by the swap chain are switched. In some sense, the Swap() method isn't really going to help us because if the front buffer is null beforehand, afterwards the back buffer will be null, which is also no good.
Checking the ESR 78 code, there is no mFrontBuffer variable, but there is an mFront which appears to be doing ostensibly the same thing. The mFront is only every used to switch the back buffer in to it, or to be accessed by EmbedLiteCompositorBridgeParent::GetPlatformImage(). In the latter case it's used, but not set.
So the arrangement isn't so dissimilar. Perhaps the main difference is that in ESR 78 there's no call to get the size of the front buffer as there is in ESR 91. Just as a reminder again: it's this size request that's causing the crash.
In ESR 78 the Swap() method is called from PublishFrame(), which is called from EmbedLiteCompositorBridgeParent::PresentOffscreenSurface(). It would be good to try to find out whether there's anything tying these together, to understand the sequencing, but the code is too convoluted for me to figure that out by hand.
So, instead, I'm going to look at the call to SwapChain::Size(). This is a call I added myself on top of the changes since ESR 91 and which doesn't have an immediately obvious equivalent call in ESR 78, so there must have been some reason why I added it.
Looking at the code in ESR 78 I can see that this is the reason I added this call:
GLScreenBuffer* screen = context->Screen(); MOZ_ASSERT(screen); if (screen->Size().IsEmpty() || !screen->PublishFrame(screen->Size())) { NS_ERROR("Failed to publish context frame"); }Compare that to the attempt I made to replicate the functionality in ESR 91:
// TODO: The switch from GLSCreenBuffer to SwapChain needs completing // See: https://phabricator.services.mozilla.com/D75055 SwapChain* swapChain = context->GetSwapChain(); MOZ_ASSERT(swapChain); const gfx::IntSize& size = swapChain->Size(); if (size.IsEmpty() || !swapChain->PublishFrame(size)) { NS_ERROR("Failed to publish context frame"); }The obvious question is, what is context->Screen() returning in ESR 78 and where is it created. Unfortunately the answer is complex. It's returning the following member of GLContext:
UniquePtr<GLScreenBuffer> mScreen; [...] GLScreenBuffer* Screen() const { return mScreen.get(); }This gets created from a call to GLContext::InitOffscreen(), like this:
Delete all breakpoints? (y or n) y (gdb) b CreateScreenBufferImpl Breakpoint 7 at 0x7fb8e837d8: file gfx/gl/GLContext.cpp, line 2120. (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/bin/harbour-webview [...] Thread 36 "Compositor" hit Breakpoint 7, mozilla::gl::GLContext:: CreateScreenBufferImpl (this=this@entry=0x7eac109140, size=..., caps=...) at gfx/gl/GLContext.cpp:2120 2120 const SurfaceCaps& caps) { (gdb) bt #0 mozilla::gl::GLContext::CreateScreenBufferImpl (this=this@entry=0x7eac109140, size=..., caps=...) at gfx/gl/GLContext.cpp:2120 #1 0x0000007fb8e838ec in mozilla::gl::GLContext::CreateScreenBuffer (caps=..., size=..., this=0x7eac109140) at gfx/gl/GLContext.h:3517 #2 mozilla::gl::GLContext::InitOffscreen (this=0x7eac109140, size=..., caps=...) at gfx/gl/GLContext.cpp:2578 #3 0x0000007fb8e83ac8 in mozilla::gl::GLContextProviderEGL::CreateOffscreen (size=..., minCaps=..., flags=flags@entry=mozilla::gl::CreateContextFlags:: REQUIRE_COMPAT_PROFILE, out_failureId=out_failureId@entry=0x7fa50ed378) at gfx/gl/GLContextProviderEGL.cpp:1443 #4 0x0000007fb8ee475c in mozilla::layers::CompositorOGL::CreateContext (this=0x7eac003420) at gfx/layers/opengl/CompositorOGL.cpp:250 #5 mozilla::layers::CompositorOGL::CreateContext (this=0x7eac003420) at gfx/layers/opengl/CompositorOGL.cpp:223 #6 0x0000007fb8f053bc in mozilla::layers::CompositorOGL::Initialize (this=0x7eac003420, out_failureReason=0x7fa50ed730) at gfx/layers/opengl/CompositorOGL.cpp:374 #7 0x0000007fb8fdcf7c in mozilla::layers::CompositorBridgeParent::NewCompositor (this=this@entry=0x7f8c99d3f0, aBackendHints=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1534 #8 0x0000007fb8fe65e8 in mozilla::layers::CompositorBridgeParent:: InitializeLayerManager (this=this@entry=0x7f8c99d3f0, aBackendHints=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1491 #9 0x0000007fb8fe6730 in mozilla::layers::CompositorBridgeParent:: AllocPLayerTransactionParent (this=this@entry=0x7f8c99d3f0, aBackendHints=..., aId=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1587 #10 0x0000007fbb2e31b4 in mozilla::embedlite::EmbedLiteCompositorBridgeParent:: AllocPLayerTransactionParent (this=0x7f8c99d3f0, aBackendHints=..., aId=...) at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:77 #11 0x0000007fb88c13d0 in mozilla::layers::PCompositorBridgeParent:: OnMessageReceived (this=0x7f8c99d3f0, msg__=...) at PCompositorBridgeParent.cpp:1391 [...] #27 0x0000007fbe70d89c in ?? () from /lib64/libc.so.6 (gdb)Recall that the call to CreateOffscreen() at frame 3 is now a call to CreateHeadless(). And it looks like that's where things really start to diverge.
After thinking long and hard about this I don't think it's going to be possible to fit everything that's needed into the current SwapChain structure. So tomorrow I'm going to start putting back in all of the pieces from ESR 78 that were ripped out of ESR 91. This should be a much more tractable exercise than trying to reconstruct the functionality from scratch. Once I've got a working renderer I can then take the diff and try to fit as much of what's needed as possible into the swap chain structure.
But I'm not going to be able to do that today as it's time for me to head to bed. I'll pick this up in the morning.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comments
Uncover Disqus comments