flypig.co.uk

List items

Items from the current list are shown below.

Gecko

23 Feb 2024 : Day 165 #
Today I'm following up the tasks I set out for myself yesterday:
 
Tomorrow I'll do a sweep of the other code to check whether any attempt is being made to initialise it somewhere else. If not I'll add in some initialisation code to see what happens.

As you may recall the WebView is crashing because the SwapChain returned from the context is null. It should be created somewhere, but it's not yet clear where. So the first question is whether there's some code to create it that isn't being called, or whether there's nowhere in the code currently set to create it.

A quick grep of the code throws up a few potential places where it could be being created:
$ grep -rIn "new SwapChain(" *
embedding/embedlite/embedthread/EmbedLiteCompositorBridgeParent.cpp:128:
    swapChain = new SwapChain();
gecko-dev/dom/webgpu/CanvasContext.cpp:95:
    mSwapChain = new SwapChain(aDesc, extent, mExternalImageId, format);
gecko-dev/dom/webgpu/CanvasContext.cpp:139:
    mSwapChain = new SwapChain(desc, extent, mExternalImageId, gfxFormat);
The most promising of these is going to be the code in EmbedLiteCompositorBridgeParent.cpp since that's EmbedLite-specific code. In fact, probably something I added myself during the ESR 91 changes (since SwapChain is new to ESR 91):
$ git blame \
    embedding/embedlite/embedthread/EmbedLiteCompositorBridgeParent.cpp \
    -L 128,128
d59d44a5bccaf (David Llewellyn-Jones 2023-09-04 09:03:49 +0100 128)
    swapChain = new SwapChain();
Confirmed. I even added a note to myself at the time to explain that this might need fixing:
$ git blame \\
    embedding/embedlite/embedthread/EmbedLiteCompositorBridgeParent.cpp \\
    -L 113,114
d59d44a5bccaf (David Llewellyn-Jones 2023-09-04 09:03:49 +0100 113)
    // TODO: The switch from GLSCreenBuffer to SwapChain needs completing
d59d44a5bccaf (David Llewellyn-Jones 2023-09-04 09:03:49 +0100 114)
    // See: https://phabricator.services.mozilla.com/D75055
Here's the relevant bit of code:
void
EmbedLiteCompositorBridgeParent::PrepareOffscreen()
{
  fprintf(stderr,
      "=============== Preparing offscreen rendering context ===============\n");

  const CompositorBridgeParent::LayerTreeState* state =
      CompositorBridgeParent::GetIndirectShadowTree(RootLayerTreeId());
  NS_ENSURE_TRUE(state && state->mLayerManager, );

  GLContext* context = static_cast<CompositorOGL*>(state->mLayerManager->
      GetCompositor())->gl();
  NS_ENSURE_TRUE(context, );

  // TODO: The switch from GLSCreenBuffer to SwapChain needs completing
  // See: https://phabricator.services.mozilla.com/D75055
  if (context->IsOffscreen()) {
    UniquePtr<SurfaceFactory> factory;
    if (context->GetContextType() == GLContextType::EGL) {
      // [Basic/OGL Layers, OMTC] WebGL layer init.
      factory = SurfaceFactory_EGLImage::Create(*context);
    } else {
      // [Basic Layers, OMTC] WebGL layer init.
      // Well, this *should* work...
      factory = MakeUnique<SurfaceFactory_Basic>(*context);
    }

    SwapChain* swapChain = context->GetSwapChain();
    if (swapChain == nullptr) {
      swapChain = new SwapChain();
      new SwapChainPresenter(*swapChain);
      context->mSwapChain.reset(swapChain);
    }

    if (factory) {
      swapChain->Morph(std::move(factory));
    }
  }
}
So either this method isn't being run, or context->IsOffscreen() must be set to false. Let's find out. Conveniently I can once again continue with my debugging session from yesterday:
(gdb) delete break
Delete all breakpoints? (y or n) y
(gdb) b EmbedLiteCompositorBridgeParent::PrepareOffscreen
Breakpoint 10 at 0x7ff366808c: file mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp, line 104.
(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /usr/bin/harbour-webview 
[...]

Thread 36 "Compositor" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 14342]
mozilla::gl::SwapChain::OffscreenSize (this=0x0)
    at gfx/gl/GLScreenBuffer.cpp:129
129       return mPresenter->mBackBuffer->mFb->mSize;
(gdb) bt
#0  mozilla::gl::SwapChain::OffscreenSize (this=0x0)
    at gfx/gl/GLScreenBuffer.cpp:129
#1  0x0000007ff3667930 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc4aebc90, aId=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
#2  0x0000007ff12b808c in mozilla::layers::CompositorVsyncScheduler::
    ForceComposeToTarget (this=0x7fc4d0b230, aTarget=aTarget@entry=0x0, 
    aRect=aRect@entry=0x0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/LayersTypes.h:82
#3  0x0000007ff12b80e8 in mozilla::layers::CompositorBridgeParent::
    ResumeComposition (this=this@entry=0x7fc4aebc90)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#4  0x0000007ff12b8174 in mozilla::layers::CompositorBridgeParent::
    ResumeCompositionAndResize (this=0x7fc4aebc90, x=<optimized out>,
    y=<optimized out>, width=<optimized out>, height=<optimized out>)
    at gfx/layers/ipc/CompositorBridgeParent.cpp:794
#5  0x0000007ff12b0d10 in mozilla::detail::RunnableMethodArguments<int, int,
    int, int>::applyImpl<mozilla::layers::CompositorBridgeParent, void
    (mozilla::layers::CompositorBridgeParent::*)(int, int, int, int),
    StoreCopyPassByConstLRef<int>, StoreCopyPassByConstLRef<int>,
    StoreCopyPassByConstLRef<int>, StoreCopyPassByConstLRef<int>, 0ul, 1ul,
    2ul, 3ul> (args=..., m=<optimized out>, o=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsThreadUtils.h:1151
#6  mozilla::detail::RunnableMethodArguments<int, int, int, int>::apply
    <mozilla::layers::CompositorBridgeParent, void (mozilla::layers::
    CompositorBridgeParent::*)(int, int, int, int)> (m=<optimized out>,
    o=<optimized out>, this=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsThreadUtils.h:1154
[...]
#17 0x0000007ff6a0489c in ?? () from /lib64/libc.so.6
(gdb) 
So the crash is happening before PrepareOffscreen() is called. Looking through the code there's actually only one place it is called and that's inside EmbedLiteCompositorBridgeParent::AllocPLayerTransactionParent(). There it's wrapped in a condition:
  if (!StaticPrefs::embedlite_compositor_external_gl_context()) {
    // Prepare Offscreen rendering context
    PrepareOffscreen();
  }
So I should check whether the problem is that this isn't being called, or the condition is false. A quick debug confirms that it's the latter: the method is entered but the value of the static preference means the PrepareOffScreen() call is never being made:
Thread 36 "Compositor" hit Breakpoint 11, mozilla::embedlite::
    EmbedLiteCompositorBridgeParent::AllocPLayerTransactionParent
    (this=0x7fc4bc68c0, aBackendHints=..., aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:79
79      {
(gdb) n
80        PLayerTransactionParent* p =
(gdb) n
83        EmbedLiteWindowParent *parentWindow = EmbedLiteWindowParent::From(mWindowId);
(gdb) n
84        if (parentWindow) {
(gdb) n
85          parentWindow->GetListener()->CompositorCreated();
(gdb) n
88        if (!StaticPrefs::embedlite_compositor_external_gl_context()) {
(gdb) n
mozilla::layers::PCompositorBridgeParent::OnMessageReceived
    (this=0x7fc4bc68c0, msg__=...) at PCompositorBridgeParent.cpp:1286
1286    PCompositorBridgeParent.cpp: No such file or directory.
(gdb) 
As we can see, it comes down to this embedlite.compositor.external_gl_context static preference, which needs to be set to false for the condition to be entered.

This preference isn't being set for the WebView, although it is set for the browser:
$ pushd ../sailfish-browser/
$ grep -rIn "embedlite.compositor.external_gl_context" *
apps/core/declarativewebutils.cpp:239:
    webEngineSettings->setPreference(QString(
    "embedlite.compositor.external_gl_context"), QVariant(true));
data/prefs.js:5:
    user_pref("embedlite.compositor.external_gl_context", true);
tests/auto/mocks/declarativewebutils/declarativewebutils.cpp:62:
    webEngineSettings->setPreference(QString(
    "embedlite.compositor.external_gl_context"), QVariant(true));
$ popd
I'm going to set it to false explicitly for the WebView. But this immediately makes me feel nervous: this setting isn't new and there's a reason it's not being touched in the WebView code. It makes me think that I'm travelling down a rendering pipeline path that I shouldn't be.

So as well as trying out this change I'm also going to ask for some expert advice from the Jolla team about this, just in case it's actually important that I don't set this to false and that the real issue is somewhere else.

But, it's the start of my work day, so that will all have to wait until later.

[...]

I've added in the change to set the embedlite.compositor.external_gl_context static preference to false:
diff --git a/lib/webenginesettings.cpp b/lib/webenginesettings.cpp
index de9e4b86..780f6555 100644
--- a/lib/webenginesettings.cpp
+++ b/lib/webenginesettings.cpp
@@ -110,6 +110,12 @@ void SailfishOS::WebEngineSettings::initialize()
     engineSettings->setPreference(QStringLiteral("intl.accept_languages"),
                                   QVariant::fromValue<QString>(langs));
 
+    // Ensure the renderer is configured correctly
+    engineSettings->setPreference(QStringLiteral("gfx.webrender.force-disabled"),
+                                  QVariant(true));
+    engineSettings->setPreference(QStringLiteral("embedlite.compositor.external_gl_context"),
+                                  QVariant(false));
+
     Silica::Theme *silicaTheme = Silica::Theme::instance();
 
     // Notify gecko when the ambience switches between light and dark
The code has successfully built; now it's time to test it. On a dry run of the new code it crashes seemingly somewhere close to where it was crashing before. But crucially the debug print from inside the PrepareOffscreen() method is now being output. So we've definitely moved a step forwards.
$ harbour-webview
[D] unknown:0 - QML debugging is enabled. Only use this in a safe environment.
[D] main:30 - WebView Example
[D] main:44 - Using default start URL:  "https://www.flypig.co.uk/search/"
[D] main:47 - Opening webview
[D] unknown:0 - Using Wayland-EGL
library "libutils.so" not found
[...]
JSComp: UserAgentOverrideHelper.js loaded
UserAgentOverrideHelper app-startup
CONSOLE message:
[JavaScript Error: "Unexpected event profile-after-change" {file:
    "resource://gre/modules/URLQueryStrippingListService.jsm" line: 228}]
observe@resource://gre/modules/URLQueryStrippingListService.jsm:228:12

Created LOG for EmbedPrefs
Created LOG for EmbedLiteLayerManager
=============== Preparing offscreen rendering context ===============
Segmentation fault
To find out what's going on we can step through. And after the new install it's finally time to start a new debugging session.
$ gdb harbour-webview 
GNU gdb (GDB) Mer (8.2.1+git9)
[...]
(gdb) b EmbedLiteCompositorBridgeParent::PrepareOffscreen
Function "EmbedLiteCompositorBridgeParent::PrepareOffscreen" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (EmbedLiteCompositorBridgeParent::PrepareOffscreen) pending.
(gdb) r
[...]
Thread 36 "Compositor" hit Breakpoint 1, mozilla::embedlite::
    EmbedLiteCompositorBridgeParent::PrepareOffscreen
    (this=this@entry=0x7fc4be7560)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:104
104     {
(gdb) n
[LWP 18280 exited]
105       fprintf(stderr,
    "=============== Preparing offscreen rendering context ===============\n");
(gdb) 
=============== Preparing offscreen rendering context ===============
107       const CompositorBridgeParent::LayerTreeState* state =
    CompositorBridgeParent::GetIndirectShadowTree(RootLayerTreeId());
(gdb) 
108       NS_ENSURE_TRUE(state && state->mLayerManager, );
(gdb) 
110       GLContext* context = static_cast<CompositorOGL*>
    (state->mLayerManager->GetCompositor())->gl();
(gdb) p context
$1 = <optimized out>
(gdb) n
111       NS_ENSURE_TRUE(context, );
(gdb) n
3540    ${PROJECT}/obj-build-mer-qt-xr/dist/include/GLContext.h:
    No such file or directory.
(gdb) n
117         if (context->GetContextType() == GLContextType::EGL) {
(gdb) n
119           factory = SurfaceFactory_EGLImage::Create(*context);
(gdb) n
126         SwapChain* swapChain = context->GetSwapChain();
(gdb) n
127         if (swapChain == nullptr) {
(gdb) n
128           swapChain = new SwapChain();
(gdb) n
129           new SwapChainPresenter(*swapChain);
(gdb) n
130           context->mSwapChain.reset(swapChain);
(gdb) n
133         if (factory) {
(gdb) n
134           swapChain->Morph(std::move(factory));
(gdb) n
mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    AllocPLayerTransactionParent (this=0x7fc4be7560, aBackendHints=..., aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:92
92        return p;
(gdb) c
Continuing.

Thread 36 "Compositor" received signal SIGSEGV, Segmentation fault.
0x0000007ff110a38c in mozilla::gl::SwapChain::OffscreenSize
    (this=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
290     ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:
    No such file or directory.
(gdb) bt
#0  0x0000007ff110a38c in mozilla::gl::SwapChain::OffscreenSize
    (this=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
#1  0x0000007ff3667930 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc4be7560, aId=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
#2  0x0000007ff12b808c in mozilla::layers::CompositorVsyncScheduler::
    ForceComposeToTarget (this=0x7fc4d0b1a0, aTarget=aTarget@entry=0x0, 
    aRect=aRect@entry=0x0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
    LayersTypes.h:82
#3  0x0000007ff12b80e8 in mozilla::layers::CompositorBridgeParent::
    ResumeComposition (this=this@entry=0x7fc4be7560)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#4  0x0000007ff12b8174 in mozilla::layers::CompositorBridgeParent::
    ResumeCompositionAndResize (this=0x7fc4be7560, x=<optimized out>,
    y=<optimized out>, width=<optimized out>, height=<optimized out>)
    at gfx/layers/ipc/CompositorBridgeParent.cpp:794
#5  0x0000007ff12b0d10 in mozilla::detail::RunnableMethodArguments<int, int,
    int, int>::applyImpl<mozilla::layers::CompositorBridgeParent, void
    (mozilla::layers::CompositorBridgeParent::*)(int, int, int, int),
    StoreCopyPassByConstLRef<int>, StoreCopyPassByConstLRef<int>,
    StoreCopyPassByConstLRef<int>, StoreCopyPassByConstLRef<int>, 0ul, 1ul,
    2ul, 3ul> (args=..., m=<optimized out>, o=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsThreadUtils.h:1151
[...]
#17 0x0000007ff6a0489c in ?? () from /lib64/libc.so.6
(gdb) 
Here's the code that's causing the crash:
const gfx::IntSize& SwapChain::OffscreenSize() const {
  return mPresenter->mBackBuffer->mFb->mSize;
}
It would be helpful to know which of these values is null, but unhelpfully the values have been optimised out.
(gdb) p mPresenter
value has been optimized out
(gdb) p this
$2 = <optimized out>
However if we go up a stack frame we can have better luck, applying the trick we used on Day 164 to extract the SwapChain object from the context via the UniquePtr class:
(gdb) frame 1
#1  0x0000007ff3667930 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc4be7560, aId=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
290     ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:
    No such file or directory.
(gdb) p context
$3 = (mozilla::gl::GLContext *) 0x7ee019ee40
(gdb) p context->mSwapChain.mTuple.mFirstA
$5 = (mozilla::gl::SwapChain *) 0x7ee01ce090
(gdb) p context->mSwapChain.mTuple.mFirstA->mPresenter
$6 = (mozilla::gl::SwapChainPresenter *) 0x7ee01a1380
(gdb) p context->mSwapChain.mTuple.mFirstA->mPresenter->mBackBuffer
$7 = std::shared_ptr<mozilla::gl::SharedSurface> (empty) = {get() = 0x0}
(gdb) p context->mSwapChain.mTuple.mFirstA->mPresenter->mBackBuffer->mFb
Cannot access memory at address 0x20
(gdb) 
As we can see from this, the missing value is the mBackBuffer value inside the SwapChainPresenter object of the SwapChain class.

It's clear what the next step is: find out why the mBackBuffer value isn't being set and, if necessary, set it. But that's a task that'll have to wait until tomorrow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments