flypig.co.uk

List items

Items from the current list are shown below.

Gecko

10 Mar 2024 : Day 181 #
Today it's time to test out the changes I made yesterday, adding in the SharedSurface_Basic functionality that got lost in the transition to ESR 91. The key change is that now the ProdTexture() will be overridden and so the call to it from SurfaceFactory during initialisation should — I'm hoping — no longer trigger a crash.

One of the downsides of using the massive partial libxul.so builds packed full of debugging information is that they just take up so much room on the device. But after deleting the library, the reinstallation of it then goes through without a hitch. It does strike me as a little odd that the calculation for how much space is needed doesn't take into account how much will be removed as well as how much will be added. I guess this is important for allowing transactional updates.
$ rpm -U xulrunner-qt5-91.*.rpm xulrunner-qt5-debuginfo-91.*.rpm \
    xulrunner-qt5-debugsource-91.*.rpm xulrunner-qt5-misc-91.*.rpm
        installing package xulrunner-qt5-91.9.1+git1.aarch64 needs 7MB more
        space on the / filesystem
$ rm /usr/lib64/xulrunner-qt5-91.9.1/libxul.so
$ rpm -U xulrunner-qt5-91.*.rpm xulrunner-qt5-debuginfo-91.*.rpm \
    xulrunner-qt5-debugsource-91.*.rpm xulrunner-qt5-misc-91.*.rpm
When running the new code it still almost immediately crashes. And that's not a surprise; I'm expecting at least a few more cycles of this "run-crash-debug-fix" process before we have something working.
Thread 37 "Compositor" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 4396]
0x0000007ff1107dc4 in mozilla::gl::SharedSurface::ProdTexture
    (this=<optimized out>)
    at gfx/gl/SharedSurface.h:157
157         MOZ_CRASH("GFX: Did you forget to override this function?");
(gdb) bt
#0  0x0000007ff1107dc4 in mozilla::gl::SharedSurface::ProdTexture
    (this=<optimized out>)
    at gfx/gl/SharedSurface.h:157
#1  0x0000007ff1106cc4 in mozilla::gl::ReadBuffer::Attach (this=0x7ed41a1700,
    surf=surf@entry=0x7ed419f9c0)
    at gfx/gl/GLScreenBuffer.cpp:718
#2  0x0000007ff1106ebc in mozilla::gl::GLScreenBuffer::Attach
    (this=this@entry=0x5555642e30, surf=0x7ed419f9c0, size=...)
    at gfx/gl/GLScreenBuffer.cpp:486
#3  0x0000007ff1106f60 in mozilla::gl::GLScreenBuffer::Swap
    (this=this@entry=0x5555642e30, size=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
[...]
#25 0x0000007ff6a0489c in ?? () from /lib64/libc.so.6
(gdb) p this
$6 = <optimized out>
(gdb) frame 1
#1  0x0000007ff1106cc4 in mozilla::gl::ReadBuffer::Attach (this=0x7ed41a1700, surf=surf@entry=0x7ed419f9c0)
    at gfx/gl/GLScreenBuffer.cpp:718
718             colorTex = surf->ProdTexture();
(gdb) p surf
$3 = (mozilla::gl::SharedSurface *) 0x7ed419f9c0
(gdb) set print object on
(gdb) p surf
$5 = (mozilla::gl::SharedSurface_EGLImage *) 0x7ed419f9c0
(gdb) set print object off
(gdb) 
My initial reaction is that this is the same error that I spent yesterday trying to fix. But on closer inspection it's actually a little different. So maybe the changes made yesterday were actually worthwhile after all?

Nevertheless, this still seems to be a crash due to a missing override, as we can see from the error message that's output: "Did you forget to override this function". It's an induced crash again. But this time the surface is of type SharedSurface_EGLImage. Probably we'll have to add the overrides into this class as well. This will be similar to the work I did yesterday, but this time applied to a different class that's also inheriting from SharedSurface.

Looking at the SharedSurface_EGLImage class definition in SharedSurfaceEGL.h there are some very distinct differences between ESR 78 and ESR 91, including the lack of a ProdTexture() override in ESR 91. Here are the relevant code pieces from ESR 78 (I've rearranged some of the line orders for clarity):
class SharedSurface_EGLImage : public SharedSurface {
[...]
 protected:
  mutable Mutex mMutex;
  const GLFormats mFormats;
  GLuint mProdTex;
[...]
  virtual GLuint ProdTexture() override { return mProdTex; }
[...]
In comparison, the ProdTexture() method and associated mProdTex member variable are both missing in ESR 91. I'll need to add them in, along with all the logic associated with them.

The mFormats variable is also missing from ESR 91, but I can't see anywhere that's used in a meaningful way, so I'll leave that out. The Cast() method has also been removed. But the logic for this is pretty simple and it looks like this has just been replaced with the same logic and direct cast in the various places it's used in the code, rather than in a separate method. Given this, there looks to be no need to revert this particular change.
  static SharedSurface_EGLImage* Cast(SharedSurface* surf) {
    MOZ_ASSERT(surf->mType == SharedSurfaceType::EGLImageShare);

    return (SharedSurface_EGLImage*)surf;
  }
Possibly these were changes I made myself at some point in the (now distant!) past while performing this update.

The other change I've had to make is to the SharedSurface_EGLImage::Create() and SurfaceFactory_EGLImage::Create() methods. I've changed their implementations slightly and redirected them to use the new (old?) constructors.

With all of these changes in place compilation the partial build now goes through. I've linked the partial build, copied it over to my phone and manually copied it to the correct directory. Now to see whether it's had any effect.
$ harbour-webview 
[D] unknown:0 - QML debugging is enabled. Only use this in a safe environment.
[D] main:30 - WebView Example
[D] main:47 - Opening webview
[D] unknown:0 - Using Wayland-EGL
[...]
Created LOG for EmbedLiteLayerManager
=============== Preparing offscreen rendering context ===============
CONSOLE message:
OpenGL compositor Initialized Succesfully.
[...]
Frame script: embedhelper.js loaded
CONSOLE message:
[JavaScript Warning: "This page uses the non standard property “zoom”. Consider
    using calc() in the relevant property values, or using “transform” along
    with “transform-origin: 0 0”." {file: "https://sailfishos.org/" line: 0}]
CONSOLE message:
[JavaScript Warning: "Layout was forced before the page was fully loaded. If
    stylesheets are not yet loaded this may cause a flash of unstyled content."
    {file: "https://sailfishos.org/wp-includes/js/jquery/
    jquery.min.js?ver=3.5.1" line: 2}]
The good news is the WebView test app is no longer crashing. It stays running and even responds to touch input. But it's not rendering. The screen is just showing a completely white page. This is definitely good progress though. It means that tomorrow I can dive back in to the debugger to compare execution with ESR 78, see where they're diverging and hopefully gradually get them to align closer until the rendering works.

I'm afraid to say, there's still a long journey ahead of us. But we are still, slowly but surely, moving forwards.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments