flypig.co.uk

List items

Items from the current list are shown below.

Gecko

26 Jun 2024 : Day 270 #
I've been frustrated now for a while because I don't understand how the same libxul.so library can work when installed on top of one set of packages, but then break when installed on top of another set of packages. I find it baffling. Originally I thought it might be to do with API changes causing dynamic linking to fail against the browser or QtMozEmbed code. But I know this can't be the case, because the sailfish-browser binary and QtMozEmbed libraries are staying the same. They both access libxul.so directly, so there can't be anything else in the xulrunner packages that are changing their relationship to them either.

Yesterday I ran the various different arrangements, including using the harbour-webview app, which failed. But when I checked the backtrace none of the method names were defined. That's because I'd copied the new libxul.so library in place of the old one. Having done that the debug source and symbols were no longer valid, making it impossible to generate a sensible backtrace.

Consequently, if I want to continue development to track down the error, I'm going to have to figure out this peculiar situation with the library working when copied on top of some packages, but not others.

That means there must be something else that's being installed by the package which is causing the incompatibility. But what?

In the hope it might shed some light on things I'm going to checksum all of the installed files to see which ones change.

First, here are the checksums for the installed files from the new packages I've built (the non-working packages):
$ find /usr/lib64/xulrunner-qt5-91.9.1/ -type f -exec sha256sum {} \;
7f267b67c763f9dcee815b63f3d9beda01e05a8633ba0229ed0692877943  iblgpllibs.so
96638b0343cba81b65e13f835c5b7b554a26c6a2a65f0ee95ef2c00875ea  mni.ja
46628085d9d2912973453035688e6d6c632a51a2cade000a8180defa93b8  lugin-container
8cdd39dfc3f2f982d909314aa7af160a3144564f354126c85d2a07f258b4  ependentlibs.list
facb73cd418a6647bd9b4d7914206257a4a97e5857355ab430b3737d2917  latform.ini
b40edc79cfa6e82237381a1e92f2c562f4fa844637779739e687967c0f6a  ibmozavcodec.so
233025ceb162a45f5ca2d7dac3e511c6c7f188539b7859a88b16277264a3  ibxul.so
fd7bd48c5d68e6d9c13fed833d2a6880ab614d0af54dd4d0c903e2752775  ibmozavutil.so
b84c50b60b264df22ccc0bcc262eeb86f7764f20ffec9a073e3ba99ef703  pplication.ini
We can compare those against the checksums for the files that come from the package which has working WebGL but broken WebView:
$ find /usr/lib64/xulrunner-qt5-91.9.1/ -type f -exec sha256sum {} \;
ed00ccd7d2faadbf872f61436dc5041857d4464c05ba080147f88fc3e35c  iblgpllibs.so
224d5f398864a708b6bf6a9a091d101adb2c9d94c7374837fd38ee8090ce  mni.ja
7375b5d4a9445e3e6e169bda464253b33e132bad6bdd9b9de96a7c7399d1  lugin-container
8cdd39dfc3f2f982d909314aa7af160a3144564f354126c85d2a07f258b4  ependentlibs.list
ce90838911024163f58961d92cf6d810389e730c08ada4e364f1d592050a  latform.ini
b60a6fa988fb3c763c21322e898721c5cd0f1aea5fad3b9ce4f938a0569c  ibmozavcodec.so
bb0068273b939c4352f20a8e8d3095b387a33a24b82ecbf6bd1df280fcd9  ibxul.so
0f559421c9fded2b93e25818b98b178960d1956e2b0139721ba5085622df  ibmozavutil.so
d007134ac436f8e806e3e6760855fd67235d523fe1fcb13c9a25f75dc3cb  pplication.ini
Unfortunately this is less enlightening than I was hoping for: all of the files have changed, apart from the dependentlibs.list file. This unchanged file isn't very interesting, given it simply contains the name of the library:
$ cat /usr/lib64/xulrunner-qt5-91.9.1/dependentlibs.list 
libxul.so
Literally all of the other files have changed. That's going to make it harder to pin down where the problem lies. I'm thinking that this avenue of investigation using checksums isn't going to be especially fruitful.

It's possible that there's a problem with the boundary between the gecko and EmbedLite code, although I'm not sure I really understand how this can be. Nevertheless I've reverted the changes in EmbedLite just in case and set the build going again.

Unfortunately the build fails:
177:27.12 mobile/sailfishos
177:36.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp: In member function ‘void mozilla::
    embedlite::EmbedLiteCompositorBridgeParent::PrepareOffscreen()’:
177:36.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:121:20: error: ‘class mozilla::gl::
    GLScreenBuffer’ has no member named ‘mCaps’
177:36.83        if (!screen->mCaps.premultAlpha) {
177:36.83                     ^~~~~
177:36.84 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:127:68: error: ‘class mozilla::gl::
    GLScreenBuffer’ has no member named ‘mCaps’
177:36.84          factory = SurfaceFactory_EGLImage::Create(context, 
    screen->mCaps, nullptr, flags);
177:36.84                                                                     
    ^~~~~
177:36.84 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:131:30: error: 
    ‘SurfaceFactory_GLTexture’ was not declared in this scope
177:36.84          factory = MakeUnique<SurfaceFactory_GLTexture>(context, 
    screen->mCaps, nullptr, flags);
177:36.84                               ^~~~~~~~~~~~~~~~~~~~~~~~
177:36.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:131:30: note: suggested alternative: 
    ‘SurfaceDescriptorSharedGLTexture’
177:36.92          factory = MakeUnique<SurfaceFactory_GLTexture>(context, 
    screen->mCaps, nullptr, flags);
177:36.92                               ^~~~~~~~~~~~~~~~~~~~~~~~
177:36.92                               SurfaceDescriptorSharedGLTexture
177:36.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:131:73: error: ‘class mozilla::gl::
    GLScreenBuffer’ has no member named ‘mCaps’
177:36.92          factory = MakeUnique<SurfaceFactory_GLTexture>(context, 
    screen->mCaps, nullptr, flags);
177:36.92                                                                       
       ^~~~~
177:36.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp: In member function ‘virtual void 
    mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget(mozilla::layers::PCompositorBridgeParent::
    VsyncId)’:
177:36.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:156:18: error: ‘class mozilla::gl::
    GLContext’ has no member named ‘OffscreenSize’; did you mean ‘IsOffscreen’?
177:36.92      if (context->OffscreenSize() != mEGLSurfaceSize && 
    !context->ResizeOffscreen(mEGLSurfaceSize)) {
177:36.92                   ^~~~~~~~~~~~~
177:36.92                   IsOffscreen
177:36.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:156:66: error: ‘class mozilla::gl::
    GLContext’ has no member named ‘ResizeOffscreen’; did you mean 
    ‘IsOffscreen’?
177:36.92      if (context->OffscreenSize() != mEGLSurfaceSize && 
    !context->ResizeOffscreen(mEGLSurfaceSize)) {
177:36.92                                                                   
    ^~~~~~~~~~~~~~~
177:36.92                                                                   
    IsOffscreen
177:38.59 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:694: 
    EmbedLiteCompositorBridgeParent.o] Error 1
The interesting thing with these errors is that the missing mCaps and OffScreenSize() members of GLContext no longer exist in the updated code. It turns out that's because although I added only a single commit to the gecko side, that was split into two commits on the EmbedLite side. Removing the extra embedlite commit should remove these extra elements and allow the build to go through.

But as I'm checking the diff between the two different versions of the code, I notice that it's not just the C++ code that has changed. There's also this addition to the embedding.js file:
// Make gecko compositor use GL context/surface provided by the application.
pref(&quot;embedlite.compositor.external_gl_context&quot;, false);
// Request the application to create GLContext for the compositor as
// soon as the top level PuppetWidget is created for the view. Setting
// this pref only makes sense when using external compositor gl context.
pref(&quot;embedlite.compositor.request_external_gl_context_early&quot;, false);
This is significant in being the only non-C++ change. This gets packaged into the omni.ja file which, crucially, won't get updated when I copy over a new libxul.so library.

So, it looks like this could be where the problem is.

If you cast your mind back to Day 94, you may recall that I have a script for cleanly packing and unpacking the omni.ja archive. So I can test this out really easily using that. The steps are:
 
  1. Install the full set of new packages that are broken.
  2. Unpack omni.ja.
  3. Edit the embedding.js file to comment out the code shown above.
  4. Repack omni.ja.
  5. Test the browser.


If my hypothesis is correct, this should fix the problem.
$ cd omni
$ ./omni.sh unpack
$ vim omni/defaults/pref/embedding.js
$ ./omni.sh pack
$ cd ..
$ sailfish-browser 
[...]
And indeed now the browser works correctly. So that clarifies the mystery that's been baffling me for the last week. I feel much better now.

To wrap things up, I've eddied the embedding.js file in the source tree as well, so that it'll get baked into the package in future. But at some stage I'll need to restore it, because this is essential for getting the WebView to work. But in the meantime, at least I now have an answer to my conundrum which will make things much easier to handle in the future.

I now have a set of packages that contain GLScreenBuffer, which have working WebGL and which don't crash on start-up. The WebView is broken, so the next step will be to hook in the GLScreenBuffer and try to find out why that's breaking the WebGL. That's my plan for tomorrow.

I'm going to sleep much more soundly tonight.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments