List items
Items from the current list are shown below.
Blog
13 Aug 2024 : Day 318 #
During the last 318 days of posting these diary entries I've found myself enjoying a variety of different modes of transport. Mostly trains, but also busses, cars and planes. So far busses have been the most awkward by far. Today I'm trying something hew: developing on a ferry. I have to say, it's at the other extreme, being by far the most comfortable yet. Even better than developing on a train: calmer and with more space. At least, that'll be true until we leave cellphone range and I can no longer access the Internet; at that point I may not find it so comfortable!
Yesterday I was looking in to two issues, both related to video. First is the discolouration issue, which it turns out is due to the Y'CbCr channels being interpreted as RGB channels. The second is a call to eglTerminate() which is causing the browser to crash during video playback.
Last night and this morning I've been repeatedly playing the YouTube video on the Jolla test page. This is the same video that triggered the crash with the backtrace yesterday. I built a new version of the browser that has the call to eglTerminate() replaced by a call to send some debug output to the console instead:
The purpose of the function definitely seems to relate to releasing resources once they're no longer needed. The obvious questions are: what's the value of display being passed in and what's the return value coming back? To find these out I've added a line of debug output to the constructor to check the display value at creation time:
That's doesn't look quite right to me. It might make more sense if a different display value were being used, but that's not what we're seeing. At least when I run this there are no crashes. But after restarting I get similar outputs, but this time there's also a crash on the fourth runthrough of the video:
To try to figure out what's going on I've placed some breakpoints on the EGLDisplay constructor. I'm curious to know how they're getting created and why we're not getting different values each time. This is what we get for the very first occurrence of a call to the constructor:
That's my hypothesis anyway. It turns out I'm nearly right, but not quite. After stepping through the methods seen in the two backtraces above, it becomes clear that it's not the context that's being duplicated, but rather the GLLibraryEGL. The reason is that in all the mess of trying to figure out how to get the WebView working alongside WebGL, I ended up with two different ways to create the library.
We can see the result of this by adding a breakpoint to EglDisplay::Create(). A pointer to the library that requested it is passed in as a parameter, so we can query this parameter to check that only one library is in use at any one time.
The following two cases are from the same execution of ESR 91. Here's the first hit. Notice that the lib parameter is pointing to memory location 0x7ed01a21b0:
The second happens when there's a call to DefaultEglLibrary(). In this case there's an interesting twist, because DefaultEglLibrary() stores the result in gDefaultEglLibrary which is a static variable. A new instance is constructed only if this is set to null, essentially making GLLibraryEGL a singleton when created via this route.
Since they're both in the same file, the solution I've come up with is to make CreateWrappingExisting() use gDefaultEglLibrary as well. That way, whichever is called first will create the canonical instance of the library. Any subsequent calls will reuse the same instance.
Testing this out with the browser I get good results. It's clear that eglTerminate() is no longer being called partway through the video; in fact, there's now no construction of destruction of a new display at all while the video is playing. I've not yet managed to trigger a crash, but will have to use the browser a bit more before I feel more confident about this.
Just as importantly, I've tested the browser and the WebView app with both standard browsing and pages that contain WebGL. So far the results have been stable as well.
If this really has prevented a crash then this will be a great result. I have a suspicion that something similar was happening on ESR 78, which would periodically crash when videos were playing. This was problematic on many pages with embedded videos. Often it would appear that the browser was crashing at random. If this fixes these crashes, the browser will be far more enjoyable to use in general.
None of these changes will fix the discolouration issue and I won't have time to do more work on that today. But I will certainly return to it tomorrow with a fresh pair of eyes.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Yesterday I was looking in to two issues, both related to video. First is the discolouration issue, which it turns out is due to the Y'CbCr channels being interpreted as RGB channels. The second is a call to eglTerminate() which is causing the browser to crash during video playback.
Last night and this morning I've been repeatedly playing the YouTube video on the Jolla test page. This is the same video that triggered the crash with the backtrace yesterday. I built a new version of the browser that has the call to eglTerminate() replaced by a call to send some debug output to the console instead:
EglDisplay::~EglDisplay() { printf_stderr("CRASH: EglDisplay destructor\n"); //fTerminate(); mLib->mActiveDisplays.erase(mDisplay); }During my testing since making this change I've yet to experience a crash. The debug output appears on the first playthrough, so far up to three times, but no more after that:
Created LOG for EmbedLite Created LOG for EmbedPrefs Created LOG for EmbedLiteLayerManager CRASH: EglDisplay destructor library "libandroidicu.so" needed or dlopened by "/system/lib64/ libmedia.so" is not accessible for the namespace "(default)" library "/apex/com.android.vndk.v30/lib64/hw/ android.hidl.memory@1.0-impl.so" needed or dlopened by "/usr/ libexec/droid-hybris/system/lib64/libvndksupport.so" is not accessible for the namespace "sphal" CRASH: EglDisplay destructor CRASH: EglDisplay destructorI don't really want to remove the call to eglTerminate() completely as this will likely result in a resource leak. The documentation for the function states the following:
Name: eglTerminate — terminate an EGL display connection
C Specification: EGLBoolean eglTerminate(EGLDisplay display);
Parameters: display Specifies the EGL display connection to terminate.
Description: eglTerminate releases resources associated with an EGL display connection. Termination marks all EGL resources associated with the EGL display connection for deletion. If contexts or surfaces associated with display is current to any thread, they are not released until they are no longer current as a result of eglMakeCurrent.
Terminating an already terminated EGL display connection has no effect. A terminated display may be re-initialized by calling eglInitialize again. Errors
EGL_FALSE is returned if eglTerminate fails, EGL_TRUE otherwise.
EGL_BAD_DISPLAY is generated if display is not an EGL display connection.
See Also: eglInitialize, eglMakeCurrent.
C Specification: EGLBoolean eglTerminate(EGLDisplay display);
Parameters: display Specifies the EGL display connection to terminate.
Description: eglTerminate releases resources associated with an EGL display connection. Termination marks all EGL resources associated with the EGL display connection for deletion. If contexts or surfaces associated with display is current to any thread, they are not released until they are no longer current as a result of eglMakeCurrent.
Terminating an already terminated EGL display connection has no effect. A terminated display may be re-initialized by calling eglInitialize again. Errors
EGL_FALSE is returned if eglTerminate fails, EGL_TRUE otherwise.
EGL_BAD_DISPLAY is generated if display is not an EGL display connection.
See Also: eglInitialize, eglMakeCurrent.
The purpose of the function definitely seems to relate to releasing resources once they're no longer needed. The obvious questions are: what's the value of display being passed in and what's the return value coming back? To find these out I've added a line of debug output to the constructor to check the display value at creation time:
EglDisplay::EglDisplay(const PrivateUseOnly&, GLLibraryEGL& lib, const EGLDisplay disp, const bool isWarp) : mLib(&lib), mDisplay(disp), mIsWARP(isWarp) { printf_stderr("EGL: constructor, display: %d\n", mDisplay); [...] }I've also updated the debug output in the destructor so that we can check the display value going in to the eglTerminate() call and the return value coming out:
EglDisplay::~EglDisplay() { printf_stderr("EGL: destructor, display: %d\n", mDisplay); EGLBoolean result = fTerminate(); printf_stderr("EGL: eglTerminate return: %d\n", result); mLib->mActiveDisplays.erase(mDisplay); }The result is unexpected in a number of ways. The first time I run it I get the output below. What's notable here is first that there's a repeated cycle of construction and destruction calls. Each time I refresh the page the video loads and, after a short pause of between one and two seconds the context is constructed and then almost immediately destructed. Second is that two copies of the context seem to be constructed with the same display value of 1, with multiple contexts being active at the same time and using the same display:
[...] Created LOG for EmbedLite Created LOG for EmbedPrefs Created LOG for EmbedLiteLayerManager EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 [D] unknown:0 - AMBIENCE: received embedliteviewcreated library "libandroidicu.so" needed or dlopened by "/system/lib64/ libmedia.so" is not accessible for the namespace "(default)" library "/apex/com.android.vndk.v30/lib64/hw/ android.hidl.memory@1.0-impl.so" needed or dlopened by "/usr/ libexec/droid-hybris/system/lib64/libvndksupport.so" is not accessible for the namespace "sphal" EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 [...]For most of the cycles we can see the context gets destructed before the next is constructed, but on the fourth line prefixed with EGL a context is constructed that's not being immediately destructed. Then on the fifth EGL line there's a new context constructed using the same display value.
That's doesn't look quite right to me. It might make more sense if a different display value were being used, but that's not what we're seeing. At least when I run this there are no crashes. But after restarting I get similar outputs, but this time there's also a crash on the fourth runthrough of the video:
[...] Created LOG for EmbedLite Created LOG for EmbedPrefs Created LOG for EmbedLiteLayerManager EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 [D] unknown:0 - AMBIENCE: received embedliteviewcreated library "libandroidicu.so" needed or dlopened by "/system/lib64/ libmedia.so" is not accessible for the namespace "(default)" library "/apex/com.android.vndk.v30/lib64/hw/ android.hidl.memory@1.0-impl.so" needed or dlopened by "/usr/ libexec/droid-hybris/system/lib64/libvndksupport.so" is not accessible for the namespace "sphal" EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 EGL: destructor, display: 1 EGL: eglTerminate return: 1 EGL: constructor, display: 1 EGL: destructor, display: 1 Segmentation faultOnce again the crash happens as a result of the call to eglTerminate(), which means that the return value from the method is never output to the console.
To try to figure out what's going on I've placed some breakpoints on the EGLDisplay constructor. I'm curious to know how they're getting created and why we're not getting different values each time. This is what we get for the very first occurrence of a call to the constructor:
Thread 39 "Compositor" hit Breakpoint 1, mozilla::gl::EglDisplay:: EglDisplay (this=0x7ed019b090, lib=..., disp=0x1, isWarp=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:689 689 ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp: No such file or directory. (gdb) bt #0 mozilla::gl::EglDisplay::EglDisplay (this=0x7ed019b090, lib=..., disp=0x1, isWarp=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:689 #1 0x0000007ff2397378 in __gnu_cxx::new_allocator<mozilla::gl::EglDisplay>:: construct<mozilla::gl::EglDisplay, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__p=0x7ed019b090, this=<optimized out>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/new:169 #2 std::allocator_traits<std::allocator<mozilla::gl::EglDisplay> >:: construct<mozilla::gl::EglDisplay, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__p=0x7ed019b090, __a=...) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/alloc_traits.h:475 #3 std::_Sp_counted_ptr_inplace<mozilla::gl::EglDisplay, std:: allocator<mozilla::gl::EglDisplay>, (__gnu_cxx::_Lock_policy)2>:: _Sp_counted_ptr_inplace<mozilla::gl::EglDisplay::PrivateUseOnly, mozilla:: gl::GLLibraryEGL&, void* const&, bool const&> (__a=..., this=0x7ed019b080) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr_base.h:545 #4 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<mozilla::gl: :EglDisplay, std::allocator<mozilla::gl::EglDisplay>, mozilla::gl:: EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__a=..., __p=<synthetic pointer>: <optimized out>, this=<synthetic pointer>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr_base.h:677 #5 std::__shared_ptr<mozilla::gl::EglDisplay, (__gnu_cxx::_Lock_policy)2>:: __shared_ptr<std::allocator<mozilla::gl::EglDisplay>, mozilla::gl:: EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__tag=..., this=<synthetic pointer>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr_base.h:1342 #6 std::shared_ptr<mozilla::gl::EglDisplay>::shared_ptr<std::allocator<mozilla: :gl::EglDisplay>, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl:: GLLibraryEGL&, void* const&, bool const&> (__tag=..., this=<synthetic pointer>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr.h:359 #7 std::allocate_shared<mozilla::gl::EglDisplay, std::allocator<mozilla::gl:: EglDisplay>, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl:: GLLibraryEGL&, void* const&, bool const&> (__a=...) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr.h:706 #8 std::make_shared<mozilla::gl::EglDisplay, mozilla::gl::EglDisplay:: PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> () at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr.h:722 #9 mozilla::gl::EglDisplay::Create (lib=..., display=<optimized out>, isWarp=isWarp@entry=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:684 #10 0x0000007ff23974c4 in mozilla::gl::GetAndInitDisplay (egl=..., displayType=displayType@entry=0x0, display=<optimized out>, display@entry=0x1) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:154 #11 0x0000007ff2397a34 in mozilla::gl::GLLibraryEGL::CreateDisplay ( this=this@entry=0x7ed01a2660, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7f2d90af50, aDisplay=aDisplay@entry=0x1) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:817 #12 0x0000007ff2397e1c in mozilla::gl::GLLibraryEGL::Init ( this=this@entry=0x7ed01a2660, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7f2d90af50, aDisplay=aDisplay@entry=0x1) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:504 #13 0x0000007ff2398b48 in mozilla::gl::GLContextProviderEGL:: CreateWrappingExisting (aContext=0x7ed00042f0, aSurface=0x5555985ee0, aDisplay=0x1) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1008 #14 0x0000007ff4c589ac in mozilla::embedlite::nsWindow::GetGLContext ( this=this@entry=0x7fb8b7b940) at ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.cpp:405 #15 0x0000007ff4c58b78 in mozilla::embedlite::nsWindow::GetNativeData ( this=0x7fb8b7b940, aDataType=12) at ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.cpp:173 #16 0x0000007ff24120ac in mozilla::layers::CompositorOGL::CreateContext ( this=this@entry=0x7ed01a22a0) at ${PROJECT}/gecko-dev/gfx/layers/opengl/CompositorOGL.cpp:232 #17 0x0000007ff2427964 in mozilla::layers::CompositorOGL::Initialize ( this=0x7ed01a22a0, out_failureReason=0x7f2d90b510) at ${PROJECT}/gecko-dev/gfx/layers/opengl/CompositorOGL.cpp:387 #18 0x0000007ff253d6f4 in mozilla::layers::CompositorBridgeParent:: NewCompositor (this=this@entry=0x7fb8bdec20, aBackendHints=...) at ${PROJECT}/gecko-dev/gfx/layers/ipc/CompositorBridgeParent.cpp:1493There are a lot of uninteresting allocator calls here. We don't get to the interesting bit until frame 13, below which we have the following sequence of calls:
- GLContextProviderEGL::CreateWrappingExisting()
- nsWindow::GetGLContext()
- nsWindow::GetNativeData()
- CompositorOGL::CreateContext()
- CompositorOGL::Initialize()
- CompositorBridgeParent::NewCompositor()
Thread 10 "GeckoWorkerThre" hit Breakpoint 1, mozilla::gl::EglDisplay: :EglDisplay (this=0x7fb9ec51e0, lib=..., disp=0x1, isWarp=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:689 689 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) bt #0 mozilla::gl::EglDisplay::EglDisplay (this=0x7fb9ec51e0, lib=..., disp=0x1, isWarp=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:689 #1 0x0000007ff2397378 in __gnu_cxx::new_allocator<mozilla::gl::EglDisplay>:: construct<mozilla::gl::EglDisplay, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__p=0x7fb9ec51e0, this=<optimized out>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/new:169 #2 std::allocator_traits<std::allocator<mozilla::gl::EglDisplay> >:: construct<mozilla::gl::EglDisplay, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__p=0x7fb9ec51e0, __a=...) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/alloc_traits.h:475 #3 std::_Sp_counted_ptr_inplace<mozilla::gl::EglDisplay, std:: allocator<mozilla::gl::EglDisplay>, (__gnu_cxx::_Lock_policy)2>:: _Sp_counted_ptr_inplace<mozilla::gl::EglDisplay::PrivateUseOnly, mozilla:: gl::GLLibraryEGL&, void* const&, bool const&> (__a=..., this=0x7fb9ec51d0) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr_base.h:545 #4 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<mozilla::gl: :EglDisplay, std::allocator<mozilla::gl::EglDisplay>, mozilla::gl:: EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__a=..., __p=<synthetic pointer>: <optimized out>, this=<synthetic pointer>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr_base.h:677 #5 std::__shared_ptr<mozilla::gl::EglDisplay, (__gnu_cxx::_Lock_policy)2>:: __shared_ptr<std::allocator<mozilla::gl::EglDisplay>, mozilla::gl:: EglDisplay::PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> (__tag=..., this=<synthetic pointer>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr_base.h:1342 #6 std::shared_ptr<mozilla::gl::EglDisplay>::shared_ptr<std::allocator<mozilla: :gl::EglDisplay>, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl:: GLLibraryEGL&, void* const&, bool const&> (__tag=..., this=<synthetic pointer>) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr.h:359 #7 std::allocate_shared<mozilla::gl::EglDisplay, std::allocator<mozilla::gl:: EglDisplay>, mozilla::gl::EglDisplay::PrivateUseOnly, mozilla::gl:: GLLibraryEGL&, void* const&, bool const&> (__a=...) at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr.h:706 #8 std::make_shared<mozilla::gl::EglDisplay, mozilla::gl::EglDisplay:: PrivateUseOnly, mozilla::gl::GLLibraryEGL&, void* const&, bool const&> () at /srv/mer/toolings/SailfishOS-4.5.0.18/opt/cross/aarch64-meego-linux-gnu/ include/c++/8.3.0/bits/shared_ptr.h:722 #9 mozilla::gl::EglDisplay::Create (lib=..., display=<optimized out>, isWarp=isWarp@entry=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:684 #10 0x0000007ff23974c4 in mozilla::gl::GetAndInitDisplay (egl=..., displayType=displayType@entry=0x0, display=<optimized out>, display@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:154 #11 0x0000007ff2397a34 in mozilla::gl::GLLibraryEGL::CreateDisplay ( this=this@entry=0x7fb99b9f30, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7fde768ac8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:817 #12 0x0000007ff2397e1c in mozilla::gl::GLLibraryEGL::Init ( this=this@entry=0x7fb99b9f30, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7fde768ac8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:504 #13 0x0000007ff2398664 in mozilla::gl::GLLibraryEGL::Create ( out_failureId=out_failureId@entry=0x7fde768ac8) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:345 #14 0x0000007ff23987bc in mozilla::gl::DefaultEglLibrary ( out_failureId=out_failureId@entry=0x7fde768ac8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1307 #15 0x0000007ff23a9ec8 in mozilla::gl::DefaultEglDisplay ( out_failureId=0x7fde768ac8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextEGL.h:29 #16 mozilla::gl::GLContextProviderEGL::CreateHeadless (desc=..., desc@entry=<error reading variable: value has been optimized out>, out_failureId=0x7fde768ac8, out_failureId@entry=<error reading variable: value has been optimized out>) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1248The interesting parts of this distinct backtrace start at frame 12:
- GLLibraryEGL::Init()
- GLLibraryEGL::Create()
- DefaultEglLibrary()
- DefaultEglDisplay()
- GLContextProviderEGL::CreateHeadless()
That's my hypothesis anyway. It turns out I'm nearly right, but not quite. After stepping through the methods seen in the two backtraces above, it becomes clear that it's not the context that's being duplicated, but rather the GLLibraryEGL. The reason is that in all the mess of trying to figure out how to get the WebView working alongside WebGL, I ended up with two different ways to create the library.
We can see the result of this by adding a breakpoint to EglDisplay::Create(). A pointer to the library that requested it is passed in as a parameter, so we can query this parameter to check that only one library is in use at any one time.
The following two cases are from the same execution of ESR 91. Here's the first hit. Notice that the lib parameter is pointing to memory location 0x7ed01a21b0:
Thread 39 "Compositor" hit Breakpoint 2, mozilla::gl::EglDisplay:: Create (lib=..., display=0x1, isWarp=isWarp@entry=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:664 664 ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp: No such file or directory. (gdb) p lib $1 = (mozilla::gl::GLLibraryEGL &) @0x7ed01a21b0: {mRefCnt = {static isThreadSafe = true, mValue = {<std::__atomic_base<unsigned long>> = { static _S_alignment = 8, _M_i = 1}, static is_always_lock_free = true}}, mEGLLibrary = 0x7ed01a23e0, mGLLibrary = 0x7ed01a2470, mIsANGLE = false, mAvailableExtensions = std::bitset, mDefaultDisplay = std::weak_ptr<mozilla:: gl::EglDisplay> (empty) = {get() = 0x0}, mActiveDisplays = std::unordered_map with 0 elements, mSymbols = {fGetProcAddress = 0x7fef07cdc0 <eglGetProcAddress>, fGetDisplay = 0x7fef07d440 <eglGetDisplay>, fGetPlatformDisplay = 0x0, fTerminate = 0x7fef07d480 <eglTerminate>, fGetCurrentSurface = 0x7fef07e010 <eglGetCurrentSurface>, fGetCurrentContext = 0x7fef07dfb0 <eglGetCurrentContext>, fMakeCurrent = 0x7fef07df18 <eglMakeCurrent>, fDestroyContext = 0x7fef07de90 <eglDestroyContext>, fCreateContext = 0x7fef07cac8 <eglCreateContext>, fDestroySurface = 0x7fef07d0b8 <eglDestroySurface>, fCreateWindowSurface = 0x7fef07d500 <eglCreateWindowSurface>, fCreatePbufferSurface = 0x7fef07d940 <eglCreatePbufferSurface>, fCreatePbufferFromClientBuffer = 0x7fef07dc20 <eglCreatePbufferFromClientBuffer>, fCreatePixmapSurface = 0x7fef07ca30 <eglCreatePixmapSurface>, fBindAPI = 0x7fef07da80 <eglBindAPI>, fInitialize = 0x7fef07d618 <eglInitialize>, fChooseConfig = 0x7fef07d7e0 <eglChooseConfig>, fGetError = 0x7fef07c9a0 <eglGetError>, fGetConfigAttrib = 0x7fef07d898 <eglGetConfigAttrib>, fGetConfigs = 0x7fef07d738 <eglGetConfigs>, fWaitNative = 0x7fef07e1f8 <eglWaitNative>, fSwapBuffers = 0x7fef07d248 <eglSwapBuffers>, fCopyBuffers = 0x7fef07e278 <eglCopyBuffers>, fQueryString = 0x7fef07d6b0 <eglQueryString>, fQueryContext = 0x7fef07e0f0 <eglQueryContext>, fBindTexImage = 0x7fef07dd60 <eglBindTexImage>, fReleaseTexImage = 0x7fef07ddf8 <eglReleaseTexImage>, fSwapInterval = 0x7fef07cc18 <eglSwapInterval>, fCreateImageKHR = 0x0, fDestroyImageKHR = 0x0, fQuerySurface = 0x7fef07d9d8 <eglQuerySurface>, fQuerySurfacePointerANGLE = 0x0, fCreateSyncKHR = 0x0, fDestroySyncKHR = 0x0, fClientWaitSyncKHR = 0x0, fGetSyncAttribKHR = 0x0, fWaitSyncKHR = 0x0, fDupNativeFenceFDANDROID = 0x0, fCreateStreamKHR = 0x0, fDestroyStreamKHR = 0x0, fQueryStreamKHR = 0x0, fStreamConsumerGLTextureExternalKHR = 0x0, fStreamConsumerAcquireKHR = 0x0, fStreamConsumerReleaseKHR = 0x0, fQueryDisplayAttribEXT = 0x0, fQueryDeviceAttribEXT = 0x0, fStreamConsumerGLTextureExternalAttribsNV = 0x0, fCreateStreamProducerD3DTextureANGLE = 0x0, fStreamPostD3DTextureANGLE = 0x0, fCreateDeviceANGLE = 0x0, fReleaseDeviceANGLE = 0x0, fSwapBuffersWithDamage = 0x0, fSetDamageRegion = 0x0, fGetNativeClientBufferANDROID = 0x0}} (gdb)For the next case of this call the lib parameter is pointing elsewhere, to memory location 0x7fb9350e50:
Thread 10 "GeckoWorkerThre" hit Breakpoint 2, mozilla::gl::EglDisplay: :Create (lib=..., display=0x1, isWarp=isWarp@entry=false) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:664 664 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) p lib $5 = (mozilla::gl::GLLibraryEGL &) @0x7fb9350e50: {mRefCnt = {static isThreadSafe = true, mValue = {<std::__atomic_base<unsigned long>> = { static _S_alignment = 8, _M_i = 1}, cstatic is_always_lock_free = true}}, mEGLLibrary = 0x7ed01a23e0, mGLLibrary = 0x7ed01a2470, mIsANGLE = false, mAvailableExtensions = std::bitset, mDefaultDisplay = std::weak_ptr<mozilla:: gl::EglDisplay> (empty) = {get() = 0x0}, mActiveDisplays = std::unordered_map with 0 elements, mSymbols = {fGetProcAddress = 0x7fef07cdc0 <eglGetProcAddress>, fGetDisplay = 0x7fef07d440 <eglGetDisplay>, fGetPlatformDisplay = 0x0, fTerminate = 0x7fef07d480 <eglTerminate>, fGetCurrentSurface = 0x7fef07e010 <eglGetCurrentSurface>, fGetCurrentContext = 0x7fef07dfb0 <eglGetCurrentContext>, fMakeCurrent = 0x7fef07df18 <eglMakeCurrent>, fDestroyContext = 0x7fef07de90 <eglDestroyContext>, fCreateContext = 0x7fef07cac8 <eglCreateContext>, fDestroySurface = 0x7fef07d0b8 <eglDestroySurface>, fCreateWindowSurface = 0x7fef07d500 <eglCreateWindowSurface>, fCreatePbufferSurface = 0x7fef07d940 <eglCreatePbufferSurface>, fCreatePbufferFromClientBuffer = 0x7fef07dc20 <eglCreatePbufferFromClientBuffer>, fCreatePixmapSurface = 0x7fef07ca30 <eglCreatePixmapSurface>, fBindAPI = 0x7fef07da80 <eglBindAPI>, fInitialize = 0x7fef07d618 <eglInitialize>, fChooseConfig = 0x7fef07d7e0 <eglChooseConfig>, fGetError = 0x7fef07c9a0 <eglGetError>, fGetConfigAttrib = 0x7fef07d898 <eglGetConfigAttrib>, fGetConfigs = 0x7fef07d738 <eglGetConfigs>, fWaitNative = 0x7fef07e1f8 <eglWaitNative>, fSwapBuffers = 0x7fef07d248 <eglSwapBuffers>, fCopyBuffers = 0x7fef07e278 <eglCopyBuffers>, fQueryString = 0x7fef07d6b0 <eglQueryString>, fQueryContext = 0x7fef07e0f0 <eglQueryContext>, fBindTexImage = 0x7fef07dd60 <eglBindTexImage>, fReleaseTexImage = 0x7fef07ddf8 <eglReleaseTexImage>, fSwapInterval = 0x7fef07cc18 <eglSwapInterval>, fCreateImageKHR = 0x0, fDestroyImageKHR = 0x0, fQuerySurface = 0x7fef07d9d8 <eglQuerySurface>, fQuerySurfacePointerANGLE = 0x0, fCreateSyncKHR = 0x0, fDestroySyncKHR = 0x0, fClientWaitSyncKHR = 0x0, fGetSyncAttribKHR = 0x0, fWaitSyncKHR = 0x0, fDupNativeFenceFDANDROID = 0x0, fCreateStreamKHR = 0x0, fDestroyStreamKHR = 0x0, fQueryStreamKHR = 0x0, fStreamConsumerGLTextureExternalKHR = 0x0, fStreamConsumerAcquireKHR = 0x0, fStreamConsumerReleaseKHR = 0x0, fQueryDisplayAttribEXT = 0x0, fQueryDeviceAttribEXT = 0x0, fStreamConsumerGLTextureExternalAttribsNV = 0x0, fCreateStreamProducerD3DTextureANGLE = 0x0, fStreamPostD3DTextureANGLE = 0x0, fCreateDeviceANGLE = 0x0, fReleaseDeviceANGLE = 0x0, fSwapBuffersWithDamage = 0x0, fSetDamageRegion = 0x0, fGetNativeClientBufferANDROID = 0x0}} (gdb)As we continue through the execution we find that both of these two GLLibraryEGL instances are in use throughout. Both of the ways to create an instance of GLLibraryEGL are in GLContextProviderEGL. The first happens when a call is made to GLContextProviderEGL::CreateWrappingExisting(). It's clear from the code that a new copy of the library will be created each time this method is called.
The second happens when there's a call to DefaultEglLibrary(). In this case there's an interesting twist, because DefaultEglLibrary() stores the result in gDefaultEglLibrary which is a static variable. A new instance is constructed only if this is set to null, essentially making GLLibraryEGL a singleton when created via this route.
Since they're both in the same file, the solution I've come up with is to make CreateWrappingExisting() use gDefaultEglLibrary as well. That way, whichever is called first will create the canonical instance of the library. Any subsequent calls will reuse the same instance.
Testing this out with the browser I get good results. It's clear that eglTerminate() is no longer being called partway through the video; in fact, there's now no construction of destruction of a new display at all while the video is playing. I've not yet managed to trigger a crash, but will have to use the browser a bit more before I feel more confident about this.
Just as importantly, I've tested the browser and the WebView app with both standard browsing and pages that contain WebGL. So far the results have been stable as well.
If this really has prevented a crash then this will be a great result. I have a suspicion that something similar was happening on ESR 78, which would periodically crash when videos were playing. This was problematic on many pages with embedded videos. Often it would appear that the browser was crashing at random. If this fixes these crashes, the browser will be far more enjoyable to use in general.
None of these changes will fix the discolouration issue and I won't have time to do more work on that today. But I will certainly return to it tomorrow with a fresh pair of eyes.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comments
Uncover Disqus comments