List items
Items from the current list are shown below.
Gecko
19 Mar 2024 : Day 190 #
Yesterday I collected valgrind logs from ESR 78 and ESR 91. Even though both logs contain a huge number of errors, I whittled the interesting ones down to just two.
First there's an invalid read size coming from GetAndInitDisplay(). This should be straightforward to track down.
The reason the for the EGLDisplay being deleted is because of this call here:
In ESR 78 the logic around EglDisplay didn't exist. Back then there was just an EGLDisplay value, which is a pointer to an opaque EGL structure and which could be passed around everywhere. In ESR 91 an effort has been made to generalise this, so that multiple such displays can be handled simultaneously. As part of this move the value was wrapped in the similarly — but not identically — named EglDisplay, which contains an EGLDisplay pointer along with some other values and helper methods.
It looks like there's something going wrong with this. Disentangling it could be horrific, but there is something that works in our favour, which is that the two calls don't happen to far apart from one another.
In fact, the call stack for both seems to originate in this method:
The actual failure is happening inside libEGL. I looked at the code there and, to be honest, it's not clear to me what the underlying reason is. Nevertheless, it's clear that there's something going wrong here in the gecko code that can be fixed and which should address it.
When the WebView is initialised it has to create an EGL display. Starting at the DefaultEglDisplay() method I copied out above there's a problem and the problem is that the display is created not once, but twice. To understand this, we have to note the fact that some time ago inside the GLLibraryEGL::Init() method I added the following code:
This piece of code is a problem for two reasons, which become clear when we follow the flow through DefaultEglDisplay():
I'm about to make this change when something else peculiar jumps out at me. The call to GLLibraryEGL::DefaultDisplay() checks whether the value of mDefaultDisplay is valid and only calls CreateDisplay() to create a new one if it's not. That's strange, since the value should already have been set in GLLibraryEGL::Init(). The exception is if CreateDisplay() returns null. It is possible that this is what's causing this issue. Let's check:
I've deleted the five lines. I've set the build running.
While that chugs away, let's briefly look at the other potentially relevant issue that valgrind threw up. This issue relates to APZCTreeManager::PrintAPZCInfo(). Although that doesn't sound especially relevant, the code is from the Compositor thread, so may be worth looking in to just in case.
It's been a long one today and I apologise for that. Sometimes it takes a lot to unravel the code and structure my thoughts around it. The key question now is whether the small change I've made will make any difference to the rendering. We'll find that out tomorrow.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
First there's an invalid read size coming from GetAndInitDisplay(). This should be straightforward to track down.
==29044== Thread 32 Compositor: ==29044== Invalid read of size 8 ==29044== at 0xCCF62E0: hybris_egl_display_get_mapping (in /usr/lib64/ libEGL.so.1.0.0) ==29044== by 0xCCF63BB: ??? (in /usr/lib64/libEGL.so.1.0.0) ==29044== by 0x758848B: fGetDisplay (GLLibraryEGL.h:193) ==29044== by 0x758848B: mozilla::gl::GetAndInitDisplay(mozilla::gl:: GLLibraryEGL&, void*, void*) (GLLibraryEGL.cpp:151) ==29044== by 0x75889C7: mozilla::gl::GLLibraryEGL::CreateDisplay(bool, nsTSubstring<char>*, void*) (GLLibraryEGL.cpp:813) ==29044== by 0x7589917: mozilla::gl::GLLibraryEGL::DefaultDisplay( nsTSubstring<char>*) (GLLibraryEGL.cpp:745) ==29044== by 0x759AFBF: DefaultEglDisplay (GLContextEGL.h:33) ==29044== by 0x759AFBF: mozilla::gl::GLContextProviderEGL::CreateHeadless( mozilla::gl::GLContextCreateDesc const&, nsTSubstring<char>*) ( GLContextProviderEGL.cpp:1246) ==29044== by 0x759B89B: mozilla::gl::GLContextProviderEGL::CreateOffscreen( mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gl:: SurfaceCaps const&, mozilla::gl::CreateContextFlags, nsTSubstring<char>*) ( GLContextProviderEGL.cpp:1288) ==29044== by 0x76042BB: mozilla::layers::CompositorOGL::CreateContext() ( CompositorOGL.cpp:256) [...] ==29044== Address 0x14c39ef0 is 0 bytes inside a block of size 40 free'd ==29044== at 0x484EAD8: operator delete(void*, unsigned long) ( vg_replace_malloc.c:935) ==29044== by 0xCCF64BB: eglTerminate (in /usr/lib64/libEGL.so.1.0.0) ==29044== by 0x7587473: fTerminate (GLLibraryEGL.h:234) ==29044== by 0x7587473: fTerminate (GLLibraryEGL.h:639) ==29044== by 0x7587473: mozilla::gl::EglDisplay::~EglDisplay() ( GLLibraryEGL.cpp:734) ==29044== by 0x758752B: destroy<mozilla::gl::EglDisplay> (new_allocator.h: 140) ==29044== by 0x758752B: destroy<mozilla::gl::EglDisplay> (alloc_traits.h:487) ==29044== by 0x758752B: std::_Sp_counted_ptr_inplace<mozilla::gl:: EglDisplay, std::allocator<mozilla::gl::EglDisplay>, (__gnu_cxx:: _Lock_policy)2>::_M_dispose() (shared_ptr_base.h:554) ==29044== by 0x7575F33: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>:: _M_release() (shared_ptr_base.h:155) ==29044== by 0x758937F: ~__shared_count (shared_ptr_base.h:728) ==29044== by 0x758937F: ~__shared_ptr (shared_ptr_base.h:1167) ==29044== by 0x758937F: ~shared_ptr (shared_ptr.h:103) ==29044== by 0x758937F: mozilla::gl::GLLibraryEGL::Init(bool, nsTSubstring<char>*, void*) (GLLibraryEGL.cpp:504) ==29044== by 0x75895F7: mozilla::gl::GLLibraryEGL::Create( nsTSubstring<char>*) (GLLibraryEGL.cpp:345) ==29044== by 0x758974F: mozilla::gl::DefaultEglLibrary(nsTSubstring<char>*) ( GLContextProviderEGL.cpp:1331) ==29044== by 0x759AFAB: DefaultEglDisplay (GLContextEGL.h:29) ==29044== by 0x759AFAB: mozilla::gl::GLContextProviderEGL::CreateHeadless( mozilla::gl::GLContextCreateDesc const&, nsTSubstring<char>*) ( GLContextProviderEGL.cpp:1246) ==29044== by 0x759B89B: mozilla::gl::GLContextProviderEGL::CreateOffscreen( mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gl:: SurfaceCaps const&, mozilla::gl::CreateContextFlags, nsTSubstring<char>*) ( GLContextProviderEGL.cpp:1288) ==29044== by 0x76042BB: mozilla::layers::CompositorOGL::CreateContext() ( CompositorOGL.cpp:256) [...] ==29044== Block was alloc'd at ==29044== at 0x484BF24: operator new(unsigned long) (vg_replace_malloc.c:422) ==29044== by 0x14E619FB: waylandws_GetDisplay (in /usr/lib64/libhybris/ eglplatform_wayland.so) ==29044== by 0xCCF63C7: ??? (in /usr/lib64/libEGL.so.1.0.0) ==29044== by 0x14960F73: ??? (in /usr/lib64/qt5/plugins/ wayland-graphics-integration-client/libwayland-egl.so) ==29044== by 0x14778947: QtWaylandClient::QWaylandIntegration:: initializeClientBufferIntegration() (in /usr/lib64/ libQt5WaylandClient.so.5.6.3) [...]Let's dig into this one first. The code that's identified in the error output is the following:
static std::shared_ptr<EglDisplay> GetAndInitDisplay(GLLibraryEGL& egl, void* displayType, EGLDisplay display = EGL_NO_DISPLAY) { if (display == EGL_NO_DISPLAY) { display = egl.fGetDisplay(displayType); } if (!display) return nullptr; return EglDisplay::Create(egl, display, false); }It's the call to egl.fGetDisplay() that we can see towards the top of the error backtrace. But note that there are two call stacks in the output. The first is for the data being read that leads to the code above, the second is for where the memory being accessed was previously deleted:
==29044== Address 0x14c39ef0 is 0 bytes inside a block of size 40 free'd ==29044== at 0x484EAD8: operator delete(void*, unsigned long) ( vg_replace_malloc.c:935)The error itself is "Invalid read of size 8" which makes sense: we're reading from memory that's no longer allocated. Checking the call stack, here's the code (it's specifically the call to fTerminate()) that frees the memory that's subsequently read:
EglDisplay::~EglDisplay() { fTerminate(); mLib->mActiveDisplays.erase(mDisplay); }It looks like value of egl that's being passed into GetAndInitDisplay() is an instance of EglDisplay that has already been deleted. Not good.
The reason the for the EGLDisplay being deleted is because of this call here:
std::shared_ptr<EglDisplay> defaultDisplay = CreateDisplay(forceAccel, out_failureId, aDisplay);It's presumably not the call to CreateDisplay() that's causing the deletion, but rather the replacement of the EGlDisplay contained in the shared_ptr with the one that's returned from it.
In ESR 78 the logic around EglDisplay didn't exist. Back then there was just an EGLDisplay value, which is a pointer to an opaque EGL structure and which could be passed around everywhere. In ESR 91 an effort has been made to generalise this, so that multiple such displays can be handled simultaneously. As part of this move the value was wrapped in the similarly — but not identically — named EglDisplay, which contains an EGLDisplay pointer along with some other values and helper methods.
It looks like there's something going wrong with this. Disentangling it could be horrific, but there is something that works in our favour, which is that the two calls don't happen to far apart from one another.
In fact, the call stack for both seems to originate in this method:
inline std::shared_ptr<EglDisplay> DefaultEglDisplay( nsACString* const out_failureId) { const auto lib = DefaultEglLibrary(out_failureId); if (!lib) { return nullptr; } return lib->DefaultDisplay(out_failureId); }The call to DefaultEglLibrary() in this code ends up calling eglTerminate() which deletes the value. Then the call to DefaultDisplay() attempts to read the value in again. It reads the value that was just deleted, whereas it should — presumably — be using the newly created value.
The actual failure is happening inside libEGL. I looked at the code there and, to be honest, it's not clear to me what the underlying reason is. Nevertheless, it's clear that there's something going wrong here in the gecko code that can be fixed and which should address it.
When the WebView is initialised it has to create an EGL display. Starting at the DefaultEglDisplay() method I copied out above there's a problem and the problem is that the display is created not once, but twice. To understand this, we have to note the fact that some time ago inside the GLLibraryEGL::Init() method I added the following code:
std::shared_ptr<EglDisplay> defaultDisplay = CreateDisplay(forceAccel, out_failureId, aDisplay); if (!defaultDisplay) { return false; } mDefaultDisplay = defaultDisplay;Why did I add this? It was to replace the following code which had been removed from ESR 78 by D85496 in order to "Support multiple EglDisplays per GLLibraryEGL":
mEGLDisplay = CreateDisplay(forceAccel, gfxInfo, out_failureId, aDisplay); if (!mEGLDisplay) { return false; }It seemed natural, when reversing some of the changes moving from ESR 78 to ER 91, to reintroduce the creation of the display, albeit it's now an EglDisplay wrapper rather than the EGLDisplay itself. I remember these changes being a huge source of angst back when I made them around Day 77.
This piece of code is a problem for two reasons, which become clear when we follow the flow through DefaultEglDisplay():
- DefaultEglLibrary() is called. At this stage things are still being initialised and so there is no GLLibraryEGL just yet. As a result the code is called to create it.
- This leads to a call to GLLibraryEGL::Init() which initialises the library.
- Inside this Init() method the code I added to call CreateDisplay() is called, the intention being to mimic the behaviour of ESR 78.
- defaultDisplay is a shared pointer so when the assignment happens there's a call made to the EglDisplay destructor. I wasn't expecting this, but it seems to be borne out by what the debugger shows. This is the first problem: there should be no construction or destruction happening here apart from via the call to CreateDisplay().
- Execution continues until we return back into the body of the DefaultEglDisplay() method. The local lib variable now contains a usable GLLibraryEGL, off of which the DefaultDisplay() method is called.
- DefaultDisplay() then goes ahead and calls CreateDisplay() all over again in order to set up the default EGL display. This is the second problem: the display has already been created by this point; we shouldn't be doing it again.
(gdb) b EglDisplay::~EglDisplay (gdb) b GetAndInitDisplay (gdb) r [...] Thread 38 "Compositor" hit Breakpoint 4, mozilla::gl:: GetAndInitDisplay (egl=..., displayType=displayType@entry=0x0, display=display@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:149 149 ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp: No such file or directory. (gdb) bt #0 mozilla::gl::GetAndInitDisplay (egl=..., displayType=displayType@entry=0x0, display=display@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:149 #1 0x0000007ff111c9c8 in mozilla::gl::GLLibraryEGL::CreateDisplay ( this=this@entry=0x7ed80031c0, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7fd40eb1c8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:813 #2 0x0000007ff111cdb0 in mozilla::gl::GLLibraryEGL::Init ( this=this@entry=0x7ed80031c0, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7fd40eb1c8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:504 #3 0x0000007ff111d5f8 in mozilla::gl::GLLibraryEGL::Create ( out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:345 #4 0x0000007ff111d750 in mozilla::gl::DefaultEglLibrary ( out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1331 [...] #29 0x0000007ff6a0289c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/ clone.S:78 (gdb) c Continuing. Thread 38 "Compositor" hit Breakpoint 3, mozilla::gl::EglDisplay:: ~EglDisplay (this=0x7ed8003550, __in_chrg=<optimized out>) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:733 733 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) bt #0 mozilla::gl::EglDisplay::~EglDisplay (this=0x7ed8003550, __in_chrg=<optimized out>) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:733 #1 0x0000007ff111b52c in __gnu_cxx::new_allocator<mozilla::gl::EglDisplay>:: destroy<mozilla::gl::EglDisplay> (__p=<optimized out>, this=<optimized out>) at include/c++/8.3.0/ext/new_allocator.h:140 #2 std::allocator_traits<std::allocator<mozilla::gl::EglDisplay> >:: destroy<mozilla::gl::EglDisplay> (__p=<optimized out>, __a=...) at include/c++/8.3.0/bits/alloc_traits.h:487 #3 std::_Sp_counted_ptr_inplace<mozilla::gl::EglDisplay, std:: allocator<mozilla::gl::EglDisplay>, (__gnu_cxx::_Lock_policy)2>::_M_dispose ( this=<optimized out>) at include/c++/8.3.0/bits/shared_ptr_base.h:554 #4 0x0000007ff1109f34 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>:: _M_release (this=0x7ed8003540) at include/c++/8.3.0/ext/atomicity.h:69 #5 0x0000007ff111d380 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>:: ~__shared_count (this=0x7fd40ea790, __in_chrg=<optimized out>) at include/c++/8.3.0/bits/shared_ptr_base.h:1167 #6 std::__shared_ptr<mozilla::gl::EglDisplay, (__gnu_cxx::_Lock_policy)2>:: ~__shared_ptr (this=0x7fd40ea788, __in_chrg=<optimized out>) at include/c++/8.3.0/bits/shared_ptr_base.h:1167 #7 std::shared_ptr<mozilla::gl::EglDisplay>::~shared_ptr (this=0x7fd40ea788, __in_chrg=<optimized out>) at include/c++/8.3.0/bits/shared_ptr.h:103 #8 mozilla::gl::GLLibraryEGL::Init (this=this@entry=0x7ed80031c0, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7fd40eb1c8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/ GLLibraryEGL.cpp:504 #9 0x0000007ff111d5f8 in mozilla::gl::GLLibraryEGL::Create ( out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:345 #10 0x0000007ff111d750 in mozilla::gl::DefaultEglLibrary ( out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1331 [...] #35 0x0000007ff6a0289c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/ clone.S:78 (gdb) c Continuing. Thread 38 "Compositor" hit Breakpoint 4, mozilla::gl:: GetAndInitDisplay (egl=..., displayType=displayType@entry=0x0, display=display@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:149 149 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) bt #0 mozilla::gl::GetAndInitDisplay (egl=..., displayType=displayType@entry=0x0, display=display@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:149 #1 0x0000007ff111c9c8 in mozilla::gl::GLLibraryEGL::CreateDisplay ( this=this@entry=0x7ed80031c0, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7fd40eb1c8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:813 #2 0x0000007ff111d918 in mozilla::gl::GLLibraryEGL::DefaultDisplay ( this=0x7ed80031c0, out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:745 #3 0x0000007ff112efc0 in mozilla::gl::DefaultEglDisplay ( out_failureId=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextEGL.h:33 #4 mozilla::gl::GLContextProviderEGL::CreateHeadless (desc=..., out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1246 #5 0x0000007ff112f89c in mozilla::gl::GLContextProviderEGL::CreateOffscreen ( size=..., minCaps=..., flags=flags@entry=mozilla::gl::CreateContextFlags::REQUIRE_COMPAT_PROFILE, out_failureId=out_failureId@entry=0x7fd40eb1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1288 #6 0x0000007ff11982bc in mozilla::layers::CompositorOGL::CreateContext ( this=this@entry=0x7ed8002ed0) at ${PROJECT}/gecko-dev/gfx/layers/opengl/CompositorOGL.cpp:254 [...] #27 0x0000007ff6a0289c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/ clone.S:78 (gdb) c [...]Assuming that all of the above is correct, on the face of it there seems to be a simple solution, which is to remove the code I added to GLLibraryEGL::Init() which calls CreateDisplay(). An important consequence of this is that mDefaultDisplay will then no longer be set. We can't just set this inside DefaultEglDisplay() because that's in the wrong context. But it's already being set by GLLibraryEGL::DefaultDisplay(), which is in the right context (it's part of the GLLibraryEGL) and should be enough already.
I'm about to make this change when something else peculiar jumps out at me. The call to GLLibraryEGL::DefaultDisplay() checks whether the value of mDefaultDisplay is valid and only calls CreateDisplay() to create a new one if it's not. That's strange, since the value should already have been set in GLLibraryEGL::Init(). The exception is if CreateDisplay() returns null. It is possible that this is what's causing this issue. Let's check:
(gdb) b GLLibraryEGL.cpp:504 Breakpoint 5 at 0x7ff111cd98: file ${PROJECT}/gecko-dev/gfx/gl/ GLLibraryEGL.cpp, line 504. (gdb) r [...] Thread 37 "Compositor" hit Breakpoint 6, mozilla::gl::GLLibraryEGL:: Init (this=this@entry=0x7ee00031c0, forceAccel=forceAccel@entry=false, out_failureId=out_failureId@entry=0x7f1f7bf1c8, aDisplay=aDisplay@entry=0x0) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:504 504 ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp: No such file or directory. (gdb) p defaultDisplay $14 = std::shared_ptr<mozilla::gl::EglDisplay> (use count -134204000, weak count 125) = {get() = 0x0} (gdb) p mDefaultDisplay $15 = std::weak_ptr<mozilla::gl::EglDisplay> (empty) = {get() = 0x0} (gdb) n 505 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) p defaultDisplay $16 = std::shared_ptr<mozilla::gl::EglDisplay> (use count 1, weak count 1) = {get() = 0x7ee0003570} (gdb) p mDefaultDisplay $17 = std::weak_ptr<mozilla::gl::EglDisplay> (empty) = {get() = 0x0} (gdb) n 508 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) n 510 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) p mDefaultDisplay $18 = std::weak_ptr<mozilla::gl::EglDisplay> (use count 1, weak count 2) = {get( ) = 0x7ee0003570} (gdb) p this $19 = (mozilla::gl::GLLibraryEGL * const) 0x7ee00031c0As we can see that's not what's happening. The actual reason is that when it comes to the call to GLLibraryEGL::DefaultDisplay() the pointer has expired and hence it's created all over again.
(gdb) b GLLibraryEGL::DefaultDisplay Breakpoint 7 at 0x7ff111d7f4: file ${PROJECT}/gecko-dev/gfx/gl/ GLLibraryEGL.cpp, line 741. (gdb) c Continuing. Thread 37 "Compositor" hit Breakpoint 7, mozilla::gl::GLLibraryEGL:: DefaultDisplay (this=0x7ee00031c0, out_failureId=out_failureId@entry=0x7f1f7bf1c8) at ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp:741 741 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) n 742 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb) p this $20 = (mozilla::gl::GLLibraryEGL * const) 0x7ee00031c0 (gdb) p mDefaultDisplay $21 = std::weak_ptr<mozilla::gl::EglDisplay> (expired, weak count 1) = {get() = 0x7ee0003570} (gdb) p ret $22 = std::shared_ptr<mozilla::gl::EglDisplay> (use count 1634882661, weak count 25954) = {get() = 0x7f1f7bf160} (gdb) n 745 in ${PROJECT}/gecko-dev/gfx/gl/GLLibraryEGL.cpp (gdb)The flow all makes sense now and the solution to all of these issues is the same: remove the code I added to Init(). As long as nothing is attempting to access the mDefaultDisplay before GLLibraryEGL::DefaultDisplay() is called, it should all work out fine.
I've deleted the five lines. I've set the build running.
While that chugs away, let's briefly look at the other potentially relevant issue that valgrind threw up. This issue relates to APZCTreeManager::PrintAPZCInfo(). Although that doesn't sound especially relevant, the code is from the Compositor thread, so may be worth looking in to just in case.
==29044== Thread 32 Compositor: ==29044== Mismatched free() / delete / delete [] ==29044== at 0x484E858: operator delete(void*) (vg_replace_malloc.c:923) ==29044== by 0x5CEFFFF: std::__cxx11::basic_stringstream<char, std:: char_traits<char>, std::allocator<char> >::~basic_stringstream() (in /usr/ lib64/libstdc++.so.6.0.25) ==29044== by 0x767C23F: void mozilla::layers::APZCTreeManager:: PrintAPZCInfo<mozilla::layers::LayerMetricsWrapper>(mozilla::layers:: LayerMetricsWrapper const&, mozilla::layers::AsyncPanZoomController const*) (APZCTreeManager.cpp:1014) ==29044== by 0x768DC63: mozilla::layers::HitTestingTreeNode* mozilla::layers: :APZCTreeManager::PrepareNodeForLayer<mozilla::layers::LayerMetricsWrapper>( mozilla::RecursiveMutexAutoLock const&, mozilla::layers:: LayerMetricsWrapper const&, mozilla::layers::FrameMetrics const&, mozilla:: layers::LayersId, mozilla::Maybe<mozilla::layers::ZoomConstraints> const&, mozilla::layers::AncestorTransform const&, mozilla::layers:: HitTestingTreeNode*, mozilla::layers::HitTestingTreeNode*, mozilla::layers:: APZCTreeManager::TreeBuildingState&) (APZCTreeManager.cpp:1323) ==29044== by 0x768E55F: mozilla::layers::APZCTreeManager:: UpdateHitTestingTreeImpl<mozilla::layers::LayerMetricsWrapper>(mozilla:: layers::LayerMetricsWrapper const&, bool, mozilla::layers::LayersId, unsigned int)::{lambda(mozilla::layers::LayerMetricsWrapper)#2}::operator()( mozilla::layers::LayerMetricsWrapper) const (APZCTreeManager.cpp:481) [...] ==29044== Address 0x18c00050 is 0 bytes inside a block of size 513 alloc'd ==29044== at 0x484B7C0: malloc (vg_replace_malloc.c:381) ==29044== by 0x48F61EB: moz_xmalloc (in /usr/lib64/ libqt5embedwidget.so.1.53.9) ==29044== by 0x48F644B: std::__cxx11::basic_string<char, std:: char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) (in /usr/lib64/libqt5embedwidget.so.1.53.9) ==29044== by 0x5CFC2E7: std::__cxx11::basic_string<char, std:: char_traits<char>, std::allocator<char> >::reserve(unsigned long) (in /usr/ lib64/libstdc++.so.6.0.25) ==29044== by 0x5CF14D7: std::__cxx11::basic_stringbuf<char, std:: char_traits<char>, std::allocator<char> >::overflow(int) (in /usr/lib64/ libstdc++.so.6.0.25) ==29044== by 0x5CFA7B7: std::basic_streambuf<char, std::char_traits<char> >:: xsputn(char const*, long) (in /usr/lib64/libstdc++.so.6.0.25) ==29044== by 0x5CEC1AB: std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std:: basic_ostream<char, std::char_traits<char> >&, char const*, long) (in /usr/ lib64/libstdc++.so.6.0.25) ==29044== by 0x765AC0B: operator<< <std::char_traits<char> > (ostream:561) ==29044== by 0x765AC0B: mozilla::layers::operator<<(std::ostream&, mozilla:: layers::ScrollableLayerGuid const&) (ScrollableLayerGuid.cpp:52) ==29044== by 0x767BE3F: void mozilla::layers::APZCTreeManager:: PrintAPZCInfo<mozilla::layers::LayerMetricsWrapper>(mozilla::layers:: LayerMetricsWrapper const&, mozilla::layers::AsyncPanZoomController const*) (APZCTreeManager.cpp:1015) ==29044== by 0x768DC63: mozilla::layers::HitTestingTreeNode* mozilla::layers: :APZCTreeManager::PrepareNodeForLayer<mozilla::layers::LayerMetricsWrapper>( mozilla::RecursiveMutexAutoLock const&, mozilla::layers:: LayerMetricsWrapper const&, mozilla::layers::FrameMetrics const&, mozilla:: layers::LayersId, mozilla::Maybe<mozilla::layers::ZoomConstraints> const&, mozilla::layers::AncestorTransform const&, mozilla::layers:: HitTestingTreeNode*, mozilla::layers::HitTestingTreeNode*, mozilla::layers:: APZCTreeManager::TreeBuildingState&) (APZCTreeManager.cpp:1323) [...]The definition of the APZCTreeManager::PrintAPZCInfo() method that's referred to in the output above looks like this:
template <class ScrollNode> void APZCTreeManager::PrintAPZCInfo(const ScrollNode& aLayer, const AsyncPanZoomController* apzc) { const FrameMetrics& metrics = aLayer.Metrics(); std::stringstream guidStr; guidStr << apzc->GetGuid(); mApzcTreeLog << "APZC " << guidStr.str() << "\tcb=" << metrics.GetCompositionBounds() << "\tsr=" << metrics.GetScrollableRect() << (metrics.IsScrollInfoLayer() ? "\tscrollinfo" : "") << (apzc->HasScrollgrab() ? "\tscrollgrab" : "") << "\t" << aLayer.Metadata().GetContentDescription().get(); }There may be something to do here, but this is so far removed from the rendering pipeline that I don't see how it can be related to the issues I'm trying to fix. So I'm going to ignore this one and potentially come back to it later. I'm thinking there's some work to be done cleaning up the valgrind output and if I do eventually do so, this would fall under that.
It's been a long one today and I apologise for that. Sometimes it takes a lot to unravel the code and structure my thoughts around it. The key question now is whether the small change I've made will make any difference to the rendering. We'll find that out tomorrow.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comments
Uncover Disqus comments