flypig.co.uk

Gecko-dev Diary

Starting in August 2023 I'll be upgrading the Sailfish OS browser from Gecko version ESR 78 to ESR 91. This page catalogues my progress.

Latest code changes are in the gecko-dev sailfishos-esr91 branch.

There is an index of all posts in case you want to jump to a particular day.

Gecko RSS feed Click the icon for the Gecko-dev Diary RSS feed.

Gecko

5 most recent items

24 May 2024 : Day 242 #
It's funny how even when asleep I'm capable of making stupid mistakes. Yesterday I applied patch 0044 in order to restore the OnFirstPaint() trigger functionality. This is critical for getting the offscreen rendering pipeline to work. But as is the case after applying a patch I needed to rebuild the packages before they could be tested on my phone.

I kicked the build off to run overnight and then — d'oh! — promptly sent my laptop to sleep. It doesn't take a genius to realise that the build isn't going to run while the laptop is sleeping.

So here I am this morning without packages to test. At least my laptop is now awake and the build is continuing. But this has lost me the opportunity to do some development this morning.

Instead I've spent the time ordering myself a Jolla C2, the new Sailfish OS phone announced at Jolla Love Day 2 this last Monday. It's an interesting move by Jolla. For years now members of the community have been encouraging Jolla to offer Sailfish OS pre-installed. On the other hand, many in the community are also keen to see Sailfish OS offered on a very specific selection of handsets, be they PinePHones, Fairphones or brands that are more widely available in Europe, especially those known for being more "open".

My own opinion is that this is a good move by Jolla. There is no perfect phone and every choice has its benefits and drawbacks. Going with Sony phones all those years ago was controversial, but Sony hardware has (in my opinion) been both excellent and offering good value for the hardware. Nice materials, unassuming design and unexpectedly solid features (with headphones jacks, SD card slots, notification lights, and excellent screens to boot). Even more importantly Sony retained their commitment to openness over the long term, when other manufacturers have long since lost interest. On the other hand PinePhone and Fairphone have credentials that are really appealing for Sailfish users, who tend to prize openness and ethical considerations over price and features alone.

The new Reeder phone offers something a little different to either of these, which is the opportunity for Jolla to work hand-in-hand with the manufacturer. Jolla has a good track record doing this and the results historically have been very good. The new Reeder phone might not be open in the same way that the Sony or PinePhone are, but it's likely Jolla will have been given access to much more of what they need, including support, in order to build a really solid software experience for the hardware.

I'm speculating of course and only time will tell, but given we're still expecting Sailfish OS releases for the Xperia 10 IV and Xperia 10 V, and given I already have one of the latter, these are exciting times as far as I'm concerned.

Alright, let's get back to development. The build has carried on for a while but eventually failed with an error. Here's the debug output generated from the build process.
813:10.10 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.cpp: In member 
    function ‘void nsRefreshDriver::Tick(mozilla::VsyncId, mozilla::TimeStamp, 
    nsRefreshDriver::IsExtraTick)’:
813:10.10 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.cpp:2430:41: error: 
    ‘mNextTransactionId’ was not declared in this scope
813:10.10        transactionId.AppendInt((uint64_t)mNextTransactionId);
813:10.10                                          ^~~~~~~~~~~~~~~~~~
813:10.12 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.cpp:2430:41: note: 
    suggested alternative: ‘GetTransactionId’
813:10.13        transactionId.AppendInt((uint64_t)mNextTransactionId);
813:10.13                                          ^~~~~~~~~~~~~~~~~~
813:10.13                                          GetTransactionId
813:14.49 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:694: 
    nsRefreshDriver.o] Error 1
The error is telling us that mNextTransactionIddoesn't exist and that's correct because it was removed by the patch. But there's still an instance remaining that the patch didn't capture. To get this through I've had to make one small change. This code looks to be related to profiling so probably isn't really needed for our purposes anyway, but we do need it to compile.
$ git diff layout/base/nsRefreshDriver.cpp
diff --git a/layout/base/nsRefreshDriver.cpp b/layout/base/nsRefreshDriver.cpp
index 33c90bb1f324..bb43dea053ea 100644
--- a/layout/base/nsRefreshDriver.cpp
+++ b/layout/base/nsRefreshDriver.cpp
@@ -2427,7 +2427,7 @@ void nsRefreshDriver::Tick(VsyncId aId, TimeStamp 
    aNowTime,
     nsCString transactionId;
     if (profiler_can_accept_markers()) {
       transactionId.AppendLiteral("Transaction ID: ");
-      transactionId.AppendInt((uint64_t)mNextTransactionId);
+      transactionId.AppendInt((uint64_t)mPendingTransaction);
     }
     AUTO_PROFILER_MARKER_TEXT(
         "ViewManagerFlush", GRAPHICS,
This is a similar change to that made elsewhere in the code, so should hopefully be pretty benign.

As the build is still taking a while I've decide I can't let all this potential development time go to waste. So I've decided to take the plunge and perform a test while the build continues. The current issue my most recent changes are aimed at addressing is the fact that the mIsPainted variable in the QMozViewPrivate class never gets set to true. So I'm wondering what will happen if I forcefully set it to true at a suitable time using the debugger.

If I'm honest I'm a little nervous about performing this test. If it turns out to have no effect then it'll leave me scratching my head yet again. I really want this rendering to work.

But I've plucked up the courage and am going to give it a go. First set the breakpoint on the method where the test for this flag is being made.
(gdb) b QuickMozView::updatePaintNode
Function "QuickMozView::updatePaintNode" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (QuickMozView::updatePaintNode) pending.
(gdb) r
Starting program: /usr/bin/harbour-webview
[...]
Now run the program through to the first time this breakpoint is hit. I don't want to flip the flag here because the rendering pipeline and textures aren't fully set up yet, so I'm going to wait for a later hit of the breakpoint.
Thread 9 "QSGRenderThread" hit Breakpoint 1, QuickMozView::
    updatePaintNode (this=0x555586dc20, oldNode=0x0) at quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {
(gdb) c
Continuing.
[...]
=============== Preparing offscreen rendering context ===============
[...]
As we can see the rendering context has now been configured. Following this the breakpoint is hit again. Now is the time to change the flag from true to false. Here goes:
Thread 9 &quot;QSGRenderThread&quot; hit Breakpoint 1, QuickMozView::
    updatePaintNode (this=0x555586dc20, oldNode=0x7fc800c640) at 
    quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {
(gdb) p d->mIsPainted
$1 = false
(gdb) set d->mIsPainted = true
(gdb) p d->mIsPainted
$2 = true
(gdb) disable break
(gdb) c
Continuing.
[...]
Finally I disabled the breakpoint and set the program running again. And the result? I'm astonished to find that the result is a fully rendered Sailfish OS website!
 
A phone showing the sailfishos.org website next to a laptop on a table with the debugger running in a console window.

Scrolling around, selecting links and interacting with the site all seem to work nicely. This is really encouraging. This means that once this patch is properly applied, that should be the final piece of the puzzle needed to get the WebView working again.
 
Three screenshots of the WebView rendering: the sailfish.org site at the top; the same site scrolled down a bit; and shop.jolla.com with the new Jolla C2 on display

I'm quite excited by this!

Unfortunately this is a debugging hack so we're not there yet. And the build has failed again. At least now, motivated by my excitement, I have plenty of energy to fix the build.

Here's the latest output from the failed build:
302:48.25 In file included from Unified_cpp_layout_base2.cpp:20:
302:48.25 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp: In member 
    function ‘void nsPresContext::DetachPresShell()’:
302:48.25 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp:843:15: error: 
    ‘class nsRootPresContext’ has no member named 
    ‘CancelApplyPluginGeometryTimer’; did you mean ‘mApplyPluginGeometryTimer’?
302:48.25      thisRoot->CancelApplyPluginGeometryTimer();
302:48.25                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
302:48.25                mApplyPluginGeometryTimer
302:48.40 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp: In destructor 
    ‘virtual nsRootPresContext::~nsRootPresContext()’:
302:48.40 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp:2683:3: error: 
    ‘CancelApplyPluginGeometryTimer’ was not declared in this scope
302:48.40    CancelApplyPluginGeometryTimer();
302:48.40    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
302:48.47 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp:2683:3: note: 
    suggested alternative:  mApplyPluginGeometryTimer’
302:48.47    CancelApplyPluginGeometryTimer();
302:48.47    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
302:48.47    mApplyPluginGeometryTimer
302:48.51 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp: In member 
    function ‘void nsRootPresContext::ComputePluginGeometryUpdates(nsIFrame*, 
    nsDisplayListBuilder*, nsDisplayList*)’:
302:48.51 ${PROJECT}/gecko-dev/layout/base/nsPresContext.cpp:2712:21: error: 
    ‘nsPluginFrame’ does not name a type; did you mean ‘nsPageFrame’?
302:48.51          static_cast<nsPluginFrame*>(iter.Get()->GetKey(
    )->GetPrimaryFrame());
302:48.51                      ^~~~~~~~~~~~~
302:48.51                      nsPageFrame
These are actually rather odd errors. For example nsPageFrame has been added to the code even though it's not part of the upstream change. It must have been pulled in due to some changes applied later.

I'm not very happy with this, it's all a bit messy. So I've decided to revert my attempt to automatically merge in the patch and go back to a manual approach instead.

This will require a bit of a rework, which unfortunately I don't have time to do tonight. But I'll start on it in the morning and with any luck may be able to get it done tomorrow. Which means that's it for today. But we're definitely getting there!

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
23 May 2024 : Day 241 #
It's a good job I stopped working on this last night when I did. On returning to it this morning with a clearer head it didn't take long to identify that the changes needed to revert the D5005 upstream code changes had already been wrapped up into a patch — patch 0044 to be precise — that I'd not yet applied.

If the patch can be applied cleanly, it could save me a lot of time. So first step is to do a dry run of applying the patch. This won't actually make any changes to the code, but it will tell me whether the patch is going to apply without incident if I attempt it properly.
$ patch --dry-run -d gecko-dev -p1 < rpm/
    0044-Revert-Bug-1445570-Remove-EnsureEventualAfterPaint-t.patch 
checking file dom/base/nsDOMWindowUtils.cpp
Hunk #1 succeeded at 407 (offset 51 lines).
checking file layout/base/nsDocumentViewer.cpp
Hunk #1 succeeded at 1573 (offset -44 lines).
Hunk #2 succeeded at 1718 (offset -45 lines).
Hunk #3 succeeded at 3649 (offset -254 lines).
checking file layout/base/nsPresContext.cpp
Hunk #1 succeeded at 314 (offset 3 lines).
Hunk #2 FAILED at 804.
Hunk #3 succeeded at 2021 (offset 66 lines).
Hunk #4 FAILED at 2039.
Hunk #5 FAILED at 2121.
Hunk #6 FAILED at 2613.
Hunk #7 succeeded at 2716 with fuzz 2 (offset -132 lines).
4 out of 7 hunks FAILED
checking file layout/base/nsPresContext.h
Hunk #1 succeeded at 355 (offset -5 lines).
Hunk #2 succeeded at 912 (offset -4 lines).
Hunk #3 FAILED at 1363.
Hunk #4 FAILED at 1455.
2 out of 4 hunks FAILED
checking file layout/base/nsRefreshDriver.cpp
Hunk #1 FAILED at 1166.
Hunk #2 FAILED at 1239.
Hunk #3 FAILED at 2337.
Hunk #4 FAILED at 2396.
Hunk #5 FAILED at 2404.
5 out of 5 hunks FAILED
checking file layout/base/nsRefreshDriver.h
Hunk #1 FAILED at 507.
1 out of 1 hunk FAILED
checking file layout/style/test/test_restyles_in_smil_animation.html
There are quite a few nice succeeded lines there. But also a disturbing number of FAILED lines as well. That's FAILED in all caps, just in case the word itself doesn't sufficiently convey the damage being done.

The advice Raine would give me in this situation is to apply a three-way merge using git. So that's what I'm going to try.
$ git am --3way ../rpm/
    0044-Revert-Bug-1445570-Remove-EnsureEventualAfterPaint-t.patch
Applying: Revert &quot;Bug 1445570 - Remove EnsureEventualAfterPaint timer. 
    r=tnikkel&quot;
Using index info to reconstruct a base tree...
M       dom/base/nsDOMWindowUtils.cpp
M       layout/base/nsDocumentViewer.cpp
M       layout/base/nsPresContext.cpp
M       layout/base/nsPresContext.h
M       layout/base/nsRefreshDriver.cpp
M       layout/base/nsRefreshDriver.h
Falling back to patching base and 3-way merge...
Auto-merging layout/base/nsRefreshDriver.h
CONFLICT (content): Merge conflict in layout/base/nsRefreshDriver.h
Auto-merging layout/base/nsRefreshDriver.cpp
CONFLICT (content): Merge conflict in layout/base/nsRefreshDriver.cpp
Auto-merging layout/base/nsPresContext.h
CONFLICT (content): Merge conflict in layout/base/nsPresContext.h
Auto-merging layout/base/nsPresContext.cpp
CONFLICT (content): Merge conflict in layout/base/nsPresContext.cpp
Auto-merging layout/base/nsDocumentViewer.cpp
Auto-merging dom/base/nsDOMWindowUtils.cpp
error: Failed to merge in the changes.
Patch failed at 0001 Revert &quot;Bug 1445570 - Remove EnsureEventualAfterPaint 
    timer. r=tnikkel&quot;
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run &quot;git am --continue&quot;.
If you prefer to skip this patch, run &quot;git am --skip&quot; instead.
To restore the original branch and stop patching, run &quot;git am 
    --abort&quot;.
This has done a better job than the patch did, but there are still some problems which require manual intervention. But after lots of manual changes, comparing the source code against the patch file, eventually I'm able to get it there.
git am --continue
It's still the morning and before I start work. It would be nice to kick off the build before I have to move on to my proper work day. That way there's an outside chance that the build will be completed and ready to test by this evening. So although I'd normally read carefully through the changes to check whether they look sound, I'm just going to kick off the build and hope for the best. Here we go!
sfdk build -d -p --with git_workaround
Sadly as the build trundles on into my work day it hits an error in the code.
27:10.41 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    mozilla/dom/DocumentTimeline.h:16,
27:10.41                  from ${PROJECT}/gecko-dev/dom/animation/Animation.cpp:
    15,
27:10.41                  from Unified_cpp_dom_animation0.cpp:2:
27:10.41 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.h: In member function 
    ‘bool nsRefreshDriver::AtPendingTransactionLimit()’:
27:10.41 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.h:514:12: error: 
    ‘mPendingTransactions’ was not declared in this scope
27:10.41      return mPendingTransactions.Length() == 2;
27:10.42             ^~~~~~~~~~~~~~~~~~~~
27:10.42 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.h:514:12: note: 
    suggested alternative: ‘mPendingTransaction’
27:10.42      return mPendingTransactions.Length() == 2;
27:10.42             ^~~~~~~~~~~~~~~~~~~~
27:10.42             mPendingTransaction
27:10.42 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.h: In member function 
    ‘bool nsRefreshDriver::TooManyPendingTransactions()’:
27:10.42 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.h:517:12: error: 
    ‘mPendingTransactions’ was not declared in this scope
27:10.42      return mPendingTransactions.Length() >= 2;
27:10.43             ^~~~~~~~~~~~~~~~~~~~
27:10.43 ${PROJECT}/gecko-dev/layout/base/nsRefreshDriver.h:517:12: note: 
    suggested alternative: ‘mPendingTransaction’
27:10.43      return mPendingTransactions.Length() >= 2;
27:10.43             ^~~~~~~~~~~~~~~~~~~~
27:10.43             mPendingTransaction
27:24.10 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:694: 
    Unified_cpp_dom_animation0.o] Error 1
27:24.10 make[3]: *** [${PROJECT}/gecko-dev/config/recurse.mk:72: dom/animation/
    target-objects] Error 2
These errors are clearly all caused by the patches I applied. Unfortunately I can't fix them while I'm at work, which means looking in to them in the evening. If I can get them done today there's a chance I'll be able to run the build overnight, but let's see.

The error is about use of mPendingTransactions in the nsRefreshDriver class. It's true that the member variable isn't defined there. Should it be?

It's not defined in the code in the upstream changeset on either side of the diff. Nor does it appear in the patch that we applied. Which is a little odd, except that there is reference in all of those to a mPendingTransaction member variable. Note the subtle difference: mPendingTransaction vs. mPendingTransactions (singular vs. plural).

And looking more carefully at the patch we can see the reason. Here are the changes that the patch tried to apply:
diff --git a/layout/base/nsRefreshDriver.h b/layout/base/nsRefreshDriver.h
index 07feab12e079..deec935f25f4 100644
--- a/layout/base/nsRefreshDriver.h
+++ b/layout/base/nsRefreshDriver.h
@@ -507,12 +507,7 @@ class nsRefreshDriver final : public mozilla::layers::
    TransactionIdAllocator,
   RefPtr<nsRefreshDriver> mRootRefresh;
 
   // The most recently allocated transaction id.
-  TransactionId mNextTransactionId;
-  // This number is mCompletedTransaction + (pending transaction count).
-  // When we revoke a transaction id, we revert this number (since it's
-  // no longer outstanding), but not mNextTransactionId (since we don't
-  // want to reuse the number).
-  TransactionId mOutstandingTransactionId;
+  TransactionId mPendingTransaction;
   // The most recently completed transaction id.
   TransactionId mCompletedTransaction;
But this failed and I made a manual intervention; one which I thought looked pretty similar, but which in fact included a crucial destructive change:
diff --git a/layout/base/nsRefreshDriver.h b/layout/base/nsRefreshDriver.h
index 29c595ba5ba8..b9a0fa4bb2ed 100644
--- a/layout/base/nsRefreshDriver.h
+++ b/layout/base/nsRefreshDriver.h
@@ -528,8 +528,9 @@ class nsRefreshDriver final : public mozilla::layers::
    TransactionIdAllocator,
   RefPtr<nsRefreshDriver> mRootRefresh;
 
   // The most recently allocated transaction id.
-  TransactionId mNextTransactionId;
-  AutoTArray<TransactionId, 3> mPendingTransactions;
+  TransactionId mPendingTransaction;
+  // The most recently completed transaction id.
+  TransactionId mCompletedTransaction;
As you can see, my change removed the mPendingTransactions (plural) variable. The good news is that after applying the patch this mPendingTransactions variable is only used in two places:
  bool AtPendingTransactionLimit() {
    return mPendingTransactions.Length() == 2;
  }
  bool TooManyPendingTransactions() {
    return mPendingTransactions.Length() >= 2;
  }
Neither of these two methods is used anywhere, so it should be safe to remove them. On the other hand, it seems that by applying the patch I've trampled over rather more changes than I'd anticipated. So while removing these two methods to get the build to pass is a simple solution, I'm also going to need to spend some time checking the other changes.

But, having worked through the patch and read through it side-by-side with the changes made to the ESR 91 code, it all looks sensible, albeit rather messy. It looks like the ESR 91 code now matches the code that would have been expected had the patch been applied to ESR 78 as a base.

I've therefore set the build running again. If it builds that'll be a good sign. The good news is also that once it's built, due to the work I've been doing over the last few days, it should be really easy to check whether this change has had the desired effect. All I need to do is check whether the OnFirstPaint() method is being called. If it is, then that's a good sign things have worked out.

Since it's building now I can't do much else until it completes, so I'm going to call it a night. By the morning it should be done and we can continue with some testing.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
22 May 2024 : Day 240 #
We're continuing to dig deeper into the first paint trigger code today. I've spent a couple of days digging around this already now and it's turning out to be a surprisingly complex investigation. I'm hoping the reward will be worth it, and to be honest, I think it will be: WebView rendering isn't going to work without getting to the bottom of this.

Yesterday we saw that the crucial OnFirstPaint() call was being made from EmbedLiteViewParent::RecvOnFirstPaint(). That's because the QMozViewPrivate class implements the EmbedLiteViewListener interface like this:
class QMozViewPrivate : public QObject,
    public mozilla::embedlite::EmbedLiteViewListener
{
[...]
This listener is set up to get called by EmbedLiteViewParent on receipt of the OnFirstPaint message:
mozilla::ipc::IPCResult EmbedLiteViewParent::RecvOnFirstPaint(const int32_t &aX,
                                                              const int32_t &aY)
{
  LOGT();
  NS_ENSURE_TRUE(mView && !mViewAPIDestroyed, IPC_OK());

  mView->GetListener()->OnFirstPaint(aX, aY);
  return IPC_OK();
}
This is called at the far end of a channel after receipt of a message sent potentially across a thread or process boundary. Sending things across process boundaries like this messes up the backtraces but are otherwise a pretty commonplace in Gecko. We just need to find out where the message is being sent from. Searching the code for where the Send happens is usually the way to go with this, and here it's no different. These send and receive messages are auto-generated from the interface files, so they always follow a strict pattern. A search for a call to SendOnFirstPaint() therefore throws up the following in the ESR 78 code:
NS_IMETHODIMP
EmbedLiteViewChild::OnFirstPaint(int32_t aX, int32_t aY)
{
  if (mDOMWindow) {
    nsCOMPtr<nsIDocShell> docShell = mDOMWindow->GetDocShell();
    if (docShell) {
      RefPtr<PresShell> presShell = docShell->GetPresShell();
      if (presShell) {
        nscolor bgcolor = presShell->GetCanvasBackground();
        Unused << SendSetBackgroundColor(bgcolor);
      }
    }
  }

  return SendOnFirstPaint(aX, aY) ? NS_OK : NS_ERROR_FAILURE;
}
We want to find out where this is called and for that we can use the debugger. Using it gives us the following backtrace.
Thread 7 &quot;GeckoWorkerThre&quot; hit Breakpoint 2, non-virtual thunk to 
    mozilla::embedlite::EmbedLiteViewChild::OnFirstPaint(int, int) ()
    at mobile/sailfishos/embedshared/EmbedLiteViewChild.h:57
57        NS_DECL_NSIEMBEDBROWSERCHROMELISTENER
(gdb) bt
#0  non-virtual thunk to mozilla::embedlite::EmbedLiteViewChild::OnFirstPaint(
    int, int) ()
    at mobile/sailfishos/embedshared/EmbedLiteViewChild.h:57
#1  0x0000007fbb306f38 in WebBrowserChrome::HandleEvent (this=0x7f8ccba130, 
    aEvent=0x7f8dd57a60)
    at mobile/sailfishos/utils/WebBrowserChrome.cpp:443
#2  0x0000007fb9d4b4a0 in mozilla::EventListenerManager::HandleEventSubType (
    this=this@entry=0x7f8ccbd3d0, aListener=<optimized out>, 
    aListener@entry=0x7f8cdb7d78, aDOMEvent=0x7f8dd57a60, 
    aCurrentTarget=<optimized out>, aCurrentTarget@entry=0x7eac138da0)
    at dom/events/EventListenerManager.cpp:1087
#3  0x0000007fb9d4e984 in mozilla::EventListenerManager::HandleEventInternal (
    this=0x7f8ccbd3d0, aPresContext=0x7f8cf82940, aEvent=0x7f8d5c7e80, 
    aDOMEvent=0x7fa7972858, aCurrentTarget=<optimized out>, 
    aEventStatus=<optimized out>, aItemInShadowTree=<optimized out>)
    at dom/events/EventListenerManager.cpp:1279
#4  0x0000007fb9d4f604 in mozilla::EventTargetChainItem::HandleEventTargetChain 
    (aChain=..., aVisitor=..., aCallback=aCallback@entry=0x0, aCd=...)
    at dom/events/EventDispatcher.cpp:558
#5  0x0000007fb9d50eb8 in mozilla::EventDispatcher::Dispatch (
    aTarget=<optimized out>, aPresContext=0x7f8cf82940, aEvent=0x7f8d5c7e80, 
    aDOMEvent=0x7f8dd57a60, aEventStatus=<optimized out>, aCallback=0x0, 
    aTargets=<optimized out>)
    at dom/events/EventDispatcher.cpp:1055
#6  0x0000007fba7f8f50 in nsPresContext::FireDOMPaintEvent (this=0x7f8cf82940, 
    aList=aList@entry=0x7ecc05ae78, aTransactionId=..., aTimeStamp=...)
    at obj-build-mer-qt-xr/dist/include/nsCOMPtr.h:847
#7  0x0000007fba7f9090 in DelayedFireDOMPaintEvent::Run (this=0x7ecc05ae50)
    at obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#8  0x0000007fb917d1a8 in nsContentUtils::RemoveScriptBlocker ()
    at dom/base/nsContentUtils.cpp:5407
#9  0x0000007fb917d218 in nsContentUtils::RemoveScriptBlocker ()
    at dom/base/nsContentUtils.cpp:5386
#10 0x0000007fba7fd020 in nsAutoScriptBlocker::~nsAutoScriptBlocker (
    this=<synthetic pointer>, __in_chrg=<optimized out>)
    at dom/base/nsContentUtils.h:3461
#11 nsRootPresContext::<lambda()>::operator() (__closure=0x7f8dd1cfe8)
    at layout/base/nsPresContext.cpp:2825
#12 mozilla::layers::GenericNamedTimerCallback<nsRootPresContext::
    EnsureEventualDidPaintEvent(nsPresContext::TransactionId)::<lambda()> >::
    Notify(nsITimer *) (this=0x7f8dd1cfd0) at obj-build-mer-qt-xr/dist/include/
    mozilla/layers/APZThreadUtils.h:81
#13 0x0000007fb84653e0 in nsTimerImpl::Fire (this=0x7f8cecfb50, aGeneration=1)
    at obj-build-mer-qt-xr/dist/include/mozilla/TimeStamp.h:136
#14 0x0000007fb84654c0 in nsTimerEvent::Run (this=0x7ed4001ab0)
    at obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#15 0x0000007fb845e4c4 in nsThread::ProcessNextEvent (aResult=0x7fa7972d77, 
    aMayWait=<optimized out>, this=0x7f8c02f610)
    at xpcom/threads/nsThread.cpp:1211
[...]
#32 0x0000007fbe70b89c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
(gdb) 
That's a long backtrace but the magic appears to be happening inside a call to DelayedFireDOMPaintEvent() and we can check for this by searching through the code. It gets set up in nsPresContext::NotifyDidPaintForSubtree() inside nsPresContext.cpp where we have the following code, which is basically the same in both ESR 78 and ESR 91. This is a long piece of code, but the important parts are the two calls to DelayedFireDOMPaintEvent() which you can see around the middle and towards the end of the method:
void nsPresContext::NotifyDidPaintForSubtree(
    TransactionId aTransactionId, const mozilla::TimeStamp& aTimeStamp) {
  if (mFirstContentfulPaintTransactionId && !mHadContentfulPaintComposite) {
    if (aTransactionId >= *mFirstContentfulPaintTransactionId) {
      mHadContentfulPaintComposite = true;
      RefPtr<nsDOMNavigationTiming> timing = mDocument->GetNavigationTiming();
      if (timing && !aTimeStamp.IsNull()) {
        timing->NotifyContentfulPaintForRootContentDocument(aTimeStamp);
      }
    }
  }

  if (IsRoot()) {
    static_cast<nsRootPresContext*>(this)->CancelDidPaintTimers(aTransactionId);

    if (mTransactions.IsEmpty()) {
      return;
    }
  }

  if (!PresShell()->IsVisible() && mTransactions.IsEmpty()) {
    return;
  }

  // Non-root prescontexts fire MozAfterPaint to all their descendants
  // unconditionally, even if no invalidations have been collected. This is
  // because we don't want to eat the cost of collecting invalidations for
  // every subdocument (which would require putting every subdocument in its
  // own layer).

  bool sent = false;
  uint32_t i = 0;
  while (i < mTransactions.Length()) {
    if (mTransactions[i].mTransactionId <= aTransactionId) {
      if (!mTransactions[i].mInvalidations.IsEmpty()) {
        nsCOMPtr<nsIRunnable> ev = new DelayedFireDOMPaintEvent(
            this, &mTransactions[i].mInvalidations,
            mTransactions[i].mTransactionId, aTimeStamp);
        nsContentUtils::AddScriptRunner(ev);
        sent = true;
      }
      mTransactions.RemoveElementAt(i);
    } else {
      i++;
    }
  }

  if (!sent) {
    nsTArray<nsRect> dummy;
    nsCOMPtr<nsIRunnable> ev =
        new DelayedFireDOMPaintEvent(this, &dummy, aTransactionId, aTimeStamp);
    nsContentUtils::AddScriptRunner(ev);
  }
As anticipated there's no hit for this method on ESR 91, but on ESR 78 we can get a backtrace:
(gdb) break DelayedFireDOMPaintEvent::DelayedFireDOMPaintEvent
Breakpoint 8 at 0x7fba7fcc8c: DelayedFireDOMPaintEvent::
    DelayedFireDOMPaintEvent. (2 locations)
(gdb) r
[...]
Thread 7 &quot;GeckoWorkerThre&quot; hit Breakpoint 8, 0x0000007fba7fcc8c in 
    DelayedFireDOMPaintEvent::DelayedFireDOMPaintEvent (aTimeStamp=..., 
    aTransactionId=..., aList=<optimized out>, aPresContext=<optimized out>, 
    this=<optimized out>)
    at layout/base/nsPresContext.h:173
173         return mPresShell;
(gdb) bt
#0  0x0000007fba7fcc8c in DelayedFireDOMPaintEvent::DelayedFireDOMPaintEvent (
    aTimeStamp=..., aTransactionId=..., aList=<optimized out>, 
    aPresContext=<optimized out>, this=<optimized out>)
    at layout/base/nsPresContext.h:173
#1  nsPresContext::NotifyDidPaintForSubtree (this=0x7f8ce6dda0, 
    aTransactionId=..., aTimeStamp=...)
    at layout/base/nsPresContext.cpp:2082
#2  0x0000007fba7fd01c in nsRootPresContext::<lambda()>::operator() (
    __closure=0x7f8dc1af58)
    at obj-build-mer-qt-xr/dist/include/mozilla/TimeStamp.h:400
#3  mozilla::layers::GenericNamedTimerCallback<nsRootPresContext::
    EnsureEventualDidPaintEvent(nsPresContext::TransactionId)::<lambda()> >::
    Notify(nsITimer *) (this=0x7f8dc1af40) at obj-build-mer-qt-xr/dist/include/
    mozilla/layers/APZThreadUtils.h:81
#4  0x0000007fb84653e0 in nsTimerImpl::Fire (this=0x7f8c72bfd0, aGeneration=1)
    at obj-build-mer-qt-xr/dist/include/mozilla/TimeStamp.h:136
#5  0x0000007fb84654c0 in nsTimerEvent::Run (this=0x7ecc001ab0)
    at obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#6  0x0000007fb845e4c4 in nsThread::ProcessNextEvent (aResult=0x7fa7972d77, 
    aMayWait=<optimized out>, this=0x7f8c02f610)
    at xpcom/threads/nsThread.cpp:1211
#7  nsThread::ProcessNextEvent (this=0x7f8c02f610, aMayWait=<optimized out>, 
    aResult=0x7fa7972d77)
    at xpcom/threads/nsThread.cpp:1047
#8  0x0000007fb845cdcc in NS_ProcessPendingEvents (
    aThread=aThread@entry=0x7f8c02f610, aTimeout=10)
    at xpcom/threads/nsThreadUtils.cpp:449
#9  0x0000007fba63c430 in nsBaseAppShell::NativeEventCallback (
    this=0x7f8c3e6ac0)
    at widget/nsBaseAppShell.cpp:87
#10 0x0000007fba64f0bc in nsAppShell::event (this=<optimized out>, e=<optimized 
    out>)
    at widget/qt/nsAppShell.cpp:75
[...]
#23 0x0000007fbe70b89c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
(gdb) 
Frustratingly it's called within a lambda function, which is messing up the backtrace yet again. So we'll have to resort to digging through the code again. There we can see that the NotifyDidPaintForSubtree() method gets called in a lambda function that's triggered inside itself. But this can't be the one we're after because we'd have seen the hit on the containing method first.

But there's another case where a lambda function gets configured to then call nsPresContext::NotifyDidPaintForSubtree() and that's in the following method from nsPresContext.cpp (this is the ESR 78 version):
void
nsRootPresContext::EnsureEventualDidPaintEvent(TransactionId aTransactionId)
{
  for (NotifyDidPaintTimer& t : mNotifyDidPaintTimers) {
    if (t.mTransactionId == aTransactionId) {
      return;
    }
  }

  nsCOMPtr<nsITimer> timer;
  RefPtr<nsRootPresContext> self = this;
  nsresult rv = NS_NewTimerWithCallback(
    getter_AddRefs(timer),
    NewNamedTimerCallback([self, aTransactionId](){
      nsAutoScriptBlocker blockScripts;
      self->NotifyDidPaintForSubtree(aTransactionId);
     }, &quot;NotifyDidPaintForSubtree&quot;), 100, nsITimer::TYPE_ONE_SHOT,
    Document()->EventTargetFor(TaskCategory::Other));

  if (NS_SUCCEEDED(rv)) {
    NotifyDidPaintTimer* t = mNotifyDidPaintTimers.AppendElement();
    t->mTransactionId = aTransactionId;
    t->mTimer = timer;
  }
}
That's the only other case I can see, so we could check to find out how this EnsureEventualDidPaintEvent() method gets called. However as I was in the process of configuring the debugger I noticed that this method doesn't exist in ESR 91. That's unexpected. In ESR 78 it's called from nsPresContext::NotifyInvalidation() like this:
void nsPresContext::NotifyInvalidation(TransactionId aTransactionId,
                                       const nsRect& aRect) {
  MOZ_ASSERT(GetContainerWeak(), &quot;Invalidation in detached pres 
    context&quot;);

  // If there is no paint event listener, then we don't need to fire
  // the asynchronous event. We don't even need to record invalidation.
  // MayHavePaintEventListener is pretty cheap and we could make it
  // even cheaper by providing a more efficient
  // nsPIDOMWindow::GetListenerManager.

  nsPresContext* pc;
  for (pc = this; pc; pc = pc->GetParentPresContext()) {
    TransactionInvalidations* transaction =
        pc->GetInvalidations(aTransactionId);
    if (transaction) {
      break;
    } else {
      transaction = pc->mTransactions.AppendElement();
      transaction->mTransactionId = aTransactionId;
    }
  }
  if (!pc) {
    nsRootPresContext* rpc = GetRootPresContext();
    if (rpc) {
      rpc->EnsureEventualDidPaintEvent(aTransactionId);
    }
  }

  TransactionInvalidations* transaction = GetInvalidations(aTransactionId);
  MOZ_ASSERT(transaction);
  transaction->mInvalidations.AppendElement(aRect);
}
But as we can see, the call to EnsureEventualDidPaintEvent() has been removed in the ESR 91 version of the same method:
void nsPresContext::NotifyInvalidation(TransactionId aTransactionId,
                                       const nsRect& aRect) {
  MOZ_ASSERT(GetContainerWeak(), &quot;Invalidation in detached pres 
    context&quot;);

  // If there is no paint event listener, then we don't need to fire
  // the asynchronous event. We don't even need to record invalidation.
  // MayHavePaintEventListener is pretty cheap and we could make it
  // even cheaper by providing a more efficient
  // nsPIDOMWindow::GetListenerManager.

  nsPresContext* pc;
  for (pc = this; pc; pc = pc->GetParentPresContext()) {
    TransactionInvalidations* transaction =
        pc->GetInvalidations(aTransactionId);
    if (transaction) {
      break;
    } else {
      transaction = pc->mTransactions.AppendElement();
      transaction->mTransactionId = aTransactionId;
    }
  }

  TransactionInvalidations* transaction = GetInvalidations(aTransactionId);
  MOZ_ASSERT(transaction);
  transaction->mInvalidations.AppendElement(aRect);
}
I wasn't expecting that. For completeness, the backtrace from ESR 78 confirms that this is indeed the correct method to be looking at, at least on the ESR 78 side:
Thread 7 &quot;GeckoWorkerThre&quot; hit Breakpoint 9, nsRootPresContext::
    EnsureEventualDidPaintEvent (this=0x7f8ce70330, 
    aTransactionId=aTransactionId@entry=...)
    at layout/base/nsPresContext.cpp:2813
2813    {
(gdb) bt
#0  nsRootPresContext::EnsureEventualDidPaintEvent (this=0x7f8ce70330, 
    aTransactionId=aTransactionId@entry=...)
    at layout/base/nsPresContext.cpp:2813
#1  0x0000007fba7fb480 in nsPresContext::NotifyInvalidation (
    this=this@entry=0x7f8ce70330, aTransactionId=..., aRect=...)
    at layout/base/nsPresContext.cpp:1964
#2  0x0000007fba7fb5d8 in nsPresContext::NotifyInvalidation (
    this=this@entry=0x7f8ce70330, aTransactionId=..., aRect=...)
    at layout/base/nsPresContext.cpp:1927
#3  0x0000007fbaa4dbe4 in nsDisplayList::PaintRoot (
    this=this@entry=0x7fa7971f48, aBuilder=aBuilder@entry=0x7fa7970170, 
    aCtx=aCtx@entry=0x0, 
    aFlags=aFlags@entry=13) at layout/painting/nsDisplayList.cpp:2526
#4  0x0000007fba7d656c in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7f8dba5c40, aDirtyRegion=..., 
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting, 
    aFlags=<optimized out>, 
    aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | nsLayoutUtils::
    PaintFrameFlags::ExistingTransaction | nsLayoutUtils::PaintFrameFlags::
    NoComposite)) at layout/base/nsLayoutUtils.cpp:4168
#5  0x0000007fba79a258 in mozilla::PresShell::Paint (
    this=this@entry=0x7f8d2dd790, aViewToPaint=aViewToPaint@entry=0x7f8d210390, 
    aDirtyRegion=..., 
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6254
#6  0x0000007fba6110ac in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7f8d210df0, aWidget=aWidget@entry=0x7f8d210e60)
    at obj-build-mer-qt-xr/dist/include/nsTArray.h:554
#7  0x0000007fba6113c0 in nsViewManager::ProcessPendingUpdatesForView (
    this=this@entry=0x7f8d210df0, aView=<optimized out>, 
    aFlushDirtyRegion=aFlushDirtyRegion@entry=true) at view/nsViewManager.cpp:
    395
#8  0x0000007fba611b50 in nsViewManager::ProcessPendingUpdates (
    this=0x7f8d210df0)
    at view/nsViewManager.cpp:1018
#9  nsViewManager::ProcessPendingUpdates (this=<optimized out>)
    at view/nsViewManager.cpp:1004
#10 0x0000007fba611c0c in nsViewManager::WillPaintWindow (
    this=this@entry=0x7f8d210df0, aWidget=0x7f8d210e60)
    at view/nsViewManager.cpp:664
#11 0x0000007fba611c88 in nsView::WillPaintWindow (this=<optimized out>, 
    aWidget=<optimized out>)
    at view/nsView.cpp:1048
#12 0x0000007fbb2f6344 in mozilla::embedlite::PuppetWidgetBase::Invalidate (
    aRect=..., this=0x7f8d210e60)
    at mobile/sailfishos/embedshared/PuppetWidgetBase.cpp:268
#13 mozilla::embedlite::PuppetWidgetBase::Invalidate (this=0x7f8d210e60, 
    aRect=...)
    at mobile/sailfishos/embedshared/PuppetWidgetBase.cpp:253
#14 0x0000007fbb2ef768 in mozilla::embedlite::EmbedLitePuppetWidget::Show (
    this=0x7f8d210e60, aState=<optimized out>)
    at mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:97
#15 0x0000007fba7f03e8 in nsDocumentViewer::Show (this=0x7f8d15df90)
    at layout/base/nsDocumentViewer.cpp:2132
#16 0x0000007fba7f8b7c in nsPresContext::EnsureVisible (this=0x7f8ce70330)
    at obj-build-mer-qt-xr/dist/include/nsCOMPtr.h:857
#17 0x0000007fba79c114 in mozilla::PresShell::UnsuppressAndInvalidate (
    this=0x7f8d2dd790)
    at obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#18 0x0000007fba7a1638 in mozilla::PresShell::ProcessReflowCommands (
    this=this@entry=0x7f8d2dd790, aInterruptible=aInterruptible@entry=false)
    at layout/base/PresShell.cpp:9712
#19 0x0000007fba7a0630 in mozilla::PresShell::DoFlushPendingNotifications (
    this=this@entry=0x7f8d2dd790, aFlush=...)
    at layout/base/PresShell.cpp:4209
[...]
#33 0x0000007fba772840 in mozilla::VsyncRefreshDriverTimer::
    RefreshDriverVsyncObserver::NotifyVsync (this=0x7f8cc9de00, aVsync=...)
    at layout/base/nsRefreshDriver.cpp:644
[...]
#53 0x0000007fbe70b89c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
(gdb)
Alright, let's rewind back to the top again. We've now reached the point where we know how the OnFirstPaint() method is getting called in ESR 78. And we can see why it's not happening the same way in ESR 91, viz. that the code to call it has been removed from the NotifyInvalidation() method. The next step is to find out why it's been removed and what should be triggering the OnFirstPaint() instead (assuming there is still something).

The best way to investigate this further will be to look a the Bugzilla ticket and associated changeset where this change was made upstream. We've searched for these kinds of changes many times before and now is no different. Here's the log for the change:
$ git log -S EnsureEventualDidPaintEvent -1 -- layout/base/nsPresContext.cpp
commit 1deccd7ac14706ad1849343cfb2b93df191a1c42
Author: Mantaroh Yoshinaga <mantaroh@gmail.com>
Date:   Thu Sep 6 02:21:39 2018 +0000

    Bug 1445570 - Remove EnsureEventualAfterPaint timer. r=tnikkel
    
    MozReview-Commit-ID: C7WICJ5Q0ES
    
    Differential Revision: https://phabricator.services.mozilla.com/D5005
    
    --HG--
    extra : moz-landing-system : lando
After reading through Bug 1445570 and the associated code changes my conclusion is that unravelling this is going to be too complicated for my brain in its current state, so best to let it sink in overnight. I'll pick this up again tomorrow where we'll try to find out what's replacing the trigger that we need in the updated ESR 91 code.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
21 May 2024 : Day 239 #
I got a bit carried away yesterday delving into the reasons why QuickMozView::updatePaintNode() was failing to fire on ESR 91. My conclusion was that it came down to the mIsPainted flag, part of the QMozViewPrivate class and which was remaining stubbornly set to false when running ESR 91.

While in this state the invalidTexture flag inside updatePaintNode() never gets set to false since it's defined like this:
    const bool invalidTexture = !mComposited
            || !d->mIsPainted
            || !d->mViewInitialized
            || !d->mHasCompositor
            || !d->mContext->registeredWindow()
            || !d->mMozWindow;
Notice how if any single value in this list is set to false then the combined value of invalidTexture will be set to true. The consequence of this is that the following clause is always executed:
    if (!mTexture && invalidTexture) {
        QSGSimpleRectNode *node = static_cast<QSGSimpleRectNode *>(oldNode);
        if (!node) {
            node = new QSGSimpleRectNode;
        }
        node->setColor(d->mBackgroundColor);
        node->setRect(boundingRect);

        return node;
    }
The return at the end of this means the updatePaintNode() method returns early. This means that the code beyond the clause never gets executed and crucially this section is never entered:
    if (!node) {
        QMozExtTexture * const texture = new QMozExtTexture;
        mTexture = texture;

        connect(texture, &QMozExtTexture::getPlatformImage, d->mMozWindow, 
            &QMozWindow::getPlatformImage, Qt::DirectConnection);

        node = new MozExtMaterialNode;

        node->setTexture(mTexture);
    }
Here we can see both mTexture and the MozExtMaterialNode object being created. That's what we need for the rendering to proceed. So if this portion of code is never executed, we'll never get the rendered page to appear.

That all sounds very plausible and comprehensible, but it begs the question: "why is mIsPainted never being set to true?". That's what I plan to find out over the coming days.

Unsurprisingly we can see the initial value it's set to in the QMozViewPrivate constructor, where it's initially set to false:
QMozViewPrivate::QMozViewPrivate(IMozQViewIface *aViewIface, QObject *publicPtr)
    : mViewIface(aViewIface)
[...]
    , mIsPainted(false)
[...]
It's also set to false after a call to QMozViewPrivate::reset():
QMozViewPrivate::reset()
{
    if (mIsPainted) {
        mIsPainted = false;
        mViewIface->firstPaint(-1, -1);
    }
[...]
But the important part is where it gets set to true and that only happens in one place; here in the OnFirstPaint() method:
void QMozViewPrivate::OnFirstPaint(int32_t aX, int32_t aY)
{
    mIsPainted = true;
    mViewIface->firstPaint(aX, aY);
}
This method doesn't get explicitly called anywhere in the QtMozEmbed code, so to find out where it's actually getting used I'm going to drop to the debugger on my ESR 78 device.
(gdb) break QMozViewPrivate::OnFirstPaint
Breakpoint 13 at 0x7fbfbf56b8: file qmozview_p.cpp, line 1113.
(gdb) r
[...]
Thread 1 &quot;harbour-webview&quot; hit Breakpoint 13, QMozViewPrivate::
    OnFirstPaint (this=0x55557b08f0, aX=0, aY=0) at qmozview_p.cpp:1113
1113        mIsPainted = true;
(gdb) bt                         
#0  QMozViewPrivate::OnFirstPaint (this=0x55557b08f0, aX=0, aY=0) at 
    qmozview_p.cpp:1113
#1  0x0000007fbb2f5d90 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#2  0x0000007fb8932f50 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#3  0x0000007fb8920238 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#4  0x0000007fb88428cc in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#5  0x0000007fb88482d8 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#6  0x0000007fb8849d58 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#7  0x0000007fb8809d48 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#8  0x0000007fb880e740 in ?? () from /usr/lib64/xulrunner-qt5-78.15.1/libxul.so
#9  0x0000007fbfbed924 in MessagePumpQt::HandleDispatch (this=0x555578ef30) at 
    qmessagepump.cpp:63
#10 0x0000007fbfbeda9c in MessagePumpQt::event (this=<optimized out>, 
    e=<optimized out>) at qmessagepump.cpp:51
#11 0x0000007fbebe1144 in QCoreApplication::notify(QObject*, QEvent*) () from /
    usr/lib64/libQt5Core.so.5
#12 0x0000007fbebe12e8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) (
    ) from /usr/lib64/libQt5Core.so.5
#13 0x0000007fbebe36b8 in QCoreApplicationPrivate::sendPostedEvents(QObject*, 
    int, QThreadData*) () from /usr/lib64/libQt5Core.so.5
[...]
#21 0x0000005555557bcc in main ()
(gdb) 
Well that's an unhelpful mess. But it is at least getting called from inside the Gecko code and that in itself is both helpful to know and a good thing (I was concerned it might be called from inside Qt which would have made it considerably harder to track down; the fact it's coming from Gecko means we can skip a layer of Qt indirection and get closer to the source of the problem more quickly).

But as you can see the Gecko debug symbols are messed up: we can see eight calls to methods inside libxul in a row, but it's not telling us the names of the methods.

So I'm going to reinstall the packages, including the debug packages, to get the symbols back. Before the gap I was using a custom build with extra debug output and exporting of textures, but I don't need any of that any more so I can just reinstall the packages from the official repositories. Much easier and quicker than rebuilding the packages locally and installing them that way:
# zypper install --force xulrunner-qt5-78.15.1+git33.2-1.21.1.jolla 
    xulrunner-qt5-debuginfo-78.15.1+git33.2-1.21.1.jolla xul
runner-qt5-debugsource-78.15.1+git33.2-1.21.1.jolla 
    xulrunner-qt5-misc-78.15.1+git33.2-1.21.1.jolla
Loading repository data...
Reading installed packages...
Forcing installation of 'xulrunner-qt5-78.15.1+git33.2-1.21.1.jolla.aarch64' 
    from repository 'jolla'.
Forcing installation of 
    'xulrunner-qt5-debuginfo-78.15.1+git33.2-1.21.1.jolla.aarch64' from 
    repository 'jolla'.
Forcing installation of 
    'xulrunner-qt5-debugsource-78.15.1+git33.2-1.21.1.jolla.aarch64' from 
    repository 'jolla'.
Forcing installation of 
    'xulrunner-qt5-misc-78.15.1+git33.2-1.21.1.jolla.aarch64' from repository 
    'jolla'.
[...]
2 packages to downgrade, 2 to reinstall, 2  to change vendor.
Overall download size: 758.8 MiB. Already cached: 0 B. After the operation, 2.4 
    GiB will be freed.
Continue? [y/n/v/...? shows all options] (y): y
[...]
Okay, with the debug symbols now properly aligned we can try getting the backtrace again, but hopefully this time with a bit more helpful detail.
Thread 1 &quot;harbour-webview&quot; hit Breakpoint 1, QMozViewPrivate::
    OnFirstPaint (this=0x55555cb830, aX=0, aY=0) at qmozview_p.cpp:1113
1113        mIsPainted = true;
(gdb) bt
#0  QMozViewPrivate::OnFirstPaint (this=0x55555cb830, aX=0, aY=0) at 
    qmozview_p.cpp:1113
#1  0x0000007fbb2f5d90 in mozilla::embedlite::EmbedLiteViewParent::
    RecvOnFirstPaint (this=0x55555d9d70, aX=@0x7fffffea18: 0, aY=@0x7fffffea28: 
    0)
    at mobile/sailfishos/embedshared/EmbedLiteViewParent.cpp:237
#2  0x0000007fb8932f50 in mozilla::embedlite::PEmbedLiteViewParent::
    OnMessageReceived (this=0x55555d9d70, msg__=...) at 
    PEmbedLiteViewParent.cpp:1600
#3  0x0000007fb8920238 in mozilla::embedlite::PEmbedLiteAppParent::
    OnMessageReceived (this=<optimized out>, msg__=...)
    at obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:866
#4  0x0000007fb88428cc in mozilla::ipc::MessageChannel::DispatchAsyncMessage (
    this=this@entry=0x7f8c888578, aProxy=aProxy@entry=0x555560d1d0, aMsg=...)
    at obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:866
#5  0x0000007fb88482d8 in mozilla::ipc::MessageChannel::DispatchMessage (
    this=0x7f8c888578, aMsg=...)
    at ipc/glue/MessageChannel.cpp:2100
#6  0x0000007fb8849b50 in mozilla::ipc::MessageChannel::RunMessage (
    this=<optimized out>, aTask=...)
    at ipc/glue/MessageChannel.cpp:1959
#7  0x0000007fb8849d58 in mozilla::ipc::MessageChannel::MessageTask::Run (
    this=0x7f8e127650)
    at obj-build-mer-qt-xr/dist/include/mozilla/ipc/MessageChannel.h:610
#8  0x0000007fb8809d48 in MessageLoop::RunTask (aTask=..., this=0x55557c5200)
    at ipc/chromium/src/base/message_loop.cc:487
#9  MessageLoop::DeferOrRunPendingTask (pending_task=..., this=0x55557c5200)
    at ipc/chromium/src/base/message_loop.cc:478
#10 MessageLoop::DeferOrRunPendingTask (this=0x55557c5200, pending_task=...)
    at ipc/chromium/src/base/message_loop.cc:476
#11 0x0000007fb880e740 in MessageLoop::DoWork (this=<optimized out>)
    at ipc/chromium/src/base/message_loop.cc:551
#12 MessageLoop::DoWork (this=0x55557c5200) at ipc/chromium/src/base/
    message_loop.cc:530
#13 0x0000007fbfbed924 in MessagePumpQt::HandleDispatch (this=0x55557a6f40) at 
    qmessagepump.cpp:63
#14 0x0000007fbfbeda9c in MessagePumpQt::event (this=<optimized out>, 
    e=<optimized out>) at qmessagepump.cpp:51
#15 0x0000007fbebe1144 in QCoreApplication::notify(QObject*, QEvent*) () from /
    usr/lib64/libQt5Core.so.5
#16 0x0000007fbebe12e8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) (
    ) from /usr/lib64/libQt5Core.so.5
#17 0x0000007fbebe36b8 in QCoreApplicationPrivate::sendPostedEvents(QObject*, 
    int, QThreadData*) () from /usr/lib64/libQt5Core.so.5
[...]
#25 0x0000005555557bcc in main ()
(gdb) 
That's much better and gives us a much clearer idea of where OnFirstPaint() is getting called. But as we can see from this it's actually being triggered by receipt of a message. The next step will therefore be to find out where the OnFirstPaint message is being sent from.

For completeness I also performed the same checks on ESR 91. You'll not be surprised to hear that the QMozViewPrivate::OnFirstPaint() method is never called in the newer build.

The ESR 78 backtrace tells us what should be happening on ESR 91. So our task now is to find out why it's not. This takes us a step forwards and gives us something to look into further; we'll pick this avenue of investigation up again tomorrow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
20 May 2024 : Day 238 #
It's been an exciting and packed day of Jolla and Sailfish OS news today with the announcement of not one but two new Jolla devices, plus some significant changes to the way Sailfish OS will be developed and funded in the future. This isn't the place to dwell on these topics, but I am excited for the future of the operating system. I didn't want it to go without saying, but I do want to focus on Gecko here, so let's head straight in to development.

Yesterday I made what could be an important discovery. The QMozExtTexture::updateTexture() method is being called on ESR 78, but not on ESR 91. If I can track down the underlying reason for this, then it may well help getting the offscreen rendering back up and running again.

The next step is to find out at which point the failure is happening and try to understand why. Before we get into it I want to warn you that this is rather a long post, mostly containing debugging output which is dull at the best of times. But it does lead to some useful conclusions, so if you're interest to follow but don't want to have to read through all of the details, don't feel bad if you want to skip straight to the end!

If you're still reading this then I'll assume you're in it for the long haul. Don't say I didn't warn you though! So the best place to start this investigation is with the backtrace of the call. We saw this yesterday and it looked like this:
#0  QMozExtTexture::updateTexture (this=0x7f90024e30) at qmozexttexture.cpp:64
#1  0x0000007fbfc04f18 in MozMaterialNode::preprocess (this=0x7f9001ef60) at 
    qmozextmaterialnode.cpp:81
#2  0x0000007fbf8f0c44 in QSGRenderer::preprocess() () from /usr/lib64/
    libQt5Quick.so.5
#3  0x0000007fbf8f0e5c in QSGRenderer::renderScene(QSGBindable const&) () from /
    usr/lib64/libQt5Quick.so.5
#4  0x0000007fbf8f14c0 in QSGRenderer::renderScene(unsigned int) () from /usr/
    lib64/libQt5Quick.so.5
#5  0x0000007fbf8ffe58 in QSGRenderContext::renderNextFrame(QSGRenderer*, 
    unsigned int) () from /usr/lib64/libQt5Quick.so.5
#6  0x0000007fbf9429d0 in QQuickWindowPrivate::renderSceneGraph(QSize const&) (
    ) from /usr/lib64/libQt5Quick.so.5
#7  0x0000007fbf916c04 in ?? () from /usr/lib64/libQt5Quick.so.5
#8  0x0000007fbf91cc10 in ?? () from /usr/lib64/libQt5Quick.so.5
#9  0x0000007fbea290e8 in ?? () from /usr/lib64/libQt5Core.so.5
#10 0x0000007fb74b0a4c in start_thread (arg=0x7fffffe9bf) at pthread_create.c:
    479
#11 0x0000007fbe70b89c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
This backtrace is, of course, from the ESR 78 build. There is no equivalent backtrace for the ESR 91 build because the QMozExtTexture::updateTexture() method never gets called in ESR 91. But maybe some of the other methods in the backtrace are being called. Let's find out.

To establish this I'm putting breakpoints on each of the functions in the backtrace and then running the ESR 91 build. If one of them hits I move to one higher up the stack (that is, the next line up that has a smaller number). Eventually one will fail to hit and at that point I'll know where the failure is happening. So here goes.
Thread 68 &quot;QSGRenderThread&quot; hit Breakpoint 4, 0x0000007ff7b6e984 in 
    QQuickWindowPrivate::renderSceneGraph(QSize const&)@plt ()
   from /usr/lib64/libQt5Quick.so.5
(gdb) delete break 4
(gdb) b QSGRenderContext::renderNextFrame
Breakpoint 5 at 0x7fe82e4474 (2 locations)
(gdb) c
Continuing.

Thread 68 &quot;QSGRenderThread&quot; hit Breakpoint 5, 0x0000007fe82e4474 in 
    QSGRenderContext::renderNextFrame(QSGRenderer*, unsigned int)@plt ()
   from /usr/lib64/qt5/plugins/scenegraph/libcustomcontext.so
(gdb) delete break 5
(gdb) b QSGRenderer::preprocess
Breakpoint 6 at 0x7ff7be7a50
(gdb) c
Continuing.

Thread 68 &quot;QSGRenderThread&quot; hit Breakpoint 6, 0x0000007ff7be7a50 in 
    QSGRenderer::preprocess() () from /usr/lib64/libQt5Quick.so.5
(gdb) delete break 6
(gdb) b MozMaterialNode::preprocess
Breakpoint 7 at 0x7ff7ef899c
(gdb) c
Continuing.
frame008.data: Colour before: (0, 0, 0, 255), 1
frame009.data: Colour before: (0, 0, 0, 255), 1
frame010.data: Colour before: (0, 0, 0, 255), 1
frame011.data: Colour before: (0, 0, 0, 255), 1
[...]
As we can see from the above, the QQuickWindowPrivate::renderSceneGraph() method does get called. So does the QSGRenderContext::renderNextFrame() method. but the QSGRenderer::preprocess() method? That never gets called.

That means the failure is happening inside the QSGRenderContext::renderNextFrame() method. This call isn't part of the Gecko code, or indeed part of the QtMozEmbed code, but rather part of the Qt stack. In particular, part of the "Qt Scene Graph" stack (hence the "QSG" prefix of the method).

I'm not super familiar with how the Qt Scene Graph does it's thing, but there is plenty of good documentation out there. So I've done some reading around to get a better idea.

One part sprang out at me, since it specifically relates to the method that isn't getting called. The explanation that follows is taken from the Qt documentation:
 
void QSGNode::preprocess()

Override this function to do processing on the node before it is rendered.

Preprocessing needs to be explicitly enabled by setting the flag QSGNode::UsePreprocess. The flag needs to be set before the node is added to the scene graph and will cause the preprocess() function to be called for every frame the node is rendered.

So maybe the problem is that the QSGNode::UsePreprocess flag is never being set? A quick scan of the QtMozEmbed code throws up the following bit of code for controlling the flag, in the qmozextmaterialnode.cpp file:
MozMaterialNode::MozMaterialNode()
{
    setFlag(UsePreprocess);

    setGeometry(&m_geometry);
}
So, we should try to establish whether this is getting called or not. Running with some breakpoints shows that it is indeed getting called on the ESR 78 build.
(gdb) break MozMaterialNode::MozMaterialNode
Breakpoint 2 at 0x7fbfbdadc4 (2 locations)
(gdb) r
[...]
Thread 11 &quot;QSGRenderThread&quot; hit Breakpoint 2, MozMaterialNode::
    MozMaterialNode (this=this@entry=0x7f0001ef60) at qmozextmaterialnode.cpp:27
27      MozMaterialNode::MozMaterialNode()
(gdb) n
619     /usr/include/qt5/QtCore/qrect.h: No such file or directory.
(gdb) 
27      MozMaterialNode::MozMaterialNode()
(gdb) 
31          setGeometry(&m_geometry);
(gdb) 
That's no big surprise, since if it were not being called the other parts of the render pipeline would also be failing and they're not. How about on ESR 91 though? We know the render pipeline isn't working there and that could be because this call isn't happening. Let's find out by adding the same breakpoint on the ESR 91 build and executing it just as we did for ESR 78:
(gdb) break MozMaterialNode::MozMaterialNode
Breakpoint 8 at 0x7ff7ece9e4 (2 locations)
(gdb) r
[...]
frame014.data: Colour before: (8, 0, 152, 255), 1
frame015.data: Colour before: (1, 0, 255, 255), 1
frame016.data: Colour before: (223, 5, 0, 255), 1
frame017.data: Colour before: (71, 94, 162, 255), 1
^C
Thread 1 &quot;harbour-webview&quot; received signal SIGINT, Interrupt.
0x0000007ff69f8740 in poll () from /lib64/libc.so.6
(gdb) info break
Num     Type           Disp Enb Address            What
8       breakpoint     keep y   <MULTIPLE>         
8.1                         y     0x0000007ff7ece9e4 <MozMaterialNode::
    MozMaterialNode()@plt+4>
8.2                         y     0x0000007ff7ef9028 <MozMaterialNode::
    MozMaterialNode()+16>
(gdb) 
Here the breakpoint doesn't get called. You can see that I double-checked that the breakpoint is set correctly and it is. So this is looking very much like a smoking gun.

We'll use the same approach we used before to try to figure out why this isn't getting called. We take the backtrace of the ESR 78 build where it is getting called and work our way up through it until we find out what's failing on the ESR 91 side. Here's the backtrace, taken from ESR 78:
#0  MozMaterialNode::MozMaterialNode (this=this@entry=0x7f0001ef60) at 
    qmozextmaterialnode.cpp:31
#1  0x0000007fbfc05834 in MozExtMaterialNode::MozExtMaterialNode (
    this=0x7f0001ef60) at qmozextmaterialnode.cpp:376
#2  0x0000007fbfc03ea4 in QuickMozView::updatePaintNode (this=0x55558433a0, 
    oldNode=<optimized out>)
    at /usr/include/xulrunner-qt5-78.15.1/mozilla/cxxalloc.h:33
#3  0x0000007fbf941c50 in QQuickWindowPrivate::updateDirtyNode(QQuickItem*) () 
    from /usr/lib64/libQt5Quick.so.5
#4  0x0000007fbf94214c in QQuickWindowPrivate::updateDirtyNodes() () from /usr/
    lib64/libQt5Quick.so.5
#5  0x0000007fbf943270 in QQuickWindowPrivate::syncSceneGraph() () from /usr/
    lib64/libQt5Quick.so.5
#6  0x0000007fbf9164a8 in ?? () from /usr/lib64/libQt5Quick.so.5
#7  0x0000007fbf917134 in ?? () from /usr/lib64/libQt5Quick.so.5
#8  0x0000007fbf91cc10 in ?? () from /usr/lib64/libQt5Quick.so.5
#9  0x0000007fbea290e8 in ?? () from /usr/lib64/libQt5Core.so.5
#10 0x0000007fb74b0a4c in start_thread (arg=0x7fffffe9bf) at pthread_create.c:
    479
#11 0x0000007fbe70b89c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
It's important to note that the call at the top to MozMaterialNode::MozMaterialNode() — which is the one we're interested in — isn't a normal call, it's a constructor. It's not being called directly, but rather as a side-effect of a call to the MozExtMaterialNode class constructor. This inherits from the MozMaterialNode class as we can see in this code copied from qozextmaterialnode.cpp:
class MozExtMaterialNode : public MozMaterialNode
{
public:
    MozExtMaterialNode();
    ~MozExtMaterialNode();
[...]
As an aside, there's also a MozRgbMaterialNode class, but it looks like we're not using that:
class MozRgbMaterialNode : public MozMaterialNode
{
public:
    MozRgbMaterialNode();
    ~MozRgbMaterialNode();
[...]
A quick check confirms that it's definitely only the MozExtMaterialNode class in use for us here:
(gdb) break MozRgbMaterialNode
Breakpoint 3 at 0x7fbfc056a0: file qmozextmaterialnode.cpp, line 176.
(gdb) b MozExtMaterialNode
Breakpoint 4 at 0x7fbfbdc174 (2 locations)
(gdb) r
[...]
Thread 10 &quot;QSGRenderThread&quot; hit Breakpoint 4, 0x0000007fbfbdc174 in 
    MozExtMaterialNode::MozExtMaterialNode()@plt () from /usr/lib64/
    libqt5embedwidget.so.1
(gdb)                            
Another check of the backtrace and an examination of the code shows that the place where this is constructed is inside the QuickMozView::updatePaintNode() method; part of QtMozEmbed. We're going to be spending a bit of time inside this code, so it's worth taking a look for yourself at the source. But for reference, here's the full source for the method we're dealing with:
QSGNode *
QuickMozView::updatePaintNode(QSGNode *oldNode, UpdatePaintNodeData *)
{
    // If the dimensions are entirely invalid return no node.
    if (width() <= 0 || height() <= 0) {
        delete oldNode;

        delete mTexture;
        mTexture = nullptr;

        return nullptr;
    }

    const bool invalidTexture = !mComposited
            || !d->mIsPainted
            || !d->mViewInitialized
            || !d->mHasCompositor
            || !d->mContext->registeredWindow()
            || !d->mMozWindow;

    if (mTexture && invalidTexture) {
        delete oldNode;
        oldNode = nullptr;

        delete mTexture;
        mTexture = nullptr;
    }

    QRectF boundingRect(d->renderingOffset(), d->mSize);

    if (!mTexture && invalidTexture) {
        QSGSimpleRectNode *node = static_cast<QSGSimpleRectNode *>(oldNode);
        if (!node) {
            node = new QSGSimpleRectNode;
        }
        node->setColor(d->mBackgroundColor);
        node->setRect(boundingRect);

        return node;
    }

    if (!mTexture) {
        delete oldNode;
        oldNode = nullptr;
    }

    MozMaterialNode *node = static_cast<MozMaterialNode *>(oldNode);

    if (!node) {
#if defined(QT_OPENGL_ES_2)
        QMozExtTexture * const texture = new QMozExtTexture;
        mTexture = texture;

        connect(texture, &QMozExtTexture::getPlatformImage, d->mMozWindow, 
    &QMozWindow::getPlatformImage, Qt::DirectConnection);

        node = new MozExtMaterialNode;
#else
#warning &quot;Implement me for non ES2 platform&quot;
//        node = new MozRgbMaterialNode;
        return nullptr;
#endif
Note the call there to new MozExtMaterialNode. That's the line we're really interested in. On ESR 78 when we step through the code we initially get an execution flow that enters the (!mTexture && invalidTexture) clause. Once we're in there we know the method is going to return early and the MozExtMaterialNode constructor isn't going to get called until we re-enter the method.

After a few cycles through like this, there's suddenly a change. While the mTexture value remains unset, the invalidTexture value flips to false which means the clause gets skipped. We then end up inside the (!node) section, which is where things get interesting. At that point the texture is created, as is the MozExtMaterialNode object and now things are properly set up for rendering.

Let's break this down a bit further. Here's what the initial few cycles look like:
(gdb) break QuickMozView::updatePaintNode
Breakpoint 8 at 0x7faf06f764 (2 locations)
(gdb) r
[...]
Thread 10 &quot;QSGRenderThread&quot; hit Breakpoint 8, QuickMozView::
    updatePaintNode (this=0x55558454e0, oldNode=0x0) at quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {
(gdb) delete break
Delete all breakpoints? (y or n) y
(gdb) break quickmozview.cpp:214
Breakpoint 9 at 0x7fbfc03fac: file quickmozview.cpp, line 214.
(gdb) c
Continuing.
[...]
Thread 10 &quot;QSGRenderThread&quot; hit Breakpoint 9, QuickMozView::
    updatePaintNode (this=0x55558454e0, oldNode=0x7f0801eab0) at 
    quickmozview.cpp:216
216         if (!node) {
(gdb) 
Stepping through this section more carefully we can see why the invalidTexture flag is set to true in all these cases:
Thread 10 &quot;QSGRenderThread&quot; hit Breakpoint 12, QuickMozView::
    updatePaintNode (this=0x5555842660, oldNode=0x7f0800c400) at 
    quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {
(gdb) p mComposited
$23 = true
(gdb) p d->mIsPainted
$24 = false
(gdb) p d->mViewInitialized
$25 = true
(gdb) p d->mHasCompositor
$26 = true
(gdb) p d->mContext->registeredWindow()
$27 = (QMozWindow *) 0x555560ce60
(gdb) p d->mMozWindow
$28 = {wp = {d = 0x5555bbe8e0, value = 0x555560ce60}}
(gdb) 
Crucially here, we can see that p d->mIsPainted is set to false. While any of the items shown here are set to false the invalidTexture flag will be set to true. But then after three or four cycles like this the d->mIsPainted flag suddenly flips to true at which point everything else flows through as we want.
Thread 10 &quot;QSGRenderThread&quot; hit Breakpoint 12, QuickMozView::
    updatePaintNode (this=0x5555842660, oldNode=0x7f0800c400) at 
    quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {
(gdb) p d->mIsPainted
$37 = true
(gdb) n
181         const bool invalidTexture = !mComposited
(gdb) 
186                 || !d->mMozWindow;
(gdb) 
196         QRectF boundingRect(d->renderingOffset(), d->mSize);
(gdb) 
198         if (!mTexture && invalidTexture) {
(gdb) 
210             delete oldNode;
(gdb) 
218             QMozExtTexture * const texture = new QMozExtTexture;
(gdb) 
219             mTexture = texture;
(gdb) 
221             connect(texture, &QMozExtTexture::getPlatformImage, 
    d->mMozWindow, &QMozWindow::getPlatformImage, Qt::DirectConnection);
(gdb) p mTexture
$38 = (QSGTexture *) 0x7f08024070
(gdb) n
223             node = new MozExtMaterialNode;
(gdb) 
230             node->setTexture(mTexture);
(gdb) 
Nice! But this never happens on ESR 91. In the case of ESR 91 the d->mIsPainted flag remains set to false throughout. Consequently we never get to the point where the texture or MozExtMaterialNode object are created on ESR 91:
(gdb) break QuickMozView::updatePaintNode
Breakpoint 9 at 0x7fe790b764 (2 locations)
(gdb) r
[...]
Thread 9 &quot;QSGRenderThread&quot; hit Breakpoint 1, QuickMozView::
    updatePaintNode (this=0x555586cf70, oldNode=0x0) at quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {

(gdb) break quickmozview.cpp:223
Breakpoint 11 at 0x7fbfc03ea4: file quickmozview.cpp, line 230.
(gdb) r
[...]
We end up just getting this same sequence again and again and again:
Thread 9 &quot;QSGRenderThread&quot; hit Breakpoint 7, QuickMozView::
    updatePaintNode (this=0x555586d0d0, oldNode=0x7fc800c640) at 
    quickmozview.cpp:172
172         if (width() <= 0 || height() <= 0) {
(gdb) n
181         const bool invalidTexture = !mComposited
(gdb) p mComposited
$7 = true
(gdb) p d->mIsPainted
$8 = false
(gdb) n
188         if (mTexture && invalidTexture) {
(gdb) p mTexture
$9 = (QSGTexture *) 0x0
(gdb) n
196         QRectF boundingRect(d->renderingOffset(), d->mSize);
(gdb) 
198         if (!mTexture && invalidTexture) {
(gdb) 
200             if (!node) {
(gdb) 
203             node->setColor(d->mBackgroundColor);
(gdb) 
204             node->setRect(boundingRect);
(gdb) 
206             return node;
(gdb) c
Clearly the question we have to answer is "why is d->mIsPainted never set to true?". If we can answer this, who knows, we might even be on the right path to solving the rendering problem.

This has been a rather long investigation today already, so I'm going to leave it there for now. But tomorrow I'm going to come back to this in order to try to answer this question. This feels like it could be a very fruitful line of enquiry!

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment