flypig.co.uk

List items

Items from the current list are shown below.

Gecko

8 Dec 2023 : Day 101 #
It's bright and early, the build completed successfully, it's time to continue with the PDF printing work I started yesterday. First order of the day is to check whether the changes I made yesterday have had the desired effect.

You'll recall I'd restored the DownloadPDFSaver execution path, code which I eventually plan to move into EmbedliteDownloadManager; and that I'd then removed the [noscript] annotations from the print() method of the nsIWebBrowserPrint interface.

Those annotations were preventing the print() method from being called. With any luck making that small change will restore the functionality (at least, up to the next point at which it's broken). Let's see.
$ EMBED_CONSOLE=1 sailfish-browser
[D] unknown:0 - Using Wayland-EGL
library "libGLESv2_adreno.so" not found
library "eglSubDriverAndroid.so" not found
greHome from GRE_HOME:/usr/bin
[...]
Segmentation fault (core dumped)
The error has gone, but it's been replaced by something more dramatic. The PDF print option now generates no output except the consequences of a segfault. That's okay though, because we can use the debugger to find out where the segfault is occurring. And this isn't totally unexpected: as I mentioned previously, the changes I made to get the print code to build were pretty barbaric and I always expected to have to revisit them.
$ EMBED_CONSOLE=1 gdb sailfish-browser
[...]
Thread 8 "GeckoWorkerThre" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 23729]
nsPrintJob::FindFocusedDocument (this=this@entry=0x7f8a637e20,
    aDoc=aDoc@entry=0x7f890638e0)
    at /usr/src/debug/xulrunner-qt5-91.9.1-1.aarch64/layout/printing/
    nsPrintJob.cpp:2411
2411      nsPIDOMWindowOuter* window = aDoc->GetOriginalDocument()->GetWindow();
(gdb) bt
#0  nsPrintJob::FindFocusedDocument (this=this@entry=0x7f8a637e20,
    aDoc=aDoc@entry=0x7f890638e0)
    at /usr/src/debug/xulrunner-qt5-91.9.1-1.aarch64/layout/printing/
    nsPrintJob.cpp:2411
#1  0x0000007fbc3bac6c in nsPrintJob::DoCommonPrint
    (this=this@entry=0x7f8a637e20, aIsPrintPreview=aIsPrintPreview@entry=false, 
    aPrintSettings=aPrintSettings@entry=0x7f89bc2ff0,
    aWebProgressListener=aWebProgressListener@entry=0x7f8a9f68a0,
    aDoc=aDoc@entry=0x7f890638e0)
    at /usr/src/debug/xulrunner-qt5-91.9.1-1.aarch64/layout/printing/
    nsPrintJob.cpp:548
#2  0x0000007fbc3bb718 in nsPrintJob::CommonPrint (this=this@entry=0x7f8a637e20,
    aIsPrintPreview=aIsPrintPreview@entry=false, 
    aPrintSettings=aPrintSettings@entry=0x7f89bc2ff0,
    aWebProgressListener=aWebProgressListener@entry=0x7f8a9f68a0, 
    aSourceDoc=aSourceDoc@entry=0x7f890638e0) at /usr/src/debug/
    xulrunner-qt5-91.9.1-1.aarch64/layout/printing/nsPrintJob.cpp:488
#3  0x0000007fbc3bb840 in nsPrintJob::Print (this=this@entry=0x7f8a637e20,
    aSourceDoc=<optimized out>, aPrintSettings=aPrintSettings@entry=0x7f89bc2ff0, 
    aWebProgressListener=aWebProgressListener@entry=0x7f8a9f68a0) at /usr/src/
    debug/xulrunner-qt5-91.9.1-1.aarch64/layout/printing/nsPrintJob.cpp:824
#4  0x0000007fbc108fe4 in nsDocumentViewer::Print (this=0x7f89066fe0,
    aPrintSettings=0x7f89bc2ff0, aWebProgressListener=0x7f8a9f68a0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsCOMPtr.h:859
[...]
#35 0x0000007fbd16635c in js::jit::MaybeEnterJit (cx=0x7f881df990, state=...)
    at /usr/src/debug/xulrunner-qt5-91.9.1-1.aarch64/js/src/jit/Jit.cpp:207
#36 0x0000007f8824c3d1 in ?? ()
Backtrace stopped: Cannot access memory at address 0x764fdb5afe82
(gdb) 
That's quite a backtrace. The error is certainly happening inside the print stack with the top of the callstack in nsPrintJob::FindFocusedDocument(). Maybe there's some difficulty figuring out which document is to be printed?

While doing this I notice that the remorse timer now opens and a file is output to disk. When I saw this I admit it got me pretty excited; it hinted at the slight possibility the error was happening after the print completed. That would have made things easier. But in fact it's just outputting empty files right now:
$ ls -l ~/Downloads/
total 60
-rw-------    1 defaultu defaultu         0 Dec  5 22:17 Jolla(2).pdf
-rw-------    1 defaultu defaultu         0 Dec  5 22:17 Jolla(3).pdf
-rw-------    1 defaultu defaultu         0 Dec  6 08:23 Jolla(4).pdf
-rw-------    1 defaultu defaultu         0 Dec  6 08:27 Jolla(5).pdf
-rw-------    1 defaultu defaultu         0 Dec  5 22:16 Jolla.pdf
The generation of these files actually happens in the JavaScript code, before any printing has been attempted, as you can see here from our DownloadPDFSaver class:
    // An empty target file must exist for the PDF printer to work correctly.
    let file = await OS.File.open(targetPath, { truncate: true });
    await file.close();
So we can't really divine much from the existence of these files after all unfortunately. It demonstrates the print code is getting called, but we knew that already from the debugger output.

So looking at the code in nsPrintJob.cpp the line that's causing the segfault is the following:
  nsPIDOMWindowOuter* window = aDoc->GetOriginalDocument()->GetWindow();
It looks like the problem here isn't really to do with printing, but rather to do with getting the Window info. We can see from the backtrace that aDoc isn't null, so presumably aDoc->GetOriginalDocument() is returning null. Here's what that method, defined in Document.h, does:
  /**
   * If this document is a static clone, this returns the original
   * document.
   */
  Document* GetOriginalDocument() const {
    MOZ_ASSERT(!mOriginalDocument || !mOriginalDocument->GetOriginalDocument());
    return mOriginalDocument;
  }
Let's check this with the debugger then.
Thread 8 "GeckoWorkerThre" received signal SIGSEGV, Segmentation fault.
[Switching to LWP 16121]
nsPrintJob::FindFocusedDocument (this=this@entry=0x7f88ec2740, aDoc=aDoc@entry=0x7f89154290)
    at /usr/src/debug/xulrunner-qt5-91.9.1-1.aarch64/layout/printing/nsPrintJob.cpp:2411
2411      nsPIDOMWindowOuter* window = aDoc->GetOriginalDocument()->GetWindow();
(gdb) p aDoc
$1 = (nsPrintJob::Document *) 0x7f89154290
(gdb) p aDoc->mIsStaticDocument
$3 = false
(gdb) p aDoc->mOriginalDocument
$2 = {mRawPtr = 0x0}
Definitely null. In theory this will only be non-null if the document is a static clone, flagged by mIsStaticDocument as you can see in the comment here:
  // If mIsStaticDocument is true, mOriginalDocument points to the original
  // document.
  RefPtr<Document> mOriginalDocument;
There's also some useful info in the comments for the method that returns mStaticDocument. It looks like a static clone of the document should be being made as part of the print process:
  /**
   * Returns true if this document is a static clone of a normal document.
   *
   * We create static clones for print preview and printing (possibly other
   * things in future).
   *
   * Note that static documents are also "loaded as data" (if this method
   * returns true, IsLoadedAsData() will also return true).
   */
  bool IsStaticDocument() const { return mIsStaticDocument; }
Either this isn't happening, or the process is somehow broken.

The process for all this happening is through a call to Document::CreateStaticClone(). This sets the mCreatingStaticClone flag to true. This flag then gets transferred over to mIsStaticDocument in a call to Document::CloneDocHelper().

One place where I notice this CreateStaticClone() method is called is in BrowserChild.cpp where it does seem to be in relation to printing:
mozilla::ipc::IPCResult BrowserChild::RecvCloneDocumentTreeIntoSelf(
    const MaybeDiscarded<BrowsingContext>& aSourceBC,
    const embedding::PrintData& aPrintData) {
#ifdef NS_PRINTING
[...]
  printSettingsSvc->DeserializeToPrintSettings(aPrintData, printSettings);

  RefPtr<Document> clone;
  {
    AutoPrintEventDispatcher dispatcher(*sourceDocument);
    nsAutoScriptBlocker scriptBlocker;
    bool hasInProcessCallbacks = false;
    clone = sourceDocument->CreateStaticClone(ourDocShell, cv, printSettings,
                                              &hasInProcessCallbacks);
    if (NS_WARN_IF(!clone)) {
      return IPC_OK();
    }
  }

  return RecvUpdateRemotePrintSettings(aPrintData);
#endif
  return IPC_OK();
}
It might be useful to know if this is being called at any point during our print process. I've attached a breakpoint to it and we can see. Just in case, I've also added a breakpoint to RecvCloneDocumentTreeIntoSelf().

When printing, neither breakpoint is hit.

So I'm now visually comparing the execution flow in nsPrintJob::DoCommonPrint() (this is the second method in the call stack) between ESR 78 and ESR 91. The code has changed quite a bit. The main changes seem to have happened in this commit:
$ git log -1 ada89eea81989
commit ada89eea819891081b040ab527fdea5752e77e89
Author: Bob Owen <bobowencode@gmail.com>
Date:   Mon Aug 3 14:23:56 2020 +0000

    Bug 1653334 part 2: Cache the selection ranges on subdocuments as we build
    the nsPrintObject tree. r=jwatt
    
    This also refactors the selection printing code, so that as we build the tree we
    record which nsPrintObject should be used if printing a Selection is chosen.
    
    Differential Revision: https://phabricator.services.mozilla.com/D85600
Rather unexpectedly though, the code that's causing the segfault is also clearly present in the ESR 78 code, albeit in a different method. That's really helpful since it means we can compare the execution flow between the two, find the parts that are being missed in ESR 91 and (hopefully) figure out why. This makes me feel a whole lot more confident about finding a solution for this.

But that's it for today. Tomorrow I'll start this comparison between the ESR 78 and ESR 91 print execution flow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.

Comments

Uncover Disqus comments