Blog
5 most recent items
19 Mar 2025 : Day 10 #
Yesterday I started trying to get the NewPipe Extractor build to complete on my phone. Unfortunately I hit some issues with resources. I attempted to fix this by reducing the maximum number of allowed threads to one, but this caused the build to stall. So I then set a build running overnight allowing a maximum of two threads.
Unfortunately this resulted in my phone rebooting during the night. Presumably the phone became unresponsive and the Sailfish OS watchdog stepped in to trigger the reboot.
This leaves us in a bit of a quandary. There's no intermediate ground between one and two threads, so we'll have to stick to the two thread limit if we're going to make progress. We need something else to control.
It would also be useful to try to better understand the reason the build is triggering a reboot. My suspicion is that it's a memory issue: the build is using more memory than the available RAM. But it would be good to confirm this using some measurements.
Here's the status of my memory while the phone is idling:
I've also updated the gradle build command so it now includes the verbose flag. I'm hoping this will output not just progress but also the exact commands being used for java and the native-image calls. Here's what I'm using now:
As we can see the memory usage gradually increases through the 300 second run. It peaks at about 1280 MiB, which seems high considering we have two threads capped at 256 MiB, but not completely absurdly so. Presumably there's a bunch of other overhead that's consuming the remaining 768 MiB.
The memory drops down abruptly just before the 300 second mark. That's when the build process crashes.
At least now the phone isn't running out of memory, rather it's that the Java build process is running out of memory inside the constrained JVM.
So I'll have to try increasing the JVM memory and give it another go. Before I do that it's worth checking the command that was used, which is now visible in the output after the inclusion of the verbose flog.
The command executed is incredibly long so I've abridged it slightly, while attempting to retain the most relevant parts.
Whether there's a sweet spot somewhere between those two values isn't clear. I'm also making the assumption that there's only minimal swap space available on my phone and it may turn out that I need to increase the swap space to get it to work.
Here are the memory graphs from the different runs. Bear in mind that the x-axis scales represent time and aren't all directly comparable. Similarly there are two y-axis scales. The one we're interested in is the memory usage shown along the right hand edge of each graph. Note that these don't all share the same scale either, so they're not directly comparable.
I've included the 256 MiB graph again for comparison. Since psrecord writes out the image on completion I don't have the graph for 4096 MiB because the phone rebooted before the image could be written.
All of these graphs show a similar pattern: CPU usage and memory both increase and then plateau. After a period of time the build fails and both drop down to zero. The maximum memory usage increases over time, in MiB to around 1280, 1500, 1560, 1800, 2000 and 2800 respectively. It's hard to correlate these directly to the maximum available JVM memory that we set, except to say that as the setting increases, so does the maximum memory used by the build. That is at least what we'd expect.
It's also notable that the last three graphs for 768 MiB, 1024 MiB and 2048 MiB show a slightly different curve from the earlier graphs. These seem to increase, plateau, but then start to increase again before reaching a second plateau. This may be because they're reaching later portions of the build process.
Finally, the time taken for each build doesn't seem to correspond to the amount of memory available. The build with 768 MiB took especially long for some reason. It's possible this is just down to how the compiler is attempting to shuffle memory around to compensate for the available memory being restricted.
Tomorrow I'll have to do a few more experiments and will need to examine these graphs in more detail to try to figure out how much memory we're likely to need in practice.
If you've been following on Mastodon you'll already know, but I'm happy to share here that Thigg has also been looking in to this, attempting to get the build to complete in a third environment: a Hetzner aarch64 VM:
It wouldn't have occurred to me to try this, but it would be an excellent way to circumvent both the challenges of getting the code to build on a phone and the fact it won't build in the Sailfish SDK.
Thigg and I were planning to have a call to discuss ways forwards with this but unfortunately it didn't happen, entirely my failure. I'm very much hoping that we can still have a conversation about it this week.
Comment
Unfortunately this resulted in my phone rebooting during the night. Presumably the phone became unresponsive and the Sailfish OS watchdog stepped in to trigger the reboot.
$ ./compile.sh transfering data ~/Documents/Development/projects/newpipe/NewPipeExtractor ~/Documents/Development/projects/newpipe skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure" ~/Documents/Development/projects/newpipe starting compilation Starting a Gradle Daemon (subsequent builds will be faster) > Task :timeago-parser:compileJava [...] =============================================================================== GraalVM Native Image: Generating 'appwrapper' (shared library)... =============================================================================== For detailed information and explanations on the build output, visit: https://github.com/oracle/graal/blob/master/docs/reference-manual/native-image/ BuildOutput.md ------------------------------------------------------------------------------- [1/8] Initializing... (33.0s @ 0.13GB) [...] ------------------------------------------------------------------------------- 1 experimental option(s) unlocked: - '-H:DeadlockWatchdogInterval' (origin(s): command line) ------------------------------------------------------------------------------- Build resources: - 4.06GB of memory (75.6% of 5.38GB system memory, determined at start) - 2 thread(s) (25.0% of 8 available processor(s), set via '--parallelism=2') [2/8] Performing analysis... [*******] (408.5s @ 2.62GB) 21,377 reachable types (93.0% of 22,980 total) 34,005 reachable fields (53.3% of 63,823 total) 116,082 reachable methods (69.4% of 167,276 total) 11,524 runtime compiled methods ( 6.9% of 167,276 total) 6,289 types, 70 fields, and 902 methods registered for reflection 83 types, 61 fields, and 192 methods registered for JNI access 4 native libraries: dl, pthread, rt, z client_loop: send disconnect: Broken pipeMy phone isn't just a phone, or just a development tool. It's also my alarm clock! But thankfully it didn't prevent the alarms from triggering this morning to wake me up!
This leaves us in a bit of a quandary. There's no intermediate ground between one and two threads, so we'll have to stick to the two thread limit if we're going to make progress. We need something else to control.
It would also be useful to try to better understand the reason the build is triggering a reboot. My suspicion is that it's a memory issue: the build is using more memory than the available RAM. But it would be good to confirm this using some measurements.
Here's the status of my memory while the phone is idling:
$ free -h total used free shared buff/cache available Mem: 5.4G 2.1G 964.9M 66.4M 2.4G 3.6G Swap: 1024.0M 475.0M 549.0MIncreasing swap size might be a way to approach this, but in the first instance I'm just going to try compiling using various different flags to set different memory limits. At the same time I plan to capture data about memory usage to try to get an idea of whether the flags are actually making any difference, as well as to try to understand what sort of memory usage we're looking at.
I've also updated the gradle build command so it now includes the verbose flag. I'm hoping this will output not just progress but also the exact commands being used for java and the native-image calls. Here's what I'm using now:
ssh $COMPILEHOST "cd $COMPILEHOST_WORKSPACE; \ GRAALVM_HOME=$COMPILEHOST_GRAAL JAVA_HOME=$COMPILEHOST_GRAAL \ ./gradlew nativeCompile" --console verboseAnd here are the additional flags in the graalvmNative section of my build.gradle build configuration. As you can see, I've added the verbose flag here as well, plus the Xmx and Xms flags for setting the maximum and initial memory allocations respectively.
$ git diff diff --git a/appwrapper/build.gradle b/appwrapper/build.gradle index 39a9503e..c0bd305e 100644 --- a/appwrapper/build.gradle +++ b/appwrapper/build.gradle @@ -35,6 +35,12 @@ graalvmNative { buildArgs.add('-H:+AddAllCharsets') // Enable network protocols buildArgs.add('--enable-url-protocols=http,https') + buildArgs.add('--parallelism=2') + buildArgs.add('-H:DeadlockWatchdogInterval=0') + buildArgs.add('-H:+UnlockExperimentalVMOptions') + buildArgs.add('--verbose') + buildArgs.add('-J-Xmx256m') + buildArgs.add('-J-Xms128m') } }Back when I was working on Gecko I also had to capture memory usage and ended up using psrecord for this. It's a Python utility which can easily be installed in a virtual environment, and which can then be attached to a process to capture CPU and memory usage. Here's how I've installed it directly on my phone:
$ python3 -m venv venv $ . ./venv/bin/activate $ pip install --upgrade pip $ pip install psrecord matplotlib $ psrecord --plot mem-gradle.png --interval 0.2 --include-children <PID>There are multiple ways to attach psrecord to a process, but I've decided to go for a manual approach. Once the gradle build has started I can find the process using ps and attach psrecord to this so as to capture resource usage. Because gradle will spawn a bunch of other tools I've added the include-children flag to the psrecord command:
$ ps aux | grep gradle 22624 defaultu /home/defaultuser/Documents/Development/newpipe/graalvm/ graalvm-jdk-23.0.2+7.1/bin/java -Xmx64m -Xms64m -Dorg.gradle.appname=gradlew -classpath /home/defaultuser/Documents/ Development/newpipe/NewPipeExtractor/gradle/wrapper/gradle-wrapper.jar org.gradle.wrapper.GradleWrapperMain nativeCompile --console verbose 22664 defaultu grep gradle $ psrecord --plot mem-gradle.png --interval 0.2 --include-children 22624 Attaching to process 22624For my first attempt I set the memory to a maximum of 256 MiB. This is very low, but at this level with two threads it should at least prevent my phone from crashing.
$ ./compile.sh [...] ------------------------------------------------------------------------------- 1 experimental option(s) unlocked: - '-H:DeadlockWatchdogInterval' (origin(s): command line) ------------------------------------------------------------------------------- Build resources: - 0.22GB of memory (4.1% of 5.38GB system memory, set via '-Xmx256m') - 2 thread(s) (25.0% of 8 available processor(s), set via '--parallelism=2') Terminating due to java.lang.OutOfMemoryError: Java heap space The Native Image build process ran out of memory. Please make sure your build system has more memory available. [...] veimage.driver/com.oracle.svm.driver.NativeImage.performBuild (NativeImage.java:1847) at org.graalvm.nativeimage.driver/com.oracle.svm.driver.NativeImage.main (NativeImage.java:1829) at java.base@23.0.2/java.lang.invoke.LambdaForm$DMH/sa346b79c .invokeStaticInit(LambdaForm$DMH) FAILURE: Build failed with an exception. [...]With these numbers the build fails pretty swiftly with the advice "Please make sure your build system has more memory available". The actual error message is:
veimage.driver/com.oracle.svm.driver.NativeImage.performBuild (NativeImage.java:1847) at org.graalvm.nativeimage.driver/com.oracle.svm.driver.NativeImage.main (NativeImage.java:1829) at java.base@23.0.2/java.lang.invoke.LambdaForm$DMH/sa346b79c .invokeStaticInit(LambdaForm$DMH)Let's take a look at the graph generated by psrecord for memory usage.
As we can see the memory usage gradually increases through the 300 second run. It peaks at about 1280 MiB, which seems high considering we have two threads capped at 256 MiB, but not completely absurdly so. Presumably there's a bunch of other overhead that's consuming the remaining 768 MiB.
The memory drops down abruptly just before the 300 second mark. That's when the build process crashes.
At least now the phone isn't running out of memory, rather it's that the Java build process is running out of memory inside the constrained JVM.
So I'll have to try increasing the JVM memory and give it another go. Before I do that it's worth checking the command that was used, which is now visible in the output after the inclusion of the verbose flog.
The command executed is incredibly long so I've abridged it slightly, while attempting to retain the most relevant parts.
Executing [ HOME=/home/defaultuser \ PATH=/usr/local/bin:/bin:/usr/bin \ PWD=/home/defaultuser/Documents/Development/newpipe/NewPipeExtractor \ USE_NATIVE_IMAGE_JAVA_PLATFORM_MODULE_SYSTEM=true \ /home/defaultuser/Documents/Development/newpipe/graalvm/graalvm-jdk-23.0.2+7.1/ bin/java \ -XX:+UseParallelGC \ -XX:+UnlockExperimentalVMOptions \ -XX:+EnableJVMCI \ -Dtruffle.TrustAllTruffleRuntimeProviders=true \ -Dtruffle.TruffleRuntime=com.oracle.truffle.api.impl.DefaultTruffleRuntime \ -Dgraalvm.ForcePolyglotInvalid=true \ -Dgraalvm.locatorDisabled=true \ [...] -XX:+UseJVMCINativeLibrary \ -Xss10m \ -XX:MaxRAMPercentage=84.99999999999999 \ -XX:GCTimeRatio=9 \ -XX:+ExitOnOutOfMemoryError \ -Djava.awt.headless=true \ '-Dorg.graalvm.vendor=Oracle Corporation' \ -Dorg.graalvm.vendorurl=https://www.graalvm.org/ \ '-Dorg.graalvm.vendorversion=Oracle GraalVM 23.0.2+7.1' \ -Dorg.graalvm.version=24.1.2 \ -Dcom.oracle.graalvm.isaot=true \ -Djava.system.class.loader=com.oracle.svm.hosted.NativeImageSystemClassLoader \ -Xshare:off \ -Dtruffle.TruffleRuntime=com.oracle.svm.truffle.api.SubstrateTruffleRuntime \ -Dgraalvm.ForcePolyglotInvalid=false \ -Dgraalvm.ForcePolyglotInvalid=false \ -Djdk.reflect.useOldSerializableConstructor=true \ -Djdk.internal.lambda.disableEagerInitialization=true \ -Djdk.internal.lambda.eagerlyInitialize=false \ -Djava.lang.invoke.InnerClassLambdaMetafactory.initializeLambdas=false \ -Djava.lang.invoke.MethodHandle.DONT_INLINE_THRESHOLD=-1 \ -Djava.lang.invoke.MethodHandle.PROFILE_GWT=false \ -Xmx256m \ -Xms128m \ --add-modules=ALL-DEFAULT \ [...] -H:+AddAllCharsets@user \ -H:EnableURLProtocols@user+api=http,https \ -H:NumberOfThreads@user+api=2 \ -H:DeadlockWatchdogInterval@user=0 \ -H:+UnlockExperimentalVMOptions@user \ -H:-UnlockExperimentalVMOptions@user \ [...] ]There's nothing unexpected in any of that. In particular, we can see some of the flags related to memory use, all of which are as expected:
-Xss10m \ -XX:MaxRAMPercentage=84.99999999999999 \ -XX:GCTimeRatio=9 \ -XX:+ExitOnOutOfMemoryError \ [...] -Xmx256m \ -Xms128m \My plan now is to gradually increase the memory until something different happens, all the time capturing the outputs. I tried 384 MiB, 512 MiB and 768 MiB, all of which resulted in the build failing with the following error:
com.oracle.svm.driver.NativeImage$NativeImageError at org.graalvm.natiThe Native Image build process ran out of memory. Please make sure your build system has more memory available.At 1024 MiB the error changed, but was still clearly related to a lack of memory. The processes are running in parallel so sometimes the output arrives with different ordering.
Please make sure your build system has more memory available. [...] com.oracle.svm.driver.NativeImage$NativeImageError at org.graalvm.nativeimage.driver/com.oracle.svm.driver.NativeImage .showError(NativeImage.java:2300) at org.graalvm.nativeimage.driver/com.oracle.svm.driver.NativeImage .build(NativeImage.java:1897) at org.graalvm.nativeimage.driver/com.oracle.svm.driver.NativeImage .performBuild(NativeImage.java:1847) at org.graalvm.nativeimage.driver/com.oracle.svm.driver.NativeImage .main(NativeImage.java:1829) at java.base@23.0.2/java.lang.invoke.LambdaForm$DMH/sa346b79c .invokeStaticInit(LambdaForm$DMH)Increasing the maximum allowable memory to 2048 MiB gives similar results again:
$ ./compile.sh [...] Build resources: - 1.78GB of memory (33.1% of 5.38GB system memory, set via '-Xmx2048m') - 2 thread(s) (25.0% of 8 available processor(s), set via '--parallelism=2') Terminating due to java.lang.OutOfMemoryError: Java heap space The Native Image build process ran out of memory. Please make sure your build system has more memory available. <==========---> 81% EXECUTING [34m 2s] > :appwrapper:nativeCompile Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. com.oracle.svm.driver.NativeImage$NativeImageError at org.graalvm.natiAt this point I decided to try doubling the memory to 4096 MiB — a rather large jump — and this time the build didn't fail as such, but it did cause my phone to reboot again. So I guess we now know that, if there is a configuration that's going to work, it'll be with memory set somewhere between 2048 MiB and 4096 MiB.
Whether there's a sweet spot somewhere between those two values isn't clear. I'm also making the assumption that there's only minimal swap space available on my phone and it may turn out that I need to increase the swap space to get it to work.
Here are the memory graphs from the different runs. Bear in mind that the x-axis scales represent time and aren't all directly comparable. Similarly there are two y-axis scales. The one we're interested in is the memory usage shown along the right hand edge of each graph. Note that these don't all share the same scale either, so they're not directly comparable.
I've included the 256 MiB graph again for comparison. Since psrecord writes out the image on completion I don't have the graph for 4096 MiB because the phone rebooted before the image could be written.
All of these graphs show a similar pattern: CPU usage and memory both increase and then plateau. After a period of time the build fails and both drop down to zero. The maximum memory usage increases over time, in MiB to around 1280, 1500, 1560, 1800, 2000 and 2800 respectively. It's hard to correlate these directly to the maximum available JVM memory that we set, except to say that as the setting increases, so does the maximum memory used by the build. That is at least what we'd expect.
It's also notable that the last three graphs for 768 MiB, 1024 MiB and 2048 MiB show a slightly different curve from the earlier graphs. These seem to increase, plateau, but then start to increase again before reaching a second plateau. This may be because they're reaching later portions of the build process.
Finally, the time taken for each build doesn't seem to correspond to the amount of memory available. The build with 768 MiB took especially long for some reason. It's possible this is just down to how the compiler is attempting to shuffle memory around to compensate for the available memory being restricted.
Tomorrow I'll have to do a few more experiments and will need to examine these graphs in more detail to try to figure out how much memory we're likely to need in practice.
If you've been following on Mastodon you'll already know, but I'm happy to share here that Thigg has also been looking in to this, attempting to get the build to complete in a third environment: a Hetzner aarch64 VM:
"Just for fun I tried to build the libjavafloodjava lib on an aarch64 VM on hetzner today. First the glibc version did not match up so I tried static linking with musl (maybe it would have been easier to use a compatible glibc somehow...) this did not work natively on the machine so I tried the docker images of graal but the aarch64 images do not come with musl. That's where I ended today. Maybe one could install musl into the docker images and just use the docker container to compile the library on some beefy VM?"
It wouldn't have occurred to me to try this, but it would be an excellent way to circumvent both the challenges of getting the code to build on a phone and the fact it won't build in the Sailfish SDK.
Thigg and I were planning to have a call to discuss ways forwards with this but unfortunately it didn't happen, entirely my failure. I'm very much hoping that we can still have a conversation about it this week.
18 Mar 2025 : Day 9 #
Yesterday I worked through and removed all final references to the Rhino JavaScript interpreter, so that we can use the GraalJS interpreter instead. Just as I was about to commit the code I had a change of heart: rather than putting all of the replaced functionality inside a single Utils class I decided to split it across replacement Context, Kit and ScriptRuntime classes instead.
This increases the number of files, but reduces the diff required from the existing code, so I think it ends up a lot cleaner and nicer.
Having committed my changes it's time to move to the next stage, which is attempting to build and run the NewPipe source for Sailfish OS. Since we already attempted to build it locally using the Sailfish SDK unsuccessfully on Day 6 I'm going to jump straight to building it directly on my phone today.
This means adjusting the compile.sh file we used to build sailing-the-flood-to-java earlier for the NewPipe codebase.
My first attempt looks like this:
The next step then has to be to try to build the native binaries using nativeCompile. I've therefore updated the final line of the script so it looks like this:
Perhaps unsurprisingly the build is also taking a lot longer now it's happening on my phone. I should time it and the build on my laptop to compare.
I'm currently travelling to work and it's taking so long in fact that the build is still running as my train pulls in to King's Cross, the final stop of my journey. Although I'm running in a gnu screen session, the session is running on my laptop, not on my phone. I can send my laptop to sleep but I doubt the SSH connection will survive the interruption.
So I've cancelled the build and will have to kick it off again later.
And as expected, as my train comes in to land I've had to close the connection and stop the build.
I don't see any flags available for the native-image tool for constraining memory usage, but there is a flag for controlling the number of threads that are used. The flag in question is --parallelism, which you can pass with a number to restrict the maximum number of threads.
I'm therefore adding this to the configuration so that no more than one thread is used. This will potentially slow the build down by a factor of eight, which is a huge difference, but still better than a failed build and a rebooted phone.
Tomorrow we'll find out the result. If it's failed I'll need to look more deeply into how to fix it. If it's succeeded then it'll be time to test the resulting library!
Comment
This increases the number of files, but reduces the diff required from the existing code, so I think it ends up a lot cleaner and nicer.
Having committed my changes it's time to move to the next stage, which is attempting to build and run the NewPipe source for Sailfish OS. Since we already attempted to build it locally using the Sailfish SDK unsuccessfully on Day 6 I'm going to jump straight to building it directly on my phone today.
This means adjusting the compile.sh file we used to build sailing-the-flood-to-java earlier for the NewPipe codebase.
My first attempt looks like this:
#!/bin/bash COMPILEHOST=defaultuser@172.28.172.1 COMPILEHOST_WORKSPACE=/home/defaultuser/Documents/Development/newpipe/ NewPipeExtractor COMPILEHOST_GRAAL=/home/defaultuser/Documents/Development/newpipe/graalvm/ graalvm-jdk-23.0.2+7.1 COMPILEHOST_MAVENLOCAL=/home/defaultuser/Documents/Development/newpipe/graalvm/ m2 echo "transfering data" pushd NewPipeExtractor rsync . $COMPILEHOST:$COMPILEHOST_WORKSPACE -r popd echo "starting compilation" ssh $COMPILEHOST "cd $COMPILEHOST_WORKSPACE; GRAALVM_HOME=$COMPILEHOST_GRAAL \ JAVA_HOME=$COMPILEHOST_GRAAL ./gradlew tasks"All this does is sync the NewPipeExtractor source directory, then call gradlew tasks inside it. The result is good: the various tasks are listed to the console, including a bunch of native build and run tasks. There are a couple of warnings during the rsync operation though:
skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure"That's because these files are symlinks:
$ ls -lh graalvm/graalvm-jdk-23.0.2+7.1/bin/native-image lrwxrwxrwx 1 flypig flypig 27 Jan 7 14:16 native-image -> ../lib/svm/bin/native-image $ ls -lh graalvm/graalvm-jdk-23.0.2+7.1/bin/native-image-configure lrwxrwxrwx 1 flypig flypig 37 Jan 7 14:16 native-image-configure -> ../lib/svm/bin/native-image-configureI could fix those manually on my phone, but in practice I don't think they should matter because I've configured the build process to use the GraalVM toolchain from a different location (the location that was used for sailing-the-flood-to-java as it happens). In future, I think the right way to deal with this is to ensure the build tools get downloaded and unpacked directly on my phone during the build process. That'll be something for the future.
The next step then has to be to try to build the native binaries using nativeCompile. I've therefore updated the final line of the script so it looks like this:
ssh $COMPILEHOST "cd $COMPILEHOST_WORKSPACE; GRAALVM_HOME=$COMPILEHOST_GRAAL \ JAVA_HOME=$COMPILEHOST_GRAAL ./gradlew nativeCompile"Let's try executing it to see what happens.
$ ./compile.sh transfering data ~/Documents/Development/projects/newpipe/NewPipeExtractor ~/Documents/Development/projects/newpipe skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure" ~/Documents/Development/projects/newpipe starting compilation [...]As the build progresses I notice that my email client, calendar, Whisperfish and the Documents app all running on my phone have been killed due to lack of memory. This also happened when building sailing-the-fllod-to-java so this isn't unexpected. But it makes me wonder whether there will be enough resources for the much larger NewPipe build to get through in its entirety.
Perhaps unsurprisingly the build is also taking a lot longer now it's happening on my phone. I should time it and the build on my laptop to compare.
I'm currently travelling to work and it's taking so long in fact that the build is still running as my train pulls in to King's Cross, the final stop of my journey. Although I'm running in a gnu screen session, the session is running on my laptop, not on my phone. I can send my laptop to sleep but I doubt the SSH connection will survive the interruption.
So I've cancelled the build and will have to kick it off again later.
And as expected, as my train comes in to land I've had to close the connection and stop the build.
$ ./compile.sh transfering data ~/Documents/Development/projects/newpipe/NewPipeExtractor ~/Documents/Development/projects/newpipe skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure" ~/Documents/Development/projects/newpipe starting compilation Starting a Gradle Daemon (subsequent builds will be faster) [...] =============================================================================== GraalVM Native Image: Generating 'appwrapper' (shared library)... =============================================================================== For detailed information and explanations on the build output, visit: https://github.com/oracle/graal/blob/master/docs/reference-manual /native-image/BuildOutput.md ------------------------------------------------------------------------------- [1/8] Initializing... (31.9s @ 0.14GB) [...] ------------------------------------------------------------------------------- Build resources: - 4.06GB of memory (75.6% of 5.38GB system memory, determined at start) - 8 thread(s) (100.0% of 8 available processor(s), determined at start) [2/8] Performing analysis... [*******] (337.2s @ 2.77GB) 21,380 reachable types (93.0% of 22,982 total) 34,005 reachable fields (53.3% of 63,827 total) 116,104 reachable methods (69.4% of 167,289 total) 11,524 runtime compiled methods ( 6.9% of 167,289 total) 6,290 types, 70 fields, and 905 methods registered for reflection 83 types, 61 fields, and 192 methods registered for JNI access 4 native libraries: dl, pthread, rt, z ^CI attempted a further build during the day. It seemed to make progress until it caused my phone to become unresponsive and reboot. So I'll need a way to reduce the resource consumption of the build. Looking at the output from earlier, I see there are both memory and threads listed as resources being claimed.
I don't see any flags available for the native-image tool for constraining memory usage, but there is a flag for controlling the number of threads that are used. The flag in question is --parallelism, which you can pass with a number to restrict the maximum number of threads.
I'm therefore adding this to the configuration so that no more than one thread is used. This will potentially slow the build down by a factor of eight, which is a huge difference, but still better than a failed build and a rebooted phone.
$ git diff diff --git a/appwrapper/build.gradle b/appwrapper/build.gradle index 39a9503e..c3c19dc3 100644 --- a/appwrapper/build.gradle +++ b/appwrapper/build.gradle @@ -35,6 +35,7 @@ graalvmNative { buildArgs.add('-H:+AddAllCharsets') // Enable network protocols buildArgs.add('--enable-url-protocols=http,https') + buildArgs.add('--parallelism=1') } }With this in place, I kick the build off again. But now there appears to be a different issue. Brace yourself for a rather long console transcript. Feel free to skip past this; I really just want to capture the errors for future reference.
$ ./compile.sh transfering data ~/Documents/Development/projects/newpipe/NewPipeExtractor ~/Documents/Development/projects/newpipe skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure" ~/Documents/Development/projects/newpipe starting compilation Starting a Gradle Daemon (subsequent builds will be faster) [...] > Task :appwrapper:nativeCompile [native-image-plugin] GraalVM Toolchain detection is disabled [native-image-plugin] GraalVM location read from environment variable: GRAALVM_HOME [native-image-plugin] Native Image executable path: graalvm-jdk-23.0.2+7.1/lib/svm/bin/native-image Loading classes is taking a long time. This can be caused by class- or module-path entries that point to large directory structures. Total processed entries: 17989, current entry: jar:file:///home/defaultuser/.gradle/caches/modules-2/files-2.1/ com.google.code.findbugs/jsr305/3.0.2/ 25ea2e8b0c338a877313bd4672d3fe056ea78f0d/jsr305-3.0.2.jar!/META-INF/maven/ com.google.code.findbugs/jsr305 [...] === Image generator watchdog detected no activity. This can be a sign of a deadlock during image building. Dumping all stack traces. Current time: Tue Feb 25 21:11:04 GMT 2025 "main" Id=1 in WAITING on lock=java.util.stream.ForEachOps$ForEachTask@2cb0e449 at java.base@23.0.2/jdk.internal.misc.Unsafe.park(Native Method) at java.base@23.0.2/java.util.concurrent.locks.LockSupport.park (LockSupport.java:371) at java.base@23.0.2/java.util.concurrent.ForkJoinTask.awaitDone (ForkJoinTask.java:441) at java.base@23.0.2/java.util.concurrent.ForkJoinTask.awaitDone (ForkJoinTask.java:496) at java.base@23.0.2/java.util.concurrent.ForkJoinTask.join (ForkJoinTask.java:662) at java.base@23.0.2/java.util.concurrent.ForkJoinTask.invoke (ForkJoinTask.java:677) at java.base@23.0.2/java.util.stream.ForEachOps$ForEachOp.evaluateParallel (ForEachOps.java:160) at java.base@23.0.2/java.util.stream.ForEachOps$ForEachOp$OfRef .evaluateParallel(ForEachOps.java:174) at java.base@23.0.2/java.util.stream.AbstractPipeline.evaluate (AbstractPipeline.java:264) at java.base@23.0.2/java.util.stream.ReferencePipeline.forEach (ReferencePipeline.java:636) at java.base@23.0.2/java.util.stream.ReferencePipeline$Head.forEach (ReferencePipeline.java:810) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .NativeImageClassLoaderSupport$LoadClassHandler.run (NativeImageClassLoaderSupport.java:678) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .NativeImageClassLoaderSupport.loadAllClasses (NativeImageClassLoaderSupport.java:243) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .ImageClassLoader.loadAllClasses(ImageClassLoader.java:88) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .NativeImageGeneratorRunner.buildImage(NativeImageGeneratorRunner.java: 386) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .NativeImageGeneratorRunner.build(NativeImageGeneratorRunner.java:711) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .NativeImageGeneratorRunner.start(NativeImageGeneratorRunner.java:139) at app/org.graalvm.nativeimage.builder/com.oracle.svm.hosted .NativeImageGeneratorRunner.main(NativeImageGeneratorRunner.java:94) "Reference Handler" Id=5 in RUNNABLE at java.base@23.0.2/java.lang.ref.Reference.waitForReferencePendingList (Native Method) at java.base@23.0.2/java.lang.ref.Reference.processPendingReferences (Reference.java:246) at java.base@23.0.2/java.lang.ref.Reference$ReferenceHandler.run (Reference.java:208) "Finalizer" Id=6 in WAITING on lock=java.lang.ref .NativeReferenceQueue$Lock@28e6ff9f at java.base@23.0.2/java.lang.Object.wait0(Native Method) at java.base@23.0.2/java.lang.Object.wait(Object.java:378) at java.base@23.0.2/java.lang.Object.wait(Object.java:352) at java.base@23.0.2/java.lang.ref.NativeReferenceQueue.await (NativeReferenceQueue.java:48) at java.base@23.0.2/java.lang.ref.ReferenceQueue.remove0 (ReferenceQueue.java:166) at java.base@23.0.2/java.lang.ref.NativeReferenceQueue.remove (NativeReferenceQueu$ ./compile.sh transfering data ~/Documents/Development/projects/newpipe/NewPipeExtractor ~/Documents/Development/projects/newpipe skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure" ~/Documents/Development/projects/newpipe starting compilation Starting a Gradle Daemon (subsequent builds will be faster) [...] =============================================================================== GraalVM Native Image: Generating 'appwrapper' (shared library)... =============================================================================== For detailed information and explanations on the build output, visit: https://github.com/oracle/graal/blob/master/docs/reference-manual/native-image/ BuildOutput.md ------------------------------------------------------------------------------- [1/8] Initializing... (33.0s @ 0.13GB) [...] ------------------------------------------------------------------------------- 1 experimental option(s) unlocked: - '-H:DeadlockWatchdogInterval' (origin(s): command line) ------------------------------------------------------------------------------- Build resources: - 4.06GB of memory (75.6% of 5.38GB system memory, determined at start) - 2 thread(s) (25.0% of 8 available processor(s), set via '--parallelism=2') e.java:89) at java.base@23.0.2/java.lang.ref.Finalizer$FinalizerThread.run( Finalizer.java:173) "Signal Dispatcher" Id=7 in RUNNABLE "Common-Cleaner" Id=14 in TIMED_WAITING on lock=java.util.concurrent.locks .AbstractQueuedSynchronizer$ConditionObject@4321868d at java.base@23.0.2/jdk.internal.misc.Unsafe.park(Native Method) at java.base@23.0.2/java.util.concurrent.locks.LockSupport.parkNanos (LockSupport.java:269) at java.base@23.0.2/java.util.concurrent.locks .AbstractQueuedSynchronizer$ConditionObject.await (AbstractQueuedSynchronizer.java:1852) at java.base@23.0.2/java.lang.ref.ReferenceQueue.await (ReferenceQueue.java:79) at java.base@23.0.2/java.lang.ref.ReferenceQueue.remove0 (ReferenceQueue.java:151) at java.base@23.0.2/java.lang.ref.ReferenceQueue.remove (ReferenceQueue.java:229) at java.base@23.0.2/jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140) at java.base@23.0.2/java.lang.Thread.runWith(Thread.java:1588) at java.base@23.0.2/java.lang.Thread.run(Thread.java:1575) at java.base@23.0.2/jdk.internal.misc.InnocuousThread.run (InnocuousThread.java:186) "Notification Thread" Id=15 in RUNNABLE [...] === Memory statistics (in MB): === Used heap size: 84 === Free heap size: 53 === Maximum heap size: 4162 === Image generator watchdog is aborting image generation. To configure the watchdog, use the options -H:DeadlockWatchdogInterval=10 and -H:+DeadlockWatchdogExitOnTimeout > Task :appwrapper:nativeCompile FAILED Error: Image build request for 'appwrapper' (pid: 59178, path: /home/defaultuser/Documents/Development/newpipe/NewPipeExtractor/appwrapper /build/native/nativeCompile) failed with exit status 30 [Incubating] Problems report is available at: file:///home/defaultuser/Documents/Development/newpipe/NewPipeExtractor /build/reports/problems/problems-report.html FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':appwrapper:nativeCompile'. > Process 'command '/home/defaultuser/Documents/Development/newpipe/graalvm /graalvm-jdk-23.0.2+7.1/bin/native-image'' finished with non-zero exit value 30 [...] BUILD FAILED in 11m 42sThe build has clearly failed here, but despite all those backtraces it doesn't appear to be due to an error, but rather is triggered by the watchdog mechanism. This is supposed to identify builds that have hung, killing the build process after a certain duration of unresponsiveness. Rather conveniently the error message also provides some advice for how to work around this:
watchdog is aborting image generation. To configure the watchdog, use the options -H:DeadlockWatchdogInterval=10 and -H:+DeadlockWatchdogExitOnTimeoutI'm not sure what a good timeout would be, so instead I'm going to disable it completely by adding -H:DeadlockWatchdogInterval=0 as a build parameter, like this:
$ git diff diff --git a/appwrapper/build.gradle b/appwrapper/build.gradle index 39a9503e..dbf0a12b 100644 --- a/appwrapper/build.gradle +++ b/appwrapper/build.gradle @@ -35,6 +35,8 @@ graalvmNative { buildArgs.add('-H:+AddAllCharsets') // Enable network protocols buildArgs.add('--enable-url-protocols=http,https') + buildArgs.add('--parallelism=1') + buildArgs.add('H:DeadlockWatchdogInterval=0') } }With this set the build is now running and not being killed. But it's taking a very long time and appears to be stuck in a loop even before reaching the first stage:
$ ./compile.sh transfering data ~/Documents/Development/projects/newpipe/NewPipeExtractor ~/Documents/ Development/projects/newpipe skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image" skipping non-regular file "graalvm/graalvm-jdk-23.0.2+7.1/bin/ native-image-configure" ~/Documents/Development/projects/newpipe starting compilation Starting a Gradle Daemon (subsequent builds will be faster) [...] > Task :appwrapper:nativeCompile [native-image-plugin] GraalVM Toolchain detection is disabled [native-image-plugin] GraalVM location read from environment variable: GRAALVM_HOME [native-image-plugin] Native Image executable path: graalvm/graalvm-jdk-23.0.2+7.1/lib/svm/bin/native-image Warning: The option '-H:DeadlockWatchdogInterval=0' is experimental and must be enabled via '-H:+UnlockExperimentalVMOptions' in the future. Warning: Please re-evaluate whether any experimental option is required, and either remove or unlock it. The build output lists all active experimental options, including where they come from and possible alternatives. If you think an experimental option should be considered as stable, please file an issue. Loading classes is taking a long time. This can be caused by class- or module-path entries that point to large directory structures. Total processed entries: 17989, current entry: jar:file:///home/defaultuser/.gradle/caches/modules-2/files-2.1/com.google .code.findbugs/jsr305/3.0.2/25ea2e8b0c338a877313bd4672d3fe056ea78f0d /jsr305-3.0.2.jar!/META-INF/maven/com.google.code.findbugs/jsr305 Total processed entries: 17989, current entry: jar:file:///home/defaultuser/.gradle/caches/modules-2/files-2.1/com.google .code.findbugs/jsr305/3.0.2/25ea2e8b0c338a877313bd4672d3fe056ea78f0d /jsr305-3.0.2.jar!/META-INF/maven/com.google.code.findbugs/jsr305 [...] Total processed entries: 17989, current entry: jar:file:///home/defaultuser/.gradle/caches/modules-2/files-2.1/com.google .code.findbugs/jsr305/3.0.2/25ea2e8b0c338a877313bd4672d3fe056ea78f0d /jsr305-3.0.2.jar!/META-INF/maven/com.google.code.findbugs/jsr305 [...]That same message is repeated 25 times and the "Total processed entries" never seems to increase beyond 17989. It appears to be the case that with one thread the process will never progress. I've therefore increased parallelism so that it now allows up to two threads. It's late, so I'm going to now leave it running for the night and see how far it's progressed by morning.
Tomorrow we'll find out the result. If it's failed I'll need to look more deeply into how to fix it. If it's succeeded then it'll be time to test the resulting library!
17 Mar 2025 : Day 8 #
Yesterday turned out to be an unexpectedly productive day. I was able to get the NewPipe Extractor code built on Linux as a native executable, linked to some C code that was able to collect metadata and a video download URL from YouTube.
As part of this I had to convert some code that executed JavaScript functions using Mozilla's Rhino JavaScript engine to use Polyglot instead. Under the hood, this is making use of the GraalJS JavaScript interpreter.
But, although I was able to switch out Rhino for GraalJS in the places needed for our test application, there are other places where Rhino is still used, most notably in the TokenStream.java source. Rhino is much more tightly intertwined with this code, so replacing it is going to be considerably harder.
In fact, looking at the header, it even seems like some of this source was taken directly from the Rhino codebase:
So what's the actual code that's being used here? There's a lot of code in the TokenStream class, but the actual Rhino code being used is somewhat more limited. There are a couple of static ints taken from the Context class: These are just integers, so it'd be pretty easy to redefine them (although I must also look in to how they're being used and whether they need to be updated for use with GraalJS). There are also four methods that are made use of:
Looking at these five methods, it also looks like they can mostly be easily replaced, potentially just by pulling in the small portion of code directly in to the source file in the NewPipe Extractor code. Since the code is MPL licensed and the MPL copyright notice is already at the top of the file, doing this shouldn't be problematic from a licensing perspective either.
First, checking where the version integers are being used from Context I actually only see one case, which is this, from Lexer.java:
So I've done some work to recreate the missing code. I've added four new files, all in the extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor directory:
It might end up cleaner if I split these out into three separate files again, but for now this small set of changes will do the trick.
To check that things compile now without the need to reference Rhino, I've also removed Rhino as a dependency from the build configuration.
This seems like a good place to stop for today. We've stripped the code of references to Rhino and replaced it so that GraalJS is used instead. This puts us in a good position to try building and running the code on Sailfish OS tomorrow.
Comment
As part of this I had to convert some code that executed JavaScript functions using Mozilla's Rhino JavaScript engine to use Polyglot instead. Under the hood, this is making use of the GraalJS JavaScript interpreter.
But, although I was able to switch out Rhino for GraalJS in the places needed for our test application, there are other places where Rhino is still used, most notably in the TokenStream.java source. Rhino is much more tightly intertwined with this code, so replacing it is going to be considerably harder.
In fact, looking at the header, it even seems like some of this source was taken directly from the Rhino codebase:
/* Source: Mozilla Rhino, org.mozilla.javascript.Token * * This Source Code Form is subject to the terms of the Mozilla Public * License, v. 2.0. If a copy of the MPL was not distributed with this * file, You can obtain one at http://mozilla.org/MPL/2.0/. * */ class TokenStream { [...]Digging through the code, it appears this is used to extract function names from JavaScript files, needed in two places:
- YoutubeSignatureUtils: used to get the signature de-obfuscation code of YouTube's base JavaScript file;
So what's the actual code that's being used here? There's a lot of code in the TokenStream class, but the actual Rhino code being used is somewhat more limited. There are a couple of static ints taken from the Context class: These are just integers, so it'd be pretty easy to redefine them (although I must also look in to how they're being used and whether they need to be updated for use with GraalJS). There are also four methods that are made use of:
- Kit.xDigitToInt()
- Kit.codeBug()
- ScriptRuntime.isJSLineTerminator()
- ObjToIntMap.ObjToIntMap()
- ObjToIntMap.intern()
Looking at these five methods, it also looks like they can mostly be easily replaced, potentially just by pulling in the small portion of code directly in to the source file in the NewPipe Extractor code. Since the code is MPL licensed and the MPL copyright notice is already at the top of the file, doing this shouldn't be problematic from a licensing perspective either.
First, checking where the version integers are being used from Context I actually only see one case, which is this, from Lexer.java:
/** * Create a new JavaScript lexer with the given source code * * @param js JavaScript code */ public Lexer(final String js) { this(js, Context.VERSION_DEFAULT); }The VERSION_DEFAULT integer is also defined in the Rhino source and simply takes the value 0. In that case, these values should be safe for us to define explicitly.
So I've done some work to recreate the missing code. I've added four new files, all in the extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor directory:
- ObjToIntMap.java
- UniqueTag.java
- Utils.java
public final class UniqueTag implements Serializable { private static final int ID_NULL_VALUE = 2; public static final UniqueTag NULL_VALUE = new UniqueTag(ID_NULL_VALUE); private final int tagId; private UniqueTag(int tagId) { this.tagId = tagId; } }Finally the Utils class contains the fields and methods from Context, Kit and ScriptRuntime that are needed for TokenStream. Because these are now in a class called Utils I also had to make minor adjustments to the TokenStream code to use the new names.
It might end up cleaner if I split these out into three separate files again, but for now this small set of changes will do the trick.
To check that things compile now without the need to reference Rhino, I've also removed Rhino as a dependency from the build configuration.
$ git diff extractor/build.gradle diff --git a/extractor/build.gradle b/extractor/build.gradle index 16d34e00..e617e39f 100644 --- a/extractor/build.gradle +++ b/extractor/build.gradle @@ -30,7 +30,6 @@ dependencies { implementation 'org.jsoup:jsoup:1.17.2' implementation "com.google.code.findbugs:jsr305:$jsr305Version" - implementation 'org.mozilla:rhino:1.7.15' implementation("org.graalvm.polyglot:polyglot:$graalVMVersion") implementation("org.graalvm.polyglot:js:$graalVMVersion")Happily the build completes successfully despite this. And the example still runs.
This seems like a good place to stop for today. We've stripped the code of references to Rhino and replaced it so that GraalJS is used instead. This puts us in a good position to try building and running the code on Sailfish OS tomorrow.
16 Mar 2025 : Day 7 #
Sadly it wasn't possible to get GraalVM working correctly using scratchbox2 yesterday, which means I'll have to continue running builds on my phone in future. Maybe I'll figure out a better approach in future, but for now this will do the job. But for now, it's time to move on.
And it feels like we're going to start our adventure properly today as we attempt to get the NewPipe Extractor code working on GraalVM. I'm expecting multiple challenges. First, NewPipe Extractor is set up to be built using Gradle, whereas our existing GraalVM pipeline uses Maven. So there's going to be some work needed either to adjust the Gradle configuration or to switch over to using Maven. Second, GraalVM doesn't support the full feature-set of Java. For example, some of the reflection features aren't supported. I have no idea what features NewPipe Extractor relies on, but these could result in some bumps along the way. Finally NewPipe Extractor is a much larger codebase than the sailing-the-flood-to-java code. So we may well hit memory issues.
Those are the known unknowns. There will surely be unknown unknowns as well!
For the entirety of today I won't be working on-phone, I'll be working on my Linux laptop. Moving to the phone will be for a future post.
As I mentioned, the GraalVM build process we've been using up until now has used Maven. So for the first step today I'm going to attempt to integrate GraalVM's native build process into the Gradle build scripts used for NewPipe Extractor.
I've therefore added the following to the build.gradle files for extractor and appwrapper:
Let's find out.
This is just the extractor. If you cast your mind back to Day 4 you may recall I added an appwrapper directory to the repository that contained a small example application. This made use of the library by outputting some info about a YouTube video, along with a URL that allows the video to be downloaded.
Because I also added org.graalvm.buildtools.native as a plugin to the build.gradle for the appwrapper code, it means that this was also run through the native build tooling. Let's take a look at what we got from that.
In order to test this out, I've written a very simple wrapper application in C that will initialise the native Gradle library, call this method and then quit. Here's the code, which I've stored in the file appwrapper/src/main.c.
Since it's executable, we should try to execute it.
Well, for our purposes we really are going to need that HTTPS protocol. It's not so clear whether we'll need HTTP, but to avoid us hitting that problem in the future I'm going to enable it as well. I've done that by adding the relevant flags to the configuration from earlier:
The missing class is org.mozilla.javascript.VMBridge. In a previous life I did a decent amount of work with Java, but that was well over a decade ago and I'm not especially familiar with the Java landscape as it exists now. But a bit of digging around on the Web uncovers the fact that this is part of Mozilla's Rhino JavaScript library. Rhino is a JavaScript library written in Java. Despite the fact it's still being developed, it seems that GraalVM has chosen a different approach, relying instead on its own JavaScript interpreter, part of its Polyglot approach to interoperability which allows a whole host of languages to play nicely together.
Having now read up a bit more on this, it seems there are potentially two approaches here. I could try to get Rhino integrated with the build, or I could amend the code to use Polyglot instead. Since it looks like Rhino may not be supported at all with GraalVM, I've decided to give the latter a go.
As we can see if we follow the exception stacktrace above, the problematic code is in the JavaScript.java file, which is part of the NewPipe Extractor code. So I've made the following changes to it, which essentially replace the calls that use Rhino with calls to Polyglot instead.
There are a few takeaways. First is that building native binaries using Gradle turns out to be pretty straightforward, but with a few easy-to-hit gotchas, including the fact that HTTP and HTTPS requests are disabled by default.
The other takeaway is that we have to switch out the Rhino JavaScript interpreter for the JavaScript interpreter provided by Polyglot. We did this for the one place it was used for our test application, but a quick grep suggests it's used in several other places as well:
Up until this point that wasn't clear; not it looks a lot more promising.
So where next? Tomorrow I'll need to review the Extractor code again and remove all references to Rhino. That means taking a look at the TokenStream.java code to see whether we can switch out Rhino for Polyglot. Once I've done that, I'll then switch back to Sailfish OS to see whether we can get all this working there.
Assuming we can, the next step after that will be figuring out how to expose the Extractor functionality in a form that we can use from our Sailfish OS app. That'll require a lot of thought and work, but at least by that point I'll no longer be scrabbling in the dark. By then, we'll know whether the result we want is possible or not.
Comment
And it feels like we're going to start our adventure properly today as we attempt to get the NewPipe Extractor code working on GraalVM. I'm expecting multiple challenges. First, NewPipe Extractor is set up to be built using Gradle, whereas our existing GraalVM pipeline uses Maven. So there's going to be some work needed either to adjust the Gradle configuration or to switch over to using Maven. Second, GraalVM doesn't support the full feature-set of Java. For example, some of the reflection features aren't supported. I have no idea what features NewPipe Extractor relies on, but these could result in some bumps along the way. Finally NewPipe Extractor is a much larger codebase than the sailing-the-flood-to-java code. So we may well hit memory issues.
Those are the known unknowns. There will surely be unknown unknowns as well!
For the entirety of today I won't be working on-phone, I'll be working on my Linux laptop. Moving to the phone will be for a future post.
As I mentioned, the GraalVM build process we've been using up until now has used Maven. So for the first step today I'm going to attempt to integrate GraalVM's native build process into the Gradle build scripts used for NewPipe Extractor.
I've therefore added the following to the build.gradle files for extractor and appwrapper:
plugins { id 'checkstyle' id 'org.graalvm.buildtools.native' version '0.10.5' }The version number — 0.10.5 — I got from the GraalVM instructions for building native images using Gradle. The value there mirrors the latest version of the native build tools available from the source repository. This should match up with the version of GraalVM we'll be downloading, which is also the latest version:
$ mkdir graalvm $ pushd graalvm/ ~/dev/graalvm ~/dev $ curl -O https://download.oracle.com/graalvm/23/latest/ graalvm-jdk-23_linux-x64_bin.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 361M 100 361M 0 0 4133k 0 0:01:29 0:01:29 --:--:-- 3495k $ tar --totals -xf graalvm-jdk-23_linux-x64_bin.tar.gz Total bytes read: 835983360 (798MiB, 106MiB/s) $ GRAAL=${PWD}/graalvm-jdk-23.0.2+7.1/ $ popd ~/devWith the plugin added and the GraalVM tools available for use, we can now see there are a bunch of new "native" tasks available when running Gradle:
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./gradlew tasks | grep native collectReachabilityMetadata - Obtains native reachability metadata for the runtime classpath configuration nativeCompile - Compiles a native image for the main binary nativeRun - Executes the main native binary nativeTestCompile - Compiles a native image for the test binary nativeTest - Executes the test native binaryThat's all very encouraging. I'm always in favour of just trying things out so why don't we go ahead and just try building a native library and see what happens...
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./gradlew nativeCompile > Task :extractor:nativeCompile [native-image-plugin] GraalVM Toolchain detection is disabled [native-image-plugin] GraalVM location read from environment variable: GRAALVM_HOME [native-image-plugin] Native Image executable path: /home/flypig/dev/graalvm/ graalvm-jdk-23.0.2+7.1/lib/svm/bin/native-image =============================================================================== GraalVM Native Image: Generating 'extractor' (shared library)... =============================================================================== [...] [1/8] Initializing... (4.8s @ 0.08GB) [...] [2/8] Performing analysis... [******] (5.4s @ 0.25GB) [...] [3/8] Building universe... (1.0s @ 0.28GB) [4/8] Parsing methods... [*] (1.7s @ 0.26GB) [5/8] Inlining methods... [***] (0.8s @ 0.30GB) <===========--> 90% EXECUTING [16s] > :extractor:nativeCompile [...] /usr/libexec/gcc/x86_64-linux-gnu/13/collect2 -plugin /usr/libexec/gcc/ x86_64-linux-gnu/13/liblto_plugin.so -plugin-opt=/usr/libexec/gcc/ x86_64-linux-gnu/13/lto-wrapper -plugin-opt=-fresolution=/tmp/ccyIUKnP.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --build-id --eh-frame-hdr -m elf_x86_64 --hash-style=gnu --as-needed -shared -z relro -o ~/dev/extractor/build/ native/nativeCompile/extractor.so -z noexecstack -z text /usr/lib/gcc/ x86_64-linux-gnu/13/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/ x86_64-linux-gnu/13/crtbeginS.o -L/tmp/SVM-11991437499249190113 -L~/dev/ graalvm/graalvm-jdk-23.0.2+7.1/lib/static/linux-amd64/glibc -L~/dev/graalvm/ graalvm-jdk-23.0.2+7.1/lib/svm/clibraries/linux-amd64/glibc -L~/dev/graalvm/ graalvm-jdk-23.0.2+7.1/lib/svm/clibraries/linux-amd64 -L~/dev/graalvm/ graalvm-jdk-23.0.2+7.1/lib/svm/clibraries -L/usr/lib/gcc/x86_64-linux-gnu/ 13 -L/usr/lib/gcc/x86_64-linux-gnu/13/../../../x86_64-linux-gnu -L/usr/lib/ gcc/x86_64-linux-gnu/13/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../ lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/ x86_64-linux-gnu/13/../../.. --gc-sections --version-script /tmp/ SVM-11991437499249190113/exported_symbols.list extractor.o ~/dev/graalvm/ graalvm-jdk-23.0.2+7.1/lib/svm/clibraries/linux-amd64/glibc/liblibchelper.a ~/dev/graalvm/graalvm-jdk-23.0.2+7.1/lib/static/linux-amd64/glibc/libnet.a ~/dev/graalvm/graalvm-jdk-23.0.2+7.1/lib/static/linux-amd64/glibc/libnio.a ~/dev/graalvm/graalvm-jdk-23.0.2+7.1/lib/static/linux-amd64/glibc/libjava.a ~/dev/graalvm/graalvm-jdk-23.0.2+7.1/lib/svm/clibraries/linux-amd64/glibc/ libjvm.a -lz -ldl -lpthread -lrt -lgcc --push-state --as-needed -lgcc_s --pop-state -lc -lgcc --push-state --as-needed -lgcc_s --pop-state /usr/lib/ gcc/x86_64-linux-gnu/13/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/13/../../../ x86_64-linux-gnu/crtn.o /usr/bin/ld: cannot find -lz: No such file or directory collect2: error: ld returned 1 exit status [...] BUILD FAILED in 34s 6 actionable tasks: 1 executed, 5 up-to-dateOkay, well, things got a fair way through but seem to have failed at the linker stage. The error says that the linker failed to resolve the -lz flag. This is usually a reference to the zlib compression library. So maybe I just need to install it?
Let's find out.
$ sudo apt install zlib1g-dev $ $GRAAL JAVA_HOME=$GRAAL ./gradlew nativeCompile [...] [1/8] Initializing... (4.6s @ 0.08GB) [...] [2/8] Performing analysis... [******] (4.9s @ 0.27GB) [...] [3/8] Building universe... (1.0s @ 0.29GB) [4/8] Parsing methods... [*] (1.5s @ 0.24GB) [5/8] Inlining methods... [***] (0.7s @ 0.31GB) [6/8] Compiling methods... [****] (13.3s @ 0.43GB) [7/8] Laying out methods... [*] (1.4s @ 0.45GB) [8/8] Creating image... [*] (1.0s @ 0.50GB) [...] Build artifacts: ~/dev/extractor/build/native/nativeCompile/extractor.so (shared_library) ~/dev/extractor/build/native/nativeCompile/graal_isolate.h (c_header) ~/dev/extractor/build/native/nativeCompile/graal_isolate_dynamic.h (c_header) =============================================================================== Finished generating 'extractor' in 29.4s. [native-image-plugin] Native Image written to: ~/dev/extractor/build/native/nativeCompile [...] BUILD SUCCESSFUL in 31s 6 actionable tasks: 1 executed, 5 up-to-dateAmazing! The thing actually just went ahead and built a native library. Honestly, I'm pretty astonished at how easy this looks to have been. But maybe my excitement is premature? Let's take a look at what actually got built.
$ ls -hl extractor/build/native/nativeCompile/ total 7.2M -rwxr-xr-x 1 flypig flypig 7.1M Feb 23 12:24 extractor.so -rw-r--r-- 1 flypig flypig 5.3K Feb 23 12:24 graal_isolate.h -rw-r--r-- 1 flypig flypig 5.5K Feb 23 12:24 graal_isolate_dynamic.h -rw-r--r-- 1 flypig flypig 33K Feb 23 12:15 svm_err_b_20250223T121513.601_pid1562.md -rw-r--r-- 1 flypig flypig 33K Feb 23 12:22 svm_err_b_20250223T122232.035_pid1963.md $ file extractor/build/native/nativeCompile/extractor.so extractor/build/native/nativeCompile/extractor.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=21b90538c3a9a3f151a578c48b4469b6cb37ebed, strippedThat looks pretty good actually. The only thing I find a little concerning is that there's no extractor.h header file to build against. This is something I was expected. We'll return to that shortly, but at this point it's probably also a good idea to run the automated test suite to see whether things are actually still working or not.
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./gradlew nativeTest > Task :extractor:compileTestJava Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. > Task :extractor:test [...] YoutubeStreamExtractorDefaultTest$StreamSegmentsTestTagesschau > testRelatedItems() FAILED org.opentest4j.AssertionFailedError: List of items is empty ==> expected: <false> but was: <true> at app//org.schabi.newpipe.extractor.services.DefaultTests .defaultTestListOfItems(DefaultTests.java:35) at app//org.schabi.newpipe.extractor.services .DefaultStreamExtractorTest.testRelatedItems (DefaultStreamExtractorTest.java:244) at java.base@23.0.2/java.lang.reflect.Method.invoke(Method.java:580) at java.base@23.0.2/java.util.ArrayList.forEach(ArrayList.java:1597) at java.base@23.0.2/java.util.ArrayList.forEach(ArrayList.java:1597) YoutubeStreamExtractorRelatedMixTest > testRelatedItems() FAILED org.opentest4j.AssertionFailedError: Unexpected normal playlist in related items ==> expected: not equal but was: <NORMAL> at app//org.junit.jupiter.api.AssertionFailureBuilder .build(AssertionFailureBuilder.java:152) at app//org.junit.jupiter.api.AssertionFailureBuilder .buildAndThrow(AssertionFailureBuilder.java:132) at app//org.junit.jupiter.api.AssertNotEquals .failEqual(AssertNotEquals.java:277) at app//org.junit.jupiter.api.AssertNotEquals .assertNotEquals(AssertNotEquals.java:263) at app//org.junit.jupiter.api.Assertions .assertNotEquals(Assertions.java:2832) at app//org.schabi.newpipe.extractor.services.youtube.stream .YoutubeStreamExtractorRelatedMixTest.lambda$testRelatedItems$0 (YoutubeStreamExtractorRelatedMixTest.java:95) at java.base@23.0.2/java.util.ArrayList.forEach(ArrayList.java:1597) at app//org.schabi.newpipe.extractor.services.youtube.stream .YoutubeStreamExtractorRelatedMixTest.testRelatedItems (YoutubeStreamExtractorRelatedMixTest.java:95) 2264 tests completed, 2 failed, 99 skipped [...] BUILD FAILED in 7m 46s 8 actionable tasks: 4 executed, 4 up-to-dateOnly two test failures. It would be nice if there were none of course, but when I ran the tests without the native build I got similar results, so this isn't unexpected.
This is just the extractor. If you cast your mind back to Day 4 you may recall I added an appwrapper directory to the repository that contained a small example application. This made use of the library by outputting some info about a YouTube video, along with a URL that allows the video to be downloaded.
Because I also added org.graalvm.buildtools.native as a plugin to the build.gradle for the appwrapper code, it means that this was also run through the native build tooling. Let's take a look at what we got from that.
$ pushd appwrapper/build/native/nativeCompile/ ~/dev/appwrapper/build/native/nativeCompile ~/dev $ gcc -I ./ -L ./ -Wl,-rpath ./ -o main main.c -l:appwrapper.so $ ls -lh total 182M -rw-r--r-- 1 flypig flypig 201 Feb 23 19:10 appwrapper.h -rwxr-xr-x 1 flypig flypig 182M Feb 23 19:10 appwrapper.so -rw-r--r-- 1 flypig flypig 225 Feb 23 19:10 appwrapper_dynamic.h -rw-r--r-- 1 flypig flypig 5.3K Feb 23 19:10 graal_isolate.h -rw-r--r-- 1 flypig flypig 5.5K Feb 23 19:10 graal_isolate_dynamic.h drwxr-xr-x 3 flypig flypig 4.0K Feb 23 19:01 resources $ popd ~/devThis looks a bit more encouraging: we have an appwrapper.h header file and, if we look inside it, we can see that it exposes a single callable function called run_main():
#include <graal_isolate.h> #if defined(__cplusplus) extern "C" { #endif int run_main(int argc, char** argv); #if defined(__cplusplus) } #endifThat's encouraging because our appwrapper code does indeed include a main() method. Calling this method should execute our test application, which will be a great way to check whether things are working as expected or not.
In order to test this out, I've written a very simple wrapper application in C that will initialise the native Gradle library, call this method and then quit. Here's the code, which I've stored in the file appwrapper/src/main.c.
#include <stdio.h> #include <stdlib.h> #include "appwrapper.h" int main(int argc, char **argv) { graal_isolate_t *isolate = NULL; graal_isolatethread_t *thread = NULL; if (graal_create_isolate(NULL, &isolate, &thread) != 0) { fprintf(stderr, "initialization error\n"); return 1; } int result = run_main(argc, argv); graal_tear_down_isolate(thread); return result; }Having created this file we can try to build and link it against the appwrapper.so and extractor.so dynamic libraries that contain our native-build Java code.
$ gcc -I appwrapper/build/native/nativeCompile/ \ -L appwrapper/build/native/nativeCompile/ \ -L extractor/build/native/nativeCompile/ \ -Wl,-rpath appwrapper/build/native/nativeCompile/ \ -Wl,-rpath extractor/build/native/nativeCompile/ \ -o main appwrapper/src/main.c -l:appwrapper.so -l:extractor.so $ ls -lh main -rwxr-xr-x 1 flypig flypig 16K Feb 23 22:06 main $ file main main: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b279e9d824c33b2c1fac1dff7a883a4360ba4c22, for GNU/Linux 3.2.0, not strippedHere we're calling gcc with our main.c code, but including the directories that contain the header and library files that were generated by the Java native build tools. The result is a 64-bit executable ELF file.
Since it's executable, we should try to execute it.
$ ./main Initialising Exception in thread "main" java.lang.ExceptionInInitializerError at okhttp3.OkHttpClient.<clinit>(OkHttpClient.java:127) at okhttp3.OkHttpClient$Builder.<init>(OkHttpClient.java:475) at uk.co.flypig.Main.main(Main.java:40) at java.base@23.0.2/java.lang.invoke.LambdaForm$DMH/sa346b79c .invokeStaticInit(LambdaForm$DMH) Caused by: java.nio.charset.UnsupportedCharsetException: UTF-32BE at java.base@23.0.2/java.nio.charset.Charset.forName(Charset.java:559) at okhttp3.internal.Util.<clinit>(Util.java:75) ... 4 moreWell, this is interesting. The code is being executed. We can see that because the Initialising output is coming from our appwrapper code:
public static void main(final String[] args) { System.out.println("Initialising"); [...]But then there's an error, indicating that the OkHttpClient that we're using has failed to initialise due to an UnsupportedCharsetException error. This appears to be a known issue and, according to the bug report for it, the solution is to add the -H:+AddAllCharsets flag to the native image generator. In an attempt to fix this, I've added the following snippet to the gradle build configuration in appwrapper/build.gradle:
graalvmNative { binaries.all { // Avoid java.lang.ExceptionInInitializerError runtime error // See https://github.com/oracle/graal/issues/1294 buildArgs.add('-H:+AddAllCharsets') } }Let's try rebuilding the libraries, recompiling the main executable and executing the resulting binary again.
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./gradlew nativeCompile [...] $ pushd appwrapper/build/native/nativeCompile/ $ gcc -I ./ -L ./ -Wl,-rpath ./ -o main main.c -l:appwrapper.so $ ./main Initialising Downloading video URL: https://www.youtube.com/watch?v=xvFZjo5PgG0 Exception: org.schabi.newpipe.extractor.exceptions.ParsingException: Malformed url: https://www.youtube.com/watch?v=xvFZjo5PgG0 CompletedWell, the previous exception is gone and the code is getting a little further now, but it seems we have a new exception to deal with. Now we have a ParsingException failure, apparently due to a malformed URL. But it's not malformed. The good news is that there's only one place in the code where this error string appears:
$ grep -rIn "Malformed url" * extractor/src/main/java/org/schabi/newpipe/extractor/utils/Utils.java:235: throw new ParsingException("Malformed url: " + url, e);We can use this to our advantage by finding out what the underlying error message actually is, with a small adjustment to the code. Here's the change I've made:
$ git diff extractor/src/main/java/org/schabi/newpipe/extractor/utils/ Utils.java diff --git a/Utils.java b/Utils.java index c061ce30..815015ff 100644 --- a/extractor/src/main/java/org/schabi/newpipe/extractor/utils/Utils.java +++ b/extractor/src/main/java/org/schabi/newpipe/extractor/utils/Utils.java @@ -232,7 +232,7 @@ public final class Utils { return message.substring("unknown protocol: ".length( )); } - throw new ParsingException("Malformed url: " + url, e); + throw new ParsingException(e.getMessage(), e); } }After rebuilding and executing the code again, we now get a much more helpful and explanatory exception error message when running our test application:
Exception: org.schabi.newpipe.extractor.exceptions.ParsingException: Accessing a URL protocol that was not enabled. The URL protocol https is supported but not enabled by default. It must be enabled by adding the --enable-url-protocols=https option to the native-image command.It seems that although they can be made available, GraalVM disables both HTTP and HTTPS protocols by default. The rationale for this is to keep the output binaries as small as possible by only enabling the things that are really needed.
Well, for our purposes we really are going to need that HTTPS protocol. It's not so clear whether we'll need HTTP, but to avoid us hitting that problem in the future I'm going to enable it as well. I've done that by adding the relevant flags to the configuration from earlier:
graalvmNative { binaries.all { // Avoid java.lang.ExceptionInInitializerError runtime error // See https://github.com/oracle/graal/issues/1294 buildArgs.add('-H:+AddAllCharsets') // Enable network protocols buildArgs.add('--enable-url-protocols=http') buildArgs.add('--enable-url-protocols=https') } }Time to rebuild the Java and recompile the C code again.
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./gradlew nativeCompile [...] $ gcc -I appwrapper/build/native/nativeCompile/ \ -L appwrapper/build/native/nativeCompile/ \ -L extractor/build/native/nativeCompile/ \ -Wl,-rpath appwrapper/build/native/nativeCompile/ \ -Wl,-rpath extractor/build/native/nativeCompile/ \ -o main appwrapper/src/main.c -l:appwrapper.so -l:extractor.soFeeling hopeful now, time to try out the executable.
$ ./main Initialising Downloading video URL: https://www.youtube.com/watch?v=xvFZjo5PgG0 Video name: Rick Roll (Different link + no ads) Uploader: Duran Category: Entertainment Likes: 167725 Views: 16915531 Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class org.mozilla.javascript.VMBridge at org.mozilla.javascript.Context.exit(Context.java:482) at org.schabi.newpipe.extractor.utils.JavaScript.compileOrThrow (JavaScript.java:20) at org.schabi.newpipe.extractor.services.youtube.YoutubeSignatureUtils .getDeobfuscationCode(YoutubeSignatureUtils.java:88) at org.schabi.newpipe.extractor.services.youtube .YoutubeJavaScriptPlayerManager.deobfuscateSignature (YoutubeJavaScriptPlayerManager.java:145) at org.schabi.newpipe.extractor.services.youtube.extractors .YoutubeStreamExtractor.buildAndAddItagInfoToList (YoutubeStreamExtractor.java:1387) at org.schabi.newpipe.extractor.services.youtube.extractors .YoutubeStreamExtractor.lambda$getStreamsFromStreamingDataKey$15 (YoutubeStreamExtractor.java:1360) [...] at java.base@23.0.2/java.util.stream.ReferencePipeline .forEachOrdered(ReferencePipeline.java:641) at org.schabi.newpipe.extractor.services.youtube.extractors .YoutubeStreamExtractor.getItags(YoutubeStreamExtractor.java:1215) at org.schabi.newpipe.extractor.services.youtube.extractors .YoutubeStreamExtractor.getVideoStreams (YoutubeStreamExtractor.java:657) at uk.co.flypig.Main.main(Main.java:57) at java.base@23.0.2/java.lang.invoke.LambdaForm$DMH/sa346b79c .invokeStaticInit(LambdaForm$DMH)Argh! Yet another exception. So far, all of these have been unknown unknowns. But at least we're making progress. Interestingly we can see that the execution is at least now getting a lot further than it did before. It's managed to extract some metadata about the video, namely the video name, category, likes and so on which have been correctly extracted. However, the all important URL for downloading a copy of the video isn't being output. We seem to be hitting a NoClassDefFoundError before that can happen.
The missing class is org.mozilla.javascript.VMBridge. In a previous life I did a decent amount of work with Java, but that was well over a decade ago and I'm not especially familiar with the Java landscape as it exists now. But a bit of digging around on the Web uncovers the fact that this is part of Mozilla's Rhino JavaScript library. Rhino is a JavaScript library written in Java. Despite the fact it's still being developed, it seems that GraalVM has chosen a different approach, relying instead on its own JavaScript interpreter, part of its Polyglot approach to interoperability which allows a whole host of languages to play nicely together.
Having now read up a bit more on this, it seems there are potentially two approaches here. I could try to get Rhino integrated with the build, or I could amend the code to use Polyglot instead. Since it looks like Rhino may not be supported at all with GraalVM, I've decided to give the latter a go.
As we can see if we follow the exception stacktrace above, the problematic code is in the JavaScript.java file, which is part of the NewPipe Extractor code. So I've made the following changes to it, which essentially replace the calls that use Rhino with calls to Polyglot instead.
$ git diff extractor/src/main/java/org/schabi/newpipe/extractor/utils/ JavaScript.java diff --git a/JavaScript.java b/JavaScript.java index ab30ed80..ee0468a7 100644 --- a/extractor/src/main/java/org/schabi/newpipe/extractor/utils/JavaScript.java +++ b/extractor/src/main/java/org/schabi/newpipe/extractor/utils/JavaScript.java @@ -1,8 +1,8 @@ package org.schabi.newpipe.extractor.utils; -import org.mozilla.javascript.Context; -import org.mozilla.javascript.Function; -import org.mozilla.javascript.ScriptableObject; +import org.graalvm.polyglot.Context; +import org.graalvm.polyglot.Source; +import org.graalvm.polyglot.Value; public final class JavaScript { @@ -10,31 +10,28 @@ public final class JavaScript { } public static void compileOrThrow(final String function) { + Value value; + final Context context = Context.create(); try { - final Context context = Context.enter(); - context.setOptimizationLevel(-1); - - // If it doesn't compile it throws an exception here - context.compileString(function, null, 1, null); + final Source source = Source.create("js", function); + value = context.parse(source); } finally { - Context.exit(); + context.close(true); } } public static String run(final String function, final String functionName, final String... parameters) { + final Context context = Context.create(); try { - final Context context = Context.enter(); - context.setOptimizationLevel(-1); - final ScriptableObject scope = context.initSafeStandardObjects(); + final Source source = Source.create("js", function); + final Value value = context.eval("js", function); - context.evaluateString(scope, function, functionName, 1, null); - final Function jsFunction = (Function) scope.get(functionName, scope); - final Object result = jsFunction.call(context, scope, scope, parameters); + Value result = value.execute((Object) parameters); return result.toString(); } finally { - Context.exit(); + context.close(true); } }There are actually fewer changes here than I'd feared might be necessary. If this works, I think it'll make for a nice solution, given that GraalVM seems to favour Polyglot. So what happens when we rebuild and execute?
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./gradlew nativeCompile [...] BUILD SUCCESSFUL in 5m 13s 10 actionable tasks: 6 executed, 4 up-to-date $ gcc -I appwrapper/build/native/nativeCompile/ \ -L appwrapper/build/native/nativeCompile/ \ -L extractor/build/native/nativeCompile/ \ -Wl,-rpath appwrapper/build/native/nativeCompile/ \ -Wl,-rpath extractor/build/native/nativeCompile/ \ -o main appwrapper/src/main.c -l:appwrapper.so -l:extractor.so $ ./main Initialising Downloading video URL: https://www.youtube.com/watch?v=xvFZjo5PgG0 Video name: Rick Roll (Different link + no ads) Uploader: Duran Category: Entertainment Likes: 167747 Views: 16918500 Content: https://rr1---sn-1xopouxgoxu-aigl.googlevideo.com/videoplayback?expire =1740359515&ei=-3K7Z7avGr_ep-oPzcOw6QM&ip=62.3.65.133&id=o-AIiqvBpexPfeF_Ot ovTgn8l1hpOBffKxoHfspDOTcYrC&itag=18&source=youtube&requiressl=yes&xpc=EgVo 2aDSNQ%3D%3D&met=1740337915%2C&mh=bl&mm=31%2C29&mn=sn-1xopouxgoxu-aigl%2Csn -aigl6ned&ms=au%2Crdu&mv=m&mvi=1&pl=22&rms=au%2Cau&initcwndbps=3486250&bui= AUWDL3x9n8km503f_C6UeJ-nmAK-f2JXxFo3iHX8HgXWUrBxXlsxrKr63R0LnSqXd7G4MsxTbK3 rHtSa&spc=RjZbSTMxxHkVycQ394WOmaNPcAiMR28iW5oscKuzyUPoVk6lOQ&vprv=1&svpuc=1 &mime=video%2Fmp4&rqh=1&cnr=14&ratebypass=yes&dur=7.685&lmt=170873886759751 5&mt=1740337503&fvip=4&fexp=51326932&c=ANDROID&txp=4530434&sparams=expire%2 Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cxpc%2Cbui%2Cspc%2Cvprv%2Csvpuc %2Cmime%2Crqh%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AJfQdSswRgIhANW5FqNuItTPiG SvgBHqzpa0UGLBegd9wUYd8-yHjH49AiEAmf8RsXSBT_Z4EpLKY7Mx6APBVjdK80dKnSodnJHKd n4%3D&lsparams=met%2Cmh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Crms%2Cinitcwndbps& lsig=AGluJ3MwRQIgajuyXNfNyrHYnh6kS5ZbH47Hj8MO_-3vRauDTqz8nKYCIQDIHsgn6W7JBb 6GEo7Wr5qhLwk9Lx1FJW_r6UeENOWE-w%3D%3D&cpn=nQp7YGHcYfwig4NE CompletedAnd there it is! The example application, compiled from C code, linked to the native Java code, is producing exactly the result it should be! This is a good place to stop for today, but before signing off, I'm going to spend just a little time reflecting on where we've got to and what still needs to be done.
There are a few takeaways. First is that building native binaries using Gradle turns out to be pretty straightforward, but with a few easy-to-hit gotchas, including the fact that HTTP and HTTPS requests are disabled by default.
The other takeaway is that we have to switch out the Rhino JavaScript interpreter for the JavaScript interpreter provided by Polyglot. We did this for the one place it was used for our test application, but a quick grep suggests it's used in several other places as well:
$ grep -rIn "org.mozilla.javascript" * --include="*.java" extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor/ TokenStream.java:3:import org.mozilla.javascript.Context; extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor/ TokenStream.java:4:import org.mozilla.javascript.Kit; extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor/ TokenStream.java:5:import org.mozilla.javascript.ObjToIntMap; extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor/ TokenStream.java:6:import org.mozilla.javascript.ScriptRuntime; extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor/ TokenStream.java:9:/* Source: Mozilla Rhino, org.mozilla.javascript.Token extractor/src/main/java/org/schabi/newpipe/extractor/utils/jsextractor/ Lexer.java:3:import org.mozilla.javascript.Context;Nevertheless, when we run the test application it's providing relevant metadata associated with a YouTube video and offering up a URL from which we're able to download the video. I find this really encouraging, because it suggests that it should be possible to get things to work properly when the Extractor is built into a native binary for Sailfish OS.
Up until this point that wasn't clear; not it looks a lot more promising.
So where next? Tomorrow I'll need to review the Extractor code again and remove all references to Rhino. That means taking a look at the TokenStream.java code to see whether we can switch out Rhino for Polyglot. Once I've done that, I'll then switch back to Sailfish OS to see whether we can get all this working there.
Assuming we can, the next step after that will be figuring out how to expose the Extractor functionality in a form that we can use from our Sailfish OS app. That'll require a lot of thought and work, but at least by that point I'll no longer be scrabbling in the dark. By then, we'll know whether the result we want is possible or not.
15 Mar 2025 : Day 6 #
Yesterday we looked at the code needed to expose the Java internals of NewPipe Extractor to C++ when using GraalVM in a way that would allow us to pass richer datatypes and structures between them.
Today we're taking another slight detour, this time to find out whether it's possible to build the Java library using the Sailfish SDK, rather than on the phone itself. You'll recall we looked at building the library on-phone on Day 4.
Before getting in to things I'm going to update my SDK. I want the absolute latest tooling and targets to maximise the chances of success.
Even though I've set a default target, when entering scratchbox2 manually like this, we also need to specify the target explicitly.
So instead I'm going to output it on my laptop with a JVM installed, outside of scratchbox2. Here are the relevant parts of the help text output:
With the compressed class space flags added we get some slightly more nuanced output. For example if we use a memory value that's too low we get a segmentation fault:
Increasing the memory to nearly a gigabyte leaves us with similar output:
Maybe as we continue our journey a new path will open up, maybe someone out there has an idea for something to try (in which case, please do let me know!), or maybe I'll muster up the courage to try to better understand and fix the underlying issue. But for now, it means reverting to our fallback of executing GraalVM on the phone.
That's it for today. Tomorrow we move into new territory: attempting to get NewPipe Extractor built using GraalVM!
Comment
Today we're taking another slight detour, this time to find out whether it's possible to build the Java library using the Sailfish SDK, rather than on the phone itself. You'll recall we looked at building the library on-phone on Day 4.
Before getting in to things I'm going to update my SDK. I want the absolute latest tooling and targets to maximise the chances of success.
$ ./SDKMaintenanceTool --silentUpdate -v IFW Version: 3.2.3, built with Qt 5.15.14. Build date: Aug 21 2024 Installer Framework SHA1: 7699eb32 [0] Language: en-GB [...] [1231] Install size: 3 components [...] [144072] 100% ... [...] [239236] Sync completed\n [245881] Target 'SailfishOS-5.0.0.55EA-aarch64' set up\n [245899] Done [...] [246115] Stopping the build engine… (this may take some time) [248920] Components updated successfully.Great! That got all of the updates. Let's see what we have.
$ sfdk tools target list sfdk: [I] Starting the build engine… SailfishOS-3.4.0.24-aarch64 sdk-provided SailfishOS-3.4.0.24-armv7hl sdk-provided SailfishOS-3.4.0.24-i486 sdk-provided SailfishOS-4.6.0.13-aarch64 sdk-provided,latest SailfishOS-4.6.0.13-armv7hl sdk-provided,latest SailfishOS-4.6.0.13-i486 sdk-provided,latest SailfishOS-5.0.0.55EA-aarch64 sdk-provided,early-accessI'll be using the new SailfishOS-5.0.0.55EA-aarch64 target, so the next step is to configure the SDK to use this by default.
$ sfdk config --global target=SailfishOS-5.0.0.55EA-aarch64 $ sfdk config # ---- command scope --------- # <clear> # ---- session scope --------- # <clear> # ---- global scope --------- target = SailfishOS-5.0.0.55EA-aarch64 output-prefix = ~/RPMS device = kolbeNext I need to install the GraalVM tooling inside the SDK. There are two layers to the SDK: first of all you enter the tooling using the sfdk command, following which you enter the scratchbox2 target using the sb2 command.
Even though I've set a default target, when entering scratchbox2 manually like this, we also need to specify the target explicitly.
$ sfdk engine exec $ sb2 -t SailfishOS-5.0.0.55EA-aarch64 $ ls compile.sh flatbuffers.sh mvnw pom.xml src $ mkdir graalvm $ pushd graalvm/ graalvm sailing-the-flood-to-java/java-part $ curl -O https://download.oracle.com/graalvm/23/latest/ graalvm-jdk-23_linux-aarch64_bin.tar.gz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 334M 100 334M 0 0 4245k 0 0:01:20 0:01:20 --:--:-- 4457k $ tar -xf graalvm-jdk-23_linux-aarch64_bin.tar.gz $ GRAAL=${PWD}/graalvm-jdk-23.0.2+7.1 $ mkdir m2 $ MAVENLOCAL=${PWD}/m2 $ popd sailing-the-flood-to-java/java-partThat's the tooling set up, now let's try building as if we were running the command on a phone:
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./mvnw -Dmaven.repo.local=$MAVENLOCAL \ clean install -Pnative Error occurred during initialization of VM Could not reserve enough space for 8118272KB object heapThis isn't totally unexpected: the compiler hit a memory limit reserving memory for the job. The problem here isn't Maven, out build tool, but rather the Java Virtual Machine that it's attempting to spawn. So for testing purposes we can jump straight to that and see if we have better luck if we call it directly.
$ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL graalvm/graalvm-jdk-23.0.2+7.1/bin/java \ --help Error occurred during initialization of VM Could not reserve enough space for 8118272KB object heapThe same error. Thankfully there are some ways to control the amount of memory that the JVM will try to claim. Unfortunately we can't output the help text using java --help inside scratchbox2 (the SDK engine target) itself because apparently the JVM has to get up and running before it'll even print out the help.
So instead I'm going to output it on my laptop with a JVM installed, outside of scratchbox2. Here are the relevant parts of the help text output:
$ java --help-extra | grep size -Xmn<size> sets the initial and maximum size (in bytes) of the heap -Xms<size> set initial Java heap size -Xmx<size> set maximum Java heap size -Xss<size> set java thread stack sizeOn top of these we also need to disable Compressed Class Space using the -XX:-UseCompressedClassPointers and -XX:+UseCompressedOops flags. These aren't listed in the Java man pages; nor are they listed in either the --help or --help-extra output. But without these we simply get an Out-Of-Memory error:
$ JVMMEM=512m _JAVA_OPTIONS="-Xmn${JVMMEM} -Xms${JVMMEM} -Xmx${JVMMEM} \ -Xss${JVMMEM}" GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./mvnw \ -Dmaven.repo.local=$MAVENLOCAL clean install -Pnative Picked up _JAVA_OPTIONS: -Xmn512m -Xms512m -Xmx512m -Xss512m Error occurred during initialization of VM Could not allocate compressed class space: 1073741824 bytesNotice here that I'm setting the flags using _JAVA_OPTIONS. That's so that they get automatically picked up by calls to use the JVM even though the JVM commands are being called by maven. Notice also that we get some output stating that the arguments have been picked up, so we know this approach is working.
With the compressed class space flags added we get some slightly more nuanced output. For example if we use a memory value that's too low we get a segmentation fault:
$ JVMMEM=128m _JAVA_OPTIONS="-Xmn${JVMMEM} -Xms${JVMMEM} -Xmx${JVMMEM} \ -Xss${JVMMEM} -XX:-UseCompressedClassPointers -XX:-UseCompressedOops" \ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./mvnw -Dmaven.repo.local=$MAVENLOCAL \ clean install -Pnative Picked up _JAVA_OPTIONS: -Xmn128m -Xms128m -Xmx128m -Xss128m -XX:-UseCompressedClassPointers -XX:-UseCompressedOops Segmentation fault (core dumped)Raise it higher, to around 768 MiB and it looks like the JVM is managing to get further through its initialisation sequence:
$ JVMMEM=768m _JAVA_OPTIONS="-Xmn${JVMMEM} -Xms${JVMMEM} -Xmx${JVMMEM} \ -Xss${JVMMEM} -XX:-UseCompressedClassPointers -XX:-UseCompressedOops" \ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./mvnw -Dmaven.repo.local=$MAVENLOCAL \ clean install -Pnative Picked up _JAVA_OPTIONS: -Xmn768m -Xms768m -Xmx768m -Xss768m -XX:-UseCompressedClassPointers -XX:-UseCompressedOops [0.773s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 786432k, guardsize: 0k, detached. [0.782s][warning][os,thread] Failed to start the native thread for java.lang.Thread "Finalizer" Error occurred during initialization of VM java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.lang.Thread.start0(java.base/Native Method) at java.lang.Thread.start(java.base/Thread.java:1518) at java.lang.ref.Finalizer.startFinalizerThread(java.base/ Finalizer.java:190) at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) at java.lang.System.initPhase1(java.base/System.java:2214)This error seems to happen when the JVM fails to start threads, but I'm not able to fix this by increasing the thread limit using ulimit -u. So I suspect this may actually be memory related as well.
Increasing the memory to nearly a gigabyte leaves us with similar output:
$ JVMMEM=1020m _JAVA_OPTIONS="-Xmn${JVMMEM} -Xms${JVMMEM} -Xmx${JVMMEM} \ -Xss${JVMMEM} -XX:-UseCompressedClassPointers -XX:-UseCompressedOops" \ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./mvnw -Dmaven.repo.local=$MAVENLOCAL \ clean install -Pnative Picked up _JAVA_OPTIONS: -Xmn1020m -Xms1020m -Xmx1020m -Xss1020m -XX:-UseCompressedClassPointers -XX:-UseCompressedOops [0.769s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1044480k, guardsize: 0k, detached. [0.777s][warning][os,thread] Failed to start the native thread for java.lang.Thread "Reference Handler" Error occurred during initialization of VM java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.lang.Thread.start0(java.base/Native Method) at java.lang.Thread.start(java.base/Thread.java:1518) at java.lang.ref.Reference.startReferenceHandlerThread(java.base/ Reference.java:306) at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:318) at java.lang.System.initPhase1(java.base/System.java:2214)But as the configuration reaches a gigabyte, the original error returns:
$ JVMMEM=1021m _JAVA_OPTIONS="-Xmn${JVMMEM} -Xms${JVMMEM} -Xmx${JVMMEM} \ -Xss${JVMMEM} -XX:-UseCompressedClassPointers -XX:-UseCompressedOops" \ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL ./mvnw -Dmaven.repo.local=$MAVENLOCAL \ clean install -Pnative Picked up _JAVA_OPTIONS: -Xmn1021m -Xms1021m -Xmx1021m -Xss1021m -XX:-UseCompressedClassPointers -XX:-UseCompressedOops Error occurred during initialization of VM Could not reserve enough space for 1046528KB object heapI also tried performing similar actions using java directly rather than going via Maven, but with very similar results:
$ JVMMEM=1020m _JAVA_OPTIONS="-Xmn${JVMMEM} -Xms${JVMMEM} -Xmx${JVMMEM} \ -Xss${JVMMEM} -XX:-UseCompressedClassPointers -XX:-UseCompressedOops" \ GRAALVM_HOME=$GRAAL JAVA_HOME=$GRAAL \ graalvm/graalvm-jdk-23.0.2+7.1/bin/java --help Picked up _JAVA_OPTIONS: -Xmn1020m -Xms1020m -Xmx1020m -Xss1020m -XX:-UseCompressedClassPointers -XX:-UseCompressedOops [0.796s][warning][os,thread] Failed to start thread "Unknown thread" - pthread_create failed (EAGAIN) for attributes: stacksize: 1044480k, guardsize: 0k, detached. [0.805s][warning][os,thread] Failed to start the native thread for java.lang.Thread "Finalizer" Error occurred during initialization of VM java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.lang.Thread.start0(java.base/Native Method) at java.lang.Thread.start(java.base/Thread.java:1518) at java.lang.ref.Finalizer.startFinalizerThread(java.base/ Finalizer.java:190) at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) at java.lang.System.initPhase1(java.base/System.java:2214)This is all rather sad. When using scratchbox2 the JVM is being executed inside a QEMU container. It's possible there's a 32 bit/64 bit problem here, or that there's some deeper incompatibility (it has been known for things to simply fail when run within QEMU). I'd really hoped that this may have changed or been fixed since I last tried this back in 2022, but sadly that's not the case. What's more, I'm not convinced I have the skill or knowledge to fix it myself.
Maybe as we continue our journey a new path will open up, maybe someone out there has an idea for something to try (in which case, please do let me know!), or maybe I'll muster up the courage to try to better understand and fix the underlying issue. But for now, it means reverting to our fallback of executing GraalVM on the phone.
That's it for today. Tomorrow we move into new territory: attempting to get NewPipe Extractor built using GraalVM!