Hermes#176
Conversation
Hermes's lib/InternalJavaScript/CMakeLists.txt attaches the generated InternalBytecode.js to two downstream targets (hermesInternalUnit_obj and InternalBytecodeInclude) with no common dependency. The Xcode ew build system rejects this with is attached to multiple targets ... but none of these is a common dependency of the other(s), breaking the macOS_Xcode264_Hermes CMake generate step. Switch to the Ninja generator only when js-engine == 'Hermes' so the rest of the macOS matrix keeps using Xcode. Ninja has no such restriction. xcode-select still pins clang and the SDK to the chosen Xcode toolchain. Adjust the test working-directory accordingly (Ninja is single-config and drops binaries directly under the target dir). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
These flags are MSVC-only; clang/gcc parse them as missing source-file paths and fail the env_hermes.cc compile on macOS with 'clang++: error: no such file or directory: /wd4068' (and the rest of the wd list). On non-MSVC compilers clang and gcc accept the offending compound- literal syntax in Hermes's sh_legacy_value.h as a GNU extension under -std=gnu++20, so no suppression is needed there. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On macOS the Xcode generator's Link Frameworks Automatically setting silently pulls in Foundation/CoreFoundation for .mm/.m TUs. The Ninja generator we now use for the Hermes macOS job has no such mechanism, so AppRuntime_macOS.mm's NSLog and UrlLib's UrlRequest_Apple.mm reference to NSBundle resolve to undefined symbols at the final link step:
Undefined symbols for architecture arm64:
_NSLog, referenced from ... AppRuntime_macOS.mm.o
_OBJC_CLASS_\, referenced from ... UrlRequest_Apple.mm.o
Declare the frameworks explicitly and PUBLIC so they end up on the executable's link line for every generator, while not regressing the Xcode-generator builds (specifying the framework twice is a no-op).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hermes ships HERMES_SLOW_DEBUG=ON by default. Those slow asserts compile in even for RelWithDebInfo / Release builds and fire during the Runtime destructor finalizer chain on macOS arm64, killing the macOS_Xcode264_Hermes job with exit code 134 right after 145 passing and before any gtest [OK] line. Disable the slow checks for our embedded use. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hermes's Runtime destructor runs the full finalizer chain (cleanup hooks, persistent-reference finalizers, instance-data finalizer, and heap finalizeAll()), invoking arbitrary user-provided C++ code from napi_wrap / napi_add_finalizer / etc. If any of that code throws, the exception escapes Runtime::~Runtime (its destructor is not noexcept) and propagates out of Napi::Detach into the AppRuntime worker thread. The thread function has no catch, so C++ calls std::terminate -> abort(), killing the process with SIGABRT silently. Observed on macOS arm64 CI as 'Abort trap: 6' immediately after '145 passing', with zero stderr diagnostic (hermes_fatal isn't involved). The Windows builds happen to swallow this internally and exit cleanly. Wrap state.runtime.reset() in a try/catch that logs to stderr instead of propagating. By the time Detach runs the tests have already reported their results; re-raising would just re-trigger the abort and leave CI red despite a fully successful test run. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Same root-cause class as the Detach try/catch in 2b68f36: Hermes's Runtime::drainJobs() synchronously runs whatever callbacks were registered via addDrainJobsCallback. Our hermesNapi build registers drainPendingFinalizers, which invokes user napi_wrap / napi_add_finalizer C++ callbacks. If one of those throws, the exception escapes drainJobs into AppRuntime::Dispatch's call site (DrainMicrotasks lives OUTSIDE the dispatch try/catch by design — promise continuations need to run regardless of whether the callback threw), and from there out of the worker thread function, triggering std::terminate -> abort() with no diagnostic. The macOS_Xcode264_Hermes CI run kept SIGABRTing after 145 passing even with the Detach catch in place, which means the throw happens before Detach — most plausibly in the very last DrainMicrotasks call (after the AppRuntime destructor pushed the no-op work item and the worker drained pending finalizers on its way out). Wrap runtime->drainJobs() in try/catch with a stderr log so the worker can exit cleanly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The macOS_Xcode264_Hermes job has been failing with Abort trap: 6 (SIGABRT, exit code 134) right after 145 passing, with zero stderr diagnostic. My defensive try/catch wrappers around Hermes Runtime teardown (Detach) and drainJobs in the previous two commits did not catch anything, which suggests this is either a raw abort() call from inside Hermes (not a C++ throw) or a std::terminate from somewhere outside the paths I instrumented. Wrap UnitTests in lldb -b -o run -o 'thread backtrace all' for the Hermes macOS path only, so the next CI run leaves a real backtrace in the log. The exit code is parsed back out of lldb's summary line so the job still fails on real test failures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…corruption
Captured a real backtrace from CI by wrapping the macOS Hermes UnitTests run in lldb in the previous commit. The smoking gun is macOS libmalloc:
UnitTests(25540,0x16fe87000) malloc: *** error for object
0x9cfbcb930: pointer being freed was not allocated
malloc: *** set a breakpoint in malloc_error_break to debug
Process 25540 stopped
* thread BabylonJS#2, stop reason = signal SIGABRT
Heap corruption during Runtime teardown, not a C++ throw (hence why my Detach / drainJobs try/catch additions never fired).
Hermes's hermes_run_script has two ingest paths for the source buffer: a zero-copy path when the last byte is '\\0' (wraps the buffer in WeirdZeroTerminatedBuffer and keeps the callback alive until the owning BCProvider/RuntimeModule is destroyed, often at ~Runtime), and a copy path otherwise (Hermes copies into its own StdStringBuffer and synchronously invokes the finalize callback before hermes_run_script returns).
The zero-copy path is the one tripping the abort on macOS arm64: my new uint8_t[] pointer lives long enough to interact with whatever teardown sequence corrupts it. Switch Eval to the copy path by passing length (no trailing '\\0'). The buffer now lives only for the duration of the synchronous call and the lifetime question disappears. Plenty cheap for the Mocha bundle (~1 MiB).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous lldb wrapping only printed frame #0 because the inferior's SIGABRT was passing straight through to the kernel and killing the process before the queued thread backtrace all command could run. Add process handle SIGABRT --pass false --stop true --notify true so lldb intercepts the signal, lets us dump every thread, then continues so the inferior exits with the original signal status. Also process status --verbose for the per-thread state, in case the backtrace alone isn't enough to identify the bad free site (the heap corruption message comes with an address but no callsite, since macOS libmalloc only logs to stderr before raising). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Queued -o commands after un were never executed once SIGABRT killed the inferior — even with process handle --pass false. lldb's -k flag is specifically designed to fire on crash and is the only reliable way to get bt all out of batch mode in this scenario. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Captured a full backtrace via lldb -k (see prior CI commit). The smoking gun: frame BabylonJS#5: ___BUG_IN_CLIENT_OF_LIBMALLOC_POINTER_BEING_FREED_WAS_NOT_ALLOCATED frame BabylonJS#6: napi_delete_reference at hermes_napi_reference.cpp:113 frame BabylonJS#7: Napi::Reference<...>::~Reference at napi-inl.h:3262 frame BabylonJS#8: Napi::ObjectReference::~ObjectReference frame BabylonJS#10: Babylon::Polyfills::Internal::URL::~URL at URL.h:10 frame BabylonJS#12: Napi::ObjectWrap<...URL>::FinalizeCallback at napi-inl.h:4963 frame BabylonJS#13: napi_env__::shutdown at hermes_napi.cpp:214 frame BabylonJS#16: hermes::vm::Runtime::~Runtime Root cause: Hermes's napi_env__::shutdown() iterates refListHead_ and delete ref; one at a time. It only sets ref->deletionPending_ on the *current* ref before its finalize_cb fires. If the finalizer transitively destroys a node-addon-api wrapper (Napi::Reference / Napi::ObjectReference) whose underlying napi_ref was already deleted earlier in the same loop, napi_delete_reference reads ref->deletionPending_ from freed memory and proceeds to delete ref again -> double-free. The exact path: URL (an ObjectWrap subclass) has a Napi::ObjectReference member m_searchParamsReference. addReference prepends to the linked list, so m_searchParamsReference's ref is processed BEFORE URL's wrap ref. When URL's wrap finalizer runs delete this, ~URL destroys m_searchParamsReference, whose destructor calls napi_delete_reference on the already-freed sibling ref. macOS libmalloc detects this (malloc: *** error for object 0x...: pointer being freed was not allocated -> SIGABRT). Linux glibc and Windows CRT happen to miss it. Fix: PATCH Hermes shutdown() to mark ALL refs deletionPending in a pre-pass BEFORE iterating. Apply as a FetchContent PATCH_COMMAND via the new ApplyPatchIfNeeded.cmake helper (idempotent — uses git apply --check --reverse to detect already-applied state across reconfigure). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cuts the matrix from ~24 jobs down to 4 (Win32_x64_Hermes, Android_Hermes, macOS_Xcode264_Hermes, iOS_Xcode264_Hermes) so iterating on Hermes-specific failures doesn't burn through CI minutes re-running the unaffected V8/Chakra/JSC/UWP/Linux matrix. Reverted in the very last commit of this PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous narrowed matrix referenced js-engine inputs on build-android.yml and build-ios.yml, but the iteration-2 commit that added those inputs to build-macos.yml and build-ios.yml landed only on build-macos.yml and build-ios.yml on the local branch — the squashed PR commit didn't carry the build-android.yml / build-ios.yml diffs, causing GitHub Actions to reject the workflow with: Invalid input, js-engine is not defined in the referenced workflow. Narrow further to only Win32 x64 + macOS Xcode 26.4 (the two job types that already accept js-engine on the PR head). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The hand-written patch was malformed (corrupt at line 88) and likely never applied in CI — explaining why the previous run still SIGABRT'd at line 225 with the same root-cause backtrace. Re-generated via git diff against the pristine fetched source so the format is exact. Also added a [Babylon-Hermes-patch] stderr print inside shutdown() so we can positively confirm in CI logs that the patch is in effect. Added an ApplyPatchIfNeeded status line at configure too. Patch v2 also covers a secondary case: refs created during a finalize_cb (e.g. by transient napi_create_reference calls during teardown) are now pre-marked just before the next finalize fires, in addition to the upfront sweep. Verified locally on Windows RelWithDebInfo: 145 Mocha + 4 gtest PASS, marker prints 4 times (once per AppRuntime in the suite). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
git apply is strict about line endings; allowing Windows checkouts to convert patches to CRLF would silently break the FetchContent PATCH_COMMAND step. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove the local Hermes shutdown patch and restore the CI matrix while keeping Apple Hermes jobs disabled. Add Android Hermes coverage with host compiler import support for Hermes cross-compilation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep the Android emulator runner script as a single command and relax async timer/WebSocket test timeouts that are flaky on hosted Windows runners. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restore the macOS reusable workflow, remove the patch line-ending attributes, and put the iOS CI section label back to its previous wording. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hermes vendors Boost.Context (external/boost/boost_1_86_0/libs/context). The default fcontext implementation needs MASM on Windows; MSBuild propagates CXX compile flags (/Zc:__cplusplus, /Zc:preprocessor, /MP) into the MASM step, which warns A4018 and bails out before producing make_x86_64_ms_pe_masm.obj, leading to LNK1181. Switch to BOOST_CONTEXT_IMPLEMENTATION=winfib (Win32 Fibers) on WIN32 — no assembler needed. Hermes only uses the boost fiber API, so this is functionally equivalent for this embedder. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # .github/workflows/build-win32.yml
There was a problem hiding this comment.
Pull request overview
Adds an experimental Hermes (static_h) JavaScript engine backend to JsRuntimeHost, wiring it through the Node-API layer and AppRuntime dispatch loop, and extending CI to build/test Hermes on supported platforms.
Changes:
- Integrates Hermes via a new Node-API bridge (
env_hermes.cc) and a Hermes-specificnapi/env.h. - Updates AppRuntime dispatch to open a
Napi::HandleScopeand adds an engine hook to drain microtasks/jobs (Hermes explicit, others no-op). - Extends CMake + GitHub Actions workflows for Hermes builds (Win32/Android) and adjusts unit test timeouts.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| Tests/UnitTests/Scripts/tests.ts | Increases Mocha timeouts for timer-related tests and WebSocket test. |
| Tests/UnitTests/Android/app/build.gradle | Refactors CMake argument construction and adds optional IMPORT_HOST_COMPILERS argument. |
| README.md | Documents Hermes support status and unsupported Apple targets. |
| Core/Node-API/Source/env_hermes.cc | Implements Hermes runtime/env attach/detach, Hermes-native Eval, and job draining. |
| Core/Node-API/Include/Engine/Hermes/napi/env.h | Declares Hermes-specific Napi Attach/Detach/Eval/DrainJobs API. |
| Core/Node-API/CMakeLists.txt | Adds Hermes engine selection, sources, link targets, and Hermes include/compile options. |
| Core/AppRuntime/Source/AppRuntime.cpp | Wraps dispatched callbacks in a Napi::HandleScope and drains microtasks after callback execution. |
| Core/AppRuntime/Source/AppRuntime_V8.cpp | Adds no-op DrainMicrotasks for V8. |
| Core/AppRuntime/Source/AppRuntime_JSI.cpp | Adds no-op DrainMicrotasks for JSI. |
| Core/AppRuntime/Source/AppRuntime_JavaScriptCore.cpp | Adds no-op DrainMicrotasks for JavaScriptCore. |
| Core/AppRuntime/Source/AppRuntime_Hermes.cpp | Adds Hermes environment tier using Napi::Attach/Detach and Hermes DrainMicrotasks implementation. |
| Core/AppRuntime/Source/AppRuntime_Chakra.cpp | Adds no-op DrainMicrotasks for Chakra. |
| Core/AppRuntime/Include/Babylon/AppRuntime.h | Declares the new engine hook DrainMicrotasks. |
| Core/AppRuntime/CMakeLists.txt | Links Apple Foundation/CoreFoundation frameworks for non-Xcode generators. |
| CMakeLists.txt | Fetches Hermes (static_h pinned SHA), configures Hermes options, and blocks Hermes on Apple platforms early. |
| .github/workflows/ci.yml | Adds Hermes jobs (Win32 x64, Android) and updates job header comments. |
| .github/workflows/build-win32.yml | Extends timeout for Hermes builds. |
| .github/workflows/build-android.yml | Adds Hermes disk-space mitigation, host compiler build, and Hermes-specific Gradle parameter wiring. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| auto config = hermes::vm::RuntimeConfig::Builder() | ||
| .withGCConfig(hermes::vm::GCConfig::Builder() | ||
| .withInitHeapSize(1u << 20) // 1 MiB | ||
| .withMaxHeapSize(512u << 20) // 512 MiB |
There was a problem hiding this comment.
Why 512 max? This seems likely too limited.
|
|
||
| // Dropping the last shared_ptr to the runtime tears down the env via | ||
| // Hermes's post-shutdown deleter. | ||
| state.runtime.reset(); |
There was a problem hiding this comment.
Isn't this going to happen automatically when the state variable goes out of scope at the end of this function anyway?
| // and we don't have a meaningful way to bubble it up here. Real | ||
| // exceptions thrown FROM user callbacks are already handled in | ||
| // AppRuntime::Dispatch's try/catch. | ||
| (void)runtime->drainJobs(); |
There was a problem hiding this comment.
Does this result in behavior similar to other js engines in BN? Does this not need to be reported through the unhandled error callback or anything?
| "-DNAPI_JAVASCRIPT_ENGINE=${jsEngine}", | ||
| "-DJSRUNTIMEHOST_CORE_APPRUNTIME_V8_INSPECTOR=ON" | ||
| ] | ||
| if (project.hasProperty("importHostCompilers")) { |
No description provided.