feat: on-demand native lib download and optional feature splitting#1039
feat: on-demand native lib download and optional feature splitting#1039msluszniak wants to merge 15 commits into
Conversation
|
TODO: separate xnnpack and coreml backends to separate libs, so they can be opt-out the same way as opencv etc. |
|
Adding support for vulkan is in progress |
e69584e to
5aa72cc
Compare
|
Status — verified working state PR is in a tested, working state with a clear opt-in/opt-out model for every backend except XNNPACK on Android (which stays baked into
Verified on a Galaxy S26 Ultra and an iPhone 17 Pro simulator (Xcode 26.4.1):
Supporting executorch fork branch: Next work Make XNNPACK separable on Android too — apply the same QNN-style |
5aa72cc to
eb14dbe
Compare
|
Update — XNNPACK on Android is now separable too Every backend is now opt-in symmetrically across both platforms:
Two changes on the executorch fork (
Verified on the same Galaxy S26 Ultra:
The This finishes the work tracked above. PR is ready for review. |
The npm tarball ships without prebuilt native binaries — they are
downloaded from GitHub Releases at postinstall and extracted into
third-party/, where the existing CMake / podspec configurations pick them
up unchanged. Apps can opt out of features they don't need to skip both
the download and the native compilation:
"react-native-executorch": {
"extras": ["opencv", "phonemizer", "xnnpack", "coreml", "vulkan"]
}
Defaults to all enabled. Each extra trims one or more artifacts and toggles
a corresponding RNE_ENABLE_* CMake / podspec flag, dropping its sources from
compilation and its libraries from the final link.
Per-platform behavior:
- opencv Android + iOS (iOS provided via opencv-rne CocoaPod)
- phonemizer Android + iOS
- xnnpack iOS-only as a force-loaded XnnpackBackend.xcframework;
baked into libexecutorch.so on Android
- coreml iOS-only as a force-loaded CoreMLBackend.xcframework
- vulkan Android-only as a separately-loaded
libvulkan_executorch_backend.so
Vulkan ships as its own shared library (mirroring the QNN backend pattern)
so its load-time backend registration runs only when the user opts in. The
.so links only against vulkan_backend + vulkan_schema + executorch_core,
not the CPU kernel registries, so it does not cause duplicate kernel
registration when loaded alongside libexecutorch.so.
eb14dbe to
ef7ab0e
Compare
Bring NATIVE_LIBS_PIPELINE.md in sync with the current
msluszniak/executorch@ms/separate-backends tip:
- Pin SHA bumped from 1a5c0f267 to bd24ac7681.
- Patch list expanded to cover all 10 commits on the fork branch:
the original 4 (version-script removal, vulkan-shared, flatcc-Werror,
xnnpack-shared) plus the 6 added in this round (tokenizers submodule
switched to software-mansion-labs/pytorch-tokenizers@build,
build_android_library.sh forwards BACKEND_SHARED env vars to cmake,
XNNWeightsCache null-ptr fix, ANDROID_SUPPORT_FLEXIBLE_PAGE_SIZES,
iOS create_frameworks.sh keeps merged .a files, iOS CMakePresets
disable XNNPACK_ENABLE_ARM_SME{,2}).
- iOS build section rewritten as a three-stage flow (fork .a build →
stage into RNE → repackage via ExecutorchLib/build.sh).
|
Thanks @msluszniak . Mind working on small stack of PRs so it's easier to review? |
|
@kirklandsign this particular PR solely address React Native ExecuTorch repo, but what is probably instresting for you is a not-yet-published PR from https://github.com/msluszniak/executorch/tree/@ms/separate-backends branch to executorch. For this one, for sure I can proceed with a series of small PRs :)). I just want to make sure changes are aligned with RNE repo first. I just need to check if iOS works the same way as Android. |
…steps The fork branch grew an 11th patch on top of bd24ac7681 to address the flatccrt build failure on Apple clang 21 inside the iOS / macOS presets (separate from the host-side flatcc_ep fix). Bump the pinned SHA to 3ce953dbde73035e733442f99c082f5b6fedff5b and add the patch to the list. Rewrite the iOS build section so a fresh machine can reproduce the artifacts exactly: - Drop the install_executorch.sh step: torch_pin.py points at a pruned nightly. Pin torch==2.11.0 directly + install requirements-dev + certifi/zstd (Buck2 resolver) instead. - Switch from a bare 'create_frameworks.sh' invocation to 'build_apple_frameworks.sh --Release', which orchestrates the cmake builds and calls create_frameworks.sh with the required flags. - Document the harmless 'no binary artifact for *_debug.xcframework' Swift Package error at the end of --Release runs. - Spell out the exact per-slice .a file list to stage into RNE. - Call out that libkleidiai_*.a and the non-executorch prebuilts (cpuinfo / pthreadpool / phonemis) are kept as-is, not rebuilt. - Add the iOS 26.4 simulator NSURLSession regression as a callout so future maintainers don't lose another day chasing it.
…-libs # Conflicts: # .cspell-wordlist.txt
…-libs # Conflicts: # packages/react-native-executorch/android/src/main/cpp/CMakeLists.txt # packages/react-native-executorch/react-native-executorch.podspec # packages/react-native-executorch/third-party/ios/ExecutorchLib.xcframework/ios-arm64-simulator/ExecutorchLib.framework/Info.plist # packages/react-native-executorch/third-party/ios/ExecutorchLib.xcframework/ios-arm64/ExecutorchLib.framework/Info.plist
- Three optional arrays in package.json: `backends`, `libs`, `features` (all merged). - `features` is sugar: each one expands to (backends, libs) via FEATURE_MAP matching the documented use* hooks (16 features, e.g. `llm`, `textToSpeech`, `objectDetection`). - Backends and libs are validated; unknown values throw with a list of supported ones. - Legacy `extras` is rejected with a migration error. - Defaults stay all-on when no config is given. - ModelHostObject.h: gate TextToSpeech include + Kokoro if-constexpr block with RNE_ENABLE_PHONEMIZER — unblocks builds when the phonemizer lib is disabled.
…ps' features
- FEATURE_MAP rebuilt from src/constants/modelRegistry.ts. Removed guessed
backends. LLMs / TTS / VAD / OCR / pose / sem-seg / textEmbeddings / textToImage
are xnnpack-only; classification / objectDetection / instanceSegmentation /
styleTransfer / speechToText / segmentAnything also ship coreml.
No model family currently ships a vulkan variant.
- Added privacyFilter feature.
- Demo apps now declare their actual feature set:
- bare-rn: [llm]
- llm: [llm, multimodalLLM, privacyFilter]
- speech: [llm, speechToText, textToSpeech, vad]
- text-embeddings: [textEmbeddings, imageEmbeddings]
- computer-vision: [classification, imageEmbeddings, instanceSegmentation,
ocr, objectDetection, poseEstimation, semanticSegmentation,
styleTransfer, textEmbeddings, textToImage, verticalOCR]
|
PR is ready for review. Right now, I'm testing all demo apps on both iOS and Android. |
|
Started a review but had a moment of reflection. I think we should take a step back and consider on what "axes" so to speak the splitting should operate. The backends axis is an obvious and natural choice. The second (and final) axis, however, should be imo the domain (e.g. computer vision, nlp, speech, etc.) so that we can easily specify something along the lines "for my app I want to do computer vision and need CoreML for iOS" which would translate to a union of artefacts (lib binaries, etc.) required for CV task and CoreML backend accordingly. Per-feature splitting will lead to too much fragmentation and bloat in my opinion and I don't really see any benefits of this more granular division, but perhaps there are some? The libs choice on the other hand is particularly peculiar as it reverses the semantics---instead of declaratively specifying "I want CV with CoreML and XNNPACK", it imperatively tells that for something we will need e.g. OpenCV---as such it seems unsuitable. |
|
@barhanc indeed there is at least one motivation between choosing features over broader categories as CV, etc. The split has only sense we are able to actualy disable some part of libs from bundling into an app. Which low-level translates into particular backends and libs used. In NLP, there are things like VLM. It utilizes OpenCV, which won't be disabled if we decide on less granular approach. Same thing for phonemis and speech category. This will eventually cause that broader categories will "propagate" libs by the fact that we take union, and the final advantage of opting-out is less powerful. So the axis of splitting by category is more intuitive at first sight, bur when you try to implement this mechanism optimally, I started to see, that split by libs is more acurate for implementation purpose, and the more granular split is not perfectly aligned with it. |
Brings in PR #1223 (ExecuTorch 1.3 + MLX iOS + Gemma 4 E2B) and related commits. Reconciles the on-demand-libs branch with main's committed v1.3 artifacts. Conflicts resolved: - prebuilt binaries (.a, .so, framework) re-deleted (our PR serves them via the tarball pipeline) - classes.jar + LLM.cpp taken from main (v1.3 + new generateMultimodal) - podspec keeps our feature-splitting structure; main's MLX metallib placement note merged as a comment
PoseEstimation inherits from VisionModel and LLM's multimodal path uses the VisionEncoder/MultimodalRunner cpp files — all of which are excluded when opencv is off. Move pose_estimation under the opencv group and gate LLM's multimodal include + constructor branch behind RNE_ENABLE_OPENCV so apps with the "llm" feature (but not "multimodalLLM") link cleanly.
Splits MLX out of ExecutorchLib.xcframework into its own MLXBackend
xcframework + mlx.metallib resource, served by the on-demand tarball
pipeline as mlx-ios.tar.gz. Now consistent with xnnpack/coreml: apps
that don't declare the "multimodalLLM" (or any MLX-using) feature
don't ship the ~6MB MLX runtime.
- podspec: enable_mlx config flag, force_load MLXBackend.a per-slice,
Metal/MetalKit/MPS system frameworks, mlx.metallib via s.ios.resource
(CocoaPods drops it at the app bundle root, where the static-archive
constructor symbol resolves it via dladdr)
- ExecutorchLib.xcodeproj: strip MLX libs/metallib entries (moved out)
- build.sh: build MLXBackend.xcframework from libbackend_mlx_{ios,sim}.a
- download-libs.js + package-release-artifacts.sh: add mlx-ios artifact
- android/build.gradle: enableMlx fallback (Android has no MLX yet, but
keeps the schema symmetric with the podspec)
- docs/getting-started: list mlx as iOS backend in feature/platform table
- .gitignore: third-party/ios/MLXBackend.xcframework (downloaded)
- untrack libbackend_mlx_*.a + mlx.metallib (now served via tarball)
Two iOS link issues surfaced once xnnpack/coreml/mlx were split out of ExecutorchLib.xcframework: 1. pthreadpool v2: libbackend_xnnpack references _pthreadpool_create_v2, which exists in libthreadpool_*.a but only as a local (non-exported) symbol inside ExecutorchLib.framework. Link libthreadpool_*.a directly from the app — it carries both pthreadpool (v1+v2) and cpuinfo, so we can drop the separate libpthreadpool.a + libcpuinfo.a inputs. 2. MLX on iOS Simulator: _MTLTensorDomain / _MTLIOErrorDomain ship in iPhoneOS.sdk but NOT iPhoneSimulator.sdk. Match main's behavior — only force_load libMLXBackend.a in the device slice; simulator builds get no MLX, and MLX-exported models on simulator fail at load time (which is fine, simulator can't drive Metal MLX anyway).
|
One question - why Also, I do believe Phonemis is currently being fetched from main - I think I will release it with some tag and we should bind it to this tag. |
Match the upstream library's actual name. Renames in PR-introduced surface area only: - JS lib name in the user's package.json: "phonemizer" → "phonemis" - build-config field: enablePhonemizer → enablePhonemis - compile macro: RNE_ENABLE_PHONEMIZER → RNE_ENABLE_PHONEMIS - CMake var: PHONEMIZER_CPP_SOURCES → PHONEMIS_CPP_SOURCES - podspec ruby var: phonemizer_source_files → phonemis_source_files Pre-existing public API (TextToSpeechPhonemizerConfig, phonemis::phonemizer::Config from the upstream library, phonemizer_*.pte model file names on HuggingFace, etc.) is unchanged.
|
Good catch — renamed in 2b45f7fde: the feature lib is now On the submodule: agreed. The |
Description
Removes prebuilt native binaries from the npm tarball and downloads them at postinstall from GitHub Releases. Apps declare exactly what they need under
react-native-executorchinpackage.json— three optional arrays, all merged into a single set:{ "react-native-executorch": { "backends": ["xnnpack", "coreml", "vulkan"], "libs": ["opencv", "phonemizer"], "features": ["llm", "textToSpeech", "objectDetection"] } }featuresis the friendly opt-in: list theuse*hooks you'll use and the postinstall expands each one to its required backends + libs (perFEATURE_MAP).backends/libsare the precise opt-in.The 18 features track the documented
use*hooks one-for-one; each row's backend list is the union of what at least one model in that family ships today (sourced fromsrc/constants/modelRegistry.ts).xnnpackbackendXnnpackBackend.xcframeworkforce-loadedlibxnnpack_executorch_backend.socoremlbackendCoreMLBackend.xcframeworkforce-loadedvulkanbackendlibvulkan_executorch_backend.soopencvlibopencv-rneCocoaPodlibopencv_*.a+ KleidiCV HAL (arm64)phonemizerlibthird-party/common/phonemissubmodule)add_subdirectoryEach Android backend
.solinks only<backend>_backend (--whole-archive) + <backend>_schema + executorch_core— no CPU kernel registries — so loading multiple side-by-side does not trigger duplicate kernel registration. Supporting executorch fork branch:software-mansion-labs/executorch@ms/separate-backends(EXECUTORCH_BUILD_XNNPACK_BACKEND_SHARED+EXECUTORCH_BUILD_VULKAN_BACKEND_SHAREDswitches, acustom_opsfix to stop the transitive XNNPACK link from leaking intolibexecutorch_jni.so, and a flatcc-Werrorworkaround for Apple clang 21).Demo apps in this PR declare their actual feature set:
bare-rn→["llm"]llm→["llm", "multimodalLLM", "privacyFilter"]speech→["llm", "speechToText", "textToSpeech", "vad"]text-embeddings→["textEmbeddings", "imageEmbeddings"]computer-vision→["classification", "imageEmbeddings", "instanceSegmentation", "ocr", "objectDetection", "poseEstimation", "semanticSegmentation", "styleTransfer", "textEmbeddings", "textToImage", "verticalOCR"]Introduces a breaking change?
Removes the legacy
extrasfield. Apps that set it get an install-time error pointing at the new shape.Type of change
Tested on
Testing instructions
Test the download flow:
Test feature opt-in (drop opencv / coreml by listing only LLM-shaped features):
package.jsonset"features": ["llm"].yarn install→rne-build-config.jsonshould showenableOpencv:false,enableCoreml:false,enableXnnpack:true.Backend CoreMLBackend is not registerederror rather than crashing.Test phonemizer opt-out:
package.jsonset"features": ["textEmbeddings"](no TTS).yarn install(regeneratesrne-build-config.json), thenpod install/ regenerateandroid/.Test that legacy
extrasis rejected:package.jsonset"extras": ["opencv"].yarn installfails with an explicit migration error pointing atbackends,libs,features.Related issues
Builds toward pytorch/executorch#10457 (hot-pluggable Android backends).
Checklist