Aaronb/vendor base 64 poc#8578
Conversation
Adds packages/expo/src/vendor/base-64/ — verbatim copy of the base-64@1.0.0
npm tarball plus a Clerk-side shim, README, and parity test. No consumer
code is wired yet; that's the next commit so reviewers can see the vendor
in isolation.
Layout:
packages/expo/src/vendor/base-64/
├── README.md Clerk-side: rationale + customer-side attack chains
├── index.ts Clerk-side shim with typed re-exports
├── upstream/ ← byte-for-byte copy of base-64@1.0.0 npm tarball
│ ├── base64.js (UMD; 164 lines; exports {encode, decode, version})
│ ├── LICENSE-MIT.txt
│ ├── package.json (upstream's; inert fields documented in README)
│ └── README.md
└── __tests__/parity.spec.ts RFC 4648 fixtures + extras + 512 deterministic fuzz
Also adds `!packages/*/src/vendor/**` to the root .gitignore so the
`dist`/`packages/*/dist/**` patterns above it don't silently exclude
vendored source under src/vendor/ (caught during the dequal POC on the
sibling branch — without this, future vendors with a `dist/` subdir would
be invisible to git and absent from the published tarball).
WHY THIS PACKAGE:
base-64 is a single-maintainer (mathias) npm package on which @clerk/expo
depends for the userland atob/btoa implementation it polyfills onto
global. When @clerk/expo is installed by a customer, the published tarball
declares base-64 as a runtime external; the customer's package manager
resolves "^1.0.0" against the npm registry and fetches base-64 fresh.
Clerk's own pnpm-lock.yaml is not in the published tarball and plays no
part in the customer's install. Two attack chains follow:
Chain 1 — Publisher account compromise: mathias's npm account is
compromised, attacker publishes base-64@1.0.1 with malicious code,
customer's caret range resolves to 1.0.1 on next install. The
polyfill assigns the compromised encode/decode to global.btoa /
global.atob — every subsequent btoa()/atob() call anywhere in the
customer's app, including third-party libraries, runs through the
compromised code silently.
Chain 2 — Registry-level same-version substitution: registry serves
substituted bytes for an existing base-64@1.0.0 (registry compromise,
malicious unpublish-then-republish within npm's 72-hour window, npm
internal compromise). Customer's first install fetches the substituted
bytes, computes their hash, records it as the trusted reference. No
prior hash to compare against; future installs "verify" against the
poisoned hash.
Exact version pinning ("base-64": "1.0.0") closes Chain 1 but not Chain 2.
Vendoring closes both — the customer's resolver never fetches base-64 from
the npm registry because the bytes ship inside the @clerk/expo npm tarball.
See packages/expo/src/vendor/base-64/README.md for the full rationale and
the bugprimer Sessions/S161/PROPOSAL.md for the broader proposal context.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ncies
- packages/expo/src/polyfills/base64Polyfill.ts: import { decode, encode }
from the vendored copy at ../vendor/base-64 (resolves to the shim at
src/vendor/base-64/index.ts) instead of the npm base-64 package.
- packages/expo/package.json: remove "base-64": "^1.0.0" from
dependencies; move it to devDependencies so the parity test at
src/vendor/base-64/__tests__/parity.spec.ts can keep comparing the
vendored output against the upstream npm package.
- pnpm-lock.yaml: regenerated.
After this commit, a customer who runs `pnpm install @clerk/expo` (or
npm/yarn/expo install equivalent) gets the vendored base-64 source as
part of the @clerk/expo tarball. Their resolver does not walk to
base-64 on npm. Both attack chains described in the previous commit
are closed for that customer.
Verification:
Layer 1 (byte-equivalence): vendor/upstream/ == npm pack base-64@1.0.0
Layer 3 (parity): 29 tests pass (RFC 4648 + extra + 512 fuzz)
Layer 5 (build): @clerk/expo builds clean via turbo;
dist/vendor/base-64/upstream/base64.js IS in dist
(so it ships in the published tarball, not require'd)
Source grep: no remaining `from 'base-64'` outside the parity test
(which has eslint-disable-next-line annotations).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `base-64` to the eslint.config.mjs no-restricted-imports `paths:`
list with a message pointing at the vendored copy. The parity test in
packages/expo/src/vendor/base-64/__tests__/parity.spec.ts imports
base-64 intentionally with eslint-disable-next-line annotations.
Verified: the rule fires on a deliberate `import { encode } from 'base-64'`
in a non-test file with the expected error message; passes for the
parity test (which has eslint-disable on the relevant lines).
Verification Layer 7 (regression guard) from VENDORING.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
|
Someone is attempting to deploy a commit to the Clerk Production Team on Vercel. A member of the Team first needs to authorize it. |
📝 WalkthroughWalkthroughThis PR vendors the Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@eslint.config.mjs`:
- Around line 354-358: The rule object with name 'base-64' currently blocks all
imports of upstream base-64 and will lint-break parity tests that intentionally
import the package; update the ESLint configuration so this restriction applies
only to non-test Expo source (e.g., scope the rule to packages/expo/src/** or
add an overrides entry that exempts test files like **/*.test.* and
**/__tests__/**), or add a negative pattern to the rule's target to exclude test
paths; locate the rule by the object with name 'base-64' and apply the
file-globbing change or add an overrides block to allow imports in test files.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 9f925187-b2a1-4351-8714-57c93b872fab
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (11)
.gitignoreeslint.config.mjspackages/expo/package.jsonpackages/expo/src/polyfills/base64Polyfill.tspackages/expo/src/vendor/base-64/README.mdpackages/expo/src/vendor/base-64/__tests__/parity.spec.tspackages/expo/src/vendor/base-64/index.tspackages/expo/src/vendor/base-64/upstream/LICENSE-MIT.txtpackages/expo/src/vendor/base-64/upstream/README.mdpackages/expo/src/vendor/base-64/upstream/base64.jspackages/expo/src/vendor/base-64/upstream/package.json
| { | ||
| name: 'base-64', | ||
| message: | ||
| "base-64 is vendored at packages/expo/src/vendor/base-64. Import { encode, decode } from '../vendor/base-64' instead. See packages/expo/src/vendor/base-64/README.md.", | ||
| }, |
There was a problem hiding this comment.
Global base-64 restriction conflicts with parity tests and can break lint.
This rule is global, but the PR includes parity tests that intentionally import the upstream base-64 package for behavioral comparison. As written, those tests will be lint-blocked unless excluded.
Scope this restriction to non-test Expo source files (or add a test-file override exception).
Suggested fix (scope restriction away from test files)
{
name: 'repo/global',
@@
'no-restricted-imports': [
'error',
{
paths: [
{
message: "Please always import from '`@clerk/shared/`<module>' instead of '`@clerk/shared`'.",
name: '`@clerk/shared`',
},
- {
- name: 'base-64',
- message:
- "base-64 is vendored at packages/expo/src/vendor/base-64. Import { encode, decode } from '../vendor/base-64' instead. See packages/expo/src/vendor/base-64/README.md.",
- },
],
@@
],
},
},
+ {
+ name: 'packages/expo base-64 restriction',
+ files: ['packages/expo/src/**/*.{ts,tsx,js,jsx}'],
+ ignores: ['packages/expo/src/**/__tests__/**', 'packages/expo/src/**/*.test.{ts,tsx,js,jsx}'],
+ rules: {
+ 'no-restricted-imports': [
+ 'error',
+ {
+ paths: [
+ {
+ name: 'base-64',
+ message:
+ "base-64 is vendored at packages/expo/src/vendor/base-64. Import { encode, decode } from '../vendor/base-64' instead. See packages/expo/src/vendor/base-64/README.md.",
+ },
+ ],
+ },
+ ],
+ },
+ },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@eslint.config.mjs` around lines 354 - 358, The rule object with name
'base-64' currently blocks all imports of upstream base-64 and will lint-break
parity tests that intentionally import the package; update the ESLint
configuration so this restriction applies only to non-test Expo source (e.g.,
scope the rule to packages/expo/src/** or add an overrides entry that exempts
test files like **/*.test.* and **/__tests__/**), or add a negative pattern to
the rule's target to exclude test paths; locate the rule by the object with name
'base-64' and apply the file-globbing change or add an overrides block to allow
imports in test files.
Proposal: Address customer-side supply-chain exposure in published Clerk SDKs
Summary
Two classes of supply-chain attack against Clerk customers. Both route through a runtime npm dependency that a published
@clerk/*SDK declares as an external, which the customer's package manager fetches fresh from npm at install time.ua-parser-js2021), hostile ownership transfer (event-stream2018), maintainer self-sabotage (colors.js/faker.js2022), and social engineering into commit authority (xz-utils2024). Customers' resolvers pick up the new version on next install and the malicious code runs in their app.--frozen-lockfile"verify" against the now-poisoned hash and reproduce the compromise.Three options Clerk could take per affected dep.
package.json. The published Clerk tarball ships the source inline.tsup/tsdown --noExternal): keep the dep inpackage.json, but configure the bundler to inline its bytes into the publisheddist/artifact rather than emit a runtimerequire()for it. Cheap to apply per-dep on packages that already bundle (e.g.@clerk/shared); more invasive on packages that currently transpile per-file withbundle: false(e.g.@clerk/expo), where adopting it means changing the build mode.crypto.subtle,Buffer.from, nativeatob, etc. (Only viable when the primitive exists and behaves equivalently in every consumer runtime.)(A fourth option — status quo — is documented for completeness in the body. None of Clerk's existing supply-chain hardening transits to customer installs, so status quo means accepting both chains against every external runtime dep.)
POC. Branch
aaronb/vendor-base-64-pocimplements the vendor option forbase-64in@clerk/expo, end-to-end. Includes annpm pack+ clean-fixture install that empirically verifiesbase-64no longer lands in customernode_modulesafter the change. The verification methodology and per-dep reasoning are in the branch's README.Why Clerk's lockfile doesn't protect customers
The load-bearing premise of this proposal is that none of Clerk's existing supply-chain hardening transits to customer installs. Quickly:
pnpm-lock.yamllives at the repo root and is consulted bypnpm installinside the Clerk monorepo. It pins exact versions and sha512 integrity hashes for every dep in Clerk's dev/CI environment.@clerk/*npm tarball. Lockfiles aren't part of npm's publish format, and shipping one would actively break customer installs (their package manager has its own lockfile and dep tree).pnpm install @clerk/expo(ornpm install,yarn add,expo install), their package manager reads the published@clerk/expo'spackage.json, walks itsdependencies(the declared version ranges), and resolves every entry against the npm registry at their install time. Clerk's lockfile is not in that path.minimumReleaseAge(configured inrenovate.json5) gates only Renovate-opened PRs against the Clerk repo. It does not transit to customer Renovate configs.pnpm.overridesandonlyBuiltDependenciesallowlist apply only to installs that read Clerk'spackage.json— i.e., installs of Clerk, not installs that depend on Clerk.The net effect: every defensive control Clerk has built around its own dep tree protects Clerk's CI and Clerk's developers. None of it follows the published
@clerk/*package into a customer'snode_modules. The customer's defenses are entirely their own, computed against the registry's state at their install time.This is why the attack chains below are not defended by any Clerk-side hardening that doesn't change what ships in the tarball.
The two attack chains
Both attacks compromise a Clerk customer's running application via a dependency that Clerk's published SDK declares as a runtime external. The examples below use
base-64in@clerk/expobecause that's the worked POC case, but the same shape applies to any external runtime dep in any published Clerk SDK.Chain 1 — Malicious release through the upstream maintainer trust boundary
Premise:
@clerk/expodeclares"base-64": "^1.0.0"(caret-ranged). Mathias Bynens is the solebase-64npm publisher; whoever has publish authority for that package can ship malicious code at any time and customers will pick it up.base-64version under Mathias's publish authority. The path varies (see precedents below); the common shape is that the existing publish channel is used to ship a release the original author wouldn't have signed off on.base-64@1.0.1to npm. The tarball contains a maliciousencodeanddecodethat — alongside their normal output — exfiltrate inputs to an attacker-controlled endpoint (or do anything else; pick your payload).pnpm install @clerk/expo(ornpm,yarn,expo install) on their dev machine, in CI, or as part of a fresh deployment.^1.0.0against the npm registry's current versions, picks1.0.1(the new latest in range), and records its integrity hash in the customer's lockfile.@clerk/expo's polyfill importsencode/decodefrom the resolvedbase-64@1.0.1and assigns them toglobal.btoa/global.atob.btoa()oratob()call anywhere in the customer's app — including third-party libraries the customer uses, payment SDK calls, OAuth flows, Clerk's own runtime — routes through the attacker'sencode/decode.Historical precedents in this class — malicious releases pushed through the existing upstream maintainer channel — cover several distinct mechanisms, all reaching customers the same way:
ua-parser-js(2021) — credentials phished, three malicious versions published.event-stream(2018) — original maintainer handed off to a contributor who later published a malicious version pullingflatmap-stream.colors.js/faker.js(2022) — original maintainer deliberately published broken/malicious versions.xz-utils(2024) — years-long cultivation of a co-maintainer position, culminating in malicious commits to a new release.What ties them together is the trust boundary: customers' resolvers trust whatever the upstream's publish authority shipped, regardless of how that publish came to exist. Chain 2 below adds a residual gap for first-install customers that exact-pinning doesn't close — but Chain 1 alone is sufficient justification on the historical record.
Chain 2 — Registry-level same-version substitution
Premise: even with an exact pin (
"base-64": "1.0.0"), the customer's resolver still fetches1.0.0from the npm registry on first install.name@versionis used, that version number is permanently retired even after unpublish. So Chain 2 requires a bypass of the immutability invariant, not abuse of the published-version-management flow.)pnpm install @clerk/expoon a fresh machine (new CI runner, new contributor's laptop, fresh deployment image). They have no prior lockfile for this project.base-64@1.0.0from the registry, receives the substituted bytes, computes their sha512, and records that hash in the customer's newly-created lockfile.--frozen-lockfile— "verify" the bytes against the recorded hash. They match (because the recorded hash is for the malicious bytes). The customer's CI passes integrity checks. The compromise reproduces forever.Historical precedents in this class are rarer in the public record (the most relevant is npm's 2022 access-token compromise incident, which briefly allowed publishing impersonated versions). The consequence is that for an unknown number of incidents that DID happen this way, nobody knew — there's no version-bump diff to spot. Chain 2 is the weaker case on its own — it should be read as the additional residual gap that even exact-pinning leaves open, not as the primary justification for vendoring. Chain 1 is the primary justification.
What each end-user action looks like in each chain
npm install @clerk/expoon a fresh machinebase-64@1.0.1(malicious). Lockfile records its hash. App runs maliciousbtoa/atob.base-64@1.0.0. Registry returns malicious bytes. Lockfile records the malicious hash as trusted. App runs maliciousbtoa/atob.npm install --frozen-lockfilewith existing lockfile (pre-compromise)1.0.0's original hash. Reinstalling fetches1.0.0, hashes match, safe.npm update base-641.0.1(malicious). Subsequent installs use the new hash.1.0.0bytes. If registry now serves malicious bytes, the new hash is the malicious hash.@clerk/expo's version@clerk/expotarball, which still declares"base-64": "^1.0.0". Lockfile reconciliation may pick up1.0.1if not already pinned.@clerk/expotarball'sbase-64declaration is unchanged; the customer still fetches1.0.0whose bytes are now malicious.Options for defense
Each option below is a per-dep decision — Clerk can mix and match across the dep tree. The right answer for
base-64in@clerk/expois not necessarily the right answer for, say,dequalin@clerk/shared. The dev team should weigh trade-offs case by case.package.json(mentioned for completeness; not a full option)noExternal:entry intsup/tsdownconfig, AND (b) move the dep fromdependenciestodevDependenciesso the published manifest no longer declares it. Renovate continues to manage upgrades as today.Option 1: Vendor the dep
Copy the upstream source byte-for-byte into Clerk's tree under
packages/<consumer>/src/vendor/<name>/upstream/, remove the npm dep frompackage.json(or move it todevDependenciesfor parity testing). The published Clerk tarball ships the source inline; the customer's resolver never walks to the registry for that dep.Pros
vendor/directory is grep-able, audit-friendly, and clearly labeled — downstream security auditors can inspect what shipped.Cons
VENDORS.json.clerk_go/api/scimgateway/imported/(Go-side lift-and-shift) andpackages/nextjs/src/vendor/crypto-es.js(sync crypto for Next middleware).When it's the right answer
Option 2: Build-time bundling (
tsup --noExternal/tsdown --noExternal)Two changes per dep: (a) configure the bundler to inline the dep's bytes into the published
dist/artifact rather than emit a runtimerequire()for it, AND (b) move the dep fromdependenciestodevDependenciesso the publishedpackage.jsonno longer declares it as a runtime external. Both are required. Without (a), the bundler still emits arequire()that the customer's resolver walks. Without (b), the customer's resolver walks thedependenciesdeclaration regardless of whether the runtime code actuallyrequire()s it. With both, the dep ships inlined in the tarball and the customer never fetches it.Pros
VENDORS.json, no per-vendor README, no refresh ritual.package.json(underdevDependencies), version bumps land via the existing PR flow.@clerk/sharedbuilds withtsdownand unbundle: false), configuration is per-package and minimal: one bundler-config entry + onepackage.jsonmove per dep.minimumReleaseAge, integrity-pinned lockfile) continues to apply during Clerk's own dev/CI.Cons
dist/index.jsto find the inlined module, vs. a labeledvendor/directory.noExternal:that also appears independencies.bundle: false(per-file transpilation rather than bundling) — notably@clerk/expo(seepackages/expo/tsup.config.ts). Switching one dep tonoExternalthere means changing the build mode to bundle the consuming file, which has its own knock-on effects (different output shape, possibly different code-splitting / lazy-load semantics, source-map changes). For those packages, the operational cost of Option 2 starts looking closer to Option 1's, not cheaper.When it's the right answer
noExternalachieves the customer-side closure with materially less ceremony than vendoring.Option 3: REPLACE the dep with platform primitives
Delete the npm dep entirely. Rewrite the consumer code in terms of platform-available APIs:
crypto.subtle,crypto.randomUUID,Buffer.from,navigator.clipboard, nativeatob/btoa, etc. The replacement is bug-for-bug different from the upstream — that's the point — and is treated as first-party Clerk code with its own tests and code review.Pros
Cons
When it's the right answer
fast-sha256→crypto.subtle(uniform across Node 18+, browsers, Workers) — the platform primitive is universally available and the upstream's exposed surface is small enough to mirror cleanly.base-64in@clerk/expo) in detail.Option 4: Status quo
Do nothing. Customers' resolvers continue to fetch deps fresh from npm at install time. Both attack chains remain open against every external runtime dep in every published
@clerk/*package.This is the option Clerk is on today. It's a defensible choice if the dev team judges that:
Listing status quo as Option 4 explicitly is meant to ensure the team chooses it knowingly, not by default.
Highest-priority candidates to harden
Short list of the deps with the strongest customer-side exposure case, in rough priority order. Each row identifies the dep, which
@clerk/*package pulls it as a customer-facing external, the maintainer-concentration signal, and the suggested option. A longer analysis covering ~30 candidates surfaced by the dependency audit is available on request.base-64@clerk/expomathias)global.btoa/global.atob— compromised code becomes the base64 implementation for every line of customer app code (third-party libraries, payment SDKs, OAuth flows).standardwebhooks+ transitivesfast-sha256,@stablelib/base64@clerk/backendtasn) + 1 (dchest) for both transitivesverifyWebhook()always-return-success.crypto.subtleHMAC + native base64 collapses three single-maintainer accounts (tasn + dchest×2) into platform primitives the package already uses elsewhere (e.g. JWT verification).server-only(no public source repo; npmrepositoryfield is null. Owner:sebmarkbage)@clerk/nextjs(direct dep,packages/nextjs/package.json)sebmarkbage)std-env@clerk/sharedpostinstallpi0)isCIcheck during thepostinstallscript that runs in customers' installs. Bounded to install-time, but pi0 controls ~50 reachable packages — high upstream-trust-set exposure for a one-liner check.std-env'sisCIrecognizes ~20 provider-specific env vars; a Clerk-side replacement should either pick the subset that matches Clerk's telemetry threat model (probablyprocess.env.CIplus a handful of named CI providers) or accept the narrowing. Either is fine; the choice should be explicit.dequal@clerk/shared,@clerk/clerk-js,@clerk/uilukeed)useDeepEqualMemo) and localization parsing. Runs on every render that goes through these paths.The audit surfaced additional candidates with weaker customer-side exposure stories — bounded blast radius, CLI-only consumers, deps where REPLACE is genuinely cheap — that are also worth considering but didn't make this short list. Available on request.
POC
base-64in@clerk/expo— branchaaronb/vendor-base-64-poc. Implements Option 1 end-to-end, including apnpm pack+ clean-fixture install that empirically verifiesbase-64no longer lands in customernode_modulesafter the change. The branch's README documents the verification steps run (byte-equivalence vs. the npm tarball, parity test against upstream, build throughtsup, the published-tarball smoke test mentioned above, ESLint regression guard) and the case-specific reasoning for choosing Option 1 over the alternatives. The same verification shape is reusable for any future Option 1 decision.References
clerk_go/api/scimgateway/imported/packages/nextjs/src/vendor/base-64polyfill that the POC vendors.