Skip to content

Fix resource.enc race when functions share a bundle#6750

Open
vlechemin wants to merge 1 commit intoanomalyco:devfrom
grasp-gg:fix/resource-enc-per-function-bundle-race
Open

Fix resource.enc race when functions share a bundle#6750
vlechemin wants to merge 1 commit intoanomalyco:devfrom
grasp-gg:fix/resource-enc-per-function-bundle-race

Conversation

@vlechemin
Copy link
Copy Markdown

Problem

When multiple aws.Function resources share a pre-built bundle directory via the bundle: option, Pulumi runs their Runtime.Build RPCs concurrently. Each RPC encrypts the function's links and writes them to {bundle}/resource.enc, so the writes race.

The last writer wins for the happy case, but partial / interleaved writes can leave a truncated or mixed ciphertext on disk — we see consistent trailing byte corruption after the 16-byte AES-GCM auth tag. The Lambda then fails to authenticate the ciphertext on its first cold start and the SDK throws at init:

ERROR: Error: Unsupported state or unable to authenticate data
    at Decipheriv.final (node:internal/crypto/cipher:198:29)

Root cause

pkg/runtime/runtime.go Collection.Build:

ciphertext := gcm.Seal(nil, make([]byte, 12), json, nil)
err = os.WriteFile(filepath.Join(result.Out, "resource.enc"), ciphertext, 0644)

With input.Bundle != "", result.Out == input.Bundle and the filename is constant, so N concurrent builds all race on the same path.

In the non-bundle path, each function gets its own .sst/artifacts/{FunctionID}-src/ directory, so there's no collision.

Fix

Namespace the file by FunctionID when a shared bundle is in play (resource-{FunctionID}.enc) and point each Lambda at its own file via SST_KEY_FILE:

  • pkg/runtime/runtime.go — write resource-{FunctionID}.enc when input.Bundle != "", keep resource.enc otherwise
  • platform/src/components/aws/function.ts — set SST_KEY_FILE to the per-function filename when args.bundle is set

The default (non-bundle) path is unchanged, so existing deployments keep the old filename.

Minor downside: each function's uploaded zip contains every resource-{FunctionID}.enc in the shared bundle dir (Lambda only reads its own via SST_KEY_FILE). This is a few extra KB per function and avoids more invasive changes to the bundle-packaging code path. Happy to explore trimming if reviewers prefer.

Test coverage

Adds TestCollectionBuildEncryptedResourceFileWithBundle to pkg/runtime/runtime_test.go:

  • Builds with Bundle != "" and checks resource-{FunctionID}.enc is written while the legacy resource.enc is not.
  • Builds two distinct functions into the same bundle and checks each ends up with its own file.

Real-world repro

This is hitting us in production (grasp-gg). We share a single pre-built bundle across ~40 Lambdas (Turbo-cached handlers, one-shot esbuild, bundle: + relative handler:). After every deploy, a subset of Lambdas cold-start with Decipheriv errors until they're re-invoked or re-deployed. Byte-level inspection of the deployed zip consistently shows 6 garbage bytes after the expected ciphertext+tag — the telltale sign of an interrupted partial write landing on top of a complete one.

Happy to iterate on naming / scope / packaging if maintainers want a different shape.

When multiple `aws.Function` resources share a pre-built bundle
directory (via `bundle:`), Pulumi runs their `Runtime.Build` RPCs
concurrently. Each RPC encrypts the function's links and writes them
to `{bundle}/resource.enc` — so the writes race. The last writer wins,
but partial / interleaved writes can also leave a truncated or mixed
ciphertext that then fails AES-GCM authentication on the Lambda's
first cold start (surfaces as a `Decipheriv` error during SDK init).

Namespace the file by `FunctionID` when a bundle is shared
(`resource-{FunctionID}.enc`) and point the Lambda at its own file
via `SST_KEY_FILE`. The default (non-bundle) path keeps writing
`resource.enc` into each function's own artifact directory, so
existing deployments are unaffected.

Covered by new tests in `pkg/runtime/runtime_test.go`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant