Skip to content

ci: add retry logic and caching for zstd source download#116

Open
tadjik1 wants to merge 5 commits into
mainfrom
ci/zstd-download-retry-cache
Open

ci: add retry logic and caching for zstd source download#116
tadjik1 wants to merge 5 commits into
mainfrom
ci/zstd-download-retry-cache

Conversation

@tadjik1

@tadjik1 tadjik1 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Description

Summary of Changes

Adds curl retry logic to the zstd source download and caches the build on host jobs to prevent intermittent CI failures. Adds a 30-minute timeout to the container jobs so a QEMU build hang fails fast instead of running into the 6-hour default.

What is the motivation for this change?

Concurrent CI matrix jobs all download zstd-1.5.6.tar.gz from GitHub releases simultaneously and hit the unauthenticated rate limit, getting back a ~92-byte error response that tar fails on with gzip: stdin: not in gzip format. The failures are non-deterministic and also reproducible on main with no PR changes involved.

Double check the following

  • Lint is passing (npm run check:lint)
  • Self-review completed using the steps outlined here
  • PR title follows the correct format: type(NODE-xxxx)[!]: description
    • Example: feat(NODE-1234)!: rewriting everything in coffeescript
  • Changes are covered by tests
  • New TODOs have a related JIRA ticket

The zstd source tarball is downloaded from GitHub releases via an
unauthenticated curl in etc/install-zstd.sh. When many CI jobs run in
parallel they hit GitHub's rate limit, receiving a ~92-byte error
response that causes tar to fail with "not in gzip format".

- Add --retry 5 / --retry-delay 5 / --retry-all-errors / --fail to
  the curl invocation so rate-limited requests are retried rather than
  silently producing a corrupt archive
- Download to a temp file before extracting so curl and tar failures
  are decoupled and clearly attributed
- Make the script idempotent: skip download and build if deps/zstd/out
  already exists (enables cache restoration)
- Add actions/cache@v5 for host tests, keyed by OS/arch and hashes of
  package.json + package-lock.json; all parallel matrix jobs share the
  cache so only the first runner downloads
- Add Docker Buildx GHA cache (--cache-from/--cache-to type=gha) to
  container tests with per arch+node scopes
- Remove --no-cache from the musl buildx command which was explicitly
  defeating layer caching
- Restructure both Dockerfiles to COPY package.json, package-lock.json
  and the install script before running npm run install-zstd, then COPY
  the rest; this pins the zstd download layer to those files only so it
  is not invalidated on every source commit
Copilot AI review requested due to automatic review settings June 8, 2026 08:26
@tadjik1 tadjik1 requested a review from a team as a code owner June 8, 2026 08:26
@tadjik1 tadjik1 marked this pull request as draft June 8, 2026 08:28

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce CI flakiness and speed up builds by making the zstd source download/build more resilient (retries, temp file) and enabling caching across CI runs (actions/cache for host builds; Buildx GHA cache for container builds).

Changes:

  • Updates etc/install-zstd.sh to retry downloads, download to a temp file before extraction, and skip work when a prior build is present.
  • Adds actions/cache@v5 to cache the built deps directory for host test matrix jobs.
  • Enables Docker Buildx GHA cache for glibc/musl container builds and restructures Dockerfiles to maximize layer reuse.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
etc/install-zstd.sh Adds retry + temp-file download and introduces an idempotent “skip if built” path.
.github/workflows/test.yml Adds host-side deps caching and enables Buildx GHA cache for container builds.
.github/docker/Dockerfile.musl Reorders COPY/RUN steps to isolate zstd build into a stable cacheable layer.
.github/docker/Dockerfile.glibc Reorders COPY/RUN steps to isolate zstd build into a stable cacheable layer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread etc/install-zstd.sh Outdated
Comment thread etc/install-zstd.sh
Comment thread .github/workflows/test.yml
tadjik1 added 4 commits June 8, 2026 10:32
The type=gha cache never functioned: raw `docker buildx build` does not
receive the Actions cache-service token env vars (only build-push-action
or third-party token-export actions provide them), so import/export
silently no-op'd and every container build ran from scratch anyway.

Rather than pull in a third-party action to expose the runtime token,
drop container-layer caching entirely. The download flakiness that
originally broke CI is fixed by the retry loop in install-zstd.sh, and
QEMU hangs by timeout-minutes; layer caching was only a speed
optimization. Revert the Dockerfile COPY reordering since its sole
purpose was cache granularity.

Host-job actions/cache (which does work) and the container timeout are
retained.
@tadjik1 tadjik1 marked this pull request as ready for review June 8, 2026 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants