Skip to content

ci: introduce dynamic matrix balancer for GHA#17036

Draft
ohmayr wants to merge 1 commit into
mainfrom
add-gha-matrix-balancer
Draft

ci: introduce dynamic matrix balancer for GHA#17036
ohmayr wants to merge 1 commit into
mainfrom
add-gha-matrix-balancer

Conversation

@ohmayr
Copy link
Copy Markdown
Contributor

@ohmayr ohmayr commented May 12, 2026

Adds gha_matrix_balancer.py to dynamically chunk and distribute Python packages across GitHub Actions runners.

This orchestrator sets the foundation to replace legacy sequential bash loops, eliminating massive execution latency and 6-hour CI timeouts.
It provides:

  1. Protection against GitHub's 256-job limit via dynamic bucketing.
  2. "Heavy Lifter" isolation, ensuring complex packages (e.g., spanner) get dedicated VMs and don't bottleneck smaller GAPIC clients.
  3. Clean GHA UI labels for clear developer telemetry.

Note: This is a zero-risk foundation PR. It adds the script and tests but does not yet modify any active .yaml workflows.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a GitHub Actions matrix balancer script and its corresponding unit tests to optimize CI performance by isolating heavy-load packages. The review feedback correctly identifies several robustness issues: the script needs to ensure a valid JSON output even when no packages are found to prevent workflow failures, handle potential division-by-zero or value errors in bucket calculations, and normalize input paths to correctly identify packages and generate job labels.

Comment on lines +56 to +63
if not packages:
return

# Protect against GitHub's 256-job hard limit
max_buckets = min(250 // args.matrix_multiplier, args.max_vms)
buckets = distribute_packages(packages, max_buckets)

jobs_json = json.dumps(build_github_actions_jobs(buckets))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This section has two robustness issues:

  1. If no valid packages are found, the script returns early without setting the buckets output. This will cause fromJson() to fail in GitHub Actions workflows. It is better to always output a valid JSON array (e.g., []).
  2. The max_buckets calculation can crash with a ZeroDivisionError if matrix-multiplier is 0, or cause a ValueError in distribute_packages if max_buckets evaluates to 0 (which happens if matrix-multiplier > 250 or max-vms is 0).
    jobs = []
    if packages:
        # Protect against GitHub's 256-job hard limit and ensure at least 1 bucket
        multiplier = max(1, args.matrix_multiplier)
        max_buckets = max(1, min(250 // multiplier, args.max_vms))

        buckets = distribute_packages(packages, max_buckets)
        jobs = build_github_actions_jobs(buckets)

    jobs_json = json.dumps(jobs)

parser.add_argument("--max-vms", type=int, default=20)
args = parser.parse_args()

changed_dirs = os.environ.get("CHANGED_DIRS", "").split()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If CHANGED_DIRS contains paths with trailing slashes (e.g., packages/my-pkg/), os.path.basename() will return an empty string. This breaks the heavy lifter detection and results in invalid job labels like + 5. Normalizing the paths ensures consistent behavior across different environments and input formats.

Suggested change
changed_dirs = os.environ.get("CHANGED_DIRS", "").split()
changed_dirs = [os.path.normpath(d) for d in os.environ.get("CHANGED_DIRS", "").split()]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant