fix: wire four dropped injection seams (F1–F4 from the latent-bug sweep) by theagenticguy · Pull Request #263 · theagenticguy/opencodehub

theagenticguy · 2026-06-29T17:07:49Z

Summary

Fixes the four latent bugs from the sweep — all the same class: an injection seam that defaults empty/no-op when unset, where the tests always inject a real value (so the suite stays green) but the production caller forgot to wire it, leaving the feature silently hollow with exit 0. Each fix ships with a regression test that exercises the production path (the blind spot the class hides in).

The four fixes (one commit each)

F4 — fix(ingestion): --allow-build-scripts is no longer a dead flag. The flag was parsed and passed to runAnalyze, then dropped: analyze.ts never spread it into the pipeline options, PipelineOptions had no such field, and scip-index.ts read only the CODEHUB_ALLOW_BUILD_SCRIPTS env var. Wired through all three layers; the COBOL proleap deep-parse + rust/java/kotlin build-script indexers are now reachable via the documented flag. Regression: the orchestrator "threads every PipelineOptions field" test now covers it.

F2 — fix(mcp): MCP scan no longer silently lints zero OpenAPI contracts. The MCP tool called createDefaultWrappers(specs) with no context, so Spectral's contractFiles defaulted empty. Lifted the file-discovery into a shared buildScannerFileContext in @opencodehub/scanners that both the CLI and MCP now call (killing the duplication + the drift). The CLI's two private discoverers are deleted.

F3 — fix(analysis): verdict ownership policies can pass on graph-owned paths. ownersByPath was hardcoded empty, so an ownership_required rule with empty require_approval_from (relying on graph owners) unconditionally blocked with "no owners". Added collectOwnersByPath (one batched OWNED_BY query, per-path, keyed by contributor email) and threaded it in — only when a policy is loaded, so the starter repo pays no graph walk.

F1 — fix(analysis): scan.sarif now carries graph signals. enrichWithProperties had zero production callers. Added buildScanEnrichment (maps each result to its File node's bus factor + fix-follow-feat density) and ran it in the scan pipeline after fingerprinting. Keyed by primaryLocationLineHash (run-structure-independent) and normalizing absolute result uris to the repo-relative File-node key — without that normalization the enrichment matched nothing, which would have been the same latent-bug class.

Scope note on F1

Only the file-granular signals that are a direct, cheap read off the File node are emitted (busFactor, temporalFixDensity). ownershipDrift/cochangeScore (community/temporal-table) and the symbol-level signals (blastRadius/community/centrality, which need per-finding runImpact/PageRank) are deliberately omitted rather than approximated — a follow-up can add them behind a per-finding budget. Every ResultEnrichment field is optional, so this is honest, not lossy.

Verification

Full workspace test suite: 0 failures. New regression tests: orchestrator field round-trip (F4), file-context.test.ts (F2), owners.test.ts (F3), scan-enrich.test.ts incl. the absolute-uri normalization (F1).
End-to-end on a real repo: --allow-build-scripts populates ast-chunks/commit/pins; MCP scan wires contract files; verdict ownership rule resolves graph owners; scan.sarif finding carries properties.opencodehub = {busFactor: 1} and is byte-identical across two scans.
Typecheck, repo-wide Biome, banned-strings all clean locally.

🤖 Generated with Claude Code

…ndex phase `codehub analyze --allow-build-scripts proleap` parsed the flag and passed it into runAnalyze, but analyze.ts never spread allowBuildScripts into the pipeline options, the PipelineOptions type had no such field, and scip-index.ts read only the CODEHUB_ALLOW_BUILD_SCRIPTS env var. So the documented invocation that "wakes the JVM COBOL deep-parse bridge" silently ran the regex hot path only — exit 0, no warning — and the rust/java/kotlin build-script indexers were unreachable except via the undocumented env var. Wire it through all three layers: - add `allowBuildScripts?: readonly "proleap"[]` to PipelineOptions; stripPhaseKeys already forwards every PipelineOptions field to ctx.options. - spread opts.allowBuildScripts into analyze.ts pipelineOptions. - scip-index.ts reads ctx.options.allowBuildScripts (env var kept as fallback), appends the cobol-proleap indexer to candidates when proleap is opted-in on a COBOL project, and passes allowedBuildScripts to runIndexer (which still gates). Regression guard: the orchestrator "threads every PipelineOptions field" test now covers allowBuildScripts, so a future dropped field surfaces immediately. ingestion 638 + cli 346 tests green.

The MCP `scan` tool called createDefaultWrappers(specs) with no context arg, so Spectral's contractFiles defaulted empty and it short-circuited to "no API contract files to lint" — an agent scanning an OpenAPI repo over MCP got 0 findings even though specs existed on disk and the profile selected Spectral. The CLI path wired it; the MCP path skipped exactly that step. Lift the file-discovery (findOpenApiFiles, findDockerfiles) into a shared `buildScannerFileContext` in @opencodehub/scanners (pure node:fs walks, no new deps) and call it from BOTH surfaces, so they can never drift again: - MCP scan.ts builds a DefaultWrapperContext (shared file discovery + checkov frameworks from the profile) and passes it to createDefaultWrappers. - CLI scan.ts now composes on the shared discoverer; its two private findDockerfiles/findOpenApiFiles copies are deleted. Regression guard: file-context.test.ts covers contract/Dockerfile discovery (incl. node_modules/.git/.codehub exclusion) and that buildScannerFileContext populates spectral/hadolint only for selected specs. scanners 94 + mcp 209 + cli 346 tests green.

…can pass The verdict CLI hardcoded `ownersByPath: new Map()`, so an `ownership_required` policy rule with an empty `require_approval_from` (relying on graph-derived owners) hit the evaluator's `acceptable.size === 0` branch and unconditionally blocked with "path has no owners" — it could never pass in production, even though OWNED_BY edges existed and the verdict already derives recommended reviewers from them. Add `collectOwnersByPath(graph, files)` to @opencodehub/analysis: one batched OWNED_BY edge query over the changed-file node ids, grouped per path, keyed by contributor email (falling back to email hash). The verdict CLI calls it over verdict.changedFiles — only when a policy is actually loaded, so the starter (no-policy) repo pays no graph walk — and threads the map into buildPolicyContext. Defensive: a store missing the edge/node readers (minimal test fakes) yields an empty map rather than throwing, preserving prior behavior. Team/handle reconciliation (mapping an email to a GitHub team) stays a separate design item; operators bridge it via `require_approval_from` as before. Regression guard: owners.test.ts covers the per-path mapping, the no-owner-edge omission, the emailHash fallback, and per-path isolation. analysis 159 + cli 346 tests green.

`enrichWithProperties` (stamps graph-derived signals under `properties.opencodehub.*` on each scan result) had zero production callers — only its own tests. So `scan.sarif` shipped with none of the documented graph signals, even though the data sits in the indexed graph. Add `buildScanEnrichment(graph, sarif, repoPath)` in @opencodehub/analysis: it maps each result's primary-location file to its File node and reads the file-granular signals already materialized there — bus factor and fix-follow-feat density (mapped to temporalFixDensity). The scan CLI runs it after `enrichWithFingerprints` (the enrichment is keyed by `primaryLocationLineHash`) and before baseline/suppressions, opening the scanned repo's graph read-only best-effort — a missing/empty graph leaves the SARIF untouched, like the existing ingest step. Two correctness points the e2e surfaced: - result uris are normalized to the repo-relative form the File node id uses; scanners emit absolute uris, so without stripping the repoPath prefix the lookup silently matched nothing — itself the same latent-bug class. Covered by a dedicated test. - keyed by fingerprint, not result index: scan merges each scanner into its own SARIF run and the enricher indexes per-run, so a global index misaligns. - run-level stamp carries no clock/run id, so a re-scan stays byte-identical. Scope: ownershipDrift/cochangeScore (community/temporal-table) and the symbol-level signals (blastRadius/community/centrality, which need per-finding runImpact/PageRank) are deliberately omitted rather than approximated — a follow-up can add them behind a per-finding budget. All ResultEnrichment fields are optional, so this is honest, not lossy. Verified end-to-end: a scanned repo's finding now carries result.properties.opencodehub = {busFactor: 1}, and two scans are byte-identical. analysis 164 + cli 346 tests green; full workspace 0 failures.

CI runs `biome ci .`, which enforces the organize-imports/exports assist that `biome check .` only auto-fixes locally. The new buildScanEnrichment / buildScannerFileContext exports landed out of sorted order; reorder them. No behavior change — export statements only.

Windows CI caught a real cross-platform defect: `findOpenApiFiles` / `findDockerfiles` returned `path.relative` output with the OS separator, so on Windows they yielded `api\swagger.json`. The graph keys File nodes by `/`-joined repo-relative paths and the scanner wrappers expect the same, so the backslash form would never match — Spectral/hadolint wiring (F2) and the scan enrichment lookup (F1) would silently miss on Windows, the very class these fixes close. Normalize walked paths to `/` in file-context's walker, and harden scan-enrich's `toRepoRelative` to fold backslashes before the prefix-strip so the uri and repoPath compare consistently on Windows. The existing file-context test (asserts `api/swagger.json`) is the regression guard; it failed on windows-latest and passes after the fix. scanners 99 + analysis 164 tests green; biome ci clean.

🤖 Automated release via release-please --- <details><summary>root: 0.10.3</summary> ## [0.10.3](root-v0.10.2...root-v0.10.3) (2026-06-29) ### Features * **frameworks:** wire stage-3 config-AST evidence into detection ([#264](#264)) ([18e08b2](18e08b2)) * **pack:** context-bom read-receipt (9th BOM item) + real production provenance ([#261](#261)) ([b936af2](b936af2)) ### Bug Fixes * wire four dropped injection seams (F1–F4 from the latent-bug sweep) ([#263](#263)) ([dde590e](dde590e)) </details> <details><summary>cli: 0.10.3</summary> ## [0.10.3](cli-v0.10.2...cli-v0.10.3) (2026-06-29) ### Features * **pack:** context-bom read-receipt (9th BOM item) + real production provenance ([#261](#261)) ([b936af2](b936af2)) ### Bug Fixes * wire four dropped injection seams (F1–F4 from the latent-bug sweep) ([#263](#263)) ([dde590e](dde590e)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Laith Al-Saadoon <9553966+theagenticguy@users.noreply.github.com>

theagenticguy added 6 commits June 29, 2026 16:40

theagenticguy merged commit dde590e into main Jun 29, 2026
38 checks passed

theagenticguy deleted the fix/wire-dropped-seams-f1-f4 branch June 29, 2026 17:22

github-actions Bot mentioned this pull request Jun 29, 2026

chore: release main #262

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: wire four dropped injection seams (F1–F4 from the latent-bug sweep)#263

fix: wire four dropped injection seams (F1–F4 from the latent-bug sweep)#263
theagenticguy merged 6 commits into
mainfrom
fix/wire-dropped-seams-f1-f4

theagenticguy commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

theagenticguy commented Jun 29, 2026

Summary

The four fixes (one commit each)

Scope note on F1

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant