fix: wire four dropped injection seams (F1–F4 from the latent-bug sweep)#263
Merged
Conversation
…ndex phase `codehub analyze --allow-build-scripts proleap` parsed the flag and passed it into runAnalyze, but analyze.ts never spread allowBuildScripts into the pipeline options, the PipelineOptions type had no such field, and scip-index.ts read only the CODEHUB_ALLOW_BUILD_SCRIPTS env var. So the documented invocation that "wakes the JVM COBOL deep-parse bridge" silently ran the regex hot path only — exit 0, no warning — and the rust/java/kotlin build-script indexers were unreachable except via the undocumented env var. Wire it through all three layers: - add `allowBuildScripts?: readonly "proleap"[]` to PipelineOptions; stripPhaseKeys already forwards every PipelineOptions field to ctx.options. - spread opts.allowBuildScripts into analyze.ts pipelineOptions. - scip-index.ts reads ctx.options.allowBuildScripts (env var kept as fallback), appends the cobol-proleap indexer to candidates when proleap is opted-in on a COBOL project, and passes allowedBuildScripts to runIndexer (which still gates). Regression guard: the orchestrator "threads every PipelineOptions field" test now covers allowBuildScripts, so a future dropped field surfaces immediately. ingestion 638 + cli 346 tests green.
The MCP `scan` tool called createDefaultWrappers(specs) with no context arg, so Spectral's contractFiles defaulted empty and it short-circuited to "no API contract files to lint" — an agent scanning an OpenAPI repo over MCP got 0 findings even though specs existed on disk and the profile selected Spectral. The CLI path wired it; the MCP path skipped exactly that step. Lift the file-discovery (findOpenApiFiles, findDockerfiles) into a shared `buildScannerFileContext` in @opencodehub/scanners (pure node:fs walks, no new deps) and call it from BOTH surfaces, so they can never drift again: - MCP scan.ts builds a DefaultWrapperContext (shared file discovery + checkov frameworks from the profile) and passes it to createDefaultWrappers. - CLI scan.ts now composes on the shared discoverer; its two private findDockerfiles/findOpenApiFiles copies are deleted. Regression guard: file-context.test.ts covers contract/Dockerfile discovery (incl. node_modules/.git/.codehub exclusion) and that buildScannerFileContext populates spectral/hadolint only for selected specs. scanners 94 + mcp 209 + cli 346 tests green.
…can pass The verdict CLI hardcoded `ownersByPath: new Map()`, so an `ownership_required` policy rule with an empty `require_approval_from` (relying on graph-derived owners) hit the evaluator's `acceptable.size === 0` branch and unconditionally blocked with "path has no owners" — it could never pass in production, even though OWNED_BY edges existed and the verdict already derives recommended reviewers from them. Add `collectOwnersByPath(graph, files)` to @opencodehub/analysis: one batched OWNED_BY edge query over the changed-file node ids, grouped per path, keyed by contributor email (falling back to email hash). The verdict CLI calls it over verdict.changedFiles — only when a policy is actually loaded, so the starter (no-policy) repo pays no graph walk — and threads the map into buildPolicyContext. Defensive: a store missing the edge/node readers (minimal test fakes) yields an empty map rather than throwing, preserving prior behavior. Team/handle reconciliation (mapping an email to a GitHub team) stays a separate design item; operators bridge it via `require_approval_from` as before. Regression guard: owners.test.ts covers the per-path mapping, the no-owner-edge omission, the emailHash fallback, and per-path isolation. analysis 159 + cli 346 tests green.
`enrichWithProperties` (stamps graph-derived signals under
`properties.opencodehub.*` on each scan result) had zero production callers —
only its own tests. So `scan.sarif` shipped with none of the documented graph
signals, even though the data sits in the indexed graph.
Add `buildScanEnrichment(graph, sarif, repoPath)` in @opencodehub/analysis: it
maps each result's primary-location file to its File node and reads the
file-granular signals already materialized there — bus factor and
fix-follow-feat density (mapped to temporalFixDensity). The scan CLI runs it
after `enrichWithFingerprints` (the enrichment is keyed by
`primaryLocationLineHash`) and before baseline/suppressions, opening the
scanned repo's graph read-only best-effort — a missing/empty graph leaves the
SARIF untouched, like the existing ingest step.
Two correctness points the e2e surfaced:
- result uris are normalized to the repo-relative form the File node id uses;
scanners emit absolute uris, so without stripping the repoPath prefix the
lookup silently matched nothing — itself the same latent-bug class. Covered
by a dedicated test.
- keyed by fingerprint, not result index: scan merges each scanner into its
own SARIF run and the enricher indexes per-run, so a global index misaligns.
- run-level stamp carries no clock/run id, so a re-scan stays byte-identical.
Scope: ownershipDrift/cochangeScore (community/temporal-table) and the
symbol-level signals (blastRadius/community/centrality, which need per-finding
runImpact/PageRank) are deliberately omitted rather than approximated — a
follow-up can add them behind a per-finding budget. All ResultEnrichment fields
are optional, so this is honest, not lossy.
Verified end-to-end: a scanned repo's finding now carries
result.properties.opencodehub = {busFactor: 1}, and two scans are
byte-identical. analysis 164 + cli 346 tests green; full workspace 0 failures.
CI runs `biome ci .`, which enforces the organize-imports/exports assist that `biome check .` only auto-fixes locally. The new buildScanEnrichment / buildScannerFileContext exports landed out of sorted order; reorder them. No behavior change — export statements only.
Windows CI caught a real cross-platform defect: `findOpenApiFiles` / `findDockerfiles` returned `path.relative` output with the OS separator, so on Windows they yielded `api\swagger.json`. The graph keys File nodes by `/`-joined repo-relative paths and the scanner wrappers expect the same, so the backslash form would never match — Spectral/hadolint wiring (F2) and the scan enrichment lookup (F1) would silently miss on Windows, the very class these fixes close. Normalize walked paths to `/` in file-context's walker, and harden scan-enrich's `toRepoRelative` to fold backslashes before the prefix-strip so the uri and repoPath compare consistently on Windows. The existing file-context test (asserts `api/swagger.json`) is the regression guard; it failed on windows-latest and passes after the fix. scanners 99 + analysis 164 tests green; biome ci clean.
Merged
theagenticguy
added a commit
that referenced
this pull request
Jun 29, 2026
🤖 Automated release via release-please --- <details><summary>root: 0.10.3</summary> ## [0.10.3](root-v0.10.2...root-v0.10.3) (2026-06-29) ### Features * **frameworks:** wire stage-3 config-AST evidence into detection ([#264](#264)) ([18e08b2](18e08b2)) * **pack:** context-bom read-receipt (9th BOM item) + real production provenance ([#261](#261)) ([b936af2](b936af2)) ### Bug Fixes * wire four dropped injection seams (F1–F4 from the latent-bug sweep) ([#263](#263)) ([dde590e](dde590e)) </details> <details><summary>cli: 0.10.3</summary> ## [0.10.3](cli-v0.10.2...cli-v0.10.3) (2026-06-29) ### Features * **pack:** context-bom read-receipt (9th BOM item) + real production provenance ([#261](#261)) ([b936af2](b936af2)) ### Bug Fixes * wire four dropped injection seams (F1–F4 from the latent-bug sweep) ([#263](#263)) ([dde590e](dde590e)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Laith Al-Saadoon <9553966+theagenticguy@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the four latent bugs from the sweep — all the same class: an injection seam that defaults empty/no-op when unset, where the tests always inject a real value (so the suite stays green) but the production caller forgot to wire it, leaving the feature silently hollow with exit 0. Each fix ships with a regression test that exercises the production path (the blind spot the class hides in).
The four fixes (one commit each)
F4 —
fix(ingestion):--allow-build-scriptsis no longer a dead flag. The flag was parsed and passed torunAnalyze, then dropped:analyze.tsnever spread it into the pipeline options,PipelineOptionshad no such field, andscip-index.tsread only theCODEHUB_ALLOW_BUILD_SCRIPTSenv var. Wired through all three layers; the COBOL proleap deep-parse + rust/java/kotlin build-script indexers are now reachable via the documented flag. Regression: the orchestrator "threads every PipelineOptions field" test now covers it.F2 —
fix(mcp): MCPscanno longer silently lints zero OpenAPI contracts. The MCP tool calledcreateDefaultWrappers(specs)with no context, so Spectral'scontractFilesdefaulted empty. Lifted the file-discovery into a sharedbuildScannerFileContextin@opencodehub/scannersthat both the CLI and MCP now call (killing the duplication + the drift). The CLI's two private discoverers are deleted.F3 —
fix(analysis): verdict ownership policies can pass on graph-owned paths.ownersByPathwas hardcoded empty, so anownership_requiredrule with emptyrequire_approval_from(relying on graph owners) unconditionally blocked with "no owners". AddedcollectOwnersByPath(one batched OWNED_BY query, per-path, keyed by contributor email) and threaded it in — only when a policy is loaded, so the starter repo pays no graph walk.F1 —
fix(analysis):scan.sarifnow carries graph signals.enrichWithPropertieshad zero production callers. AddedbuildScanEnrichment(maps each result to its File node's bus factor + fix-follow-feat density) and ran it in the scan pipeline after fingerprinting. Keyed byprimaryLocationLineHash(run-structure-independent) and normalizing absolute result uris to the repo-relative File-node key — without that normalization the enrichment matched nothing, which would have been the same latent-bug class.Scope note on F1
Only the file-granular signals that are a direct, cheap read off the File node are emitted (busFactor, temporalFixDensity).
ownershipDrift/cochangeScore(community/temporal-table) and the symbol-level signals (blastRadius/community/centrality, which need per-findingrunImpact/PageRank) are deliberately omitted rather than approximated — a follow-up can add them behind a per-finding budget. EveryResultEnrichmentfield is optional, so this is honest, not lossy.Verification
file-context.test.ts(F2),owners.test.ts(F3),scan-enrich.test.tsincl. the absolute-uri normalization (F1).--allow-build-scriptspopulates ast-chunks/commit/pins; MCP scan wires contract files; verdict ownership rule resolves graph owners; scan.sarif finding carriesproperties.opencodehub = {busFactor: 1}and is byte-identical across two scans.🤖 Generated with Claude Code