diff --git a/planning/README.md b/planning/README.md index e964e94..d952de4 100644 --- a/planning/README.md +++ b/planning/README.md @@ -70,7 +70,7 @@ carry **no** frontmatter — living prose, dated by git. ### Active -_None._ +- **[deep-audit](changes/active/2026-06-14.01-deep-audit/design.md)** (2026-06-14) — Full-codebase deep audit covering the perf/security/supply-chain gaps the 2026-06-07 audit skipped, plus correctness, concurrency, refactoring, and test quality. Report: [audits/2026-06-14-deep-audit.md](audits/2026-06-14-deep-audit.md) — 35 confirmed (1 High, 4 Medium, 14 Low, 14 nits); headline is an `architecture/extras.md` pydantic-isolation accuracy bug. Report-only; confirmed findings spawn follow-up bundles. ### Archived (shipped) diff --git a/planning/audits/2026-06-14-deep-audit.md b/planning/audits/2026-06-14-deep-audit.md new file mode 100644 index 0000000..53ed8c6 --- /dev/null +++ b/planning/audits/2026-06-14-deep-audit.md @@ -0,0 +1,568 @@ +# httpware deep audit — 2026-06-14 + +**Status:** complete +**Method:** ten adversarial finders fanned out across the codebase → every candidate run through a 3-lens verify panel (code_reality, reproducer, spec_grounded) → only candidates surviving ≥2/3 lenses kept → single synthesis pass for triage, dedup, and report. + +## Summary + +35 confirmed findings survived verification (33 distinct after dedup; two pydantic-import duplicates and two `middleware/__init__` `__all__` duplicates were folded). Severity applied strictly: a missing test or a duplicated-but-not-yet-diverged block is not a bug, and a defect reachable only under a non-default knob is capped at low. + +- Blockers: 0 +- High: 1 +- Medium: 4 +- Low: 14 +- Nits: 14 + +**Headline:** `architecture/extras.md` asserts that the pydantic extra is imported behind an `is__installed` guard *inside* `decoders/pydantic.py` "never at package top level" — but `pydantic.py:13` does `from pydantic import TypeAdapter` unconditionally at module top, so the documented isolation invariant (and its grep self-check) is false, and in a real no-pydantic environment the friendly `ImportError` guard is dead code, replaced by a bare `ModuleNotFoundError`. + +**Not covered:** no dynamic/runtime execution, fuzzing, or live-network testing was performed — all findings are static (source/test/doc reading plus single-call reproducer reasoning). Performance findings were uniformly refuted as micro-optimizations with no observable defect, so this pass yields no actionable performance work. No dependency-CVE / supply-chain scan was run beyond version-constraint inspection; no type-checker or linter was executed as part of the audit. + +## Findings + +### High + +#### `architecture/extras.md` claims the pydantic import is guarded inside `decoders/pydantic.py`, but it is unguarded at module top +*(accuracy / architecture_docs — verified)* + +`architecture/extras.md:22` + +The doc states the extra is imported "**inside** that module behind an `is__installed` guard … never at package top level," and offers a grep self-check that "returns exactly one indented line." In reality `decoders/pydantic.py:13` imports `TypeAdapter` unconditionally at module top, the line is not indented, and isolation is actually achieved by a lazy import in `client.py`'s `_build_default_decoders()`. The documented invariant and its verification command are both wrong; only the msgspec sibling matches the description. + +``` +The `import` of the extra happens **inside** that module behind an `is__installed` guard +from `_internal/import_checker.py` — never at package top level. … `grep -rnE 'from pydantic|import +pydantic' src/httpware/ | grep -v import_checker` returns exactly one indented line (the guarded +import in `decoders/pydantic.py`). +``` + +Panel 2/3: spec_grounded, spec_grounded. Suggested direction: reconcile the doc with reality — either describe the actual lazy-import-in-`client.py` isolation mechanism for pydantic, or treat the asymmetry as the source defect (see the two Medium pydantic-import findings below) and document a single consistent pattern. + +### Medium + +#### `decoders/pydantic.py` has an unguarded module-level pydantic import; the `__init__` fallback is dead code without the extra +*(optional_extras / correctness — verified)* + +`src/httpware/decoders/pydantic.py:13` + +Line 13 runs `from pydantic import TypeAdapter` unconditionally, so `import httpware.decoders.pydantic` (or `from … import PydanticDecoder`) raises `ModuleNotFoundError` at module-load time when pydantic is absent — before the friendly `ImportError(MISSING_DEPENDENCY_MESSAGE)` guard in `PydanticDecoder.__init__` can ever run. The guard only fires in the synthetic case where pydantic is installed but `is_pydantic_installed` is monkeypatched False. `decoders/msgspec.py` does this correctly with a module-level `if import_checker.is_msgspec_installed: import msgspec`. *(Folds two confirmed findings — the optional_extras "unguarded import" and the correctness "guard unreachable" — into one; same file:line, same root cause.)* + +```python +from pydantic import TypeAdapter + +from httpware._internal import import_checker + ... + def __init__(self) -> None: + if not import_checker.is_pydantic_installed: + raise ImportError(MISSING_DEPENDENCY_MESSAGE) +``` + +Panel 3/3: code_reality, reproducer, spec_grounded. Suggested direction: mirror the msgspec module-level conditional-import pattern so the module imports cleanly without the extra and the `__init__` guard becomes the real fail-fast path; this also fixes the High doc finding above. + +#### No response body size limit before deserialization — attacker-controlled server can drive unbounded allocation +*(deserialization-safety / security — verified)* + +`src/httpware/client.py:180` + +When `response_model` is provided, `send()` / `send_with_response()` (sync and async) read `response.content` — buffering the whole body — then hand the raw bytes to the decoder. There is no upper bound; httpx2 imposes no default max body size either. An attacker-controlled server can return an arbitrarily large body and force memory allocation proportional to it before any decode begins. + +```python +response = await self._dispatch(request) +try: + return decoder.decode(response.content, response_model) +``` + +Panel 2/3: code_reality, code_reality. Suggested direction: consider an opt-in max-decode-size guard at the decode seam (Seam B) that checks `Content-Length` / accumulated bytes before buffering, raising a typed `ClientError`; document that the streaming API does not help here because decode requires `.content`. + +#### `tests/test_error_mapping_terminal.py` covers AsyncClient only; the sync `Client._terminal` status-raising path has no parallel suite +*(test-coverage / tests — verified)* + +`tests/test_error_mapping_terminal.py:1` + +All 11 tests are `async def` against `AsyncClient`. The sync `Client._terminal` (`client.py:884`) calls the same `_raise_on_status_error`, but no suite exercises unknown-4xx→`ClientStatusError`, unknown-5xx→`ServerStatusError`, 3xx non-raise, or transport-exception mapping on the sync terminal; `test_client_sync.py` has a single 404 test and zero fallback-class tests. The invariants are unproven on the sync surface. + +``` +"""Tests for the AsyncClient internal terminal's exception mapping.""" +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add sync mirrors using `Client(httpx2_client=httpx2.Client(transport=...))` covering the unknown-4xx/5xx fallback classes and the 3xx non-raise. *(Note: this is the broadest of several sync-parity test gaps; the narrower ones are bucketed Low/Nit below.)* + +#### `architecture/client.md` streaming section omits `Client.stream()` — documents only `AsyncClient.stream()` +*(accuracy / architecture_docs — verified)* + +`architecture/client.md:17` + +The doc says "`AsyncClient.stream()` provides a context-manager API … It bypasses the middleware chain by design," but `client.py:1496-1551` defines `Client.stream()` with identical chain-bypass semantics (its own docstring says "matches AsyncClient.stream() behavior"). A reader consulting the architecture doc would conclude the sync client has no streaming surface. + +``` +AsyncClient.stream() provides a context-manager API for chunked response bodies. It bypasses the +middleware chain by design. +``` + +Panel 3/3: code_reality, spec_grounded, spec_grounded. Suggested direction: add `Client.stream()` to the streaming section as the sync peer, noting both bypass the chain. + +### Low + +#### RetryBudget token withdrawn before the `Retry-After > max_delay` give-up check (sync and async) +*(correctness — verified)* + +`src/httpware/middleware/resilience/retry.py:162` + +In both `AsyncRetry.__call__` (line 162) and `Retry.__call__` (line 300), `budget.try_withdraw()` debits a token *before* the `retry_after > self.max_delay` guard (line 187 / 325). When a server's `Retry-After` exceeds `max_delay`, the middleware re-raises without retrying, yet a token has already been spent — a sustained `Retry-After`-flood drains shared-budget capacity in proportion to request rate, suppressing retries for unrelated well-behaved requests. Reachable only when `respect_retry_after` is on and a server sends an over-large header. + +```python +if not self.budget.try_withdraw(): + ... + raise RetryBudgetExhaustedError(...) from last_exc +... +if retry_after is not None and retry_after > self.max_delay: + ... + raise last_exc +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: evaluate the `Retry-After > max_delay` give-up condition before withdrawing from the budget; mirror in both classes. + +#### `RetryBudget`'s `threading.Lock` can block the asyncio event-loop thread when shared sync↔async +*(concurrency — verified)* + +`src/httpware/middleware/resilience/budget.py:54` + +`deposit()` and `try_withdraw()` unconditionally acquire a `threading.Lock`. When one budget is shared by a sync `Client` (on a thread-pool thread) and an `AsyncClient` (on the loop thread), a sync thread holding the lock blocks the event-loop thread's acquisition, stalling all coroutines for the lock-hold duration. The docstring advertises "asyncio-safe" without qualifying that "safe" means no corruption, not non-blocking. + +``` +Thread-safe and asyncio-safe: all mutations go through a threading.Lock. +A single RetryBudget instance is safe to share across threads, across +coroutines on one event loop, and across (sync Client, AsyncClient) pairs +in the same process. +``` + +Panel 2/3: code_reality, spec_grounded. Suggested direction: qualify the docstring's "asyncio-safe" claim to clarify the blocking caveat, and/or keep the critical section minimal; a real fix is out of scope for a thin lock. + +#### `_parse_retry_after` swallows `ValueError` but not `OverflowError` — a crafted header crashes the retry loop +*(untrusted-response / error_contract — verified)* + +`src/httpware/middleware/resilience/retry.py:60` + +A `Retry-After` value of 309–4300 decimal digits makes `float(int(value))` raise `OverflowError`, which is not caught by `except ValueError`. The exception propagates unhandled through both `AsyncRetry` and `Retry`, surfacing to the caller as an unexpected crash instead of being treated as a malformed header. + +```python +return max(0.0, float(int(value))) # clamp: negative integers are malformed servers + except ValueError: + pass +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: broaden the guard to `except (ValueError, OverflowError)` so any unparseable header degrades to "no Retry-After hint." + +#### Query-string secrets are logged unredacted in all resilience-middleware observability events +*(secret-leakage / security — verified)* + +`src/httpware/middleware/resilience/retry.py:155` + +Every resilience middleware emits `"url": str(request.url)` into log records and OTel span events. `str(request.url)` includes the full query string, so tokens embedded as query params (`?api_key=…`) are written to logs and telemetry across retry.py (lines 136/155/171/274/293/309), bulkhead.py (117/173), circuit_breaker.py (193), timeout.py (72). `errors.py` documents this gap for tracebacks, but the middleware applies no redaction. + +```python +attributes={ + "method": request.method, + "url": str(request.url), +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: introduce a shared `_redact_url_for_logs` helper (strip userinfo *and* query string) and route all middleware `url` attributes through it. + +#### Query-string credentials survive `_strip_userinfo` and appear verbatim in `StatusError.__str__`/`__repr__` +*(secret-leakage / security — verified)* + +`src/httpware/errors.py:7` + +The module docstring admits "Query-string secrets are NOT stripped here." Consequently `str(exc)`/`repr(exc)` for any `StatusError` include the full URL with query string, so `?access_token=…` / `?api_key=…` tokens land in exception messages, log lines, Sentry reports, and the notes `AsyncRetry` adds via `last_exc.add_note(...)`. + +``` +Query-string secrets are NOT stripped here. +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: extend the `_strip_userinfo` sanitizer (or add a sibling) to redact known-sensitive query parameters before composing the error summary; coordinate with the middleware redaction helper above. + +#### `StatusError.response.request` carries full request headers (`Authorization`, `Cookie`) reachable from any handler +*(secret-leakage / security — verified)* + +`src/httpware/errors.py:70` + +`StatusError` stores the whole `httpx2.Response`, which references the `httpx2.Request` and its outgoing headers. Any handler that logs or serializes a caught `StatusError` (e.g. `exc.response.request.headers`) exposes `Authorization`/`Cookie`/`Proxy-Authorization`; `__repr__` redacts only URL userinfo, not headers. This is a documented trust-boundary item rather than a bug, but downstream error handlers must be aware. + +```python +def _summary(self) -> str: + method = self.response.request.method + url = _strip_userinfo(str(self.response.request.url)) + return f"{self.response.status_code} {method} {url}" +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add an explicit "secrets reachable via `exc.response.request`" callout to `architecture/errors.md` so handler authors redact before logging. + +#### `stream()` pre-reads the full error body unconditionally on 4xx/5xx +*(deserialization-safety / security — verified)* + +`src/httpware/client.py:788` + +In both `AsyncClient.stream()` and `Client.stream()`, a 4xx/5xx status triggers a full `response.aread()` / `response.read()` so `exc.response.content` is populated — with no size limit. A 500 with a 1 GB body buffers 1 GB unconditionally, even though the caller asked for streaming. + +```python +if HTTPStatus.BAD_REQUEST <= response.status_code < 600: + await response.aread() # pre-read body so exc.response.content works + _raise_on_status_error(response) +``` + +Panel 2/3: code_reality, code_reality. Suggested direction: bound the error-body pre-read (or make it opt-in), so a hostile error body cannot defeat the streaming memory profile. + +#### `middleware/__init__.py` defines no `__all__`, leaking 9+ unintended star-import symbols and breaking subpackage symmetry +*(public_api — verified)* + +`src/httpware/middleware/__init__.py:1` + +With no `__all__`, `from httpware.middleware import *` re-exports `Awaitable`, `Callable`, `Protocol`, `TypeAlias`, `runtime_checkable`, the third-party `httpx2`, and the internal `chain`/`resilience` submodules — none of them intended surface. The sibling `resilience/__init__.py` and `decoders/__init__.py` both define `__all__`, making `middleware` the inconsistent case. *(Folds two confirmed findings — the star-import leak and the subpackage-inconsistency observation — into one; same file:line.)* + +``` +Public middleware namespace: ['AsyncMiddleware', 'AsyncNext', 'Awaitable', 'Callable', 'Middleware', +'Next', 'Protocol', 'TypeAlias', 'after_response', 'async_after_response', 'async_before_request', +'async_on_error', 'before_request', 'chain', 'httpx2', 'on_error', 'resilience', 'runtime_checkable'] +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add an explicit `__all__` listing the ten public middleware names, bringing the subpackage in line with its siblings. + +#### `test_retry_props.py` is described as testing "retry interleaving" but contains no concurrent tasks +*(concurrency / tests — verified)* + +`tests/test_retry_props.py:1` + +The discover map labels the file "Hypothesis property-based tests for retry interleaving," yet every test issues one sequential request per example — no `gather`, `create_task`, or threads — and the budget-interaction tests are even synchronous. No test exercises two concurrent retries racing on a shared `RetryBudget`. + +```python +async def test_total_attempts_never_exceeds_max_attempts( + max_attempts: int, status: int, method: str, +) -> None: + ... + await client.request(method, "https://example.test/x") +``` + +Panel 2/3: code_reality, spec_grounded. Suggested direction: either add a genuinely interleaved property/concurrency test for shared-budget retries, or correct the discover-map description to "sequential retry-policy bounds." + +#### `test_bulkhead_sync_props.py` uses a hard `time.sleep(0.005)` to synchronize thread startup — flaky on slow CI +*(concurrency / tests — verified)* + +`tests/test_bulkhead_sync_props.py:96` + +The test submits up to 4 holder tasks to a thread pool, then sleeps 5 ms assuming all holder threads have started *and* acquired their semaphore slots. Thread-startup overhead on loaded CI can exceed 5 ms, leaving a slot free so the expected `BulkheadFullError` is not raised; Hypothesis re-runs this many times per session, compounding the risk. + +```python +holders = [pool.submit(client.get, f"https://example.test/hold-{i}") for i in range(max_concurrent)] +time.sleep(0.005) +for i in range(extra_requests): + with pytest.raises(BulkheadFullError): + client.get(f"https://example.test/extra-{i}") +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: replace the fixed sleep with a deterministic barrier (e.g. a `threading.Barrier` or per-holder "acquired" event) so the test waits on actual slot acquisition. *(Note: the async sibling using `asyncio.sleep` was refuted — the event loop drains ready callbacks deterministically — so only the sync thread-pool variant is flaky.)* + +#### No test asserts `StatusError` leaf subclasses do not override `__init__` +*(test-coverage / tests — verified)* + +`tests/test_errors.py:46` + +CLAUDE.md and `architecture/errors.md` mandate that all `StatusError` subclasses must not override `__init__`; this is enforced only by review. No test checks `'__init__' not in cls.__dict__` for any of the nine leaves, so a future subclass that adds an `__init__` would pass the whole suite. + +```python +def test_inheritance_tree() -> None: + ... + for exc in (BadRequestError, UnauthorizedError, ForbiddenError, ForbiddenError, + ConflictError, UnprocessableEntityError, RateLimitedError): + assert issubclass(exc, ClientStatusError), exc +``` + +Panel 3/3: code_reality, reproducer, spec_grounded. Suggested direction: add a parametrized test over all nine leaves asserting `'__init__' not in cls.__dict__`. + +#### No test exercises `TimeoutError` as a CircuitBreaker failure trigger (async or sync) +*(coverage_gap / tests — verified)* + +`tests/test_circuit_breaker.py:158` + +`circuit_breaker.py` counts both `NetworkError` and `TimeoutError` as failures (`except (NetworkError, TimeoutError)`). The tests cover `NetworkError` tripping the breaker (via `ConnectError`) but never a `TimeoutError` driving the counter and opening the circuit; the same gap exists in `test_circuit_breaker_sync.py`. A regression that stopped counting timeouts would pass. + +```python +except (NetworkError, TimeoutError): + self._state.on_failure(role, request) + raise +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add a test where the handler raises `httpx2.ReadTimeout`, `failure_threshold=2`, asserting two such requests open the circuit; mirror on the sync side. + +#### `architecture/client.md` attributes the `httpx2.Client.send` call to `Client.send` instead of `Client._terminal` +*(accuracy / architecture_docs — verified)* + +`architecture/client.md:7` + +The doc says "`Client.send` calls `httpx2.Client.send`, `AsyncClient.send` calls `httpx2.AsyncClient.send`." But `Client.send` calls `self._dispatch` (the composed middleware chain); it is `Client._terminal` that calls `self._httpx2_client.send`. The statement misattributes the terminal httpx2 call to the public `.send()`. + +``` +The same terminal lifecycle holds in both worlds — `Client.send` calls `httpx2.Client.send`, +`AsyncClient.send` calls `httpx2.AsyncClient.send`. +``` + +Panel 2/3: code_reality, spec_grounded. Suggested direction: attribute the `httpx2` send to `_terminal` and note that `.send()` enters the chain first. + +### Nits + +#### `full_jitter_delay` raises `OverflowError` for `attempt_index >= 1024` despite a docstring claiming saturation to `inf` +*(correctness — verified)* + +`src/httpware/middleware/resilience/_backoff.py:25` + +The docstring claims `2.0 ** attempt_index` "saturates to `math.inf`" for `attempt_index >= 1024` so `min` clamps to `max_delay`; in fact Python's float `**` raises `OverflowError` at `2.0 ** 1024`, so the clamp never fires and the call crashes. Reachable only when `max_attempts >= 1026` with every attempt failing — practically unreachable, but the docstring is factually wrong. + +```python +ceiling = min(max_delay, base_delay * (2.0**attempt_index)) +return _random_uniform(0.0, ceiling) +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: correct the docstring, and clamp `attempt_index` (or wrap the exponentiation) so the documented saturation behavior actually holds. + +#### `_is_streaming_body_async` does not detect sync iterables, while `_is_streaming_body_sync` does +*(correctness — verified)* + +`src/httpware/_internal/status.py:32` + +The async detector only checks `__aiter__`; the sync detector excludes replayable types then checks `__iter__`. A sync generator passed to `AsyncClient` is not marked non-replayable, so `AsyncRetry`'s replay guard is absent — correctness is preserved only because httpx2 itself raises a `RuntimeError` for sync bodies on an async client. The async invariant rests on an undocumented httpx2 detail. + +```python +def _is_streaming_body_async(value: object) -> bool: + ... + return hasattr(value, "__aiter__") + +def _is_streaming_body_sync(value: object) -> bool: + ... + return hasattr(value, "__iter__") +``` + +Panel 2/2: code_reality, code_reality. Suggested direction: document the reliance on httpx2's sync-on-async guard, or symmetrize the async detector to also mark sync iterables non-replayable. + +#### `_strip_userinfo` produces a malformed `http:///path` URL when the netloc has credentials but no hostname +*(correctness — verified)* + +`src/httpware/errors.py:29` + +For `http://user:pass@/path`, `parts.hostname` is `None`, so `netloc` becomes `''` and `urlunsplit` yields the triple-slash `http:///path`. Credentials are still stripped (no security regression), but the sanitized URL in error messages and `__repr__` is malformed — and these are exactly the URLs the function exists to sanitize. + +```python +hostname = parts.hostname or "" +... +netloc = hostname +if parts.port is not None: + netloc = f"{netloc}:{parts.port}" +return urlunsplit((parts.scheme, netloc, parts.path, parts.query, parts.fragment)) +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: when hostname is empty, preserve the original (already credential-free) authority shape rather than emitting a triple-slash URL. + +#### `errors.py` module docstring attributes the auto-raise rule to `AsyncClient` only, omitting three other raise sites +*(correctness / architecture_docs — verified)* + +`src/httpware/errors.py:3` + +The docstring says "Auto-raise rule lives at AsyncClient's internal terminal." There are four raise sites: `AsyncClient._terminal`, `Client._terminal`, `AsyncClient.stream()`, and `Client.stream()`. A reader consulting it to find where status errors originate would miss three of four. + +``` +Auto-raise rule lives at AsyncClient's internal terminal (see client.py). +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: list all four raise sites (or say "both clients' terminals and both `stream()` methods"). + +#### `trust_env=True` by default — httpware silently honors `HTTP_PROXY`/`HTTPS_PROXY` +*(inherited-httpx2-surface / security — verified)* + +`src/httpware/client.py:130` + +When httpware builds its own httpx2 client it does not set `trust_env=False`, so httpx2's default reads proxy env vars and routes traffic accordingly. In a compromised environment this can silently route all traffic through an attacker proxy; callers injecting their own `httpx2_client` can disable it, but there is no httpware-level control or doc callout. + +```python +self._httpx2_client = httpx2.AsyncClient(**kwargs) +``` + +Panel 2/2: code_reality, code_reality. Suggested direction: add a documentation callout that proxy/TLS env trust is inherited from httpx2 and how to opt out via injection. + +#### `decoders/msgspec.py` `_contains_custom_type` has unguarded runtime `msgspec.*` references that would `NameError` if called when msgspec is absent +*(correctness — verified)* + +`src/httpware/decoders/msgspec.py:29` + +When `is_msgspec_installed` is False the module-level `import msgspec` is skipped, leaving `msgspec` undefined. `_contains_custom_type` then uses bare `msgspec.inspect.CustomType`/`Type` at runtime, so direct invocation (or post-load flag patching) raises `NameError` instead of a friendly `ImportError`. The `__init__` guard blocks normal instantiation but not direct calls to this module-level function. + +```python +if isinstance(info, msgspec.inspect.CustomType): + return True +... +if isinstance(value, msgspec.inspect.Type): +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: gate `_contains_custom_type` behind the installed flag, or treat it as private-and-unreachable-without-the-extra and document that. + +#### `test_threading_with_shared_budget.py`'s exact deposit-count assertion embeds a no-purge assumption as a comment +*(concurrency / tests — verified)* + +`tests/test_threading_with_shared_budget.py:78` + +The test asserts `len(budget._deposits) == expected_deposits`, relying on a comment ("TTL is 60.0 so no purge fires during the sub-second runtime") rather than an assertion. Shortening the TTL or a slow machine would trigger `_purge`, dropping the count and producing a false failure that masks correct behavior. + +```python +expected_deposits = (_N_SYNC_THREADS * _N_OPS_PER_THREAD) + _N_ASYNC_TASKS +assert len(budget._deposits) == expected_deposits, ( + f"expected {expected_deposits} deposits, got {len(budget._deposits)}" +) +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: pin the injected clock so no real time elapses, making the no-purge assumption an enforced invariant rather than a fragile comment. + +#### `ForbiddenError` (403), `ConflictError` (409), `UnprocessableEntityError` (422) are never instantiated in tests +*(test-coverage / tests — verified)* + +`tests/test_errors.py:126` + +`test_per_status_subclasses_construct` exercises only 6 of 9 `STATUS_TO_EXCEPTION` entries (400/401/404/429/500/503). The three omitted classes appear only in inheritance/table checks — none is constructed, none has `.response` or `str()` verified. + +```python +@pytest.mark.parametrize(("status", "expected"), [ + (400, BadRequestError), (401, UnauthorizedError), (404, NotFoundError), + (429, RateLimitedError), (500, InternalServerError), (503, ServiceUnavailableError), +]) +def test_per_status_subclasses_construct(status: int, expected: type[StatusError]) -> None: +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add `(403, ForbiddenError)`, `(409, ConflictError)`, `(422, UnprocessableEntityError)` to the parametrize list. + +#### No test for `full_jitter_delay` with `attempt_index >= 1024` — the documented overflow-safety path +*(coverage_gap / tests — verified)* + +`tests/test_backoff.py:1` + +The `_backoff.py` docstring specifically calls out the `attempt_index >= 1024` saturation edge, but tests cover only `attempt_index` 0 and 10. The documented (and, per the Nit above, actually broken) overflow path has no test, so a regression to integer exponentiation would go uncaught. + +``` +Uses ``2.0 **`` … so that ``attempt_index >= 1024`` saturates to ``math.inf`` and ``min`` clamps +to ``max_delay`` — ``2 ** 1024`` would raise ``OverflowError`` +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add a large-`attempt_index` test asserting a finite clamped delay — which will also surface the `OverflowError` correctness Nit above. + +#### `test_observability` no-active-span test has no assertion — passes for the wrong reason +*(mock_transport_fidelity / tests — verified)* + +`tests/test_observability.py:85` + +`test_emit_event_works_when_otel_installed_but_no_active_span` calls `_emit_event(...)` with no mock and no assertion ("the absence of an exception IS the assertion"). It would pass even if `_emit_event` became a no-op; it does not capture the log record to confirm the log-only fallback fired. + +```python +def test_emit_event_works_when_otel_installed_but_no_active_span() -> None: + ... + # No assertion needed — the absence of an exception IS the assertion. +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: assert via `caplog` that the expected log record (level + event name) was emitted. + +#### No sync-overload typing test for `Client` — `test_client_typing.py` covers `AsyncClient` only +*(coverage_gap / tests — verified)* + +`tests/test_client_typing.py:1` + +All four overload tests (get/send × with/without model) exercise only `AsyncClient`. The sync `Client` has identical overload signatures and similar dispatch; a regression in sync overload resolution would not be caught. + +``` +"""Static-typing tests for AsyncClient overloads. +... +from httpware import AsyncClient +``` + +Panel 2/2: code_reality, code_reality. Suggested direction: add sync equivalents (`client.get(...) → httpx2.Response`, `client.get(..., response_model=_User) → _User`, and the `send` pair). + +#### No sync counterpart to `test_status_error_raised_before_decoder_runs` / `test_async_decode_error_caught_by_client_error` +*(coverage_gap / tests — verified)* + +`tests/test_client_response_model.py:63` + +The async tests confirm a 4xx raises a `StatusError` (not `DecodeError`) before decode, and that `DecodeError` is-a `ClientError` at integration level. The sync client has the schema-mismatch/malformed-JSON tests but no sync mirror for the status-before-decode ordering or the `DecodeError`-is-`ClientError` integration check. + +```python +async def test_status_error_raised_before_decoder_runs() -> None: + ... +async def test_async_decode_error_caught_by_client_error() -> None: + # (no sync counterpart for either) +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add `test_sync_status_error_raised_before_decoder_runs` and `test_sync_decode_error_caught_by_client_error`. + +#### No test exercises the `httpx2.CookieConflict` mapping branch in `map_httpx2_exception` +*(coverage_gap / tests — verified)* + +`tests/test_error_mapping_terminal.py:95` + +`map_httpx2_exception` maps `(httpx2.InvalidURL, httpx2.CookieConflict) → TransportError`. Tests cover `InvalidURL` but not `CookieConflict`; a refactor that moved `CookieConflict` to `NetworkError` would go uncaught. + +```python +if isinstance(exc, (httpx2.InvalidURL, httpx2.CookieConflict)): + return TransportError(str(exc)) +``` + +Panel 2/3: code_reality, reproducer. Suggested direction: add a test asserting a handler raising `httpx2.CookieConflict` surfaces `TransportError`, not `NetworkError`. + +#### Docs use submodule import paths for symbols already in `httpware.__all__`, creating dual canonical paths +*(public_api / architecture_docs — verified)* + +`docs/middleware.md:41` + +Several examples import middleware symbols via the submodule path (`from httpware.middleware import AsyncNext`, `from httpware.errors import NetworkError`) even though all are in `httpware.__all__`. Affected: middleware.md lines 41/67/155 and recipes/phase-decorator-patterns.md lines 17/46/86/130. Two equally documented paths leave readers unsure which is canonical. + +``` +from httpware.middleware import async_before_request, async_after_response, async_on_error +from httpware.middleware import AsyncNext +from httpware.middleware import Next +``` + +Panel 2/3: code_reality, spec_grounded. Suggested direction: standardize docs on the root `from httpware import X` path for any symbol in the root `__all__`. + +#### `architecture/extras.md` shows the pydantic constraint without its upper bound, mismatching `pyproject.toml` +*(accuracy / architecture_docs — verified)* + +`architecture/extras.md:18` + +The doc snippet shows `pydantic = ["pydantic>=2"]`, but `pyproject.toml` pins `pydantic = ["pydantic>=2.0,<3.0"]`. The illustrative snippet drops the `<3.0` ceiling. + +``` +pydantic = ["pydantic>=2"] +msgspec = ["msgspec>=0.18"] +``` + +Panel 2/3: code_reality, spec_grounded. Suggested direction: sync the snippet to the real constraint, or mark it explicitly as abbreviated. + +## Negative results (verified correct) + +Investigated and refuted (did not survive the panel), or invariants the finders checked and found holding: + +- **CircuitBreaker `_consecutive_successes` across OPEN→HALF_OPEN→OPEN** — `_open()` unconditionally resets the counter on every path back to OPEN; existing `test_success_threshold_probe_failure_mid_streak_reopens` proves it. No accumulation bug. +- **`RecursionError` from deeply nested JSON (msgspec)** — refuted on a factual error: `RecursionError` *is* an `Exception` subclass, so the `except Exception` decode guard catches it and wraps it as `DecodeError`. No raw propagation. +- **`Retry-After` far-future date form** — the `retry_after > max_delay` guard caps it at the default `max_delay=5.0`; the only "sleep for years" path requires a self-inflicted `max_delay=1e12`. Behaves to spec. +- **`PydanticDecoder.decode` TypeError→fresh `TypeAdapter` per call** — unreachable: `can_decode()` already returns False for unhashable models, so decode is never dispatched for them. The fallback branch is dead in normal flow. +- **`StatusError.response` bare annotation / `__reduce__`** — the pickle round-trip uses `cls(response)`, which runs `__init__`; the AttributeError scenario requires a third-party lib to bypass `__init__`, not evidenced anywhere. +- **`AsyncBulkhead`/`AsyncCircuitBreaker` unguarded fast-path read of `self._loop`** — pointer reads are atomic on real architectures; the worst free-threaded outcome is a harmless extra lock acquisition corrected by the in-lock double-check. No torn-value bug. +- **`AsyncBulkhead` semaphore created before a running loop** — safe under Python 3.10+ (binds on first await); the proposed break requires deliberate private `_sem.acquire()` bypassing `__call__`/`_check_loop`. The semaphore also has its own cross-loop guard. +- **Async `test_bulkhead_props` `asyncio.sleep(0.005)` startup sync** — deterministic, not flaky: the event loop drains all ready holder callbacks (which have no `await` before `sem.acquire()`) before the timer fires. +- **`_raise_on_status_error` `>= 600` silent passthrough** — intentional (inline `noqa` documents the synthetic 600 upper bound); no realistic middleware synthesizes 6xx. +- **Sync vs async `BulkheadFullError` `__cause__` difference, and divergent test-timeout constants** — both raise `BulkheadFullError` with correct fields; `__cause__` is not a documented contract, and the larger sync test timeout is intentional jitter headroom. +- **Performance findings (uniformly refuted as micro-optimizations with no observable defect):** the empty-kwargs dict in `_request_with_body`; the `tuple(...)` allocation on the `MissingDecoderError` error path; the O(n) `_dispatch_decoder` linear scan (per the documented first-match contract); the per-request coroutine allocation in `chain.py` (composition is folded once at construction); the in-lock `int(...)` floor and the `{**attributes, ...}` copy in `_emit_event` (the copy is deliberate to avoid mutating the OTel `attributes` dict); the double-checked-locking `_check_loop`; the sync CircuitBreaker's two lock round-trips per request (the documented thread-safety mechanism); `dir(info)` in `_contains_custom_type` (cached, compact node types). One finder's "floor + `self._now()` inside the lock" claim was factually wrong — `self._now()` is sampled before the lock. +- **Sync/async duplication, no divergence yet (maintainability, not bugs):** `_httpx2_exception_mapper` vs `_sync`; `_dispatch_decoder` in both clients; the send/`send_with_response` decode-and-wrap block ×4; the `_request_with_body` kwargs block; `_check_loop` in bulkhead vs circuit_breaker; `can_decode` memoization in both decoders; the `_owns_client` lifecycle guard ×4; the `__init__` httpx2-construction/conflict-dict block; the `AsyncRetry`/`Retry` bodies (the differing AssertionError class-name prefixes are correct and the guards are `# pragma: no cover` unreachable). None produces wrong output today. +- **`_reconstruct_*` pickle helpers** — all five non-status reconstructors and their `__reduce__` methods are consistently shaped; `MissingDecoderError` matches its siblings. No inconsistency. +- **`msgspec.py` module-level conditional import "not truly lazy"** — refuted: import caching means the conditional fires once on first import, not per `AsyncClient()`; the docstring makes no lazy-import promise. +- **Several "missing sync property test" parity gaps** (`test_retry_sync_props.py`, sync CircuitBreaker/Retry props, sync `test_error_mapping_terminal`) — real absences, but the sync wrappers delegate to the same shared state machines / mappers, so a mirror test would pass rather than expose a defect; treated as coverage observations, not confirmed bugs. (The one broad sync-terminal status-raising gap that *does* touch distinct fallback assertions was kept as the Medium above.) +- **`test_budget_props` "double-counts" / vacuous-zero** — no double-count exists; `permitted == expected_ceiling` is sound for non-zero ceilings; only the zero-ceiling example is vacuous. +- **Public-API guards** — `test_no_removed_symbols_leaked` is a "was-removed, don't re-add" regression denylist (correctly excludes never-exported `MsgspecDecoder`); the post-0.8.0 sync-rename names verifiably resolve to sync objects (`iscoroutinefunction` is False). `httpware.decoders.T` is a free method-level TypeVar, correctly absent from `__all__`. +- **`test_client_decoders_default.py` msgspec-only resolution** — already covered (the finder misread the file: `test_async_default_msgspec_only` / `test_sync_default_msgspec_only` exist). +- **`Retry(max_attempts=0)` validation untested** — false: both `test_retry.py` and `test_retry_sync.py` assert `ValueError` on `max_attempts=0`. +- **MockTransport sync-callable "fidelity" concern** — MockTransport's `handle_async_request` is itself async and supports async handlers; cancellation-in-flight is already covered by `test_cancellation_propagates_cleanly`. +- **`errors.md` `asyncio.wait_for` / `builtins.TimeoutError` wording** — correct: the doc describes the *catch* form user code uses, and `httpware.TimeoutError` does inherit from `builtins.TimeoutError`. Finder misread. +- **Optional-extras isolation / pydantic-missing test scoping** — the patch-flag tests do guard `MISSING_DEPENDENCY_MESSAGE` and the `__init__` `ImportError`; the uncovered cold-import path is degenerate. (The genuine doc/source asymmetry it gestures at is captured by the High + Medium pydantic findings.) +- **`httpware` not forwarding `verify`/`follow_redirects`/`cert`** — deliberate thin-wrapper design; unknown kwargs raise `TypeError` immediately (no silent misconfiguration), and injection is the documented escape hatch. +- **msgspec/opentelemetry version floors without ceilings** — dependency-hygiene notes, not reproducible bugs; no published release currently breaks the adapters. +- **`StatusError.__init__` / `_dispatch_decoder` missing-docstring observations** — `D1`/missing-docstring is explicitly ignored by project convention; the class docstring already states the `__init__` contract. diff --git a/planning/audits/scripts/_discover-2026-06-14.json b/planning/audits/scripts/_discover-2026-06-14.json new file mode 100644 index 0000000..f510ed7 --- /dev/null +++ b/planning/audits/scripts/_discover-2026-06-14.json @@ -0,0 +1,329 @@ +{ + "generated": "2026-06-14", + "repo": "httpware", + "description": "Module map of httpware: every file under src/httpware/, tests/, docs/, and planning/ with line count and one-sentence purpose, plus the load-bearing invariants extracted verbatim from CLAUDE.md.", + "modules": { + "src/httpware": { + "src/httpware/__init__.py": { + "lines": 97, + "purpose": "Public package surface: re-exports AsyncClient/Client, ResponseDecoder, the error tree, and middleware symbols, and defines __all__." + }, + "src/httpware/client.py": { + "lines": 1552, + "purpose": "Defines Client and AsyncClient as thin httpx2 wrappers that compose the middleware chain, run typed decoding, and raise the status-keyed error tree." + }, + "src/httpware/errors.py": { + "lines": 325, + "purpose": "The status-keyed exception hierarchy (StatusError 4xx/5xx tree plus non-status ClientError subclasses) holding the httpx2.Response." + }, + "src/httpware/_internal/__init__.py": { + "lines": 1, + "purpose": "Marks _internal as the private cross-module helpers package (not part of the public API)." + }, + "src/httpware/_internal/exception_mapping.py": { + "lines": 28, + "purpose": "Maps raw httpx2 transport/network/timeout exceptions to httpware NetworkError/TimeoutError/TransportError." + }, + "src/httpware/_internal/import_checker.py": { + "lines": 26, + "purpose": "Detects whether optional extras are installed without importing them, used by adapter modules to gate hard imports." + }, + "src/httpware/_internal/observability.py": { + "lines": 55, + "purpose": "Observability emission helper providing structured logging and opt-in OpenTelemetry span events." + }, + "src/httpware/_internal/status.py": { + "lines": 47, + "purpose": "Status-code dispatch and streaming-body detection used to decide which StatusError subclass to raise." + }, + "src/httpware/middleware/__init__.py": { + "lines": 143, + "purpose": "Defines the Middleware/AsyncMiddleware protocols, Next/AsyncNext type aliases, and the phase-shortcut decorators." + }, + "src/httpware/middleware/chain.py": { + "lines": 48, + "purpose": "Composes the middleware stack into a single callable chain at client construction time (Seam A)." + }, + "src/httpware/decoders/__init__.py": { + "lines": 42, + "purpose": "Defines the ResponseDecoder protocol that is the Client/AsyncClient <-> decoder boundary (Seam B)." + }, + "src/httpware/decoders/pydantic.py": { + "lines": 81, + "purpose": "PydanticDecoder: a ResponseDecoder backed by a per-instance TypeAdapter cache, importing pydantic only when installed (Seam C)." + }, + "src/httpware/decoders/msgspec.py": { + "lines": 116, + "purpose": "MsgspecDecoder: an opt-in ResponseDecoder backed by a per-instance msgspec.json.Decoder cache (Seam C)." + }, + "src/httpware/middleware/resilience/__init__.py": { + "lines": 19, + "purpose": "Re-exports the resilience middleware: Bulkhead, CircuitBreaker, Retry, RetryBudget, their Async counterparts, and AsyncTimeout." + }, + "src/httpware/middleware/resilience/_backoff.py": { + "lines": 26, + "purpose": "Private full-jitter exponential backoff helper used by the retry middleware." + }, + "src/httpware/middleware/resilience/budget.py": { + "lines": 72, + "purpose": "Finagle-style thread-safe token-bucket RetryBudget that caps the global retry rate." + }, + "src/httpware/middleware/resilience/bulkhead.py": { + "lines": 184, + "purpose": "Bulkhead/AsyncBulkhead concurrency-limiter middleware backed by a semaphore, raising BulkheadFullError when saturated." + }, + "src/httpware/middleware/resilience/circuit_breaker.py": { + "lines": 305, + "purpose": "CircuitBreaker/AsyncCircuitBreaker implementing a classic consecutive-failure circuit breaker that raises CircuitOpenError when open." + }, + "src/httpware/middleware/resilience/retry.py": { + "lines": 347, + "purpose": "Retry/AsyncRetry middleware that automatically retries transient failures with backoff and RetryBudget control." + }, + "src/httpware/middleware/resilience/timeout.py": { + "lines": 76, + "purpose": "AsyncTimeout middleware enforcing an overall wall-clock deadline across the inner pipeline." + } + }, + "tests": { + "tests/__init__.py": {"lines": 0, "purpose": "Empty package marker for the test suite."}, + "tests/conftest.py": {"lines": 1, "purpose": "Pytest configuration shim (pytest-asyncio auto mode is configured in project config, not here)."}, + "tests/test_backoff.py": {"lines": 35, "purpose": "Unit tests for the full-jitter exponential backoff helper."}, + "tests/test_budget.py": {"lines": 108, "purpose": "Unit tests for the RetryBudget token-bucket behavior."}, + "tests/test_budget_props.py": {"lines": 116, "purpose": "Hypothesis property-based tests for RetryBudget concurrency invariants."}, + "tests/test_bulkhead.py": {"lines": 475, "purpose": "Tests for the async Bulkhead concurrency limiter and BulkheadFullError."}, + "tests/test_bulkhead_props.py": {"lines": 116, "purpose": "Hypothesis property-based tests for async Bulkhead under concurrency."}, + "tests/test_bulkhead_sync.py": {"lines": 184, "purpose": "Tests for the sync Bulkhead concurrency limiter."}, + "tests/test_bulkhead_sync_props.py": {"lines": 136, "purpose": "Hypothesis property-based tests for sync Bulkhead under threading."}, + "tests/test_circuit_breaker.py": {"lines": 490, "purpose": "Tests for the async CircuitBreaker state machine and CircuitOpenError."}, + "tests/test_circuit_breaker_props.py": {"lines": 68, "purpose": "Hypothesis property-based tests for CircuitBreaker state transitions."}, + "tests/test_circuit_breaker_sync.py": {"lines": 432, "purpose": "Tests for the sync CircuitBreaker state machine."}, + "tests/test_client_construction.py": {"lines": 158, "purpose": "Tests client constructor argument validation and middleware-chain freezing."}, + "tests/test_client_decoders_default.py": {"lines": 102, "purpose": "Tests default decoder resolution against installed extras (pydantic-first)."}, + "tests/test_client_dispatch.py": {"lines": 291, "purpose": "Tests core request dispatch through the middleware chain to the terminal httpx2 send."}, + "tests/test_client_lifecycle.py": {"lines": 77, "purpose": "Tests client open/close lifecycle and aclose() context-manager behavior."}, + "tests/test_client_methods.py": {"lines": 162, "purpose": "Tests the HTTP verb convenience methods on the clients."}, + "tests/test_client_middleware_wiring.py": {"lines": 107, "purpose": "Tests that middleware are wired into the chain in the correct order at construction."}, + "tests/test_client_response_model.py": {"lines": 119, "purpose": "Tests typed response decoding via response_model on the async client."}, + "tests/test_client_send_with_response.py": {"lines": 148, "purpose": "Tests the async send_with_response API returning both decoded model and raw response."}, + "tests/test_client_send_with_response_sync.py": {"lines": 149, "purpose": "Tests the sync send_with_response API."}, + "tests/test_client_stream.py": {"lines": 337, "purpose": "Tests async streaming-response handling and streaming-body detection."}, + "tests/test_client_stream_sync.py": {"lines": 306, "purpose": "Tests sync streaming-response handling."}, + "tests/test_client_sync.py": {"lines": 394, "purpose": "Tests the sync Client surface at parity with the async client."}, + "tests/test_client_typing.py": {"lines": 53, "purpose": "Static/typing assertions for the client public surface."}, + "tests/test_decoders_msgspec.py": {"lines": 232, "purpose": "Tests the MsgspecDecoder adapter including nested custom types."}, + "tests/test_decoders_pydantic.py": {"lines": 241, "purpose": "Tests the PydanticDecoder adapter and its TypeAdapter cache."}, + "tests/test_error_mapping_terminal.py": {"lines": 137, "purpose": "Tests httpx2->httpware exception mapping at the terminal send seam."}, + "tests/test_errors.py": {"lines": 396, "purpose": "Tests the status-keyed error hierarchy construction and 4xx/5xx dispatch."}, + "tests/test_middleware.py": {"lines": 193, "purpose": "Tests the async middleware protocol, Next type, and phase decorators."}, + "tests/test_middleware_sync.py": {"lines": 201, "purpose": "Tests the sync middleware protocol and chain."}, + "tests/test_observability.py": {"lines": 117, "purpose": "Tests the observability emission helper's logging and OTel span events."}, + "tests/test_optional_extras_isolation.py": {"lines": 67, "purpose": "Tests that optional extras are imported only inside their dedicated modules (Seam C isolation)."}, + "tests/test_optional_extras_otel_missing.py": {"lines": 156, "purpose": "Tests graceful behavior when the OpenTelemetry extra is not installed."}, + "tests/test_optional_extras_pydantic_missing.py": {"lines": 63, "purpose": "Tests graceful behavior when pydantic is not installed."}, + "tests/test_public_api.py": {"lines": 83, "purpose": "Tests that the public __all__ surface matches the documented exports."}, + "tests/test_retry.py": {"lines": 703, "purpose": "Tests the async Retry middleware including backoff and budget interaction."}, + "tests/test_retry_budget_threadsafety.py": {"lines": 63, "purpose": "Tests RetryBudget thread-safety under concurrent access."}, + "tests/test_retry_props.py": {"lines": 197, "purpose": "Hypothesis property-based tests for retry interleaving."}, + "tests/test_retry_sync.py": {"lines": 575, "purpose": "Tests the sync Retry middleware."}, + "tests/test_threading_with_shared_budget.py": {"lines": 83, "purpose": "Tests a RetryBudget shared across threads for correctness."}, + "tests/test_timeout.py": {"lines": 105, "purpose": "Tests the AsyncTimeout middleware wall-clock deadline enforcement."} + }, + "docs": { + "docs/index.md": {"lines": 198, "purpose": "Documentation landing page introducing httpware and its core concepts."}, + "docs/errors.md": {"lines": 194, "purpose": "Reference for the status-keyed error tree and exception construction rules."}, + "docs/middleware.md": {"lines": 203, "purpose": "Guide to the middleware protocol, Next type, and phase-shortcut decorators."}, + "docs/resilience.md": {"lines": 361, "purpose": "Guide to the resilience middleware (retry, budget, bulkhead, circuit breaker, timeout)."}, + "docs/testing.md": {"lines": 114, "purpose": "Guide to testing clients with httpx2.MockTransport injection."}, + "docs/requirements.txt": {"lines": 2, "purpose": "Pinned dependencies for building the documentation site."}, + "docs/CNAME": {"lines": 1, "purpose": "Custom domain configuration for the GitHub Pages docs site."}, + "docs/dev/contributing.md": {"lines": 48, "purpose": "Contributor guide covering local setup and the just/uv workflow."}, + "docs/recipes/modern-di.md": {"lines": 150, "purpose": "Recipe for using httpware clients with modern dependency injection."}, + "docs/recipes/phase-decorator-patterns.md": {"lines": 168, "purpose": "Recipe demonstrating common phase-decorator middleware patterns."}, + "docs/recipes/link-header-pagination.md": {"lines": 39, "purpose": "Recipe for paginating via the HTTP Link header."} + }, + "planning": { + "planning/README.md": {"lines": 131, "purpose": "Planning conventions (two axes, change bundles, three lanes, frontmatter) plus the change Index."}, + "planning/deferred.md": {"lines": 39, "purpose": "Review-surfaced but not-yet-actionable items deferred for later."}, + "planning/_templates/change.md": {"lines": 38, "purpose": "Template for the lightweight single-file change lane."}, + "planning/_templates/design.md": {"lines": 55, "purpose": "Template for per-change design documents."}, + "planning/_templates/plan.md": {"lines": 56, "purpose": "Template for per-change implementation plans."}, + "planning/releases": { + "note": "Per-version release notes (also published on GitHub Releases).", + "files": { + "planning/releases/0.2.0.md": {"lines": 45, "purpose": "Release notes for v0.2.0 (thin httpx2 wrapper pivot)."}, + "planning/releases/0.3.0.md": {"lines": 43, "purpose": "Release notes for v0.3.0."}, + "planning/releases/0.4.0.md": {"lines": 149, "purpose": "Release notes for v0.4.0."}, + "planning/releases/0.5.0.md": {"lines": 53, "purpose": "Release notes for v0.5.0."}, + "planning/releases/0.6.0.md": {"lines": 62, "purpose": "Release notes for v0.6.0."}, + "planning/releases/0.7.0.md": {"lines": 39, "purpose": "Release notes for v0.7.0."}, + "planning/releases/0.8.0.md": {"lines": 60, "purpose": "Release notes for v0.8.0."}, + "planning/releases/0.8.1.md": {"lines": 63, "purpose": "Release notes for v0.8.1."}, + "planning/releases/0.8.3.md": {"lines": 70, "purpose": "Release notes for v0.8.3."}, + "planning/releases/0.8.4.md": {"lines": 33, "purpose": "Release notes for v0.8.4."}, + "planning/releases/0.8.5.md": {"lines": 23, "purpose": "Release notes for v0.8.5."}, + "planning/releases/0.8.6.md": {"lines": 25, "purpose": "Release notes for v0.8.6."}, + "planning/releases/0.9.0.md": {"lines": 93, "purpose": "Release notes for v0.9.0 (multi-decoder routing)."}, + "planning/releases/0.9.1.md": {"lines": 29, "purpose": "Release notes for v0.9.1."}, + "planning/releases/0.10.0.md": {"lines": 44, "purpose": "Release notes for v0.10.0."}, + "planning/releases/0.10.1.md": {"lines": 25, "purpose": "Release notes for v0.10.1."} + } + }, + "planning/audits": { + "note": "Findings reports plus scripts/ tooling.", + "files": { + "planning/audits/2026-06-07-deep-audit.md": {"lines": 584, "purpose": "Deep-audit findings report from 2026-06-07."}, + "planning/audits/2026-06-12-delta-audit.md": {"lines": 359, "purpose": "Delta-audit findings report from 2026-06-12."}, + "planning/audits/2026-06-13-delta-audit.md": {"lines": 107, "purpose": "Delta-audit findings report from 2026-06-13."}, + "planning/audits/2026-06-13-docs-audit.md": {"lines": 218, "purpose": "Documentation-accuracy audit report from 2026-06-13."}, + "planning/audits/scripts/_discover.json": {"lines": 656, "purpose": "Previously generated module-map / discovery JSON for the audit tooling."}, + "planning/audits/scripts/workflow.mjs": {"lines": 405, "purpose": "Audit workflow orchestration script."}, + "planning/audits/scripts/workflow-deep.mjs": {"lines": 492, "purpose": "Deep-audit workflow orchestration script."}, + "planning/audits/scripts/workflow-delta.mjs": {"lines": 427, "purpose": "Delta-audit workflow orchestration script."} + } + }, + "planning/retros": { + "note": "Retrospectives.", + "files": { + "planning/retros/2026-06-04-v0.2-thin-wrapper-pivot.md": {"lines": 57, "purpose": "Retro on the v0.2 thin-wrapper pivot."}, + "planning/retros/2026-06-05-lite-bootstrap-audit-session.md": {"lines": 52, "purpose": "Retro on the lite bootstrap audit session."}, + "planning/retros/2026-06-10-v0.9-multi-decoder-routing.md": {"lines": 99, "purpose": "Retro on v0.9 multi-decoder routing."} + } + }, + "planning/changes/active": { + "note": "Active per-change bundles.", + "files": { + "planning/changes/active/.gitkeep": {"lines": 0, "purpose": "Keeps the active changes directory tracked when empty."}, + "planning/changes/active/2026-06-14.01-deep-audit/design.md": {"lines": 202, "purpose": "Design doc for the active 2026-06-14 deep-audit change bundle."}, + "planning/changes/active/2026-06-14.01-deep-audit/plan.md": {"lines": 577, "purpose": "Implementation plan for the active 2026-06-14 deep-audit change bundle."} + } + }, + "planning/changes/archive": { + "note": "Archived (shipped) per-change bundles; each bundle is a design.md+plan.md pair (or a single change.md for the lightweight lane). 60 bundles, 121 files total, listed individually below.", + "files": { + "planning/changes/archive/2026-05-31.01-bmad-to-superpowers-transition/design.md": {"lines": 178, "purpose": "Design for migrating planning from BMAD to the superpowers convention."}, + "planning/changes/archive/2026-05-31.01-bmad-to-superpowers-transition/plan.md": {"lines": 669, "purpose": "Plan for migrating planning from BMAD to the superpowers convention."}, + "planning/changes/archive/2026-05-31.02-shipped-work-review/design.md": {"lines": 94, "purpose": "Design for a review of already-shipped work."}, + "planning/changes/archive/2026-05-31.03-middleware-protocol-and-chain/design.md": {"lines": 228, "purpose": "Design for the middleware protocol and chain composition."}, + "planning/changes/archive/2026-05-31.03-middleware-protocol-and-chain/plan.md": {"lines": 838, "purpose": "Plan for the middleware protocol and chain composition."}, + "planning/changes/archive/2026-05-31.04-phase-shortcut-decorators/design.md": {"lines": 245, "purpose": "Design for the phase-shortcut middleware decorators."}, + "planning/changes/archive/2026-05-31.04-phase-shortcut-decorators/plan.md": {"lines": 747, "purpose": "Plan for the phase-shortcut middleware decorators."}, + "planning/changes/archive/2026-05-31.05-request-immutability-helpers/design.md": {"lines": 169, "purpose": "Design for request immutability helpers."}, + "planning/changes/archive/2026-05-31.05-request-immutability-helpers/plan.md": {"lines": 551, "purpose": "Plan for request immutability helpers."}, + "planning/changes/archive/2026-05-31.06-msgspec-decoder-via-extras/design.md": {"lines": 206, "purpose": "Design for the msgspec decoder via an optional extra."}, + "planning/changes/archive/2026-05-31.06-msgspec-decoder-via-extras/plan.md": {"lines": 479, "purpose": "Plan for the msgspec decoder via an optional extra."}, + "planning/changes/archive/2026-05-31.07-asyncclient/design.md": {"lines": 478, "purpose": "Design for the AsyncClient."}, + "planning/changes/archive/2026-05-31.07-asyncclient/plan.md": {"lines": 1902, "purpose": "Plan for the AsyncClient."}, + "planning/changes/archive/2026-05-31.08-recordedtransport/design.md": {"lines": 291, "purpose": "Design for the (later removed) RecordedTransport test helper."}, + "planning/changes/archive/2026-05-31.08-recordedtransport/plan.md": {"lines": 1051, "purpose": "Plan for the (later removed) RecordedTransport test helper."}, + "planning/changes/archive/2026-05-31.09-release-0.1.0-prep/design.md": {"lines": 271, "purpose": "Design for the v0.1.0 release preparation."}, + "planning/changes/archive/2026-05-31.09-release-0.1.0-prep/plan.md": {"lines": 505, "purpose": "Plan for the v0.1.0 release preparation."}, + "planning/changes/archive/2026-06-01.01-auth-coercion/design.md": {"lines": 382, "purpose": "Design for auth coercion handling."}, + "planning/changes/archive/2026-06-01.01-auth-coercion/plan.md": {"lines": 1014, "purpose": "Plan for auth coercion handling."}, + "planning/changes/archive/2026-06-02.01-docs-reorg-and-mkdocs/design.md": {"lines": 128, "purpose": "Design for docs reorganization and MkDocs adoption."}, + "planning/changes/archive/2026-06-02.01-docs-reorg-and-mkdocs/plan.md": {"lines": 787, "purpose": "Plan for docs reorganization and MkDocs adoption."}, + "planning/changes/archive/2026-06-02.02-project-hygiene-tidy/design.md": {"lines": 198, "purpose": "Design for a project-hygiene tidy-up."}, + "planning/changes/archive/2026-06-02.02-project-hygiene-tidy/plan.md": {"lines": 735, "purpose": "Plan for a project-hygiene tidy-up."}, + "planning/changes/archive/2026-06-03.01-input-validation-pass/design.md": {"lines": 236, "purpose": "Design for an input-validation pass over the client surface."}, + "planning/changes/archive/2026-06-03.01-input-validation-pass/plan.md": {"lines": 794, "purpose": "Plan for an input-validation pass over the client surface."}, + "planning/changes/archive/2026-06-03.02-thin-httpx2-wrapper/design.md": {"lines": 340, "purpose": "Design for the thin httpx2-wrapper pivot."}, + "planning/changes/archive/2026-06-03.02-thin-httpx2-wrapper/plan.md": {"lines": 2531, "purpose": "Plan for the thin httpx2-wrapper pivot."}, + "planning/changes/archive/2026-06-04.01-pydantic-optional-extra/design.md": {"lines": 460, "purpose": "Design for making pydantic an optional extra."}, + "planning/changes/archive/2026-06-04.01-pydantic-optional-extra/plan.md": {"lines": 1066, "purpose": "Plan for making pydantic an optional extra."}, + "planning/changes/archive/2026-06-04.02-v0.2-retro-and-housekeeping/design.md": {"lines": 208, "purpose": "Design for the v0.2 retro and housekeeping."}, + "planning/changes/archive/2026-06-05.01-retry-and-retry-budget/design.md": {"lines": 252, "purpose": "Design for retry and retry-budget middleware."}, + "planning/changes/archive/2026-06-05.01-retry-and-retry-budget/plan.md": {"lines": 1905, "purpose": "Plan for retry and retry-budget middleware."}, + "planning/changes/archive/2026-06-05.02-bulkhead/design.md": {"lines": 216, "purpose": "Design for the bulkhead concurrency limiter."}, + "planning/changes/archive/2026-06-05.02-bulkhead/plan.md": {"lines": 963, "purpose": "Plan for the bulkhead concurrency limiter."}, + "planning/changes/archive/2026-06-05.03-docs-sync-0.4/design.md": {"lines": 185, "purpose": "Design for syncing docs to the 0.4 surface."}, + "planning/changes/archive/2026-06-05.03-docs-sync-0.4/plan.md": {"lines": 645, "purpose": "Plan for syncing docs to the 0.4 surface."}, + "planning/changes/archive/2026-06-05.04-streaming/design.md": {"lines": 335, "purpose": "Design for streaming-response support."}, + "planning/changes/archive/2026-06-05.04-streaming/plan.md": {"lines": 1097, "purpose": "Plan for streaming-response support."}, + "planning/changes/archive/2026-06-05.05-observability/design.md": {"lines": 265, "purpose": "Design for observability (logging + OTel) support."}, + "planning/changes/archive/2026-06-05.05-observability/plan.md": {"lines": 1056, "purpose": "Plan for observability (logging + OTel) support."}, + "planning/changes/archive/2026-06-05.06-extension-slot-docs/design.md": {"lines": 148, "purpose": "Design for documenting the extension slots/seams."}, + "planning/changes/archive/2026-06-05.06-extension-slot-docs/plan.md": {"lines": 540, "purpose": "Plan for documenting the extension slots/seams."}, + "planning/changes/archive/2026-06-05.07-v0.7-docs-expansion/design.md": {"lines": 322, "purpose": "Design for the v0.7 docs expansion."}, + "planning/changes/archive/2026-06-05.07-v0.7-docs-expansion/plan.md": {"lines": 956, "purpose": "Plan for the v0.7 docs expansion."}, + "planning/changes/archive/2026-06-06.01-modern-di-recipe/design.md": {"lines": 285, "purpose": "Design for the modern-DI recipe doc."}, + "planning/changes/archive/2026-06-06.01-modern-di-recipe/plan.md": {"lines": 620, "purpose": "Plan for the modern-DI recipe doc."}, + "planning/changes/archive/2026-06-07.01-sync-client/design.md": {"lines": 595, "purpose": "Design for the synchronous Client."}, + "planning/changes/archive/2026-06-07.01-sync-client/plan.md": {"lines": 3533, "purpose": "Plan for the synchronous Client."}, + "planning/changes/archive/2026-06-07.02-decoder-error/design.md": {"lines": 270, "purpose": "Design for DecodeError/MissingDecoderError handling."}, + "planning/changes/archive/2026-06-07.02-decoder-error/plan.md": {"lines": 931, "purpose": "Plan for DecodeError/MissingDecoderError handling."}, + "planning/changes/archive/2026-06-07.03-deep-audit/design.md": {"lines": 294, "purpose": "Design for the 2026-06-07 deep-audit change bundle."}, + "planning/changes/archive/2026-06-07.03-deep-audit/plan.md": {"lines": 757, "purpose": "Plan for the 2026-06-07 deep-audit change bundle."}, + "planning/changes/archive/2026-06-08.01-send-with-response/design.md": {"lines": 225, "purpose": "Design for the send_with_response API."}, + "planning/changes/archive/2026-06-08.01-send-with-response/plan.md": {"lines": 669, "purpose": "Plan for the send_with_response API."}, + "planning/changes/archive/2026-06-08.02-retry-budget-cluster/design.md": {"lines": 298, "purpose": "Design for the retry-budget cluster fixes."}, + "planning/changes/archive/2026-06-08.02-retry-budget-cluster/plan.md": {"lines": 1335, "purpose": "Plan for the retry-budget cluster fixes."}, + "planning/changes/archive/2026-06-08.03-post-080-doc-sweep/design.md": {"lines": 254, "purpose": "Design for the post-0.8.0 documentation sweep."}, + "planning/changes/archive/2026-06-08.03-post-080-doc-sweep/plan.md": {"lines": 665, "purpose": "Plan for the post-0.8.0 documentation sweep."}, + "planning/changes/archive/2026-06-08.04-otel-partial-install/design.md": {"lines": 221, "purpose": "Design for handling partial OpenTelemetry installs."}, + "planning/changes/archive/2026-06-08.04-otel-partial-install/plan.md": {"lines": 508, "purpose": "Plan for handling partial OpenTelemetry installs."}, + "planning/changes/archive/2026-06-08.05-small-fixes-mop-up/design.md": {"lines": 297, "purpose": "Design for a small-fixes mop-up bundle."}, + "planning/changes/archive/2026-06-08.05-small-fixes-mop-up/plan.md": {"lines": 715, "purpose": "Plan for a small-fixes mop-up bundle."}, + "planning/changes/archive/2026-06-08.06-test-mop-up/design.md": {"lines": 406, "purpose": "Design for a test-suite mop-up bundle."}, + "planning/changes/archive/2026-06-08.06-test-mop-up/plan.md": {"lines": 818, "purpose": "Plan for a test-suite mop-up bundle."}, + "planning/changes/archive/2026-06-08.07-mkdocs-gh-pages-migration/design.md": {"lines": 126, "purpose": "Design for migrating docs to MkDocs on GitHub Pages."}, + "planning/changes/archive/2026-06-08.07-mkdocs-gh-pages-migration/plan.md": {"lines": 628, "purpose": "Plan for migrating docs to MkDocs on GitHub Pages."}, + "planning/changes/archive/2026-06-08.08-readme-link-cleanup/design.md": {"lines": 98, "purpose": "Design for cleaning up README links."}, + "planning/changes/archive/2026-06-08.08-readme-link-cleanup/plan.md": {"lines": 431, "purpose": "Plan for cleaning up README links."}, + "planning/changes/archive/2026-06-10.01-multi-decoder/design.md": {"lines": 405, "purpose": "Design for multi-decoder list routing (Seam B)."}, + "planning/changes/archive/2026-06-10.01-multi-decoder/plan.md": {"lines": 1943, "purpose": "Plan for multi-decoder list routing (Seam B)."}, + "planning/changes/archive/2026-06-10.02-decoder-instance-cache/design.md": {"lines": 304, "purpose": "Design for the per-instance decoder cache."}, + "planning/changes/archive/2026-06-10.02-decoder-instance-cache/plan.md": {"lines": 522, "purpose": "Plan for the per-instance decoder cache."}, + "planning/changes/archive/2026-06-12.01-delta-audit/design.md": {"lines": 145, "purpose": "Design for the 2026-06-12 delta-audit change bundle."}, + "planning/changes/archive/2026-06-12.01-delta-audit/plan.md": {"lines": 541, "purpose": "Plan for the 2026-06-12 delta-audit change bundle."}, + "planning/changes/archive/2026-06-13.01-msgspec-nested-customtype-fix/design.md": {"lines": 139, "purpose": "Design for the msgspec nested-custom-type decode fix."}, + "planning/changes/archive/2026-06-13.01-msgspec-nested-customtype-fix/plan.md": {"lines": 323, "purpose": "Plan for the msgspec nested-custom-type decode fix."}, + "planning/changes/archive/2026-06-13.02-circuit-breaker-and-timeout/design.md": {"lines": 319, "purpose": "Design for the circuit-breaker and timeout middleware."}, + "planning/changes/archive/2026-06-13.02-circuit-breaker-and-timeout/plan.md": {"lines": 1558, "purpose": "Plan for the circuit-breaker and timeout middleware."}, + "planning/changes/archive/2026-06-13.03-portable-planning-convention/design.md": {"lines": 225, "purpose": "Design for a portable planning convention."}, + "planning/changes/archive/2026-06-13.03-portable-planning-convention/plan.md": {"lines": 552, "purpose": "Plan for a portable planning convention."}, + "planning/changes/archive/2026-06-13.04-docs-accuracy-fixes/change.md": {"lines": 64, "purpose": "Lightweight-lane change for docs-accuracy fixes."}, + "planning/changes/archive/2026-06-13.05-docs-audit-followups/change.md": {"lines": 71, "purpose": "Lightweight-lane change for docs-audit follow-ups."}, + "planning/changes/archive/2026-06-14.01-docs-ux-restructure/design.md": {"lines": 172, "purpose": "Design for the docs UX restructure (thin README, canonical site)."}, + "planning/changes/archive/2026-06-14.01-docs-ux-restructure/plan.md": {"lines": 459, "purpose": "Plan for the docs UX restructure (thin README, canonical site)."} + } + } + } + }, + "tests": { + "framework": "pytest with pytest-asyncio auto mode (async tests need no @pytest.mark.asyncio)", + "property_based": "Hypothesis property-based tests for concurrency-sensitive code (RetryBudget, Bulkhead, retry interleaving), files named test_*_props.py", + "transport_injection": "Tests inject httpx2.MockTransport via AsyncClient(httpx2_client=httpx2.AsyncClient(transport=mock)) or Client(httpx2_client=httpx2.Client(transport=mock)); no respx, no RecordedTransport", + "test_file_count": 41, + "total_test_lines": 9032, + "run_commands": ["just test", "just test-branch", "just test tests/test_client.py -k ", "uv run pytest"] + }, + "docs": { + "site": "MkDocs on GitHub Pages (custom domain via docs/CNAME)", + "top_level_pages": ["index.md", "errors.md", "middleware.md", "resilience.md", "testing.md"], + "recipes": ["recipes/modern-di.md", "recipes/phase-decorator-patterns.md", "recipes/link-header-pagination.md"], + "dev": ["dev/contributing.md"], + "build_inputs": ["requirements.txt", "CNAME"], + "architecture_note": "The per-capability living truth lives in architecture/ at repo root (outside the four scanned directories) and is the promotion target on every ship." + }, + "invariants_to_check": [ + "These are non-negotiable, but **most are NOT machine-checked — don't rely on CI to catch a violation.** Enforced by ruff: `print()` (`T201`) and a blanket `# type: ignore` (`PGH003`). Partially: `httpx2._` (ruff `SLF001` catches attribute access, not a *used* private import). Review-only: the future-import and global-logging bans.", + "No `httpx2` private API: `grep -rE 'httpx2\\._' src/httpware/` should return zero matches (run in review — not wired into CI). Public symbols only.", + "No `from __future__ import annotations`: Python 3.11+ floor; PEP 604/585 syntax is native.", + "No `print()`: enforced by ruff.", + "No global logging config: no `logging.basicConfig()`, no bare `logging.getLogger()`. Acquire `logging.getLogger(\"httpware\")` or `logging.getLogger(f\"httpware.{module}\")` only.", + "Type suppressions: use `# ty: ignore[]`, never `# type: ignore` or `# mypy: ignore`.", + "Modules: `snake_case` (`client.py`, `errors.py`, `middleware/chain.py`).", + "Classes: `PascalCase`. `Http` is two letters: `AsyncClient`, not `ASYNCClient`.", + "Methods: `snake_case`. No `a` prefix on async methods (match `httpx2`); `aclose()` is the sole exception.", + "Private symbols: `_leading_underscore`. Cross-module private code lives in `_internal/`.", + "Imports: absolute paths inside `src/httpware/`; relative imports only within the same subpackage.", + "Docstrings: PEP 257. Module/class/public-method required; `D1` (missing docstring) is ignored.", + "Exception construction: status-keyed `StatusError` subclasses (the 4xx/5xx tree) take a single positional `response: httpx2.Response` and do NOT override `__init__` — all fields via `exc.response.*`. This rule scopes to `StatusError` only; non-status `ClientError` subclasses such as `DecodeError`, `MissingDecoderError`, `BulkheadFullError`, `RetryBudgetExhaustedError`, and `CircuitOpenError` deliberately define `__init__` with keyword-only fields. See `architecture/errors.md`.", + "Seam A — `Client`/`AsyncClient` ↔ `Middleware`/`AsyncMiddleware` — middleware chain composed at `Client.__init__` and `AsyncClient.__init__`, frozen for the client's lifetime. Internal terminal calls `httpx2.Client.send` or `httpx2.AsyncClient.send`, maps exceptions, raises `StatusError` on 4xx/5xx. Sync and async surfaces are kept at parity.", + "Seam B — `Client`/`AsyncClient` ↔ `ResponseDecoder` list — both clients take `decoders: Sequence[ResponseDecoder] | None` (a *list*, not a single decoder; `None` resolves against installed extras, pydantic-first). When `response_model` is provided, `send`/`send_with_response` (sync and async alike) walk the list and the first decoder whose `can_decode(model: type) -> bool` returns True runs `decode(content: bytes, model: type[T]) -> T`; if no decoder claims the model, `MissingDecoderError` is raised *before* the HTTP call. Decoder exceptions are wrapped as `DecodeError` at the seam.", + "Seam C — `httpware` ↔ optional extras — each opt-in dependency imported only inside its dedicated module.", + "pytest-asyncio auto mode — async tests do NOT need `@pytest.mark.asyncio`.", + "Property-based tests (Hypothesis) for concurrency-sensitive code: `RetryBudget`, `Bulkhead`, retry interleaving. Files named `test_*_props.py`.", + "Tests inject `httpx2.MockTransport` via `AsyncClient(httpx2_client=httpx2.AsyncClient(transport=mock))` for async or `Client(httpx2_client=httpx2.Client(transport=mock))` for sync. No `respx`, no `RecordedTransport`." + ] +} diff --git a/planning/audits/scripts/workflow-deep.mjs b/planning/audits/scripts/workflow-deep.mjs new file mode 100644 index 0000000..1ec22a7 --- /dev/null +++ b/planning/audits/scripts/workflow-deep.mjs @@ -0,0 +1,492 @@ +export const meta = { + name: 'httpware-deep-audit', + description: 'Full-codebase deep audit: discover + 10 finders + 3-lens verify + single-report synthesis', + phases: [ + { title: 'Discover', detail: 'Fresh module map' }, + { title: 'Find', detail: 'One finder per dimension (10)' }, + { title: 'Verify', detail: '3-lens panel per finding' }, + { title: 'Synthesize', detail: 'Triage + write the full report' }, + ], +} + +// ───── Schemas ────────────────────────────────────────────────────────────── + +const FINDING_SCHEMA = { + type: 'object', + required: ['findings'], + properties: { + findings: { + type: 'array', + items: { + type: 'object', + required: ['dimension', 'title', 'file', 'line_hint', 'claim', + 'evidence_quote', 'suspected_severity'], + properties: { + dimension: { type: 'string' }, + title: { type: 'string' }, + file: { type: 'string' }, + line_hint: { type: 'integer' }, + claim: { type: 'string' }, + evidence_quote: { type: 'string' }, + suspected_severity: { enum: ['blocker', 'high', 'medium', 'low', 'nit'] }, + reproducer_hint: { type: ['string', 'null'] }, + }, + }, + }, + }, +} + +const VERDICT_SCHEMA = { + type: 'object', + required: ['lens', 'confirmed', 'reason'], + properties: { + lens: { enum: ['code_reality', 'reproducer', 'spec_grounded'] }, + confirmed: { type: 'boolean' }, + reason: { type: 'string' }, + quoted_evidence: { type: ['string', 'null'] }, + severity_adjustment: { enum: ['unchanged', 'raise', 'lower', null] }, + }, +} + +const DISCOVER_SCHEMA = { + type: 'object', + required: ['modules', 'tests', 'docs', 'invariants_to_check'], + properties: { + modules: { type: 'object' }, + tests: { type: 'object' }, + docs: { type: 'object' }, + invariants_to_check: { type: 'array', items: { type: 'string' } }, + }, +} + +// ───── Dimension prompts ──────────────────────────────────────────────────── + +const DIMENSION_PROMPTS = { + correctness: `You are auditing the httpware repository for CORRECTNESS bugs only. +Read every file under src/httpware/ and look for: logic errors, off-by-ones, +wrong branches, dead code, broken control flow, mis-named variables, +accidentally swapped arguments, mishandled None/empty cases. + +Out of scope for this dimension: concurrency races (the concurrency finder +handles those), error-contract violations (the error_contract finder), +public-API typing (the public_api finder), optional-extras leaks (the +optional_extras finder), tests (the tests finder), docs (the architecture_docs +finder). + +Use the discover JSON as your file inventory. For each finding return: title, +file, approximate line, a 1-3 sentence claim explaining what is wrong AND why +it is wrong (not just what the code does), a verbatim 5-15 line evidence quote, +suspected severity, and a reproducer hint if applicable. + +Default to NOT reporting when uncertain. Quality > quantity. Aim for 6-12 high- +signal findings, not 30 weak ones.`, + + concurrency: `You are auditing the httpware repository for CONCURRENCY hazards +and SYNC/ASYNC PARITY divergence. + +Focus on: src/httpware/middleware/resilience/{retry,bulkhead,budget}.py and +their tests under tests/test_*_props.py, test_retry_budget_threadsafety.py, +test_threading_with_shared_budget.py. + +Look for: missing locks, races on shared mutable state, threading.Semaphore vs +asyncio.Semaphore semantics mismatches, RetryBudget sharing between sync Client +and AsyncClient (new in 0.8.0), property-test strategies that don't actually +exercise the race they claim to, behavior divergence between sync Retry and +AsyncRetry / Bulkhead and AsyncBulkhead that isn't documented as intentional. + +Out of scope: pure-correctness logic errors (the correctness finder), error +contract (the error_contract finder). + +6-12 findings target. Default to silence when uncertain.`, + + error_contract: `You are auditing the httpware repository against the +ERROR CONTRACT documented in CLAUDE.md: + +- Status-keyed errors take a SINGLE positional response: httpx2.Response. +- Subclasses do NOT override __init__. +- All fields available via exc.response.*. +- 4xx and 5xx map to the appropriate StatusError subclass at the terminal call. + +Check src/httpware/errors.py and the terminal in src/httpware/client.py. +Cross-reference tests/test_errors.py and tests/test_error_mapping_terminal.py: +do the tests actually prove the invariants, or do they pass for the wrong +reason? + +Report any deviation from the invariants, even if minor. Also report places +where the docstring or type signature is silent on a contractual point. + +Out of scope: other code correctness, concurrency. 4-8 findings target.`, + + public_api: `You are auditing the httpware PUBLIC API SURFACE. + +Read src/httpware/__init__.py and src/httpware/middleware/__init__.py and +src/httpware/decoders/__init__.py. Compare against: +- tests/test_public_api.py +- README.md examples +- architecture/*.md import statements + +Look for: symbols exported but not in __all__, symbols in __all__ but not +defined, stale Async* aliases left over from the 0.8.0 rename, missing +type re-exports (re-exporting a class without its TypeVar bound is a smell), +imports that succeed but produce a partially-initialized object. + +Per memory: the project keeps __all__ only in __init__.py (not submodules). + +Out of scope: optional extras (the optional_extras finder), internal modules. +4-8 findings.`, + + optional_extras: `You are auditing the OPTIONAL EXTRAS BOUNDARY. + +Invariant: pydantic, msgspec, and otel must be importable ONLY inside their +dedicated modules. Top-level import httpware must not pull them. The fail-fast +error when a decoder is requested without its extra installed must trigger at +AsyncClient.__init__ / Client.__init__, NOT at first response decode. + +Check: +- src/httpware/decoders/pydantic.py, src/httpware/decoders/msgspec.py +- src/httpware/_internal/import_checker.py +- src/httpware/_internal/observability.py (OTel hook) +- tests/test_optional_extras_isolation.py +- tests/test_optional_extras_otel_missing.py +- tests/test_optional_extras_pydantic_missing.py + +Look for: stray top-level imports, lazy imports that defeat fail-fast, +ImportError handling that swallows the wrong exception, tests that don't +prove the isolation they claim to. + +Out of scope: in-decoder bugs (the correctness finder). 3-6 findings.`, + + tests: `You are auditing the httpware TEST SUITE. + +Look for: +- Coverage gaps: code paths in src/httpware/ with no test (use the discover map). +- Hypothesis property tests with strategies too narrow to exercise the + invariant they claim to (e.g. integers(min_value=0, max_value=1) won't find + most off-by-one). +- Mock transports that hide real httpx2 behavior (e.g. returning bytes that + httpx2 would never produce). +- Tests that pass for the wrong reason (assert True equivalents, no + assertions, mocks that absorb the failure). +- Sync/async parity gaps: a thoroughly tested async behavior with no + corresponding sync test, or vice versa (especially after 0.8.0). + +Out of scope: production code bugs (the correctness/concurrency/error_contract/ +public_api/optional_extras finders), docs (the architecture_docs finder). +8-14 findings.`, + + architecture_docs: `You are auditing architecture/*.md for DRIFT against +the current code. + +Read every file: architecture/{overview,client,middleware,decoders,errors, +resilience,extras,testing}.md. For each load-bearing claim, verify it +against the actual src/httpware/ code, public API (__init__.py __all__), +and tests. + +Look for: +- Class/decorator/method names made stale by the 0.8.0 Async* rename or + later changes (Middleware vs AsyncMiddleware, Retry vs AsyncRetry, etc.). +- Described behavior the code no longer matches (circuit breaker states, + async timeout non-finite handling, multi-decoder routing, per-instance + decoder cache, send_with_response). +- Invariants stated as enforced that are actually only review-enforced (the + 2026-06-13 docs work corrected some of these — check none regressed). +- Import statements or code blocks that would not run against current + src/httpware/. +- Cross-references / links that do not resolve. + +Report each with the architecture file, the inaccurate quote, and the +current truth. Out of scope: docs/ site content and planning/ docs. +4-10 findings.`, + + performance: `You are auditing the httpware repository for PERFORMANCE +issues only. + +Scope: src/httpware/ — the per-request hot path above all. Read client.py +(send / send_with_response / stream, sync and async), middleware/chain.py +(compose + Next), and middleware/resilience/{retry,bulkhead,budget, +circuit_breaker,timeout}.py. + +Look for: +- Allocations or work repeated per-request that could be hoisted to + __init__ (chain re-composition, rebuilding decoder lists, recreating + closures, redundant dict/list copies). +- Lock-hold scope: work done while holding RetryBudget/Bulkhead/ + CircuitBreaker locks that could happen outside the critical section; + contention hot spots under concurrency. +- Decoder / TypeAdapter caching: is the per-instance cache (0.9.0) actually + hit, or rebuilt? Any O(n) decoder-list scan that runs per response when it + could be memoized per model. +- Async overhead: event-loop-blocking sync calls inside async paths, + sequential awaits that could be concurrent, needless gather/wrapping. +- Response body handling: bytes read/copied more than once, eager reads on + a streaming path. + +Quantify the cost where you can (per-request vs per-client, O(n) vs O(1)). +This dimension is about COST, not safety — concurrency hazards and logic +bugs belong to other finders. Default to NOT reporting micro-optimizations +with no measurable payoff. 6-12 findings target.`, + + security: `You are auditing the httpware repository for SECURITY and +SUPPLY-CHAIN issues only. + +Look for: +- Untrusted-response trust boundaries: status code, headers, and body come + from the server — anywhere httpware trusts them without bound (e.g. + unbounded reads driven by a header, status used to index without guard). +- Decoder deserialization safety: pydantic and msgspec run on + attacker-controlled bytes in decoders/{pydantic,msgspec}.py. Any path that + could be driven to excessive recursion, memory, or arbitrary type + construction? Is body size ever bounded? +- Inherited httpx2 surfaces: redirect-following, URL handling, proxy/SSRF + exposure — does httpware widen or fail to constrain anything httpx2 leaves + to the caller? Report the boundary even if the default is httpx2's. +- Secret leakage: do exception messages, repr, or log/OTel events ever + include auth headers, cookies, or URLs with embedded credentials? Check + errors.py (StatusError holds the full Response) and + _internal/observability.py. +- Supply chain: version floors/ceilings in pyproject.toml for httpx2 and the + optional extras (pydantic/msgspec/otel). Unpinned-floor or over-wide + ranges that could pull a vulnerable transitive version. + +Report the trust boundary even when the current default is safe, but mark +severity honestly (a documented httpx2 default is a nit; an unbounded +attacker-driven allocation is high). 6-12 findings target.`, + + refactoring: `You are auditing the httpware repository for REFACTORING +opportunities and INCONSISTENCIES only — not bugs. + +Look for: +- Sync/async duplication: logic copy-pasted between Client and AsyncClient + (or Retry/AsyncRetry, Bulkhead/AsyncBulkhead) that could share a helper + WITHOUT crossing a protocol seam (Seam A/B/C in CLAUDE.md). Note where a + copy has already drifted. +- Inconsistent patterns: error construction, naming, signatures, or control + flow that differ for no reason across sibling modules. Cross-check the + conventions in CLAUDE.md (StatusError vs other ClientError __init__ rules, + naming, import style). +- Dead or unreachable code; over-complex branching that flattens; module + boundaries that have eroded. + +Every finding states the concrete payoff (what gets simpler / what +divergence it prevents), not aesthetics. A suggestion the conventions are +silent on is a nit or low at most. Never propose crossing a documented +protocol seam. Default severity low/nit unless a duplication has already +caused a real divergence. 5-10 findings target.`, +} + +// ───── Verifier prompts ───────────────────────────────────────────────────── + +const VERIFIER_PROMPTS = { + code_reality: (f) => `Re-read ${f.file} around line ${f.line_hint} (±30 lines). +The finder claims: + +Title: ${f.title} +Claim: ${f.claim} +Evidence quoted by finder: +${f.evidence_quote} + +Does the claim match what the code actually does, or did the finder misread? +Default to confirmed: false if the cited code does not support the claim, or +if you can't locate the cited code. Return your verdict per schema.`, + + reproducer: (f) => `The finder claims: + +Title: ${f.title} +Claim: ${f.claim} +Reproducer hint: ${f.reproducer_hint ?? '(none provided)'} + +Could you sketch a test (3-5 lines) that demonstrates this bug? If the finding +is in docs or planning docs, reframe: would a reader making a reasonable choice +based on the doc be misled? + +If you cannot construct a reproducer (or a misleading-reading), set +confirmed: false. Otherwise confirmed: true with the sketch in quoted_evidence.`, + + spec_grounded: (f) => `The finder claims: + +Title: ${f.title} +Claim: ${f.claim} + +Does this violate a stated invariant in CLAUDE.md or planning/engineering.md +(error contract, optional-extras pattern, no httpx2._ private API, no global +logging config, naming conventions, etc.)? + +- If yes: confirmed: true, cite the invariant verbatim in quoted_evidence. +- If it's a judgment call with no spec backing: confirmed: false. + +Severity adjustment: raise if this is a CLAUDE.md-listed invariant; lower if +the spec is silent and this is a hardening suggestion.`, +} + +// ───── Synthesis prompt ───────────────────────────────────────────────────── + +const SYNTHESIS_PROMPT = (dims, confirmed, refuted, auditFile, discoverFile) => ` +You are writing the FINAL httpware deep-audit report. This is a single +combined run (not chunked) — you write the whole file, including the +top-of-file Summary. + +You have ${confirmed.length} CONFIRMED findings (survived ≥2/3 verifiers) and +${refuted.length} REFUTED candidates (investigated, did not survive) across +dimensions: ${dims.join(', ')}. + +Tasks: +1. Triage each confirmed finding into a bucket: blocker / high / medium / + low / nit, applying severity strictly. If more than 4 nits share a + dimension, roll them into one " nits (rolled up)" entry (the + rolled entry counts its constituents in the totals). +2. Dedupe confirmed findings against each other (file + line ±5 + similar + claim); fold duplicates into one entry. +3. Write ${auditFile} with this structure: + - "# httpware deep audit — 2026-06-14" + - "**Status:** complete" and a one-line "**Method:**" (ten adversarial + finders → 3-lens verify panel → ≥2/3 to survive → single synthesis). + - "## Summary" — counts per bucket (Blockers/High/Medium/Low/Nits), the + single headline finding in one sentence, and an explicit "Not covered" + line. + - "## Findings" — grouped by bucket. Each finding: a "#### " title, a + "*(dimension — verified)*" tag line, file:line in \`code\` format, a + ≤3-sentence claim, a fenced code block with the evidence quote, the + verifier consensus (e.g. "panel 3/3: code_reality, reproducer, + spec_grounded"), and a one-line suggested direction. Directions only — + do NOT write fixes or patches. + - "## Negative results (verified correct)" — a bulleted list built from + the REFUTED candidates and invariants the finders checked and found + held. One line each: what was checked and why it is fine. Summarize; + do not dump raw JSON. +4. Use the Write tool to create ${auditFile} (overwrite if present). +5. Stage and commit ONLY the report and the discover map — NO source edits: + git add ${auditFile} ${discoverFile} + git commit -m "audit(deep): 2026-06-14 full-codebase audit — confirmed ()" + Then run \`git status\` to confirm a clean tree. If git reports any + modified src/ or tests/ file, STOP and report it — this pass must not + touch source. + +CONFIRMED findings JSON: +${JSON.stringify(confirmed, null, 2)} + +REFUTED candidates JSON (for Negative results — summarize, do not dump): +${JSON.stringify(refuted, null, 2)} +` + +// ───── Script body ────────────────────────────────────────────────────────── + +const SONNET = 'claude-sonnet-4-6' +const OPUS = 'claude-opus-4-8' + +// args may arrive as a JSON string (depending on harness) — normalize. +const cfg = typeof args === 'string' ? JSON.parse(args) : (args ?? {}) + +if (cfg.run_discover !== false) { + phase('Discover') + log('Building module map (one-shot)') + // The discover agent both produces structured data AND writes it to disk; + // schema validates the structure, the prompt requires it to call Write afterward. + await agent( + `Build a JSON module map of the httpware repo. List every file under src/httpware/, +tests/, docs/, and planning/. For each entry capture: line count, a one-sentence +purpose. Also extract the load-bearing invariants from CLAUDE.md verbatim. + +After building the structure, write it as pretty-printed JSON to: + ${cfg.discover_file} + +Use the Write tool to create the file. Do NOT commit it; the outer plan handles that. +Return the structure per schema.`, + { model: OPUS, schema: DISCOVER_SCHEMA, label: 'discover' }, + ) +} + +phase('Find') +const unknownDims = cfg.dimensions.filter(d => !DIMENSION_PROMPTS[d]) +if (unknownDims.length) throw new Error(`Unknown dimensions: ${unknownDims.join(', ')}`) +const findings = await parallel( + cfg.dimensions.map(dim => () => + agent( + `${DIMENSION_PROMPTS[dim]} + +Before you start, use the Read tool to load the discover map at ${cfg.discover_file}. +It contains the full file inventory (with line counts and purpose strings) and the +load-bearing invariants from CLAUDE.md. Use it to drive your search instead of +guessing at the codebase layout. + +Return per schema.`, + { model: SONNET, schema: FINDING_SCHEMA, label: `find:${dim}`, phase: 'Find' }, + ) + ), +) + +const FINDINGS_PER_DIM_CAP = 15 +const rawDimensionResults = findings.filter(Boolean) +const oversizedDims = rawDimensionResults.filter(r => r.findings.length > FINDINGS_PER_DIM_CAP) +for (const r of oversizedDims) { + const dimName = r.findings[0]?.dimension ?? '' + log(`WARNING: dimension ${dimName} returned ${r.findings.length} findings; capping at ${FINDINGS_PER_DIM_CAP}`) +} +const allFindings = rawDimensionResults.flatMap(r => r.findings.slice(0, FINDINGS_PER_DIM_CAP)) +log(`Found ${allFindings.length} candidate findings across ${cfg.dimensions.length} dimensions`) + +phase('Verify') +const verified = await parallel( + allFindings.map(f => () => + parallel(['code_reality', 'reproducer', 'spec_grounded'].map(lens => () => + agent(VERIFIER_PROMPTS[lens](f), { + model: SONNET, schema: VERDICT_SCHEMA, + label: `verify:${f.dimension}:${lens}`, phase: 'Verify', + }) + )).then(verdicts => { + const live = verdicts.filter(Boolean) + const confirms = live.filter(v => v.confirmed).length + const surviving = confirms >= 2 + const lensesConfirming = live.filter(v => v.confirmed).map(v => v.lens) + const adjustments = live.map(v => v.severity_adjustment).filter(Boolean) + const raiseCount = adjustments.filter(a => a === 'raise').length + const lowerCount = adjustments.filter(a => a === 'lower').length + let severity = f.suspected_severity + if (lowerCount >= 1) severity = lowerOne(severity) + if (raiseCount >= 2) severity = raiseOne(severity) + if (verdicts.every(v => v === null)) { + log(`WARNING: all 3 verifiers failed for finding "${f.title}" (${f.file}:${f.line_hint}) — dropped`) + return null + } + const refuteReason = live.find(v => !v.confirmed)?.reason ?? 'no verifier confirmed' + return surviving + ? { ...f, surviving: true, final_severity: severity, lensesConfirming } + : { ...f, surviving: false, refuteReason } + }) + ), +) + +const triaged = verified.filter(Boolean) +const confirmed = triaged.filter(v => v.surviving) +const refuted = triaged.filter(v => !v.surviving) +log(`${confirmed.length}/${allFindings.length} confirmed by ≥2 verifiers; ${refuted.length} refuted (kept for Negative results)`) + +phase('Synthesize') +await agent( + SYNTHESIS_PROMPT(cfg.dimensions, confirmed, refuted, cfg.audit_file, cfg.discover_file), + { model: OPUS, label: 'synthesize:deep' }, +) + +return { + candidates: allFindings.length, + confirmed: confirmed.length, + refuted: refuted.length, + by_severity: countBySeverity(confirmed), +} + +// ───── Helpers ────────────────────────────────────────────────────────────── + +function lowerOne(s) { + const order = ['nit', 'low', 'medium', 'high', 'blocker'] + const i = order.indexOf(s) + return i > 0 ? order[i - 1] : s +} +function raiseOne(s) { + const order = ['nit', 'low', 'medium', 'high', 'blocker'] + const i = order.indexOf(s) + return i < order.length - 1 ? order[i + 1] : s +} +function countBySeverity(arr) { + const out = { blocker: 0, high: 0, medium: 0, low: 0, nit: 0 } + for (const f of arr) out[f.final_severity ?? f.suspected_severity]++ + return out +} diff --git a/planning/changes/active/2026-06-14.01-deep-audit/design.md b/planning/changes/active/2026-06-14.01-deep-audit/design.md new file mode 100644 index 0000000..53f0a1b --- /dev/null +++ b/planning/changes/active/2026-06-14.01-deep-audit/design.md @@ -0,0 +1,202 @@ +--- +status: draft +date: 2026-06-14 +slug: deep-audit +supersedes: null +superseded_by: null +pr: null +outcome: null +--- + +# Design: Full-codebase deep audit (perf · security · refactoring · bugs) + +## Summary + +Run a fresh full-codebase deep audit of `httpware` (code + tests + +`architecture/` docs) using a multi-agent Workflow, producing one findings +report at `planning/audits/2026-06-14-deep-audit.md` in the established +taxonomy (Blocker / High / Medium / Low / Nit + a Negative-results section). +The audit deliberately covers the dimensions the +[2026-06-07 deep audit](../../../audits/2026-06-07-deep-audit.md) explicitly +left uncovered — **performance, security, supply-chain** — plus correctness, +concurrency, refactoring, and test quality. **Report only: no code changes, +no fix PRs.** Confirmed findings spawn follow-up change bundles later, per the +normal audit→fix flow. + +## Motivation + +- The 2026-06-07 deep audit states in its own summary: *"No dedicated chunk + covered performance, security, or supply-chain dimensions"* and its `tests` + dimension stalled (~1.5M Sonnet tokens, zero findings). Those are real + coverage gaps in the only full audit on record. +- Since 0.8.0 the codebase has grown materially (sync `Client`, circuit + breaker, async timeout, multi-decoder routing, per-instance decoder cache). + Delta audits covered each in isolation; nothing has swept the whole surface + with the gap dimensions in mind. +- The existing harness (`planning/audits/scripts/workflow.mjs`) still encodes + pre-restructure reality: it points finders at `docs/*.md` and + `planning/engineering.md` (now `architecture/*.md` and `planning/changes/`), + pins stale model IDs (`claude-opus-4-7`), and has no performance, security, + or refactoring finders. A fresh run needs an updated orchestrator. + +## Non-goals + +- No code changes, fixes, or PRs — this pass produces a report only. +- No planning-doc staleness sweep beyond `architecture/` (the recent + docs/UX work already churned `planning/` and the docs site). +- No re-audit of the docs site content (`docs/`) — it had its own + 2026-06-13 docs audit. +- Not a delta audit — scope is the whole `src/httpware/**` + `tests/**` + surface, not a single version's diff. + +## Design + +### 1. New orchestrator: `workflow-deep.mjs` + +Fork `planning/audits/scripts/workflow.mjs` into a sibling +`workflow-deep.mjs` rather than mutating the existing (delta-oriented) script. +Same four-phase pipeline — **Discover → Find → Verify → Synthesize** — same +schemas (`FINDING_SCHEMA`, `VERDICT_SCHEMA`, `DISCOVER_SCHEMA`). Differences: + +- **Model IDs refreshed:** `claude-opus-4-8` for discover + synthesis, + `claude-sonnet-4-6` for finders + verifiers. +- **Paths corrected** to current repo reality everywhere they appear in + prompts: `architecture/*.md` (not `docs/*.md`), `planning/changes/` (not + `planning/engineering.md` / `planning/specs/`), `CLAUDE.md` invariants. +- **Single combined run** (not per-chunk): all finders fan out in one + `parallel()`, one verify pass, one synthesis writing the whole report. + At ~11.8K LOC the synthesis context stays manageable, and a single run is + simpler to reason about than the old chunk-and-commit-per-chunk flow. +- **Discover refresh:** rebuild the module map to a dated file + `planning/audits/scripts/_discover-2026-06-14.json` so the run is + reproducible and doesn't clobber the existing `_discover.json`. + +### 2. Finder dimensions (one agent each) + +The four selected areas expand to ten focused finders so each stays narrow +and adversarial (broad finders dilute signal): + +| Area | Finders | Status | +|------|---------|--------| +| Correctness & concurrency | `correctness`, `concurrency` (sync/async parity), `error_contract` | reuse, path-refreshed | +| Performance | `performance` | **new** | +| Security & supply-chain | `security` | **new** | +| Refactoring & test quality | `refactoring`, `tests`, `public_api`, `optional_extras` | `refactoring` new; rest reused | +| Architecture docs | `architecture_docs` (drift of `architecture/*.md` vs code) | reuse, repointed from `docs`/`planning_docs` | + +Each finder: reads the discover map first, targets 6–12 high-signal findings, +returns the `FINDING_SCHEMA` (dimension, title, file, line_hint, claim, +evidence_quote, suspected_severity, reproducer_hint), and **defaults to +silence when uncertain** (quality > quantity). Per-dimension cap stays at 15. + +New finder prompt sketches: + +- **`performance`** — hot-path allocations and redundant work in the + middleware chain composition and per-request `send`; lock-hold scope and + contention in `RetryBudget` / `Bulkhead` / `CircuitBreaker`; decoder / + `TypeAdapter` caching effectiveness (the per-instance cache landed in 0.9.0); + avoidable async overhead (gather vs sequential, event-loop-blocking calls); + unnecessary `Response` body reads / copies. Out of scope: correctness, + concurrency *hazards* (those are other finders) — this is cost, not safety. +- **`security`** — untrusted-response handling (status/header/body trust + boundaries); decoder deserialization safety (pydantic/msgspec on attacker- + controlled bytes, recursion/size limits); header, redirect-following, and + URL/SSRF surfaces inherited from `httpx2`; exception messages or logs that + could leak secrets (auth headers, URLs with credentials); dependency pinning + and the optional-extras supply-chain surface (version floors in + `pyproject.toml`). Out of scope: pure logic bugs. +- **`refactoring`** — duplication between sync and async surfaces that could + share a helper without crossing a protocol seam; inconsistent naming / + signatures / error-construction patterns; dead code; over-complex control + flow; module-boundary smells. Findings are *suggestions* with a stated + payoff, never style nits dressed up as bugs. Default severity low/nit unless + the duplication has caused a real divergence. + +Reused finders (`correctness`, `concurrency`, `error_contract`, `tests`, +`public_api`, `optional_extras`, `architecture_docs`) carry forward their +2026-06-07 prompts verbatim except for the path/model corrections in §1. + +### 3. Verify — 3-lens panel per finding + +Unchanged from the existing harness. Each surviving candidate gets three +independent Sonnet verifiers: + +- **`code_reality`** — re-read the cited code ±30 lines; did the finder + misread? Default `confirmed: false` if the code doesn't support the claim or + can't be located. +- **`reproducer`** — sketch a 3–5 line test that demonstrates the bug (or, for + a doc finding, a reasonable misleading read). No repro ⇒ `confirmed: false`. +- **`spec_grounded`** — does it violate a stated `CLAUDE.md` invariant (error + contract, optional-extras isolation, no `httpx2._`, no global logging, + naming)? Raise severity if it hits a listed invariant; lower if the spec is + silent and it's a hardening suggestion. + +A finding **survives on ≥2/3 confirms**. Severity is lowered if ≥1 verifier +says lower, raised if ≥2 say raise (existing `lowerOne`/`raiseOne` logic). + +### 4. Synthesize — one report, with Negative results + +A single Opus synthesis agent: + +1. Triages surviving findings into Blocker / High / Medium / Low / Nit using + the spec's severity definitions; rolls up >4 nits per dimension into one + entry. +2. Dedups across dimensions (file + line ±5 + similar claim). +3. Writes `planning/audits/2026-06-14-deep-audit.md`: top Summary + (counts + headline), per-bucket findings (each: `file:line` in code format, + ≤3-sentence claim, fenced evidence quote, verifier consensus e.g. "3/3: + code_reality, reproducer, spec_grounded", suggested direction), and a + **Negative results** section. +4. **Negative results** are built from candidates the panel *refuted* + (`surviving === false`) plus invariants the finders explicitly checked and + found held — the "verified-correct surface" that is a signature of the + prior reports. The new orchestrator passes the refuted candidates to + synthesis instead of silently dropping them (the one structural change from + the old harness, which discarded non-survivors). +5. Commits the report with an `audit(deep): …` message. **No source edits.** + +## Operations + +None. Entirely in-repo; no infra, DNS, or external accounts. + +## Out of scope + +Listed under Non-goals. The one explicit follow-up: confirmed findings feed +the normal audit→fix flow as new `planning/changes/active/` bundles, triaged +by severity, in a *separate* session. + +## Testing + +This change ships an audit report + an orchestrator script, not library code, +so "testing" means validating the *process*, not pytest: + +- The Workflow completes all four phases without an unhandled throw; the + discover JSON is written; the report file exists and parses as the expected + Markdown structure (all five severity buckets present even if empty, a + Negative-results section present). +- Spot-check: manually reproduce the single highest-severity finding (if any) + to confirm the panel didn't pass a false positive — matching the prior + audits' "headline findings reproduced directly" discipline. +- Sanity: total confirmed count is plausible (not 0 across all ten finders, + which would signal a broken discover map or mispointed paths — the failure + mode that stalled the 2026-06-07 `tests` dimension). + +## Risk + +- **Stale paths silently yield zero findings** (most likely × high impact): + if a finder prompt still points at a moved/renamed file it returns nothing + and looks "clean." Mitigation: every path in every prompt is rewritten in §1 + against the verified current tree; discover map is regenerated fresh and + finders are required to read it before searching. +- **False positives surviving the panel** (medium × medium): three Sonnet + verifiers can collectively confirm a plausible-but-wrong finding. + Mitigation: `code_reality` defaults to false on any doubt; headline findings + are hand-reproduced before the report is trusted. +- **Token cost overrun** (medium × low): ~10 finders + (~80 candidates × 3 + verifiers) + synthesis ≈ 250 mostly-Sonnet agents, Opus only for discover + + synthesis — in line with the prior ~1M-token deep audit, which the user has + opted into. The 15/dimension cap and ≥2/3 gate bound the verify fan-out. +- **Refactoring finder produces noise** (medium × low): subjective cleanup + suggestions can crowd the report. Mitigation: prompt forces a stated payoff + and low/nit default; the spec_grounded lens lowers anything the conventions + are silent on. diff --git a/planning/changes/active/2026-06-14.01-deep-audit/plan.md b/planning/changes/active/2026-06-14.01-deep-audit/plan.md new file mode 100644 index 0000000..e360c0e --- /dev/null +++ b/planning/changes/active/2026-06-14.01-deep-audit/plan.md @@ -0,0 +1,577 @@ +--- +status: draft +date: 2026-06-14 +slug: deep-audit +spec: deep-audit +pr: null +--- + +# deep-audit — implementation plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use +> superpowers:subagent-driven-development (recommended) or +> superpowers:executing-plans to implement this plan task-by-task. Steps +> use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Build a refreshed multi-agent audit orchestrator +(`workflow-deep.mjs`) and run it to produce +`planning/audits/2026-06-14-deep-audit.md` — a full-codebase, report-only +deep audit covering performance, security/supply-chain, refactoring, and the +core correctness/concurrency/test dimensions. + +**Spec:** [`design.md`](./design.md) + +**Branch:** `audit/2026-06-14-deep-audit` (already created) + +**Commit strategy:** Per-task commits. The script lands in its own commit; +the run produces the report in a synthesis-agent commit; the Index update is +a final commit. + +> **Execution note — who runs which task.** Tasks 1–5 (build + syntax-check +> the script) are ordinary file edits and may be done by a subagent. **Task 6 +> (run the audit) MUST be executed by the main session**, because it calls the +> `Workflow` tool, whose multi-agent fan-out the user explicitly opted into at +> the main loop — a sandboxed subagent cannot invoke it. Task 7 (validate + +> finalize) is also main-session. + +--- + +### Task 1: Scaffold `workflow-deep.mjs` from the existing harness + +**Files:** +- Create: `planning/audits/scripts/workflow-deep.mjs` +- Reference (read-only): `planning/audits/scripts/workflow.mjs` + +Copy the delta-oriented harness into a new combined-run orchestrator, then +update only the `meta`, model IDs, and dimension list in this task. Prompts +and body come in later tasks. + +- [ ] **Step 1: Copy the file** + + Run: + ```bash + cp planning/audits/scripts/workflow.mjs planning/audits/scripts/workflow-deep.mjs + ``` + +- [ ] **Step 2: Replace the `meta` block** at the top of + `workflow-deep.mjs` with: + + ```javascript + export const meta = { + name: 'httpware-deep-audit', + description: 'Full-codebase deep audit: discover + 10 finders + 3-lens verify + single-report synthesis', + phases: [ + { title: 'Discover', detail: 'Fresh module map' }, + { title: 'Find', detail: 'One finder per dimension (10)' }, + { title: 'Verify', detail: '3-lens panel per finding' }, + { title: 'Synthesize', detail: 'Triage + write the full report' }, + ], + } + ``` + +- [ ] **Step 3: Refresh the model IDs** near the bottom of the file. Replace: + + ```javascript + const SONNET = 'claude-sonnet-4-6' + const OPUS = 'claude-opus-4-7' + ``` + + with: + + ```javascript + const SONNET = 'claude-sonnet-4-6' + const OPUS = 'claude-opus-4-8' + ``` + +- [ ] **Step 4: Syntax-check** (the body still references old prompts that + exist, so this should parse): + + Run: `node --check planning/audits/scripts/workflow-deep.mjs` + Expected: no output, exit 0. + + (Note: `node --check` validates syntax only; it does not resolve the + workflow globals `agent`/`parallel`/`phase`/`log`, which is fine.) + +--- + +### Task 2: Repoint the reused finder prompts and replace the doc finders + +**Files:** +- Modify: `planning/audits/scripts/workflow-deep.mjs` + +The reused finders (`correctness`, `concurrency`, `error_contract`, `tests`, +`public_api`, `optional_extras`) carry forward, but every prompt reference to +the old layout must point at current reality. The two old doc finders +(`docs`, `planning_docs`) are removed and replaced by a single +`architecture_docs` finder. + +- [ ] **Step 1: Fix stale paths in the reused prompts.** In the + `DIMENSION_PROMPTS` object, apply these substitutions wherever they appear + in the `correctness`, `concurrency`, `error_contract`, `tests`, + `public_api`, and `optional_extras` prompt strings: + + - `docs/*.md` / `docs/recipes/` / `docs/dev/` references → drop (those + finders no longer cover the docs site; `architecture_docs` does). + - `planning/engineering.md` → `CLAUDE.md` and `architecture/.md`. + - Any `README.md` example references in `public_api` → keep `README.md` but + add `architecture/*.md` as the doc cross-reference. + - Leave the dimension-scoping ("out of scope: …") lines intact but update + the parenthetical dimension numbers to names, since the dimension set + changed (e.g. "(dimension 7-8)" → "(the architecture_docs finder)"). + + The substantive instructions, targets ("6-12 findings"), and + "default to silence" lines stay unchanged. + +- [ ] **Step 2: Delete the `docs` and `planning_docs` prompts** from + `DIMENSION_PROMPTS` entirely. + +- [ ] **Step 3: Add the `architecture_docs` prompt** to `DIMENSION_PROMPTS`: + + ```javascript + architecture_docs: `You are auditing architecture/*.md for DRIFT against + the current code. + + Read every file: architecture/{overview,client,middleware,decoders,errors, + resilience,extras,testing}.md. For each load-bearing claim, verify it + against the actual src/httpware/ code, public API (__init__.py __all__), + and tests. + + Look for: + - Class/decorator/method names made stale by the 0.8.0 Async* rename or + later changes (Middleware vs AsyncMiddleware, Retry vs AsyncRetry, etc.). + - Described behavior the code no longer matches (circuit breaker states, + async timeout non-finite handling, multi-decoder routing, per-instance + decoder cache, send_with_response). + - Invariants stated as enforced that are actually only review-enforced (the + 2026-06-13 docs work corrected some of these — check none regressed). + - Import statements or code blocks that would not run against current + src/httpware/. + - Cross-references / links that do not resolve. + + Report each with the architecture file, the inaccurate quote, and the + current truth. Out of scope: docs/ site content and planning/ docs. + 4-10 findings.`, + ``` + +- [ ] **Step 4: Syntax-check.** + + Run: `node --check planning/audits/scripts/workflow-deep.mjs` + Expected: no output, exit 0. + +--- + +### Task 3: Add the three new finder prompts + +**Files:** +- Modify: `planning/audits/scripts/workflow-deep.mjs` + +Add `performance`, `security`, and `refactoring` to `DIMENSION_PROMPTS`. + +- [ ] **Step 1: Add the `performance` prompt:** + + ```javascript + performance: `You are auditing the httpware repository for PERFORMANCE + issues only. + + Scope: src/httpware/ — the per-request hot path above all. Read client.py + (send / send_with_response / stream, sync and async), middleware/chain.py + (compose + Next), and middleware/resilience/{retry,bulkhead,budget, + circuit_breaker,timeout}.py. + + Look for: + - Allocations or work repeated per-request that could be hoisted to + __init__ (chain re-composition, rebuilding decoder lists, recreating + closures, redundant dict/list copies). + - Lock-hold scope: work done while holding RetryBudget/Bulkhead/ + CircuitBreaker locks that could happen outside the critical section; + contention hot spots under concurrency. + - Decoder / TypeAdapter caching: is the per-instance cache (0.9.0) actually + hit, or rebuilt? Any O(n) decoder-list scan that runs per response when it + could be memoized per model. + - Async overhead: event-loop-blocking sync calls inside async paths, + sequential awaits that could be concurrent, needless gather/wrapping. + - Response body handling: bytes read/copied more than once, eager reads on + a streaming path. + + Quantify the cost where you can (per-request vs per-client, O(n) vs O(1)). + This dimension is about COST, not safety — concurrency hazards and logic + bugs belong to other finders. Default to NOT reporting micro-optimizations + with no measurable payoff. 6-12 findings target.`, + ``` + +- [ ] **Step 2: Add the `security` prompt:** + + ```javascript + security: `You are auditing the httpware repository for SECURITY and + SUPPLY-CHAIN issues only. + + Look for: + - Untrusted-response trust boundaries: status code, headers, and body come + from the server — anywhere httpware trusts them without bound (e.g. + unbounded reads driven by a header, status used to index without guard). + - Decoder deserialization safety: pydantic and msgspec run on + attacker-controlled bytes in decoders/{pydantic,msgspec}.py. Any path that + could be driven to excessive recursion, memory, or arbitrary type + construction? Is body size ever bounded? + - Inherited httpx2 surfaces: redirect-following, URL handling, proxy/SSRF + exposure — does httpware widen or fail to constrain anything httpx2 leaves + to the caller? Report the boundary even if the default is httpx2's. + - Secret leakage: do exception messages, repr, or log/OTel events ever + include auth headers, cookies, or URLs with embedded credentials? Check + errors.py (StatusError holds the full Response) and + _internal/observability.py. + - Supply chain: version floors/ceilings in pyproject.toml for httpx2 and the + optional extras (pydantic/msgspec/otel). Unpinned-floor or over-wide + ranges that could pull a vulnerable transitive version. + + Report the trust boundary even when the current default is safe, but mark + severity honestly (a documented httpx2 default is a nit; an unbounded + attacker-driven allocation is high). 6-12 findings target.`, + ``` + +- [ ] **Step 3: Add the `refactoring` prompt:** + + ```javascript + refactoring: `You are auditing the httpware repository for REFACTORING + opportunities and INCONSISTENCIES only — not bugs. + + Look for: + - Sync/async duplication: logic copy-pasted between Client and AsyncClient + (or Retry/AsyncRetry, Bulkhead/AsyncBulkhead) that could share a helper + WITHOUT crossing a protocol seam (Seam A/B/C in CLAUDE.md). Note where a + copy has already drifted. + - Inconsistent patterns: error construction, naming, signatures, or control + flow that differ for no reason across sibling modules. Cross-check the + conventions in CLAUDE.md (StatusError vs other ClientError __init__ rules, + naming, import style). + - Dead or unreachable code; over-complex branching that flattens; module + boundaries that have eroded. + + Every finding states the concrete payoff (what gets simpler / what + divergence it prevents), not aesthetics. A suggestion the conventions are + silent on is a nit or low at most. Never propose crossing a documented + protocol seam. Default severity low/nit unless a duplication has already + caused a real divergence. 5-10 findings target.`, + ``` + +- [ ] **Step 4: Syntax-check.** + + Run: `node --check planning/audits/scripts/workflow-deep.mjs` + Expected: no output, exit 0. + +--- + +### Task 4: Rework Verify (keep refuted candidates) and Synthesize (single report) + +**Files:** +- Modify: `planning/audits/scripts/workflow-deep.mjs` + +Two structural changes: the verify pass must retain refuted candidates (for +the Negative-results section), and synthesis becomes a single-report writer +instead of a per-chunk appender. + +- [ ] **Step 1: Replace the Verify `.then(...)` body** so non-survivors are + kept rather than nulled. Find the block inside the `verified = await + parallel(...)` call that ends with: + + ```javascript + return surviving ? { ...f, final_severity: severity, lensesConfirming } : null + ``` + + Replace the whole `.then(verdicts => { ... })` callback with: + + ```javascript + )).then(verdicts => { + const live = verdicts.filter(Boolean) + const confirms = live.filter(v => v.confirmed).length + const surviving = confirms >= 2 + const lensesConfirming = live.filter(v => v.confirmed).map(v => v.lens) + const adjustments = live.map(v => v.severity_adjustment).filter(Boolean) + const raiseCount = adjustments.filter(a => a === 'raise').length + const lowerCount = adjustments.filter(a => a === 'lower').length + let severity = f.suspected_severity + if (lowerCount >= 1) severity = lowerOne(severity) + if (raiseCount >= 2) severity = raiseOne(severity) + if (verdicts.every(v => v === null)) { + log(`WARNING: all 3 verifiers failed for finding "${f.title}" (${f.file}:${f.line_hint}) — dropped`) + return null + } + const refuteReason = live.find(v => !v.confirmed)?.reason ?? 'no verifier confirmed' + return surviving + ? { ...f, surviving: true, final_severity: severity, lensesConfirming } + : { ...f, surviving: false, refuteReason } + }) + ``` + +- [ ] **Step 2: Split confirmed vs refuted** after the verify block. + Replace: + + ```javascript + const confirmed = verified.filter(Boolean) + log(`${confirmed.length}/${allFindings.length} findings confirmed by ≥2 verifiers`) + ``` + + with: + + ```javascript + const triaged = verified.filter(Boolean) + const confirmed = triaged.filter(v => v.surviving) + const refuted = triaged.filter(v => !v.surviving) + log(`${confirmed.length}/${allFindings.length} confirmed by ≥2 verifiers; ${refuted.length} refuted (kept for Negative results)`) + ``` + +- [ ] **Step 3: Replace `SYNTHESIS_PROMPT`** (the whole `const + SYNTHESIS_PROMPT = (...) => \`...\`` definition) with the single-report + version: + + ```javascript + const SYNTHESIS_PROMPT = (dims, confirmed, refuted, auditFile, discoverFile) => ` + You are writing the FINAL httpware deep-audit report. This is a single + combined run (not chunked) — you write the whole file, including the + top-of-file Summary. + + You have ${confirmed.length} CONFIRMED findings (survived ≥2/3 verifiers) and + ${refuted.length} REFUTED candidates (investigated, did not survive) across + dimensions: ${dims.join(', ')}. + + Tasks: + 1. Triage each confirmed finding into a bucket: blocker / high / medium / + low / nit, applying severity strictly. If more than 4 nits share a + dimension, roll them into one " nits (rolled up)" entry (the + rolled entry counts its constituents in the totals). + 2. Dedupe confirmed findings against each other (file + line ±5 + similar + claim); fold duplicates into one entry. + 3. Write ${auditFile} with this structure: + - "# httpware deep audit — 2026-06-14" + - "**Status:** complete" and a one-line "**Method:**" (ten adversarial + finders → 3-lens verify panel → ≥2/3 to survive → single synthesis). + - "## Summary" — counts per bucket (Blockers/High/Medium/Low/Nits), the + single headline finding in one sentence, and an explicit "Not covered" + line. + - "## Findings" — grouped by bucket. Each finding: a "#### " title, a + "*(dimension — verified)*" tag line, file:line in \`code\` format, a + ≤3-sentence claim, a fenced code block with the evidence quote, the + verifier consensus (e.g. "panel 3/3: code_reality, reproducer, + spec_grounded"), and a one-line suggested direction. Directions only — + do NOT write fixes or patches. + - "## Negative results (verified correct)" — a bulleted list built from + the REFUTED candidates and invariants the finders checked and found + held. One line each: what was checked and why it is fine. Summarize; + do not dump raw JSON. + 4. Use the Write tool to create ${auditFile} (overwrite if present). + 5. Stage and commit ONLY the report and the discover map — NO source edits: + git add ${auditFile} ${discoverFile} + git commit -m "audit(deep): 2026-06-14 full-codebase audit — confirmed ()" + Then run \`git status\` to confirm a clean tree. If git reports any + modified src/ or tests/ file, STOP and report it — this pass must not + touch source. + + CONFIRMED findings JSON: + ${JSON.stringify(confirmed, null, 2)} + + REFUTED candidates JSON (for Negative results — summarize, do not dump): + ${JSON.stringify(refuted, null, 2)} + ` + ``` + +- [ ] **Step 4: Replace the Synthesize phase call and return value** at the + bottom of the script body. Replace: + + ```javascript + phase('Synthesize') + await agent( + SYNTHESIS_PROMPT(cfg.chunk_id, cfg.dimensions, confirmed, cfg.audit_file), + { model: OPUS, label: `synthesize:chunk-${cfg.chunk_id}` }, + ) + + return { + chunk_id: cfg.chunk_id, + candidates: allFindings.length, + confirmed: confirmed.length, + by_severity: countBySeverity(confirmed), + } + ``` + + with: + + ```javascript + phase('Synthesize') + await agent( + SYNTHESIS_PROMPT(cfg.dimensions, confirmed, refuted, cfg.audit_file, cfg.discover_file), + { model: OPUS, label: 'synthesize:deep' }, + ) + + return { + candidates: allFindings.length, + confirmed: confirmed.length, + refuted: refuted.length, + by_severity: countBySeverity(confirmed), + } + ``` + +- [ ] **Step 5: Make discover always run** for the combined run. Replace the + guard: + + ```javascript + if (cfg.run_discover) { + ``` + + with: + + ```javascript + if (cfg.run_discover !== false) { + ``` + + (Discover defaults on; pass `run_discover: false` only to reuse an existing + map.) Confirm `countBySeverity` reads `f.final_severity ?? f.suspected_severity` + — it does in the helper; leave it. + +- [ ] **Step 6: Syntax-check.** + + Run: `node --check planning/audits/scripts/workflow-deep.mjs` + Expected: no output, exit 0. + +--- + +### Task 5: Commit the orchestrator + +**Files:** +- Commit: `planning/audits/scripts/workflow-deep.mjs` + +- [ ] **Step 1: Final syntax + grep sanity.** + + Run: + ```bash + node --check planning/audits/scripts/workflow-deep.mjs && \ + grep -c "performance:\|security:\|refactoring:\|architecture_docs:" planning/audits/scripts/workflow-deep.mjs + ``` + Expected: exit 0 and a count of `4` (one per new/repointed finder key). + +- [ ] **Step 2: Confirm the old doc finders are gone.** + + Run: `grep -c "planning_docs:\| docs:" planning/audits/scripts/workflow-deep.mjs || true` + Expected: `0`. + +- [ ] **Step 3: Commit.** + + ```bash + git add planning/audits/scripts/workflow-deep.mjs + git commit -m "audit(tooling): combined-run deep-audit orchestrator + + Forks workflow.mjs into workflow-deep.mjs: refreshed model IDs, paths + repointed to architecture/ + planning/changes, three new finders + (performance, security, refactoring), architecture_docs replaces the + docs/planning_docs finders, and refuted candidates are kept for a + Negative-results section in a single combined-run report. + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +--- + +### Task 6: Run the audit (main session only) + +**Files:** +- Produces: `planning/audits/scripts/_discover-2026-06-14.json` +- Produces: `planning/audits/2026-06-14-deep-audit.md` + +This task invokes the `Workflow` tool. It cannot be delegated to a sandboxed +subagent — run it from the main session. + +- [ ] **Step 1: Invoke the Workflow** with the script path and config: + + ``` + Workflow({ + scriptPath: 'planning/audits/scripts/workflow-deep.mjs', + args: { + dimensions: [ + 'correctness', 'concurrency', 'error_contract', + 'performance', 'security', 'refactoring', + 'tests', 'public_api', 'optional_extras', 'architecture_docs' + ], + run_discover: true, + discover_file: 'planning/audits/scripts/_discover-2026-06-14.json', + audit_file: 'planning/audits/2026-06-14-deep-audit.md' + } + }) + ``` + + Expected: a `runId` returned immediately, then a `` when + all four phases finish. Watch live progress with `/workflows`. + +- [ ] **Step 2: Confirm the run returned a sane shape.** When the + notification arrives, the workflow return value should report + `confirmed > 0` across the ten finders. If `confirmed === 0` AND + `candidates === 0`, the discover map or paths are broken — STOP and inspect + `_discover-2026-06-14.json` before trusting a "clean" result (this is the + failure mode that stalled the 2026-06-07 `tests` dimension). + +--- + +### Task 7: Validate the report, reproduce the headline, finalize + +**Files:** +- Verify: `planning/audits/2026-06-14-deep-audit.md` +- Modify: `planning/README.md` (Index) + +- [ ] **Step 1: Confirm the report exists and has the required structure.** + + Run: + ```bash + test -f planning/audits/2026-06-14-deep-audit.md && \ + grep -c "^## Summary\|^## Findings\|^## Negative results" planning/audits/2026-06-14-deep-audit.md + ``` + Expected: file exists and count is `3` (all three top-level sections + present). + +- [ ] **Step 2: Confirm the synthesis agent did not touch source.** + + Run: `git status --porcelain src tests` + Expected: empty output. If anything appears, revert it — this audit is + report-only. + +- [ ] **Step 3: Reproduce the single highest-severity finding.** Read the + top finding in the report. Write the 3–5 line reproducer it cites (in a + scratch test or a `python -c`/`uv run pytest -k` invocation against + `httpx2.MockTransport` per `architecture/testing.md`) and confirm it + actually demonstrates the claimed behavior. If it does not reproduce, note + it in the report as "could not reproduce — downgrade" rather than trusting + the panel. + + (If the report has zero High/Blocker findings, skip repro and note in the + report summary that the headline is Medium-or-below.) + +- [ ] **Step 4: Add the Index entry.** In `planning/README.md`, under + `### Active`, replace `_None._` with: + + ```markdown + - **[deep-audit](changes/active/2026-06-14.01-deep-audit/design.md)** (2026-06-14) — Full-codebase deep audit covering the perf/security/supply-chain gaps the 2026-06-07 audit skipped, plus correctness, concurrency, refactoring, and test quality. Report: [audits/2026-06-14-deep-audit.md](audits/2026-06-14-deep-audit.md). Report-only; confirmed findings spawn follow-up bundles. + ``` + +- [ ] **Step 5: Commit the Index update.** + + ```bash + git add planning/README.md + git commit -m "docs(planning): index the 2026-06-14 deep audit + + Co-Authored-By: Claude Opus 4.8 (1M context) " + ``` + +- [ ] **Step 6: Report the outcome** to the user: bucket counts, the + headline finding, whether it reproduced, and the recommended next step + (triage confirmed findings into follow-up `planning/changes/active/` + bundles in a separate session). Do not open fix PRs here. + +--- + +## Notes for the executor + +- **Report-only.** No source edits in any task. The synthesis agent is + explicitly instructed to commit only the report + discover map and to stop + if `git status` shows a dirty `src/` or `tests/`. +- **Token budget.** Roughly ten finders + (~80 candidates × 3 verifiers) + + synthesis ≈ 250 mostly-Sonnet agents; Opus only for discover and synthesis. + The 15/dimension cap and the ≥2/3 survive gate bound the verify fan-out. +- **Resumability.** If the Workflow dies mid-run, relaunch with + `Workflow({ scriptPath, resumeFromRunId })` — unchanged `agent()` calls + return cached results; only the failed/new calls re-run. diff --git a/planning/changes/active/2026-06-14.02-pydantic-import-isolation/change.md b/planning/changes/active/2026-06-14.02-pydantic-import-isolation/change.md new file mode 100644 index 0000000..f626b48 --- /dev/null +++ b/planning/changes/active/2026-06-14.02-pydantic-import-isolation/change.md @@ -0,0 +1,58 @@ +--- +status: draft +date: 2026-06-14 +slug: pydantic-import-isolation +supersedes: null +superseded_by: null +pr: null +outcome: null +--- + +# Change: Guard the pydantic import so the decoder module loads without the extra + +**Lane:** lightweight — ≲30 LOC net, 2 files, no new file, no public-API +change, a single straightforward test. + +## Goal + +Fix the 2026-06-14 deep-audit **High** finding (and the two folded **Medium** +findings sharing its root cause): `decoders/pydantic.py:13` imports +`from pydantic import TypeAdapter` unconditionally at module top, so +`import httpware.decoders.pydantic` raises a bare `ModuleNotFoundError` when +the extra is absent — *before* the friendly `ImportError` guard in +`PydanticDecoder.__init__` can run. This also makes the Seam-C isolation +invariant documented in [`architecture/extras.md`](../../../../architecture/extras.md) +false for pydantic (only msgspec matched it). + +## Approach + +Mirror the `decoders/msgspec.py` pattern exactly: import `import_checker` +first, then guard the hard import behind `is_pydantic_installed`, and quote +the one class annotation that references `TypeAdapter` as a forward-ref so the +class body does not evaluate the name at definition time when the extra is +absent. Runtime uses of `TypeAdapter` (inside methods) are only reachable when +pydantic is installed, so they need no change. + +After this fix the documented invariant becomes true with **no doc edit +needed**: `grep -rnE 'from pydantic|import pydantic' src/httpware/ | grep -v +import_checker` returns exactly one indented line — the guarded import — which +is precisely what `architecture/extras.md` already claims. The High finding is +resolved by making the code match the (correct) doc. + +## Files + +- `src/httpware/decoders/pydantic.py` — reorder imports; guard + `from pydantic import TypeAdapter` behind `if import_checker.is_pydantic_installed:`; + quote the `_adapters` class annotation's `TypeAdapter` reference. +- `tests/test_optional_extras_pydantic_missing.py` — add a fresh-subprocess + test proving the module imports cleanly when pydantic is genuinely absent + and that `PydanticDecoder()` then raises the friendly extra-missing error. + +## Verification + +- [ ] Failing test first — `just test tests/test_optional_extras_pydantic_missing.py -k module_imports_when_pydantic_absent` fails (module load raises under simulated absence). +- [ ] Apply the change. +- [ ] Test passes — same command. +- [ ] `grep -rnE 'from pydantic|import pydantic' src/httpware/ | grep -v import_checker` returns exactly one indented line. +- [ ] `just test` — full suite green. +- [ ] `just lint` — clean. diff --git a/src/httpware/decoders/pydantic.py b/src/httpware/decoders/pydantic.py index 3136038..1dc0c99 100644 --- a/src/httpware/decoders/pydantic.py +++ b/src/httpware/decoders/pydantic.py @@ -10,11 +10,13 @@ class entirely when `is_pydantic_installed` is False, so `AsyncClient()` does import typing from typing import TypeVar -from pydantic import TypeAdapter - from httpware._internal import import_checker +if import_checker.is_pydantic_installed: + from pydantic import TypeAdapter + + MISSING_DEPENDENCY_MESSAGE = ( "PydanticDecoder requires the 'pydantic' extra. Install with: pip install httpware[pydantic]" ) @@ -23,9 +25,15 @@ class entirely when `is_pydantic_installed` is False, so `AsyncClient()` does class PydanticDecoder: - """Decode raw response bytes into `model` via a per-instance cached `pydantic.TypeAdapter`.""" + """Decode raw response bytes into `model` via a per-instance cached `pydantic.TypeAdapter`. + + Requires the `pydantic` extra: `pip install httpware[pydantic]`. Importing + this module without the extra works (the `pydantic` import is guarded by an + `is_pydantic_installed` check), but instantiating the decoder raises + `ImportError`. + """ - _adapters: dict[type, TypeAdapter[typing.Any]] + _adapters: dict[type, "TypeAdapter[typing.Any]"] _can_decode_results: dict[type, bool] def __init__(self) -> None: diff --git a/tests/test_optional_extras_pydantic_missing.py b/tests/test_optional_extras_pydantic_missing.py index 8383611..32f567a 100644 --- a/tests/test_optional_extras_pydantic_missing.py +++ b/tests/test_optional_extras_pydantic_missing.py @@ -6,6 +6,8 @@ duration of the test. """ +import subprocess +import sys from unittest.mock import patch import pytest @@ -24,6 +26,38 @@ def decode(self, content: bytes, model: type) -> object: # noqa: ARG002 — nam return model() # pragma: no cover +def test_pydantic_decoder_module_imports_when_pydantic_absent() -> None: + """The decoder module must import cleanly when pydantic is genuinely absent. + + pydantic IS installed in CI (via `--all-extras`), so true absence is + simulated in a fresh subprocess: setting `sys.modules['pydantic'] = None` + makes `importlib.util.find_spec('pydantic')` return None (so + `is_pydantic_installed` is False) and any `import pydantic` raise + ImportError. With the module-level import guarded, importing the decoder + module must NOT raise, and `PydanticDecoder()` must raise the friendly + extra-missing ImportError — not a bare ModuleNotFoundError at module load. + """ + script = ( + "import sys\n" + "sys.modules['pydantic'] = None\n" + "from httpware._internal import import_checker\n" + "assert import_checker.is_pydantic_installed is False\n" + "import httpware.decoders.pydantic as pyd\n" + "try:\n" + " pyd.PydanticDecoder()\n" + "except ImportError as exc:\n" + " sys.exit(0 if 'httpware[pydantic]' in str(exc) else 2)\n" + "sys.exit(3)\n" + ) + result = subprocess.run( # noqa: S603 — `script` is a test-authored constant, not untrusted input + [sys.executable, "-c", script], check=False, capture_output=True + ) + assert result.returncode == 0, ( + f"decoder module failed to import or guard without pydantic; rc={result.returncode} " + f"stdout={result.stdout!r} stderr={result.stderr!r}" + ) + + def test_pydantic_decoder_init_raises_when_pydantic_missing() -> None: with ( patch("httpware._internal.import_checker.is_pydantic_installed", False),