refactor: move dry_run_role off public LLMClientConfig — closes #63 by dantp-ai · Pull Request #65 · aganthos/clawloop

dantp-ai · 2026-06-10T23:35:28Z

Summary

Closes #63. Replaces the public dry_run_role: str | None = None field on LLMClientConfig (introduced in PR #62) with a private subclass _DryRunLLMClientConfig defined in clawloop/cli.py.

Why a subclass instead of the field

Public schema stays clean. LLMClientConfig.model_dump() and LLMClientConfig.model_json_schema() no longer carry any dry-run testing vocabulary. A new test test_public_llm_client_config_schema_excludes_dry_run_vocab locks this in.
model field still untouched. The original PR fix: address PR #60 review comments on clawloop run --dry-run + taubench #62 win (avoiding the marker-in-model pattern from PR chore: reintroduce clawloop run as a TrainConfig wrapper #60) is preserved — downstream code like _build_entropic that reads tc.model verbatim is unaffected.
Survives model_copy(). Pydantic v2 preserves the runtime class on copy, so the role tag travels with the data exactly like a field would. The regression test asserts this end-to-end (subclass identity, model preservation, dry_run_role survival, and correct mock dispatch via isinstance).

Note on the issue's original proposal

Issue #63 proposed a WeakKeyDictionary[LLMClientConfig, str] side-channel. Empirically (verified before implementing), Pydantic v2 BaseModel instances are not hashable by default — hash(cfg) raises TypeError: unhashable type. Making them hashable would require model_config = {\"frozen\": True} on LLMClientConfig, a more invasive public-API change that breaks any code that mutates a client config after construction.

The subclass approach satisfies the same spirit ("keep dry-run vocabulary out of the public schema") via a standard Pydantic mechanism without that trade-off.

Note on PR layering

This PR builds on the work in #62 (still open at time of writing). Until #62 merges, the diff here against main will include both PRs' commits. Once #62 merges, this becomes a clean ~80-line delta.

Test plan

pytest tests/test_cli.py tests/test_train_config.py — 57 passes (was 56; new test for schema-cleanliness)
pre-commit run --all-files — clean
clawloop run examples/configs/math_harness.json --dry-run — completes (real MathAdapter + mock LLMs via subclass detection)
clawloop run examples/configs/taubench_harness.json --dry-run with all API key env vars unset — completes (stub adapter)
LLMClientConfig.model_json_schema() properties no longer include dry_run_role
LLMClientConfig(model=\"x\").model_dump() no longer includes dry_run_role key

🤖 Generated with Claude Code

gemini-code-assist

Code Review

This pull request enhances the --dry-run mode to guarantee no real LLM or network calls are made across all environment types, including those that bypass _make_llm_client (such as taubench, entropic, and openclaw), by introducing a private _DryRunLLMClientConfig subclass and a _StubAdapter. It also adds up-front configuration validation for taubench and comprehensive tests. The feedback suggests refining the exception handling in _StubAdapter.run_episode to catch specific exceptions like AttributeError and TypeError instead of using a broad except Exception: block.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

…review `except Exception:` swallowed any error from `agent_state.state_id().combined_hash`. For the dry-run stub the only legitimate failure modes are AttributeError (missing `state_id` / `combined_hash`) and TypeError (non-callable). Narrow the catch so real bugs propagate instead of being silently masked. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…hos#63 Replaces the public ``dry_run_role: str | None = None`` field on ``LLMClientConfig`` (introduced in PR aganthos#62) with a private subclass ``_DryRunLLMClientConfig`` defined in ``clawloop/cli.py``. Why a subclass instead of the field: - The public ``LLMClientConfig`` schema no longer carries any dry-run testing vocabulary. ``model_dump()`` / ``model_json_schema()`` of a user's TrainConfig do not emit ``dry_run_role: null`` anymore. New test ``test_public_llm_client_config_schema_excludes_dry_run_vocab`` locks this in. - ``model`` is still left untouched (the gain from PR aganthos#62 stands). - Pydantic ``model_copy()`` preserves the runtime class, so the role tag survives copies just like a field would. The regression test is updated to assert this end-to-end. Implementation note: the issue originally proposed a ``WeakKeyDictionary`` side-channel. Empirically, Pydantic v2 BaseModel instances are not hashable by default, so that approach is infeasible without making ``LLMClientConfig`` ``frozen=True`` (which would be a more invasive public-API change). The subclass approach satisfies the same spirit ("keep dry-run vocabulary out of the public schema") via a standard Pydantic mechanism. Verification: pre-commit clean; 57 tests pass; smoke runs for math and taubench dry-run still complete cleanly with no API keys set. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…review `except Exception:` swallowed any error from `agent_state.state_id().combined_hash`. For the dry-run stub the only legitimate failure modes are AttributeError (missing `state_id` / `combined_hash`) and TypeError (non-callable). Narrow the catch so real bugs propagate instead of being silently masked. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ganthos#64 `train()` now accepts keyword-only `make_llm_client` and `env_builders` parameters; the CLI dry-run path builds them as closures over the user's config and passes them in directly. This removes the module-level mutation of `clawloop.train._make_llm_client` and `ENV_BUILDERS` that PR aganthos#62 introduced, and the `_DryRunLLMClientConfig` subclass that PR aganthos#65 added to identify roles for the monkey-patched factory. Changes ------- - `clawloop/train.py`: - `train()` signature gains `*, make_llm_client=None, env_builders=None`; defaults preserve current behavior. - All six `_build_*` helpers gain `make_llm_client` as a third positional parameter (uniform signature). Only `_build_math` actually uses it today; others accept it for the future refactor tracked by the out-of-scope notes in the plan. - New module-level alias `LLMClientFactory = Callable[[LLMClientConfig], Any]` for readable type hints. - `clawloop/cli.py`: - Delete `_DryRunLLMClientConfig` and the `LLMClientConfig` import that only existed to subclass it. - Replace `_install_dry_run_clients` (which mutated module state) with `_build_dry_run_overrides(config)` (pure: returns `(factory, builders)`). - `cmd_run` passes the overrides into `train(...)` directly. - `_StubAdapter` and `ENVS_USING_MAKE_LLM_CLIENT` are unchanged. - Tests: - `tests/test_train_config.py::TestTrainEndToEnd::test_harness_learning_math`: drop `with patch("clawloop.train._make_llm_client")` and pass `make_llm_client=_pick_client` to `train()` directly. - `tests/test_cli.py`: - Remove the subclass round-trip test (subclass is gone). - Replace with `test_dry_run_overrides_map_roles_correctly` and `test_dry_run_overrides_stub_bypassing_env` exercising `_build_dry_run_overrides` directly. - Add `test_dry_run_does_not_mutate_module_state` — invokes `cmd_run` with `dry_run=True` and asserts `clawloop.train._make_llm_client` and `ENV_BUILDERS` are unchanged after the call. Locks in the no-global-mutation contract. - Keep `test_public_llm_client_config_schema_excludes_dry_run_vocab`. - `tests/unit/test_train.py`: builder invocations now pass `None` as the third positional arg (openspiel doesn't use it). Verification ------------ - pre-commit clean (ruff + ruff-format) - pytest tests/test_cli.py tests/test_train_config.py tests/unit/test_train.py → 68 pass - Full suite: 1046 passed, 49 skipped, 0 failed - `clawloop run examples/configs/math_harness.json --dry-run` exits 0 - `clawloop run examples/configs/taubench_harness.json --dry-run` with all API key env vars unset exits 0 - `clawloop eval` still exits 2 with the redirect Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…) — closes #64 (#66) * refactor: replace monkey-patching with dependency injection — closes #64 `train()` now accepts keyword-only `make_llm_client` and `env_builders` parameters; the CLI dry-run path builds them as closures over the user's config and passes them in directly. This removes the module-level mutation of `clawloop.train._make_llm_client` and `ENV_BUILDERS` that PR #62 introduced, and the `_DryRunLLMClientConfig` subclass that PR #65 added to identify roles for the monkey-patched factory. Changes ------- - `clawloop/train.py`: - `train()` signature gains `*, make_llm_client=None, env_builders=None`; defaults preserve current behavior. - All six `_build_*` helpers gain `make_llm_client` as a third positional parameter (uniform signature). Only `_build_math` actually uses it today; others accept it for the future refactor tracked by the out-of-scope notes in the plan. - New module-level alias `LLMClientFactory = Callable[[LLMClientConfig], Any]` for readable type hints. - `clawloop/cli.py`: - Delete `_DryRunLLMClientConfig` and the `LLMClientConfig` import that only existed to subclass it. - Replace `_install_dry_run_clients` (which mutated module state) with `_build_dry_run_overrides(config)` (pure: returns `(factory, builders)`). - `cmd_run` passes the overrides into `train(...)` directly. - `_StubAdapter` and `ENVS_USING_MAKE_LLM_CLIENT` are unchanged. - Tests: - `tests/test_train_config.py::TestTrainEndToEnd::test_harness_learning_math`: drop `with patch("clawloop.train._make_llm_client")` and pass `make_llm_client=_pick_client` to `train()` directly. - `tests/test_cli.py`: - Remove the subclass round-trip test (subclass is gone). - Replace with `test_dry_run_overrides_map_roles_correctly` and `test_dry_run_overrides_stub_bypassing_env` exercising `_build_dry_run_overrides` directly. - Add `test_dry_run_does_not_mutate_module_state` — invokes `cmd_run` with `dry_run=True` and asserts `clawloop.train._make_llm_client` and `ENV_BUILDERS` are unchanged after the call. Locks in the no-global-mutation contract. - Keep `test_public_llm_client_config_schema_excludes_dry_run_vocab`. - `tests/unit/test_train.py`: builder invocations now pass `None` as the third positional arg (openspiel doesn't use it). Verification ------------ - pre-commit clean (ruff + ruff-format) - pytest tests/test_cli.py tests/test_train_config.py tests/unit/test_train.py → 68 pass - Full suite: 1046 passed, 49 skipped, 0 failed - `clawloop run examples/configs/math_harness.json --dry-run` exits 0 - `clawloop run examples/configs/taubench_harness.json --dry-run` with all API key env vars unset exits 0 - `clawloop eval` still exits 2 with the redirect Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix: replace None with lambda dummies for builder make_llm_client arg Per gemini-code-assist review on PR #66: passing `None` for the `make_llm_client` parameter violates the declared type signature `LLMClientFactory = Callable[[LLMClientConfig], Any]`. The builders we invoke (openspiel, taubench stub) never call the factory, but the type contract still applies. Replace each `None` with `lambda _: None` — a callable that ignores its arg and returns None. Satisfies the type signature; behavior is unchanged because the callees don't invoke it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

dantp-ai self-assigned this Jun 10, 2026

gemini-code-assist Bot reviewed Jun 10, 2026

View reviewed changes

Comment thread clawloop/cli.py Outdated

dantp-ai mentioned this pull request Jun 11, 2026

refactor: replace monkey-patching with dependency injection in train() — closes #64 #66

Merged

7 tasks

dantp-ai and others added 2 commits June 12, 2026 02:35

dantp-ai force-pushed the feat/issue-63-dry-run-role-subclass branch from 50baf0c to f5e7544 Compare June 12, 2026 00:39

dantp-ai merged commit c5faf2f into aganthos:main Jun 12, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: move dry_run_role off public LLMClientConfig — closes #63#65

refactor: move dry_run_role off public LLMClientConfig — closes #63#65
dantp-ai merged 2 commits into
aganthos:mainfrom
dantp-ai:feat/issue-63-dry-run-role-subclass

dantp-ai commented Jun 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dantp-ai commented Jun 10, 2026

Summary

Why a subclass instead of the field

Note on the issue's original proposal

Note on PR layering

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant