Skip to content

Type-check the mpnn feature-collator and train modules#342

Open
lyskov-ai wants to merge 8 commits into
RosettaCommons:productionfrom
lyskov-ai:0067-mpnn-mypy-collator-train
Open

Type-check the mpnn feature-collator and train modules#342
lyskov-ai wants to merge 8 commits into
RosettaCommons:productionfrom
lyskov-ai:0067-mpnn-mypy-collator-train

Conversation

@lyskov-ai

Copy link
Copy Markdown
Contributor

Continues bringing models/mpnn under the mypy gate. Clears collate.feature_collator and train, leaving 2 mpnn modules (utils.inference, utils.weights).

  • feature_collator.py: fix two implicit-Optional default_padding parameters (the body already maps None to the MPNN defaults) and annotate a list local. Annotation-only; covered by the existing test_feature_collator.py.
  • train.py: replace the ad-hoc local CkptConfig class with foundry's CheckpointConfig — the type FabricTrainer.fit actually expects. CkptConfig was a field-subset of it and, notably, lacked parameter_freezing_config, which fit() reads — so the checkpoint-resume branch would have raised AttributeError. The swap fixes that latent footgun. The remaining train.py errors are third-party stub gaps documented with type: ignore[arg-type]: atomworks StructuralDatasetWrapper types dataset_parser as Callable but takes a GenericDFParser parser object; torch WeightedRandomSampler accepts a Tensor despite its Sequence[float] stub.

Note: train.py is a cluster-only training script (hardcoded paths, real data) not run in CI, so the CheckpointConfig behaviour improvement is unverified here — flagged for mpnn owners.

Verification: ruff format/check clean; mypy Success (176 files); pytest 526 passed / 10 skipped.

lyskov and others added 8 commits June 30, 2026 19:29
Clear rfd3.trainer.fabric_trainer (99 errors) from the mypy ignore-errors
ratchet. The errors were one dominant pattern: the dynamically-keyed `state`
bag was inferred as `dict[Any, Any] | int | None`, so every state access and
counter update errored.

Mirror the already-landed foundry trainers/fabric precedent: a class-level
`state: dict[str, Any]` + `_current_train_return: Any`, an annotated
`default_state`, a documented type-ignore on the wider str|int precision API,
a cast on setup_dataloaders + widening the loop params to DataLoader,
`get_latest_checkpoint -> Path | None` with a cast at the call site, and the
truthful `load_legacy_checkpoint -> None`.

Behaviour-preserving, mypy-only (no clean CPU-test target for this
cluster-coupled trainer glue). Ratchet 4 -> 3 modules remaining.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Clear the 15 mypy errors in rfd3.inference.legacy_input_parsing and remove
it from the ignore_errors ratchet (3 -> 2 remaining: engine, utils.vizualize).

- 11 implicit-Optional parameter fixes in
  create_atom_array_from_design_specification_legacy (incl. unfix_specific ->
  str | list | None, honest to the body's isinstance branches).
- The Optional fixes cascade union-attr/arg-type errors at exists()-guarded
  sites; since foundry.common.exists is not a TypeGuard it does not narrow, so
  the narrowing-required guards are converted to behaviour-identical
  'x is not None' checks (exists(x) is by definition 'x is not None').
- unfix_residues: list[str] annotation; rewrite the 'unfix_residues, _ = ...'
  unpack to '...[0]' and the side-effecting list-comp to a for loop, dropping
  the '_' that collided with the **_ kwargs parameter.
- Documented cast(str, contig) (contig is always provided or defaulted to the
  non-optional length, which mypy cannot follow through exists()).
- Fix a real reachable NameError: import the missing
  spoof_helical_bundle_ss_conditioning_fn from rfd3.utils.inference (its branch
  is config-gated and would crash today).

mypy-only slice; no behaviour change on production paths. All gates green
(ruff format/check, mypy 152 files, pytest 486 passed / 10 skipped).

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Fix the pre-existing bugs that held the last two models/rfd3 modules on the
ignore_errors ratchet, then remove the rfd3 override block entirely. All of
models/rfd3 now type-checks with no module-level exemptions.

engine.py:
- run(): documented `# type: ignore[override]` + honest return type
  `dict[str, list[RFD3Output]] | None` for the keyword-only override of the
  positional base BaseInferenceEngine.run().
- _multiply_specifications: `exists(self.out_dir)` -> `self.out_dir is not None`
  so mypy narrows it for find_files_with_extension().
- normalize_inputs: early returns + a documented cast for the str-split branch
  (list is invariant).
- process_input: delete the dead/redundant block that re-did normalize_inputs'
  work and called .split(',') on a value that is always a list; narrow the two
  loop guards to `input is not None`.

io.py: widen find_files_with_extension's supported_file_types to `set | list`
(matches its siblings; the caller passes the CIF_LIKE_EXTENSIONS set).

vizualize.py (dev script, imported nowhere): drop the broken import+call of
_add_design_annotations_from_cif_block_metadata (removed in the open-sourcing
refactor, exists nowhere -> ImportError on the .cif path); keep plain-structure
.cif/.bcif loading; document the atomworks C_DIS .as_array() ignore.

Add unit tests for the pure normalize_inputs helper.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Add models/mpnn/src/mpnn to mypy's files and seed a fresh per-module
ignore_errors ratchet for the 10 modules that currently have type errors,
so the gate stays green while they are cleared one slice per task. The
other 14 mpnn modules type-check clean under the lenient baseline and are
now gated. Config-only; no pytest wiring (mpnn's existing test suite is
partly cluster-coupled / library-drift-broken on CPU).

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Add models/mpnn/tests to testpaths with a collect_ignore denylist in its
conftest. Most of the existing mpnn suite loads structures via atomworks
cached_parse (needs a DIGS PDB mirror absent in CI) and is held out, run
locally on the cluster; the CPU-portable files are collected. Fix two
stale numpy-bool identity assertions (np.True_ is True) in
test_polymer_ligand_interface.py so it is green and collectable. Adds 28
mpnn CPU tests to the gate (520 passed, was 492).

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Type-check mpnn.samplers.samplers, mpnn.trainers.mpnn, and
mpnn.transforms.feature_aggregation.user_settings, and drop them from the
ignore_errors ratchet. Annotation/comment-only: narrow the inherited
BatchSampler.sampler to Sampler for set_sampler_epoch + a var annotation;
a documented type: ignore[arg-type] for the DictConfig **-unpack (per the
RF3/RFD3 Loss(**loss) precedent) and type: ignore[override] for the
refined validation_step 3rd param (per rf3); and a type: ignore[misc] for
an atomworks AnnotationList2D row unpack its stub doesn't express.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
Type-check mpnn.transforms.feature_aggregation.token_encodings,
mpnn.pipelines.mpnn, and mpnn.inference_engines.mpnn, and drop them from
the ignore_errors ratchet. token_encodings wraps its TokenEncoding
token_atoms values in np.asarray (matching the field's ndarray type, which
__post_init__ already produces); pipelines.mpnn fixes an implicit-Optional
model_type with a narrowing guard + list() for a tuple arg; the inference
engine narrows _absolute_path_or_none's Optional result and the
atom_arrays/input_dicts guard with documented asserts. Adds
test_pipeline_validation.py (6 CPU tests) for the model_type validation.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
feature_collator: two implicit-Optional default_padding params + a list
var annotation (behaviour-preserving). train: replace the ad-hoc local
CkptConfig with foundry's CheckpointConfig (the type FabricTrainer.fit
actually expects) — CkptConfig lacked parameter_freezing_config, which
fit() reads, so the checkpoint-resume branch would have raised
AttributeError; the swap fixes that latent footgun. Remaining train
errors are third-party stub gaps (atomworks StructuralDatasetWrapper types
dataset_parser as Callable but takes a parser object; torch
WeightedRandomSampler accepts a Tensor despite its Sequence[float] stub) —
documented type: ignore[arg-type].

train.py is a cluster-only training script (unverified in CI); the
CheckpointConfig behaviour improvement is flagged for mpnn owners.

Co-authored-by: lyskov-ai <277346777+lyskov-ai@users.noreply.github.com>
@lyskov-ai lyskov-ai requested a review from woodsh17 July 1, 2026 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants