Add --storage_mode preset to optimize AF2/AF3 output footprint#627
Merged
Conversation
Both backends previously wrote substantial redundant data with no way to
prune it without breaking the vanilla AlphaFold layout:
AF2: predicted_aligned_error is stored both inside result_*.pkl (~21 MB
float32 per model) and as the standalone pae_*.json sidecar.
AF3: the top-level *_confidences.json is a byte-identical copy of the best
sample's confidences.json, *_data.json duplicates the saved features
input, and every per-sample confidences.json is large and compressible.
Add a single --storage_mode preset (default 'vanilla', so existing output is
byte-identical to native AlphaFold2/3 and remains a drop-in for downstream
tools):
vanilla - no change (default).
slim - AF2: strip predicted_aligned_error from pickles (kept in
pae_*.json) and xz-compress them; AF3: drop the top-level
confidences/data duplicates and xz-compress non-best per-sample
confidences.json. The best sample's confidences.json is left
plain so AlphaJudge (no xz support) reads best-model PAE directly.
minimal - slim plus: AF2 drops all result pickles; AF3 deletes non-best
per-sample confidences.json. All structures and summary scores
are retained.
pkl contents are unused by AlphaJudge and by convert_to_modelcif (its
pickle.load is commented out; scores come from the JSON sidecars), so slim/
minimal lose nothing those consumers need. Verified on a real plasmodium_hap2
prediction: slim shrinks AF2 254M->148M and AF3 628M->107M, and AlphaJudge
produces byte-identical scores (best and all modes, both backends).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Deleting a non-best AF3 sample's confidences.json is not disk-only data loss: it is the sole source of that sample's full token x token PAE matrix, and once gone AlphaJudge silently falls back to summary_confidences.json (coarse per-chain-pair PAE) for that sample, degrading its PAE-derived scores without any error. So AF3 'minimal' now behaves like 'slim' (non-best confidences are xz- compressed, not deleted); 'minimal' still differs from 'slim' on AF2, where it drops the genuinely-unused result pickles. Paired with the AlphaJudge change to read xz/gz confidences, slim/minimal are now lossless for both --models_to_analyse best and all. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Both folding backends write substantial redundant data with no way to prune it without breaking the vanilla AlphaFold layout:
predicted_aligned_erroris stored twice — insideresult_*.pkl(~21 MB float32 per model) and as the standalonepae_*.jsonsidecar.*_confidences.jsonis a byte-identical copy of the best sample'sconfidences.json,*_data.jsonduplicates the saved features input, and every per-sampleconfidences.jsonis large and highly compressible.This PR adds a single
--storage_modepreset (both backends), defaulting tovanillaso existing output stays byte-identical to native AlphaFold2/3 and remains a drop-in for downstream tools:vanilla(default)slimpredicted_aligned_errorfrom pickles (kept inpae_*.json) + xz-compressconfidences/dataduplicates + xz-compress non-best per-sampleconfidences.jsonminimalconfidences.jsonThe best sample's
confidences.jsonis always left uncompressed so AlphaJudge (which reads best-model PAE from it and has no xz support) keeps working. All structures and summary scores are retained in every mode.Why this is safe
Pickle contents are unused by the downstream consumers:
pae_*.json, AF3 fromconfidences.json/summary_confidences.json, structures from PDB/CIF.convert_to_modelcif.pyalso doesn't read pickle contents — itspickle.loadis commented out; scores come fromconfidence_*.json/ranking_debug.json/pae_*.json, and it only needs the pickle filename (derived fromranking_debug.json).So
slim/minimallose nothing those consumers need.Validation
Verified on a real
plasmodium_hap2prediction directory:bestandallmodel modes, for both backends (only the source-path column differs).Tests
test/unit/test_post_modelling.pycovering AF2 and AF3vanilla/slim/minimal, including the safety fallback that preserves the top-levelconfidences.jsonwhen the best sample lacks its own.test/unit/test_script_entrypoints.pyfixture updated for the new flag.🤖 Generated with Claude Code