Skip to content

fix: redact keras h5 lambda previews#1435

Merged
mldangelo-oai merged 28 commits into
mainfrom
mdangelo/codex/fix-keras-h5-lambda-preview-redaction-c032
Jun 6, 2026
Merged

fix: redact keras h5 lambda previews#1435
mldangelo-oai merged 28 commits into
mainfrom
mdangelo/codex/fix-keras-h5-lambda-preview-redaction-c032

Conversation

@mldangelo-oai

@mldangelo-oai mldangelo-oai commented May 30, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Omit raw Keras H5 Lambda source and dict/list bytecode previews while retaining dangerous-pattern findings.
  • Omit malformed Lambda configs and compiler error strings, which can echo attacker-controlled source lines containing secrets.
  • Preserve current main hardening for mixed encoded functions, module references, bounded bytecode analysis, and false-positive-resistant pattern matching.
  • Verify direct source, dict bytecode, legacy list bytecode, malformed syntax, and AST-valid/compiler-invalid source do not leak through JSON or SARIF.

Validation

  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_keras_h5_scanner.py -q (109 passed)
  • PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_keras_h5_scanner.py tests/scanners/test_keras_utils.py tests/scanners/test_keras_zip_scanner.py -q (359 passed)
  • uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
  • uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ (454 source files)
  • Full local xdist suite reached completion without a test failure before the known local parent-process shutdown stall; fresh GitHub CI is the authoritative full-matrix gate.
  • git diff --check

Scan Finding

  • Fixes MA-DSS-C032 from the Codex Security scan.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

@codex review

@github-actions

github-actions Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.286s -> 1.291s (+0.4%).

Workload Benchmark Target Size Files Baseline Current Change Status
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 147.7us 164.5us +11.4% stable
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 183.0us 193.6us +5.8% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 150.0us 154.0us +2.7% stable
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 93.03ms 94.74ms +1.8% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 138.5us 140.9us +1.7% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 446.41ms 452.94ms +1.5% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 364.62ms 360.79ms -1.0% stable
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 109.14ms 110.21ms +1.0% stable
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 98.55ms 97.73ms -0.8% stable
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 115.9us 116.5us +0.5% stable
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 106.58ms 107.08ms +0.5% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 67.07ms 66.84ms -0.3% stable

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8084b686d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/keras_utils.py Outdated
@mldangelo-oai mldangelo-oai marked this pull request as ready for review May 31, 2026 05:36

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 640bc88cc9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/keras_h5_scanner.py Outdated
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Reviewed and updated against current main in 7f8b02df.

Security QA found one additional output leak beyond the existing preview fixes: the AST-derived code_analysis string could serialize an attacker-controlled variadic Lambda argument name into JSON/SARIF. Dangerous H5 Lambda findings now omit that source-derived description and retain a stable code_analysis_omitted marker. The end-to-end regression includes a secret-bearing variadic identifier alongside source literals, marshalled bytecode constants, malformed configs, layer names, and safe/dangerous module references.

Validation:

  • tests/scanners/test_keras_h5_scanner.py: 126 passed
  • focused JSON/SARIF leak regression: passed
  • scoped Ruff format/check: clean
  • scoped mypy: clean
  • all 2 review threads are resolved

The legacy helper test is gated off on this host's Python 3.12 reduced lane; its assertion was updated to the same contract directly covered by the passing H5 regression.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Reviewed, synced with current main, and hardened beyond the original resolved comments.

Key QA/fixes:

  • Replaced attacker-controlled Lambda layer names with deterministic scanner-owned ordinals so opaque, non-credential-shaped values cannot leak through CVE messages, locations, JSON, or SARIF.
  • Omitted raw Lambda module and function_name evidence for both dangerous and safe-reference checks; findings retain status and explicit omission markers.
  • Scoped malformed-source keyword detection to the actual function text with identifier boundaries. Benign metadata such as evaluation_layer or retrieval quality no longer creates a substring false positive, while malformed eval/exec/compile/__import__ source remains detected.
  • Expanded end-to-end JSON/SARIF regressions to cover opaque top-level/config layer names and both dangerous and safe module/function metadata.
  • Merged the latest main; GitHub's prior conflict state resolved cleanly. Both existing review threads remain resolved.

Focused validation only (no full suite):

  • tests/scanners/test_keras_h5_scanner.py plus tests/utils/helpers/test_py_compile_improvements.py: 127 passed, 8 expected Python 3.12 lane skips
  • Scoped Ruff, format check, mypy, and git diff --check: clean

Normalize qualified Lambda class metadata, redact nested artifact-controlled config context, and avoid case-folding Python builtin names in malformed-source checks.
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

QA follow-up on exact head c7bda5a2:

  • Normalized all recognized qualified Lambda classes to the stable Lambda label in layer-count metadata and generic config findings, preventing artifact-controlled class paths from leaking into JSON/SARIF.
  • Kept nested Lambda config context redacted so artifact-controlled config keys cannot leak through recursive finding paths.
  • Made malformed Python builtin matching case-sensitive, avoiding false positives for non-executable names such as EVAL.
  • Added JSON/SARIF regressions covering qualified class paths and nested config keys.

Validation:

  • Focused Keras H5/compiler suite: 128 passed, 8 skipped
  • Full suite: 10,419 passed, 815 skipped
  • Full Ruff format/check: clean
  • Full mypy: 454 files clean
  • uv lock --check: clean

@mldangelo-oai mldangelo-oai enabled auto-merge (squash) June 5, 2026 15:14
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Review follow-up pushed in 096bc04. The H5 Lambda module-reference classifier no longer marks safe modules such as math or numpy critical solely because an unresolved attribute is named eval/system (including uppercase near-matches). Unqualified dangerous builtins and risky module roots such as os/subprocess remain flagged. Validation: 17 Keras utility tests passed, 7 focused classifier cases passed, qualified-Lambda evidence redaction probe passed; scoped Ruff, format, mypy, and diff checks are clean. H5-backed integration remains CI-only locally because h5py is unavailable.

…da-preview-redaction-c032' into mdangelo/codex/audit-pr-1435

# Conflicts:
#	modelaudit/scanners/keras_h5_scanner.py
#	tests/scanners/test_keras_h5_scanner.py
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Audit complete on head 94a51afa. In addition to the existing preview redaction, the combined fixes now: anonymize namespaced Lambda class counts and nested config keys; prevent fake config.layer metadata on Lambda from leaking through wrapper findings; fail closed on malformed scalar function metadata without echoing payloads; keep null placeholders benign; make malformed builtin matching case-sensitive; and avoid treating safe-module attributes like math.eval as dangerous solely by name. Validation on the exact pushed tree: focused union 14 passed; Keras surface 540 passed, 11 skipped before remote integration; Ruff format/check clean; mypy clean across 454 files; full canonical suite 10,309 passed, 858 skipped.

…-pr-1435-c020

# Conflicts:
#	modelaudit/scanners/keras_h5_scanner.py
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Merged current main and resolved the Keras H5 overlap in 46e3c6a. The resolution preserves main’s exact Lambda class allowlist, source/callback allowlists, auxiliary output_shape/mask scanning, nested callable handling, and fail-closed warnings while omitting source-derived code analysis, previews, validation errors, layer names, nested config paths, and module/function evidence. QA also separates generic near-matches such as math.eval and tensorflow.io.system (warning) from real callback bridges such as py_function, PyFunc, and load_op_library (critical). Targeted local validation: 25 passed; H5 integration files were skipped because h5py is unavailable locally; scoped Ruff, format, mypy, and git diff checks are clean. Both existing review threads remain resolved.

@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Adversarial QA follow-up pushed at 390559bf after integrating current main.

Fixed two remaining issues:

  • updated the reduced-lane Lambda helper tests for synthetic, non-artifact-controlled layer names, closing the current Type Check failure
  • contained Python syntax compilation inside a private temporary directory with an explicit bytecode path, preventing valid Lambda source and secret constants from persisting in orphaned /tmp/__pycache__/*.pyc files

Focused validation: 276 associated tests passed with 33 expected reduced-lane skips; scoped Ruff, mypy, and diff checks are clean. All review threads are resolved and the PR is mergeable/approved. Fresh CI is running; rotating without waiting.

Comment thread modelaudit/utils/helpers/code_validation.py Fixed
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

Critical review and remediation complete at 32f4f568 (fix commit dfab222a, followed by a current-main merge).

Review fixes:

  • removed clear-text temporary storage of model-supplied Lambda source and bytecode; syntax validation now uses in-memory ast.parse plus compile
  • retained compiler-only rejection of top-level return, break, continue, and nonlocal without echoing secret-bearing source in errors
  • added reduced-Python-lane routing for the code-validation regressions
  • added a guard that fails if syntax validation calls either temporary-file API
  • stopped the generic Lambda configuration pass from case-folding Python identifiers, preventing benign EVAL and metadata near matches while keeping non-Lambda matching behavior unchanged

All review threads are addressed and resolved. The stale GitHub conflict state was cleared by merging current main; the changelog auto-merge retained all entries.

Local validation after the merge:

  • runnable associated tests: 83 passed
  • focused in-memory validation and Lambda helper tests: 58 passed
  • H5-backed test modules: 2 skipped because h5py is unavailable locally
  • Ruff format and lint: clean
  • mypy on the six affected source/test files: clean
  • diff whitespace check: clean

Fresh CI is running on the pushed head and will exercise the H5 end-to-end cases.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 32f4f5689c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/keras_h5_scanner.py
@mldangelo-oai

Copy link
Copy Markdown
Contributor Author

QA follow-up after syncing current main and addressing the remaining review thread:

  • Fixed the mixed-case module-less callable false negative: subprocess.Popen is critical; benign near-match subprocess.PopenSafe remains warning-only.
  • Made the deep-unary fail-closed regression portable across CPython parser depth differences (Python 3.10 may reject it during syntax validation; newer versions parse it and reject it from the allowlist).
  • Associated Keras Lambda/helper tests: 311 passed, 8 reduced-lane skips.
  • Re-ran the previously failing cache regression directly: 1 passed.
  • Scoped Ruff format/check and mypy: clean.
  • All review threads are addressed and resolved.

Pushed as e9595f3; fresh CI is running.

Copy link
Copy Markdown
Contributor Author

QA refresh after syncing current main at 58fbc9c6:

  • Resolved the single test import conflict by preserving both independent imports.
  • Focused validation: 330 passed, 8 skipped across the changed Keras H5/Lambda and shared code-validation test modules.
  • Targeted Ruff check, Ruff format check, and mypy all pass on the changed source/test files.
  • The prior Windows cache-identity failure was outside this PR's feature surface and is superseded by the newer base.

Updated head: 241ad87edda1f6c52e5ac1640fc002691574dd06. Fresh CI is now running; moving on to the next review without waiting.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 241ad87edd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread modelaudit/scanners/keras_h5_scanner.py

Copy link
Copy Markdown
Contributor Author

Review follow-up pushed in a6ded77d.

The new module-qualified callable finding was valid: io.open could be downgraded to a generic warning. The fix uses a narrow dangerous module/function pair instead of treating every open or system name as critical. Added malicious and benign near-match coverage: io.open is critical; io.open_safe and platform.system remain warning-only.

Scoped validation: tests/scanners/test_keras_h5_scanner.py 250 passed; Ruff lint/format, mypy, and git diff --check are clean. Fresh CI is running.

@mldangelo-oai mldangelo-oai merged commit 9830ef6 into main Jun 6, 2026
29 checks passed
@mldangelo-oai mldangelo-oai deleted the mdangelo/codex/fix-keras-h5-lambda-preview-redaction-c032 branch June 6, 2026 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants