fix: redact keras h5 lambda previews by mldangelo-oai · Pull Request #1435 · promptfoo/modelaudit

mldangelo-oai · 2026-05-30T22:09:48Z

Summary

Omit raw Keras H5 Lambda source and dict/list bytecode previews while retaining dangerous-pattern findings.
Omit malformed Lambda configs and compiler error strings, which can echo attacker-controlled source lines containing secrets.
Preserve current main hardening for mixed encoded functions, module references, bounded bytecode analysis, and false-positive-resistant pattern matching.
Verify direct source, dict bytecode, legacy list bytecode, malformed syntax, and AST-valid/compiler-invalid source do not leak through JSON or SARIF.

Validation

PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_keras_h5_scanner.py -q (109 passed)
PROMPTFOO_DISABLE_TELEMETRY=1 uv run pytest tests/scanners/test_keras_h5_scanner.py tests/scanners/test_keras_utils.py tests/scanners/test_keras_zip_scanner.py -q (359 passed)
uv run ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run ruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/
uv run mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ (454 source files)
Full local xdist suite reached completion without a test failure before the known local parent-process shutdown stall; fresh GitHub CI is the authoritative full-matrix gate.
git diff --check

Scan Finding

Fixes MA-DSS-C032 from the Codex Security scan.

mldangelo-oai · 2026-05-30T22:09:59Z

@codex review

github-actions · 2026-05-30T22:11:30Z

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 1.286s -> 1.291s (+0.4%).

Workload	Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64]`	`nested_base64`	98 B	1	147.7us	164.5us	+11.4%	stable
`padded-multi-stream-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload`	`multi_stream_padded`	4.1 KiB	1	183.0us	193.6us	+5.8%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex]`	`nested_hex`	130 B	1	150.0us	154.0us	+2.7%	stable
`warm-cache-rescan`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan`	`release-candidate`	547.3 KiB	32	93.03ms	94.74ms	+1.8%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw]`	`nested_raw`	78 B	1	138.5us	140.9us	+1.7%	stable
`mixed-model-repository`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository`	`release-candidate`	547.3 KiB	32	446.41ms	452.94ms	+1.5%	stable
`duplicate-heavy-registry`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot`	`registry-snapshot`	915.2 KiB	13	364.62ms	360.79ms	-1.0%	stable
`chunked-upload-stream`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream`	`chunked_stream`	278.2 KiB	1	109.14ms	110.21ms	+1.0%	stable
`suspicious-pickle-intake`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake`	`suspicious-intake`	183.8 KiB	4	98.55ms	97.73ms	-0.8%	stable
`direct-malicious-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload`	`malicious_reduce`	52 B	1	115.9us	116.5us	+0.5%	stable
`clean-training-checkpoint`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint`	`safe_large`	278.2 KiB	1	106.58ms	107.08ms	+0.5%	stable
`single-checkpoint-preflight`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load`	`single_checkpoint.pkl`	183.0 KiB	1	67.07ms	66.84ms	-0.3%	stable

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8084b686d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 640bc88cc9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ras-h5-lambda-preview-redaction-c032 # Conflicts: # modelaudit/scanners/keras_h5_scanner.py # modelaudit/scanners/keras_utils.py # tests/scanners/test_keras_h5_scanner.py

…ras-h5-lambda-preview-redaction-c032

mldangelo-oai · 2026-06-05T07:07:57Z

Reviewed and updated against current main in 7f8b02df.

Security QA found one additional output leak beyond the existing preview fixes: the AST-derived code_analysis string could serialize an attacker-controlled variadic Lambda argument name into JSON/SARIF. Dangerous H5 Lambda findings now omit that source-derived description and retain a stable code_analysis_omitted marker. The end-to-end regression includes a secret-bearing variadic identifier alongside source literals, marshalled bytecode constants, malformed configs, layer names, and safe/dangerous module references.

Validation:

tests/scanners/test_keras_h5_scanner.py: 126 passed
focused JSON/SARIF leak regression: passed
scoped Ruff format/check: clean
scoped mypy: clean
all 2 review threads are resolved

The legacy helper test is gated off on this host's Python 3.12 reduced lane; its assertion was updated to the same contract directly covered by the passing H5 regression.

…ras-h5-lambda-preview-redaction-c032

mldangelo-oai · 2026-06-05T11:43:48Z

Reviewed, synced with current main, and hardened beyond the original resolved comments.

Key QA/fixes:

Replaced attacker-controlled Lambda layer names with deterministic scanner-owned ordinals so opaque, non-credential-shaped values cannot leak through CVE messages, locations, JSON, or SARIF.
Omitted raw Lambda module and function_name evidence for both dangerous and safe-reference checks; findings retain status and explicit omission markers.
Scoped malformed-source keyword detection to the actual function text with identifier boundaries. Benign metadata such as evaluation_layer or retrieval quality no longer creates a substring false positive, while malformed eval/exec/compile/__import__ source remains detected.
Expanded end-to-end JSON/SARIF regressions to cover opaque top-level/config layer names and both dangerous and safe module/function metadata.
Merged the latest main; GitHub's prior conflict state resolved cleanly. Both existing review threads remain resolved.

Focused validation only (no full suite):

tests/scanners/test_keras_h5_scanner.py plus tests/utils/helpers/test_py_compile_improvements.py: 127 passed, 8 expected Python 3.12 lane skips
Scoped Ruff, format check, mypy, and git diff --check: clean

…ras-h5-lambda-preview-redaction-c032

…pr-1435

Normalize qualified Lambda class metadata, redact nested artifact-controlled config context, and avoid case-folding Python builtin names in malformed-source checks.

mldangelo-oai · 2026-06-05T15:14:15Z

QA follow-up on exact head c7bda5a2:

Normalized all recognized qualified Lambda classes to the stable Lambda label in layer-count metadata and generic config findings, preventing artifact-controlled class paths from leaking into JSON/SARIF.
Kept nested Lambda config context redacted so artifact-controlled config keys cannot leak through recursive finding paths.
Made malformed Python builtin matching case-sensitive, avoiding false positives for non-executable names such as EVAL.
Added JSON/SARIF regressions covering qualified class paths and nested config keys.

Validation:

Focused Keras H5/compiler suite: 128 passed, 8 skipped
Full suite: 10,419 passed, 815 skipped
Full Ruff format/check: clean
Full mypy: 454 files clean
uv lock --check: clean

mldangelo-oai · 2026-06-05T15:21:07Z

Review follow-up pushed in 096bc04. The H5 Lambda module-reference classifier no longer marks safe modules such as math or numpy critical solely because an unresolved attribute is named eval/system (including uppercase near-matches). Unqualified dangerous builtins and risky module roots such as os/subprocess remain flagged. Validation: 17 Keras utility tests passed, 7 focused classifier cases passed, qualified-Lambda evidence redaction probe passed; scoped Ruff, format, mypy, and diff checks are clean. H5-backed integration remains CI-only locally because h5py is unavailable.

…da-preview-redaction-c032' into mdangelo/codex/audit-pr-1435 # Conflicts: # modelaudit/scanners/keras_h5_scanner.py # tests/scanners/test_keras_h5_scanner.py

mldangelo-oai · 2026-06-05T15:33:48Z

Audit complete on head 94a51afa. In addition to the existing preview redaction, the combined fixes now: anonymize namespaced Lambda class counts and nested config keys; prevent fake config.layer metadata on Lambda from leaking through wrapper findings; fail closed on malformed scalar function metadata without echoing payloads; keep null placeholders benign; make malformed builtin matching case-sensitive; and avoid treating safe-module attributes like math.eval as dangerous solely by name. Validation on the exact pushed tree: focused union 14 passed; Keras surface 540 passed, 11 skipped before remote integration; Ruff format/check clean; mypy clean across 454 files; full canonical suite 10,309 passed, 858 skipped.

…-pr-1435-c020 # Conflicts: # modelaudit/scanners/keras_h5_scanner.py

mldangelo-oai · 2026-06-05T18:02:44Z

Merged current main and resolved the Keras H5 overlap in 46e3c6a. The resolution preserves main’s exact Lambda class allowlist, source/callback allowlists, auxiliary output_shape/mask scanning, nested callable handling, and fail-closed warnings while omitting source-derived code analysis, previews, validation errors, layer names, nested config paths, and module/function evidence. QA also separates generic near-matches such as math.eval and tensorflow.io.system (warning) from real callback bridges such as py_function, PyFunc, and load_op_library (critical). Targeted local validation: 25 passed; H5 integration files were skipped because h5py is unavailable locally; scoped Ruff, format, mypy, and git diff checks are clean. Both existing review threads remain resolved.

…ras-h5-lambda-preview-redaction-c032

mldangelo-oai · 2026-06-05T19:50:57Z

Adversarial QA follow-up pushed at 390559bf after integrating current main.

Fixed two remaining issues:

updated the reduced-lane Lambda helper tests for synthetic, non-artifact-controlled layer names, closing the current Type Check failure
contained Python syntax compilation inside a private temporary directory with an explicit bytecode path, preventing valid Lambda source and secret constants from persisting in orphaned /tmp/__pycache__/*.pyc files

Focused validation: 276 associated tests passed with 33 expected reduced-lane skips; scoped Ruff, mypy, and diff checks are clean. All review threads are resolved and the PR is mergeable/approved. Fresh CI is running; rotating without waiting.

…pr-1435-latest

mldangelo-oai · 2026-06-05T22:45:58Z

Critical review and remediation complete at 32f4f568 (fix commit dfab222a, followed by a current-main merge).

Review fixes:

removed clear-text temporary storage of model-supplied Lambda source and bytecode; syntax validation now uses in-memory ast.parse plus compile
retained compiler-only rejection of top-level return, break, continue, and nonlocal without echoing secret-bearing source in errors
added reduced-Python-lane routing for the code-validation regressions
added a guard that fails if syntax validation calls either temporary-file API
stopped the generic Lambda configuration pass from case-folding Python identifiers, preventing benign EVAL and metadata near matches while keeping non-Lambda matching behavior unchanged

All review threads are addressed and resolved. The stale GitHub conflict state was cleared by merging current main; the changelog auto-merge retained all entries.

Local validation after the merge:

runnable associated tests: 83 passed
focused in-memory validation and Lambda helper tests: 58 passed
H5-backed test modules: 2 skipped because h5py is unavailable locally
Ruff format and lint: clean
mypy on the six affected source/test files: clean
diff whitespace check: clean

Fresh CI is running on the pushed head and will exercise the H5 end-to-end cases.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 32f4f5689c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…pr-1435-latest

mldangelo-oai · 2026-06-05T23:50:06Z

QA follow-up after syncing current main and addressing the remaining review thread:

Fixed the mixed-case module-less callable false negative: subprocess.Popen is critical; benign near-match subprocess.PopenSafe remains warning-only.
Made the deep-unary fail-closed regression portable across CPython parser depth differences (Python 3.10 may reject it during syntax validation; newer versions parse it and reject it from the allowlist).
Associated Keras Lambda/helper tests: 311 passed, 8 reduced-lane skips.
Re-ran the previously failing cache regression directly: 1 passed.
Scoped Ruff format/check and mypy: clean.
All review threads are addressed and resolved.

Pushed as e9595f3; fresh CI is running.

…ras-h5-lambda-preview-redaction-c032

mldangelo-oai · 2026-06-06T01:23:12Z

QA refresh after syncing current main at 58fbc9c6:

Resolved the single test import conflict by preserving both independent imports.
Focused validation: 330 passed, 8 skipped across the changed Keras H5/Lambda and shared code-validation test modules.
Targeted Ruff check, Ruff format check, and mypy all pass on the changed source/test files.
The prior Windows cache-identity failure was outside this PR's feature surface and is superseded by the newer base.

Updated head: 241ad87edda1f6c52e5ac1640fc002691574dd06. Fresh CI is now running; moving on to the next review without waiting.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 241ad87edd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

mldangelo-oai · 2026-06-06T01:33:30Z

Review follow-up pushed in a6ded77d.

The new module-qualified callable finding was valid: io.open could be downgraded to a generic warning. The fix uses a narrow dangerous module/function pair instead of treating every open or system name as critical. Added malicious and benign near-match coverage: io.open is critical; io.open_safe and platform.system remain warning-only.

Scoped validation: tests/scanners/test_keras_h5_scanner.py 250 passed; Ruff lint/format, mypy, and git diff --check are clean. Fresh CI is running.

fix: redact keras h5 lambda previews

8084b68

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread modelaudit/scanners/keras_utils.py Outdated

fix: omit opaque keras lambda bytecode previews

640bc88

mldangelo-oai marked this pull request as ready for review May 31, 2026 05:36

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Comment thread modelaudit/scanners/keras_h5_scanner.py Outdated

fix: omit keras lambda source previews

200d85c

ianw-oai approved these changes May 31, 2026

View reviewed changes

mldangelo-oai added 7 commits June 1, 2026 05:42

fix: omit malformed Keras Lambda configs

a2f9641

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

64491b8

…ras-h5-lambda-preview-redaction-c032 # Conflicts: # modelaudit/scanners/keras_h5_scanner.py # modelaudit/scanners/keras_utils.py # tests/scanners/test_keras_h5_scanner.py

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

cc4c377

…ras-h5-lambda-preview-redaction-c032

fix: redact H5 Lambda metadata evidence

ff3572f

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

776b538

…ras-h5-lambda-preview-redaction-c032

Merge remote-tracking branch 'origin/main' into HEAD

c70f990

fix: redact Keras H5 Lambda analysis evidence

7f8b02d

mldangelo-oai added 2 commits June 5, 2026 11:37

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

4e23a91

…ras-h5-lambda-preview-redaction-c032

fix: omit Keras H5 Lambda metadata evidence

66b5d25

mldangelo-oai added 3 commits June 5, 2026 14:54

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

c94e99f

…ras-h5-lambda-preview-redaction-c032

Merge remote-tracking branch 'origin/main' into mdangelo/codex/audit-…

256e537

…pr-1435

fix: harden keras lambda redaction

c7bda5a

Normalize qualified Lambda class metadata, redact nested artifact-controlled config context, and avoid case-folding Python builtin names in malformed-source checks.

mldangelo-oai enabled auto-merge (squash) June 5, 2026 15:14

fix: scope H5 Lambda function risks

096bc04

mldangelo-oai added 2 commits June 5, 2026 15:25

fix: close Keras H5 Lambda evidence gaps

d1baef7

Merge remote-tracking branch 'origin/mdangelo/codex/fix-keras-h5-lamb…

94a51af

…da-preview-redaction-c032' into mdangelo/codex/audit-pr-1435 # Conflicts: # modelaudit/scanners/keras_h5_scanner.py # tests/scanners/test_keras_h5_scanner.py

Merge remote-tracking branch 'origin/main' into mdangelo/codex/review…

46e3c6a

…-pr-1435-c020 # Conflicts: # modelaudit/scanners/keras_h5_scanner.py

mldangelo-oai added 2 commits June 5, 2026 19:44

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

17c2b57

…ras-h5-lambda-preview-redaction-c032

fix: contain Keras Lambda validation artifacts

390559b

github-advanced-security AI found potential problems Jun 5, 2026

View reviewed changes

Comment thread modelaudit/utils/helpers/code_validation.py Fixed

mldangelo-oai added 2 commits June 5, 2026 22:44

fix: keep Keras Lambda validation in memory

dfab222

Merge remote-tracking branch 'origin/main' into mdangelo/codex/audit-…

32f4f56

…pr-1435-latest

chatgpt-codex-connector Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread modelaudit/scanners/keras_h5_scanner.py

mldangelo-oai added 3 commits June 5, 2026 23:45

Merge remote-tracking branch 'origin/main' into mdangelo/codex/audit-…

70bd6ed

…pr-1435-latest

test: accept version-specific Lambda parse failures

1a63859

fix: preserve Keras Lambda callable severity

e9595f3

Merge remote-tracking branch 'origin/main' into mdangelo/codex/fix-ke…

241ad87

…ras-h5-lambda-preview-redaction-c032

chatgpt-codex-connector Bot reviewed Jun 6, 2026

View reviewed changes

Comment thread modelaudit/scanners/keras_h5_scanner.py

fix: preserve critical io lambda references

a6ded77

mldangelo-oai merged commit 9830ef6 into main Jun 6, 2026
29 checks passed

mldangelo-oai deleted the mdangelo/codex/fix-keras-h5-lambda-preview-redaction-c032 branch June 6, 2026 01:54

Conversation

mldangelo-oai commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Scan Finding

Uh oh!

mldangelo-oai commented May 30, 2026

Uh oh!

github-actions Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Benchmarks

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 5, 2026

Uh oh!

mldangelo-oai commented Jun 6, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

mldangelo-oai commented Jun 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mldangelo-oai commented May 30, 2026 •

edited

Loading

github-actions Bot commented May 30, 2026 •

edited

Loading