fix(agentex): redact secret values from all logs via logging filter#294
Merged
Conversation
3706ff6 to
510a5e6
Compare
Defense in depth on top of removing the principal context from authz logs (#293): add a SensitiveDataFilter to every logger created by make_logger so that if any future code path interpolates an object carrying a credential (e.g. a principal context dict with an api_key) into a log message, the value is masked before the record is emitted. The filter rewrites the fully-rendered message, masking the value that follows any sensitive key name (api_key / token / secret / cookie / authorization / acting-user) in dict-repr, kwargs, and JSON renderings. Key names reuse REQUEST_KEY_REGEXP_BLACKLIST so the message scrubber and the request-body scrubber stay in sync. Attaching the filter to the logger (not the handler) means redaction runs once, up front, for every downstream handler the record propagates to. Scope: scrubs message text; structured extra={} fields are out of scope and should continue to use non-sensitive identifiers (e.g. api_key_id). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
510a5e6 to
05d2558
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
A request principal context can carry a raw
api_key(a long-lived bearer credential). #293 removed the principal context from the authorization-service logs and hardened the request-body scrubber — but that's call-site-by-call-site. The leak originally happened because a perfectly ordinarylogger.info(..., principal_context)interpolated a secret-bearing object. Nothing structurally prevents the next such line.Fix
Add a
SensitiveDataFilterto every logger created bymake_logger(agentex/src/utils/logging.py). It rewrites the fully-rendered log message before any handler emits it, masking the value that follows any sensitive key name to[REDACTED]:REQUEST_KEY_REGEXP_BLACKLIST(the request-body scrubber's list) — single source of truth:api_key/token/secret/cookie/authorization/acting-user(incl. camelCaseapiKey).'api_key': 'x'), kwargs/model repr (api_key='x'), and JSON ("api_key": "x"), preserving quote style.This is defense-in-depth layered on top of #293: call sites should still avoid logging secrets, but if one slips through in the future, the value never reaches log aggregation.
Scope / limitations
extra={...}fields are out of scope and should continue to use non-sensitive identifiers (e.g.api_key_id, not the key).token: foo); it's a safety net, not a substitute for not logging secrets. The org-wide "all logs everywhere" backstop (Datadog Sensitive Data Scanner) is a separate infra follow-up.Tests
tests/unit/utils/test_logging_redaction.py:user_id/account_idtoken/secret/authorization/cookierenderingsmake_loggerlogger masks the secret in captured output while keeping safe identifiersuv run ruff check/formatclean;pytestfor the new file + #293's logging/request_utils tests = 15 passed.🤖 Generated with Claude Code
Greptile Summary
Adds a
SensitiveDataFilter(attached at the logger level inmake_logger) that rewrites rendered log messages to mask values following any key inREQUEST_KEY_REGEXP_BLACKLISTbefore any handler emits the record. The implementation is defense-in-depth on top of #293's call-site hardening.logging.py: Newredact_sensitive_texthelper andSensitiveDataFilterclass share the existingREQUEST_KEY_REGEXP_BLACKLISTas the single source of truth; the compiled regex handles dict-repr, kwargs, JSON, andBearer-prefixed authorization headers. The filter is a module-level singleton added idempotently to eachmake_loggerlogger.test_logging_redaction.py: 15 test cases cover the exact leaked principal shape, 10 common rendering variants (including camelCase, unquoted, and scheme-prefixed), a non-sensitive text preservation check, an over-redaction boundary test, and an end-to-endmake_logger/caplogintegration test.Confidence Score: 5/5
Safe to merge. The filter logic is correct, redaction is idempotent, and the regex handles all common credential-bearing log shapes including Bearer-prefixed headers.
The filter attaches cleanly to the logger before any handler runs, modifies record.msg and record.args in place so both plain and JSON formatters see the redacted text, and the idempotency check prevents double-attaching on repeated make_logger calls for the same name. No functional defect was found in the changed files.
No files require special attention.
Important Files Changed
Prompt To Fix All With AI
Reviews (3): Last reviewed commit: "fix(agentex): redact secret values from ..." | Re-trigger Greptile