Skip to content

fix(agentex): redact secret values from all logs via logging filter#294

Merged
danielmillerp merged 1 commit into
mainfrom
dm/redact-principal-context-logs
Jun 9, 2026
Merged

fix(agentex): redact secret values from all logs via logging filter#294
danielmillerp merged 1 commit into
mainfrom
dm/redact-principal-context-logs

Conversation

@danielmillerp

@danielmillerp danielmillerp commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Problem

A request principal context can carry a raw api_key (a long-lived bearer credential). #293 removed the principal context from the authorization-service logs and hardened the request-body scrubber — but that's call-site-by-call-site. The leak originally happened because a perfectly ordinary logger.info(..., principal_context) interpolated a secret-bearing object. Nothing structurally prevents the next such line.

Fix

Add a SensitiveDataFilter to every logger created by make_logger (agentex/src/utils/logging.py). It rewrites the fully-rendered log message before any handler emits it, masking the value that follows any sensitive key name to [REDACTED]:

  • Key names reuse REQUEST_KEY_REGEXP_BLACKLIST (the request-body scrubber's list) — single source of truth: api_key / token / secret / cookie / authorization / acting-user (incl. camelCase apiKey).
  • Handles dict-repr ('api_key': 'x'), kwargs/model repr (api_key='x'), and JSON ("api_key": "x"), preserving quote style.
  • Attached to the logger (not the handler) so redaction runs once, up front, for every downstream handler the record propagates to.

This is defense-in-depth layered on top of #293: call sites should still avoid logging secrets, but if one slips through in the future, the value never reaches log aggregation.

Scope / limitations

  • Scrubs message text only. Structured extra={...} fields are out of scope and should continue to use non-sensitive identifiers (e.g. api_key_id, not the key).
  • Regex is best-effort (may over-redact a benign token: foo); it's a safety net, not a substitute for not logging secrets. The org-wide "all logs everywhere" backstop (Datadog Sensitive Data Scanner) is a separate infra follow-up.

Tests

tests/unit/utils/test_logging_redaction.py:

  • redacts the exact principal-dict shape that leaked; preserves user_id / account_id
  • redacts kwargs / JSON / unquoted / camelCase / token / secret / authorization / cookie renderings
  • leaves non-sensitive text untouched
  • end-to-end: a make_logger logger masks the secret in captured output while keeping safe identifiers

uv run ruff check/format clean; pytest for the new file + #293's logging/request_utils tests = 15 passed.

🤖 Generated with Claude Code

Greptile Summary

Adds a SensitiveDataFilter (attached at the logger level in make_logger) that rewrites rendered log messages to mask values following any key in REQUEST_KEY_REGEXP_BLACKLIST before any handler emits the record. The implementation is defense-in-depth on top of #293's call-site hardening.

  • logging.py: New redact_sensitive_text helper and SensitiveDataFilter class share the existing REQUEST_KEY_REGEXP_BLACKLIST as the single source of truth; the compiled regex handles dict-repr, kwargs, JSON, and Bearer-prefixed authorization headers. The filter is a module-level singleton added idempotently to each make_logger logger.
  • test_logging_redaction.py: 15 test cases cover the exact leaked principal shape, 10 common rendering variants (including camelCase, unquoted, and scheme-prefixed), a non-sensitive text preservation check, an over-redaction boundary test, and an end-to-end make_logger/caplog integration test.

Confidence Score: 5/5

Safe to merge. The filter logic is correct, redaction is idempotent, and the regex handles all common credential-bearing log shapes including Bearer-prefixed headers.

The filter attaches cleanly to the logger before any handler runs, modifies record.msg and record.args in place so both plain and JSON formatters see the redacted text, and the idempotency check prevents double-attaching on repeated make_logger calls for the same name. No functional defect was found in the changed files.

No files require special attention.

Important Files Changed

Filename Overview
agentex/src/utils/logging.py Adds SensitiveDataFilter and redact_sensitive_text; logically correct, idempotent singleton attachment; minor: redact_sensitive_text called outside the try/except guard.
agentex/tests/unit/utils/test_logging_redaction.py New unit tests covering principal dict leak shape, 10 rendering variants, non-sensitive preservation, over-redaction boundary, and end-to-end caplog test; password and acting-user keys not explicitly parametrized.

Fix All in Cursor Fix All in Claude Code Fix All in Codex

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
agentex/src/utils/logging.py:76-86
The `try/except` around `getMessage()` states "Never let redaction break logging itself", but `redact_sensitive_text` is called outside that guard. If `re.sub` somehow raises (e.g., hitting a recursion-depth limit on a pathological log string), the exception propagates through the filter and crashes the calling log statement — the exact failure mode the comment promises to prevent.

```suggestion
    def filter(self, record: logging.LogRecord) -> bool:
        try:
            message = record.getMessage()
            redacted = redact_sensitive_text(message)
        except Exception:
            # Never let redaction break logging itself.
            return True
        if redacted != message:
            record.msg = redacted
            record.args = ()
        return True
```

### Issue 2 of 2
agentex/tests/unit/utils/test_logging_redaction.py:34-50
`password` and `acting-user` are both in `REQUEST_KEY_REGEXP_BLACKLIST` and therefore in the compiled regex, but neither appears in the parametrize list. A future edit that accidentally drops one of those patterns from the blacklist would leave these redaction paths silently broken with no test to catch it.

Reviews (3): Last reviewed commit: "fix(agentex): redact secret values from ..." | Re-trigger Greptile

@danielmillerp danielmillerp requested a review from a team as a code owner June 9, 2026 19:26
Comment thread agentex/tests/unit/services/test_authorization_service.py Outdated
@danielmillerp danielmillerp force-pushed the dm/redact-principal-context-logs branch from 3706ff6 to 510a5e6 Compare June 9, 2026 19:48
@danielmillerp danielmillerp changed the title fix(agentex): redact principal context in authorization logs fix(agentex): redact secret values from all logs via logging filter Jun 9, 2026
Comment thread agentex/src/utils/logging.py
Defense in depth on top of removing the principal context from authz logs
(#293): add a SensitiveDataFilter to every logger created by make_logger so
that if any future code path interpolates an object carrying a credential
(e.g. a principal context dict with an api_key) into a log message, the
value is masked before the record is emitted.

The filter rewrites the fully-rendered message, masking the value that
follows any sensitive key name (api_key / token / secret / cookie /
authorization / acting-user) in dict-repr, kwargs, and JSON renderings. Key
names reuse REQUEST_KEY_REGEXP_BLACKLIST so the message scrubber and the
request-body scrubber stay in sync. Attaching the filter to the logger (not
the handler) means redaction runs once, up front, for every downstream
handler the record propagates to.

Scope: scrubs message text; structured extra={} fields are out of scope and
should continue to use non-sensitive identifiers (e.g. api_key_id).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@danielmillerp danielmillerp force-pushed the dm/redact-principal-context-logs branch from 510a5e6 to 05d2558 Compare June 9, 2026 19:59
@danielmillerp danielmillerp merged commit db35810 into main Jun 9, 2026
30 checks passed
@danielmillerp danielmillerp deleted the dm/redact-principal-context-logs branch June 9, 2026 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant