Skip to content

fix(pii): Validate PII regexes for byte-mode compilation#5920

Draft
sentry[bot] wants to merge 1 commit intomasterfrom
seer/fix/pii-byte-regex-validation
Draft

fix(pii): Validate PII regexes for byte-mode compilation#5920
sentry[bot] wants to merge 1 commit intomasterfrom
seer/fix/pii-byte-regex-validation

Conversation

@sentry
Copy link
Copy Markdown

@sentry sentry Bot commented May 2, 2026

This PR addresses an issue where PII scrubbing regexes containing Unicode character ranges (e.g., Korean Hangul) would pass initial configuration validation but fail at runtime during byte-level attachment scrubbing.

The root cause was that relay-pii/src/attachments.rs:apply_regex_to_utf8_bytes recompiles regexes with unicode(false) for byte-mode matching, which does not support Unicode character ranges. However, the PII configuration validation in relay-cabi/src/processing.rs:relay_validate_pii_config (via CompiledPiiConfig::force_compile()) only checked for standard (unicode-enabled) regex compilation.

To fix this, the force_compile() method in relay-pii/src/compiledconfig.rs has been enhanced. It now performs an additional validation step for user-defined Pattern and RedactPair regexes: it attempts to compile them using regex::bytes::RegexBuilder::new(...).unicode(false).build(). If this byte-mode compilation fails, a PiiConfigError::RegexError is returned, causing the PII configuration to be rejected at validation time. This prevents invalid PII configurations from being accepted and deployed, thereby eliminating runtime errors.

Additionally, a comment in relay-pii/src/attachments.rs has been updated to reflect this improved validation, acknowledging that byte-mode compilation errors for user-defined patterns should now be caught earlier.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

Fixes RELAY-5Y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants