Details
Description
EXAMPLE using sample redaction rules (fixed string replace for Vancouver, and regex replace for dates):
1. Activity description (input):
----------
Vancouver welfare: $500 per month
Other sources:
2001-10-12 : gift of 10,000
2009-10-03: gift of 500
2010-01-13: gift of 100
-------------
2. Redacted output:
-------------
city_37028 welfare: $500 per month
Other sources:
date_47616 : gift of 10,000
date_47616: gift of 500
date_47616: gift of 100
-------------
All 3 dates are assigned the same code ("47616") - but each distinct match instance should get a DISTINCT redaction code. (If two of the same dates are present in the input stream, then they SHOULD get the SAME redaction code.)
NOTE: This came up because Phys Health wants to use a regext rule that redacts any word starting with an upper case letter. Distinct words need to be assigned distinct redact codes. So when testing the fix, test this specific case. EXAMPLE:
Input:
the ciient works at St. Vincents hospital on Tuesdays.
Output (assuming regex for words starting w/ upper case letter is configured to assign the prefix 'text_'):
the client works at text_09223 text_09412 hospital on text_23565.