INCREASED SPEED AND ACCURACY OF PERSONALLY IDENTIFIABLE INFORMATION REDACTION THROUGH AI-DRIVEN APPROACH

Know More

THE CLIENT

 

One of the world's top 20 insurance and reinsurance companies, located in 37 countries.

 

BUSINESS CHALLENGES

Redacting Personally Identifiable Information (PII) data of customers from transaction records is required to comply with global privacy regulations such as GDPR, New York Privacy Act, California Consumer Privacy Act, and to avoid lawsuits and hefty fines. The existing manual redaction had below challenges:

  • Time consuming and prone to errors
  • Rule-based and unable to capture document variations
  • Routinely result in sub-optimal redaction

 

SOLUTION

 

Mphasis enabled the client to replace traditional rule-based redaction approach with AI and ML driven approach. The solution works on unstructured and structured data sources and is customizable to different domains. Our solution included-

  • Setting up a ML pipeline on AWS cloud, using Amazon EC2, to run the approach repeatedly in a scalable fashion
  • Using Amazon Textract to detect text to be redacted from documents such as financial reports, and medical records
  • Using Amazon Comprehend to find insights from the content to be redacted such as relationships in text, key phrases, and named entities
  • Using Amazon Comprehend Medical for extracting complex medical information from unstructured text
  • Building UI using HTML, Flask, and Poppler-Utils to visualize redaction results

 

zoom image

BENEFITS

  • Reduction of human-based manual errors by 60-70%
  • Reduction of manual time and effort in extraction and redaction up to 80%
  • Protection of customer’s personal and sensitive data
  • Conformance to regulatory compliance policies such as GDPR, HIPAA