social share alt icon

Cloud Data Platform Modernization for a Life Insurance Carrier

THE CLIENT

 

A mid-size life insurance carrier.

Business Objective

The client sought to modernize and transform their enterprise Data Ecosystem by building a Cloud Data Platform that serves as a single source of truth for analytics, business intelligence, and predictive use cases across the enterprise. They needed a comprehensive data management strategy that standardizes data collection, storage, processing, and consumption with strong governance, security, and innovation to improve efficiency, quality, and scalable access to data products across the enterprise.

THE SOLUTION

 

Mphasis addressed the client’s requirements by leveraging Databricks on AWS (managed service) as a collaborative, unified analytics workspace, combining data engineering, advanced analytics, machine learning, and visualization. The platform integrates Apache Spark’s performance with AWS’ scalability and reliability to process, analyze, and derive insights from large datasets in near real time, supporting data engineering, ML, and BI workflows.

Architecture Overview:


  • Data Lake on Amazon S3 governed via IAM and KMS; curated layers for bronze/silver/gold data products
  • Orchestration with Databricks Jobs and/or AWS Step Functions & Lambda for event-driven pipelines
  • Databricks clusters for ELT/ETL, feature engineering, ML model training and serving
  • Operational stores on Amazon Aurora PostgreSQL and Amazon RDS MySQL for metadata, catalogs, and app configurations
  • Monitoring & audit with CloudWatch and CloudTrail; security posture via GuardDuty; networking with VPC and Route 53
  • Dashboards and observability with Grafana; governance and best-practice guidance via AWS Trusted Advisor

  • AWS products and services used include - Databricks, Aurora PostgreSQL, Lambda, S3, CloudWatch, KMS, IAM, S3 Data Lake, VPC, Grafana, CloudTrail, Route 53, AWS Trusted Advisor, RDS MySQL, GuardDuty.

BUSINESS BENEFITS

Single source of truth with standardized, governed datasets for analytics, BI, and ML

Faster time-to-insight via reusable components, self-service access, and scalable Databricks compute

Improved data quality, lineage, and compliance through centralized governance and encryption

Cost and performance optimization using elastic compute and right-sized storage tiers

Foundation for predictive analytics and generative AI workloads across business domains