social share alt icon

A Leading US Bank Lowered Costs and Enabled Savings with Successful Data Migration and Reverse Engineering



A US bank, renowned for their commitment to excellence and unparalleled financial solutions.


The client faced a unique challenge in their data warehousing and application modernization journey. They needed to migrate their ETL jobs and data stores for data processing applications across LOBs, a task made more complex by the lack of documentation and complex business logic across applications.



Mphasis executed a detailed reverse engineering phase for each application data processing rewrite project, including for reconciliation jobs, CCM engine, treatment engine, SAS jobs, data lake, and DW jobs.

Despite the challenges presented by the complex business logic and lack of documentation, we successfully rewrote and tested a variety of data processing jobs originally developed in Ab-Initio and Informatica to Spark.

We leveraged Apache Spark as the data processing framework and Java as the implementation language, deploying the Java-Spark jobs on EMR, AWS, Spark Clusters, Kubernetes, and private cloud.

Through this project, we:

  • Converted all the mapping and business logic in Ab-Initio and Informatica to the framework, ensuring that the converted jobs were efficient and effective.
  • Created detailed documents containing business logic and plans. Leveraging our expertise in Java Spark, we implemented a generic framework for data processing, which allowed us to streamline workflows and enhance performance.
  • Addressed the challenge of longer test cycles caused by multiple job dependencies and keeping the parallel environment in sync to test the converted ETL.
  • Followed industry best practices to migrate data from the application data store to cloud databases (AWS-RDS Postgres) and ensured that data accuracy and completeness were maintained throughout the process.
  • Unit tested the converted jobs in the development environment and performed integration testing in the SIT environment. The results were compared with a scaled-down legacy environment that was kept in sync by feeding the same input files.
  • Built a parallel production environment and kept it in sync with the existing production by performing parallel processing with converted ETLs, ensuring that there were no disruptions to our client's operations.
  • Set in place a detailed cutover plan and cutover testing to ensure successful deployment and production cutover, minimizing the risk of any issues arising during the transition.

Technologies used : AWS Cloud |S3 | EMR| AKS | Kubernetes Private Cloud | Terraform | Jenkins


Future-proof multiple critical applications that had data processing layers by migrating to Spark on AWS and private cloud.

Achieved huge license cost savings by decommissioning Informatica and Ab-Initio and implementing ETLs in Java Spark on AWS Cloud. This helped the client reduce operational costs and reallocate resources to areas that drive business growth.

Reduced compute costs by implementing jobs as step jobs on a serverless EMR. This helped to optimize workflows and enhance performance while reducing costs.

Lower maintenance and management of data processing jobs across LOBs by implementing ETL jobs using a data processing framework, helped streamlined workflows and reduced complexity, allowing the client to focus on their core business.