The client sells Investment strategies /advice through multiple TAMP platforms. In absence of any in house system, the AUM calculations/Sales commissions were driven based on reports/data supplied by these TAMP platforms. There were inherent problems with data quality, redundancy, non-conformity to structure and at times even missing data. All these resulted in lot of manual work, re-work and mostly incorrect commission calculations, under/over reported AUM and in correct performance measurement of a strategy.
Coforge proposed a solution to develop an in-house system to facilitate following -
The SALES/AUM data is mostly sourced from multiple TAMP systems, while SALESFORCE is the master reference data source for Advisors/Investment consultants and other parties.
The data needed to be first quality checked, and then conformed to common structure for consolidated AUM view at account and platform level. The reference data too needed to be refreshed real time and provide users with capabilities to manage these/override any mappings and run calculations each month to prepare Commission payout reports. The Daily Sales report should be accessible to end users through Tableau web automatically.
Ingestion Layer: In this stage, the data in feed file was checked for data quality, conformance to structure (using the JSON config files), and tolerance.
Exception/Workflow: Any exceptions raised during Ingestion & Transformation were captured and raised in Activiti Workflow for end user to take appropriate action.
Transformation Layer: In this layer the data is transformed i.e. Data is cleansed, mapped for respective reference data, consolidated and finally made available in S3 for production use.
Reporting Layer: This layer designed in Tableau uses Athena to query S3 data to present Daily Sales view & consolidated AUM view.
The above is snippet from architecture diagram that shows how different open-source components are integrated to develop the solution. The WEB UI is designed in house using Angular with middle tier in Spark Java. SPARK-Scala is used extensively in Ingestion layer, while rest of transformation is done using PENTAHO. The reports are served using Tableau Server and web. The infrastructure can be managed using Ambari view i.e., memory management, adding new nodes and monitoring. Overall, the system is easy to manage, on-board new functions and scale up to handle large data volumes.
Big Data – Spark Scala, HDFS, HIVE
ETL - Pentaho
Workflow - Activiti
Data Visualization - Tableau Server and Web
AWS – S3, EC2, ECS, Athena, CI CD
Futuristic goal achieved - In the end, we have a completely in-house system which caters to all requirements of data quality, data consolidation, automatic sales/commission calculation and AUM.
Configurable - The system is fully configurable to on-board new feeds, put Quality checks and make ETL changes with minimal or no development effort.
Low Cost – Given the complexity/data volumes involved and desired futuristic goal, the whole project was developed in fraction cost when compared to cost footprint using Enterprise solutions.
Open Sources & Scalable Solution - The whole project was developed using open-source systems with all offering on AWS cloud and is scalable to cater any data volume.