Skip to main content

Data Lakehouse on AWS Databricks

Objective

Enable the airline ingest the heterogeneous and diversified data sources into the newly designed data lakehouse and the EDW using Databricks technology, to meet the Enterprise Analytical needs for faster and robust business decision making.

Our Solution

  • Design and development of the Data Lakehouse and the EDW on AWS Databricks platform
  • Data ingestion from heterogeneous sources like Greenplum, Oracle and reading different file types like csv, json, and fixed width (imp) etc.
  • Delta and Hive table creation using spark and spark-sql.
  • Designed 4 data layers vis – Bronze, Silver, Silver-derived, and Gold.
  • De-dup, Cleansing and SCD implementation at Silver and Silver-derived level
  • All required business use case implementation
  • Visualization and Dashboard creation
  • Coforge roles are implementation of the end-to-end solution - from setting up the AWS Databricks environment to Staging, Ingesting, Enriching, Curating to Publishing the data.

Impact

  • Data Lakehouse and Enterprise Data Warehouse for all Enterprise DSS needs
  • Enterprise wide cleansed, de-duplicated and analytics ready data
  • Managed data along with well managed history data
  • Robust Data quality and Data Governance processes provided assurance and improved adoption of the platform
  • Use Case driven actionable roadmap to achieve incremental benefits
  • An enterprise wide futuristic, scalable and robust analytical workbench for impactful and robust business decision making

Let’s engage