Logical Data Warehouse and its relevance in today’s information architecture

By Dipanjan Majumder

Inmon’s “Building the Data Warehouse” first published in 1992, along with Kimball’s “The Data Warehouse Toolkit” in 1996 inspired enterprises globally to embark on data warehousing initiatives – a journey that is still ongoing. But as demand for information increased and sources of data proliferated, achieving an enterprise wide information architecture that encapsulated single source of truth and also ensured flexibility to reflect evolving business realities remained a distant dream for most.

Over the past few decades enterprises have evolved significantly in the way data is used to derive business benefits. From running MIS and providing daily, monthly reporting to leveraging data to derive insights that can impact customer experience and business operations on an ongoing basis, enterprises have journeyed through an evolution of their information architecture.

This evolution has not only impacted the way data is being captured, stored and managed but also the way information is being consumed.

From a usage perspective, two major changes are being observed – data is not only being used for descriptive, predictive or prescriptive analytics but also being monetized which needs mechanisms to share data or insights externally leveraging a data marketplace framework. Secondly, as analytics gets embedded across multiple functions within the enterprise, data from a single source of truth results into multiple versions of truth as business specific transformation is applied.

Key challenges in a DW architecture

While the data warehouse continues to remain as a central core to an organization’s information architecture, as requirement for data and analytics evolve, the repository style architecture in a data warehouse is preventing the flexibility and the agility that is being demanded. Various research conducted on this topic has identified that a range of six weeks to a few months is required to make changes to the warehouse data model. In the current context it becomes difficult for the business to wait for this duration which might render the analysis ineffective.

In the past a combination of high data volumes, diversity in requirement and complexities of data structures have been handled through a combination of multiple data warehouses, data marts, data lakes, multi-dimensional cubes and other different data storage technologies (document database, NoSQL databases and so on). While these approaches helped meet the requirements of the business, over a period of time, the complexity of managing change and ensuring consistency has become a significant challenge.

At Coforge, we believe that enterprises can overcome such challenges and increase trust on data by taking a layered architecture approach to information management.

Multi-layered architecture approach

A data warehouse architecture can be visualized as a multi-layered architecture comprising a systems layer (hardware servers and networks), software layer (database management systems and other tools), storage layer where data is persisted and finally the topmost layer that is used to manage metadata, security, semantics and organize information to facilitate an easy access to data by business users.

What is a Logical Data Warehouse?

A Logical Data Warehouse (LDW) is the top most layer of a well architected data warehouse. It’s an enhancement of existing DW processes and not to be considered as a replacement. Whereas a DW contains a single ontology or taxonomy, an LDW enables the creation of a semantic layer that can contain a multitude of taxonomies – multiple business definitions of the same information. In addition to this, the LDW layer enables access to a multitude of datasets, thereby increasing agility in exposing various datasets for analytical purposes.

The key capabilities of an LDW

  • A modern data services layer capable of connecting to a diverse set of applications and database management systems
  • Ability to connect across on-premise and cloud applications
  • Query optimisation and in-memory parallel processing capability
  • Data Catalog and discovery
  • Data modelling
  • Data Governance and security that could be applied across the entire landscape
  • User experience that could meet the needs of both business and IT

Some of the use cases that are made feasible through an LDW architecture include:

Creating a disposable architecture that enables experimentation as part of an innovation journey. Analysts can quickly validate the efficacy of certain data sets with respect to their analytical models and decide to replace or continue with specific sources of data.

Prototyping prior to integration and physical transfer of data – this allows the project teams to interactively refine requirements prior to making changes to existing DW data models and ensure a rapid deployment. Business users can start leveraging the new data sets while the physical modifications are being carried out in the persistence layer.

The logical layer can continue to support analytical operations while legacy application migration exercise is being carried out. Typically these application migration projects have a long turnaround turn-around time and business analytics typically get disrupted till the new applications reaches stability.

Being a common layer through which access is provided to internal and external systems, the LDW enables the enforcement of a consistent access control and data security policies.

Given our past experience of implementing similar architectures, at Coforge we have brought together a library of similar use cases that can be delivered better by leveraging a logical data warehouse architecture.

In Summary

A multi-layered approach with a well architected logical data warehouse that enables access to all internal and external data sets without enforcing the need to transform and migrate data physically will be critical for enterprises who are faced with challenges around agility to support business analytics. A consistent mechanism to apply data related security and governance controls will help meet the requirements of various data privacy regulations such as GDPR.

When implemented in the right manner, this transformation initiative that has the least disruption to users’ access to information and analytics.