Since the dawn of business, users have looked for three main components when it comes to data: Search | Secure| Share. Now let's talk about the evolution of data over the years. It's a story in itself if one pays attention. Back then, applications were created to handle a set of processes/tasks. These processes/tasks, when grouped logically, became a sub-function, a set of sub-functions constituted a function, and a set of functions made up an enterprise.
Phase 1 – Data-Aware
Let's go back to the evolution days when BI wasn't there as BI; it used to be referred to as "Decision Support Systems", the predominant use case being "Operational Reporting", a term that prevails even today. If one thinks about Operational Reporting, it is a siloed way of looking at data from an observance point of view, for example, an operational report on the number of hours clocked by shift workers during a work day. Let's refer to this phase as being Data-Aware. There is an awareness that the data is available and can be collated with some manual intervention. Other elements associated with timeliness, accuracy, and effort (no automation) to collate the data can be ignored during this phase of data evolution.
Phase 2 - Data-Informed
The second phase involves a little more intricacy wherein data from different applications is brought together, for example, employee data from an HRMS application (custom app) linked to a manufacturing facility (Shift data available in ERP/MRP) to arrive at the cost per shift. When you bring the enterprise data together in this format, a couple of new terms evolve: "Data Warehouse" and reporting is called "Business Intelligence". Various philosophies associated with how the Data Warehouse should be modeled are established, like Bill Inmon/Ralph Kimball philosophies. The data remained in an RDBMS, eventually making way for MPP DBs. Newer tools and technologies emerge in the aggregation space (ETL) and reporting (BI). Concepts around Master Data, Data Quality, and Data Governance emerged. This phase saw data aggregation to solve the primary purpose of information and may be referred to as Data-Informed as the aggregation provided information like gross revenues, margins, etc. The Data remained internal to the Enterprise and was not used for decision making but to keep stakeholders informed.
Phase 3 – Data-Enabled
The next phase saw the emergence of External Data that was business specific, for example, DUNS Number, LexisNexis data, etc., that would help enrich enterprise data like Customer, Product, et al. as also competitive data that would further enhance forecasts, be it sales, demand or margins. This phase also witnessed the emergence of Big Data as the industry saw large complex data sets (Internal & External data) that MPPs and RDBMS couldn't handle. A new term evolved called "Data Lake". Again, multiple philosophies emerged around the data lake concept.
The tools and technologies saw the emergence of newer and faster ELT concepts and AI/ML. The use of AI/ML for Data Quality and early adopters of Data Operations emerged with DevOps being implemented for the first time in data space, giving rise to a Practice around Data Ops and Observability and Action in the data space. In this phase, because of the merging of Internal & External data, business users were enabled to make decisions using data. This phase can be referred to as Data-Enabled, where the emergence of AI/ML for decision making was established from a prescriptive and predictive standpoint.
Phase 4 – Data-Driven
Today, with Digital Transformation & emergence of Cloud as the mainstay from a technology point of view, a new term called Data Product has evolved.
Let's discuss the 3 concepts of Search| Secure | Share a little more to understand the Data Product. Searchability refers to the ability to search data sets within the data eco-system using metadata concepts and the creation of data catalogs. Secured refers to the ability to secure the data in motion and at rest from a few perspectives: PII/PCI/PHI and ABAC, RBAC, and FGA. Shareability refers to the ability to collaborate with other business users based on the chosen consumption channel.
Let's go back to the original thought of why applications were created to further understand the concept of a Data Product. The processes/tasks the application helped with relate to a Data Product. A conglomeration of data products makes a sub-function, and so on. So imagine a scenario where a business user within the Sales and Marketing Domain can look at the various sub-functions and associated data products within a sub-function with the 3 concepts of Search| Secure| Share enabled. When that happens, we can call an Enterprise Data Driven. Hence this phase of Data Evolution can be referred to as Data-Driven.
We are all currently witnessing an era of ever-improving cloud eco-systems and better and improved ELT processes - a Data Ops way of ensuring every aspect of the data eco-system is being observed and either acted upon or logged as learning for an AI/ML that runs as a retrofit job nightly to look for the changes (whether minor or major) in the eco-system from a Data Ops perspective.
Trianz Extrica – Helping Organizations Become Data-Driven
We at Trianz are helping organizations to become Data-Driven by enabling the creation of Data Products, as listed below, through our product Extrica, a native AWS enabled product.
To learn more about Extrica, please connect with [email protected]
Contact Us Today
Finding Hidden Patterns and Correlations Innovative technologies such as artificial intelligence (AI), machine learning (ML) and natural language processing (NLP) are transforming the way we approach data analytics. AI, ML and NLP are categorized under the umbrella term of “cognitive analytics,” which is an approach that leverages human-like computer intelligence to identify hidden patterns and correlations in data.Explore
The Rise in Big Data Analytics According to Internet World Stats, global internet usage increased by 1,339.6% between 2000-2021. With nearly thirteen times as many people using the internet, this has resulted in a massive increase in the amount of data being processed daily. Our increased sharing and consumption of digital media also compounds this increased usage to create an enormous pool of data for big data analytics firms to process.Explore
What Is an SQL Query Engine? SQL query engine architecture was designed to allow users to query a variety of data sources within a single query. While early SQL-based query engines such as Apache Hive allowed analysts to cut through the clutter of analytical data, they found running SQL analytics on multi-petabyte data warehouses to be a time-intensive process that was difficult to visualize and hard to scale.Explore