Practitioners in the data realm have gone through various acronyms over the years. It all started with "Decision Support Systems" followed by "Data Warehouse", "Data Marts", "Data Lakes", "Data Fabric", and "Data Mesh", amongst storage formats of RDBMS, MPP, Big Data, Blob, Parquet, Iceberg, etc., and data collection, consolidation, and consumption patterns that have evolved with technology.
While the underlying need has always been 'Connecting People to Data', technology changes have made practitioners think differently to incorporate 3 facets: Search, Secure, and Share. We are often asked: Do we need to create data lakes or migrate from on-prem to cloud, etc.? While all questions are valid, a few broad thoughts need to be kept in perspective while addressing these questions:
Organizations have invested in tools and technologies over the years, and it can't be ignored
Companies have defined ways of working when it comes to processing, security, deployment, etc
Enterprises are most likely on a transformation journey or going to start one soon. It can range from migration to cloud (either lift and shift or transformation) or modernization or shifting left to adapt to the newer technical practices
Most Enterprises have implemented either a data warehouse (pre-dominantly on-prem) or a data lake and, in some cases, virtualization techniques to create data fabric and, in quite a few cases, have migrated to cloud (again pre-dominantly Lift and Shift) a pattern that is familiar to most practitioners who come across the same scenario Enterprise after Enterprise in which they consult.
This situation is resolved effectively and efficiently by the Data Mesh philosophy. We at Trianz offer Extrica, a configurable Data Mesh Platform built on AWS. The key philosophy around Extrica is Search| Secure | Share. The core features include:
Data is published as a Product -> Data Producer
Data is governed at motion and rest -> Data Producer
Data is owned by the Domain owner -> Data Producer
Data availability everywhere (Searchable, Self-service, Secured)-> Data Consumer
With the above thoughts on Extrica, one may ask if we need to create a data lake/data fabric. We as practitioners have to bear in mind the following:
Cloud Strategy -> Most enterprises are moving to cloud, both from an application and data perspective. In the application space, there is bound to be modernization of the applications, integration tools, etc., post the migration to cloud. This forces enterprises to think about the modernization of the data ecosystem.
Data Strategy -> Most enterprises are moving their on-prem data ecosystems to cloud and realize a lift and shift may only solve the issue temporarily. They must strategize the cloud data strategy that may or may not involve a data lake or data fabric.
Enterprises have invested in Data Governance and Data Security tools. The progress on the implementation of data catalogs, security policies, and role definitions needs to be looked into
Organizations have also invested in Data Virtualization tools trying to implement data fabric but soon realize the associated costs both in terms of compute and maintenance
If one considers it right, the following capabilities are needed in a Data Mesh product to resolve the many thoughts mentioned above as well as serve as a true product that allows the capability of ‘Connecting People to Data’ with ease of creation and collaboration:
Connectivity and Configuration: Leverages data anywhere in any format
Security and Compliance: Data security in motion and at rest
Instant Data Query: Ability to query data anywhere on the fly
Ease of Integration: Open API driven methodology
Automated Data Discovery: Searchability of data products via data marketspace
Federated Governance: Data product owners define governance policies
ML Pipelines for Data Quality and Trust Scores: Self-learning and healing capabilities
Customized Access Control: Attribute-based access control (ABAC)/Role-based access control (RBAC)/ Fine grained authorization (FGA) per policy
Collaboration: Through Slack, Teams, and other channels between users
Metadata Management: Open APIs allow metadata to propagate within the enterprise
At Trianz, based on the implementation experience of Extrica, the following stats have emerged:
In conclusion, Data Mesh allows all the features practitioners have aspired for in a data ecosystem without compromising "Connecting People to Data".
For more details please reach out [email protected]
Contact Us Today
For decades, Windows served as the workhorse of the business world. In recent years, however, a significant transformation has occurred with the rise of cloud infrastructure platforms. Enterprises now realize that legacy on-premises Windows workloads are impeding their progress. Core challenges include licensing costs, scalability issues, and reluctance to embrace digital transformation.Explore
Connecting more people to data has become imperative for organizations worldwide. In Top Trends in Data & Analytics for 2022, Gartner stated, “Connections between diverse and distributed data and people create truly impactful insight and innovation. These connections are critical to assisting humans and machines in making quicker, more accurate, trustworthy, and contextualized decisions while considering an increasing number of factors, stakeholders, and data sources.”Explore
Since the dawn of business, users have looked for three main components when it comes to data: Search | Secure| Share. Now let's talk about the evolution of data over the years. It's a story in itself if one pays attention. Back then, applications were created to handle a set of processes/tasks. These processes/tasks, when grouped logically, became a sub-function, a set of sub-functions constituted a function, and a set of functions made up an enterprise. Phase 1 – Data-AwareExplore
Enterprises have, over time, invested in a variety of tools, technologies, and methodologies to solve the critical problem of managing enterprise data assets, be it data catalogs, security policies associated with data access, or encryption/decryption of data (in motion and at rest) or identification of PII, PHI, PCI data. As technology has evolved, so have the tools and methodologies to implement the same. However, the issue continues to persist. There are a variety of reasons for the same:Explore
Finding Hidden Patterns and Correlations Innovative technologies such as artificial intelligence (AI), machine learning (ML) and natural language processing (NLP) are transforming the way we approach data analytics. AI, ML and NLP are categorized under the umbrella term of “cognitive analytics,” which is an approach that leverages human-like computer intelligence to identify hidden patterns and correlations in data.Explore
The Rise in Big Data Analytics According to Internet World Stats, global internet usage increased by 1,339.6% between 2000-2021. With nearly thirteen times as many people using the internet, this has resulted in a massive increase in the amount of data being processed daily. Our increased sharing and consumption of digital media also compounds this increased usage to create an enormous pool of data for big data analytics firms to process.Explore