In and of itself, data doesn’t hold any special value. Instead, the insights drawn from data analytics uncover information and metadata about events, products and historical facts. And while that has added a lot of value in our analysis, occasionally that data contains confidential aspects such as personally identifiable information (PII data - social security numbers and mailing addresses), protected health information (PHI data - health history and medical allergies), and secure fiscal data. As data engineers, it is our ethical and legal responsibility to protect that information.
Database systems have evolved in the way they share information from traditional methods such as EDI and APIs to database views. Unlike ordinary RDBMS base tables, a view is a virtual table that is usually computed from the database when it is accessed. In general, sharing data via views provides an excellent route for creating dashboards, exporting datasets, integrating with ETL workflows and supporting several business cases. All these data sharing methods, however, lead to higher data accessibility, that could increase the chances of security breaches as the security needs to be set up not only at the database level, but also at the access level.
Snowflake not only supports view-based data sharing but has also revolutionized how organizations distribute and consume shared data. Unlike FTP and email, Snowflake Data Sharing is far easier to use as it provides instant access to live data and eliminates data copying. The Snowflake built for the cloud architecture enables data sharing without any complicated ETL and setup, and more importantly, allows you to control security at a single place - the data level.
The data sharing feature of Snowflake uses its cloud data warehouse and unique multi-cluster, shared data architecture to allow users to quickly setup and govern data shares within minutes to the data consumers, who can view and seamlessly combine it with their own data sources. The real-time secure sharing of data is ready as soon as a data provider adds or updates data for the end consumers.
There are some out-of-the box advantages to this approach such as:
Immediate access: No transformation, data movement, loading or reconstruction is required, and the data is available for immediate use.
Live data: Changes made in real-time by a data provider are immediately available to data consumers without effort, which ensures data remains current.
Secure managed access: A data provider can share all their data to N number of data consumers in a secure, governed manner with minimal effort. Flexibility of controlling all aspects of the managed access increases the ease of use.
No ETL management: Users do not need to manage any ETL for secure data sharing capabilities and operations.
Access Logs: Data providers can track and monitor access to data shares and quickly respond to the user’s actions.
Snowflake’s secure data sharing is an excellent candidate where data monetization, the elimination of data silos and the ease of data management are important criteria for business operations.
For example, take an energy utility company that wants to illustrate the importance of its CRM by demonstrating how to boost customer retention, growth and other steps to justify the cost of their software.
The company can target its top customers with self-reporting metrics/dashboard capabilities that are deployed via secure data sharing, allowing for real-time access to key metrics, as well as creating unique views to meet their customers’ needs.
Shared data exists independently and can be queried along with any other database within a Snowflake environment. Objects’ access is regulated through grants and only objects granted with access privileges are shared with other Snowflake users. No ETL is required to enable this and Snowflake offers both, a guided wizard and the ability to write SQL queries to establish data shares. The latter functionality allows you to automate the grants through a third-party system with SnowSQL, further reducing the time it takes to onboard a customer and ensuing that the grants follow best practices and security protocols.
These high-level steps establish a secure data share:
Create an empty share as a shell
Add privileges for associated objects
Confirm share contents to make any updates
Share the database objects in the share via data consumers
This straightforward process allows users to create secure Data Sharehouses using the power of Snowflake’s architecture. There is enough flexibility to share data with several organizations, sustain data concurrency and share views, tables, etc. without any ETL creation or management.
Snowflake Sharehouses provide a strong, enterprise grade workflow that enables users to spend less time to quickly and securely share their data with high flexibility and spend more time on tapping into powerful insights to uncover the hidden potential of their datasets.
For all your data footprint and migration conversations, you can reach out to us at [email protected].
Director of Analytics Practice
Kireet Kokala is a senior data technologist and high-performance leader in the Data and Analytics Practice at Trianz who helps clients with digital transformation and data monetization. The Data and Analytics Practice works with enterprises to achieve significant competitive advantage via modern cloud technologies, with a focus on the Snowflake Computing ecosystem.
Contact Us Today
What are the Differences? Though often used interchangeably, data pipelines and ETL are two different methodologies for managing and structuring data. ETL tools are used for data extraction, transformation, and loading. Whereas data pipelines encompass the entire set of processes applied to data as it moves from one system to another. Sometimes data pipelines involve transformation, and sometimes they do not.Explore
One Unified Dashboard In the past, most enterprises would have used a legacy business management system to track business needs and understand how IT resources can fulfill these needs. The problem with these legacy systems is the manual data collection process, which introduces the risk of human error and is much slower than newer automated solutions.Explore
Intelligent automation in the workplace is becoming more relevant in the modern market. As automation technology becomes more refined and smart business models allow business owners to optimize their workflow, more and more are turning to intelligent automation for their internal and client-facing processes alike.Explore
What is a Hybrid Data Center? A hybrid data center is a computing environment that combines on-premise and cloud-based infrastructure to enable the sharing of applications and data across physical data centers and multi-cloud environments. This allows organizations to balance the security provided by on-premise infrastructure and the agility found with a public cloud environment.Explore
Leverage Your Data to Discover Hidden Potential The amount of data in the insurance industry is exploding, and the number of opportunities to leverage this data to achieve large-scale business value has exploded along with it. Rapid integration of technology makes it possible to use advanced business analytics in insurance to discover potential markets, risks, customers, and competitors, as well as plan for natural disasters.Explore
Increased Use of Data Lakes As volumes of big data continue to explode, data lakes are becoming essential for companies to leverage their data for competitive advantage. Research by Aberdeen shows that organizations that have deployed and are using data lakes outperform similar companies by nine percent in organic revenue growth.Explore