Today’s DBA typically manages ten, hundreds or even thousands of databases -- RDBMS, NoSQL DBMS, and/or Hadoop clusters -- often from multiple vendors, both on-premises and in the cloud.
While management automation has made substantial strides enabling DBAs to handle larger workloads, the care and feeding of these databases is still all too often a burdensome task.
Data warehouses, data marts, and data lakes usually require the most attention. Let’s discuss how using Snowflake RDBMS can dramatically reduce the DBA’s workload!
It consists of three distinct layers.
A VDW can operate against any database; it is not allocated to a specific database.
Let’s look at the common tasks and requirements for creation and management.
The initial creation of a database typically requires the following steps. This is neither an exhaustive list nor applicable to every DBMS:
Snowflake’s separation of compute and storage means that ETL/ELT does not run on the same compute as operational DML queries. ETL/ELT processing is completely isolated and self-contained, leaving time for the DBA to work on legacy problems!
Tuning a legacy database platform varies tremendously depending on the platform. Common tuning work includes:
None of the performance tuning discussed previously is required for Snowflake.
Snowflake’s approach to performance tuning is based on the elasticity of its VDW.
For large, complex queries, the solution is to scale up the VDW to a larger size, i.e. more servers in a cluster. Again, this is a simple SQL or web UI action. A VDW can be scaled up (or down) dynamically; this won’t affect running queries, only queries submitted after the size change.
A multi-cluster VDW is used to support high concurrency workload which can be created with a simple SQL or web UI action. This type of VDW may also be “auto-scaled”, allowing dynamic activation and suspension of the number of clusters.
Another key feature for Snowflake performance is the ability to create multiple VDWs, each supporting a different type of workload or business area. The VDWs are completely independent of each other.
Finally, Snowflake’s “micro-partition” architecture eliminates the need for traditional performance tuning. This approach takes partition pruning to a new level, using an incredibly rich meta-data store, enabling both vertical elimination of partitions and horizontal examination only of relevant columns. Once again, leaving the DBA time to address legacy problems!
Director of Analytics
Jeff Jacobs is a senior data technology professional in the Data and Analytics Practice at Trianz. The Data and Analytics Practice works with enterprises to achieve significant competitive advantage via modern cloud technologies, with a particular focus on Snowflake Computing ecosystem.
Contact Us Today
What Is an SQL Query Engine? SQL query engine architecture was designed to allow users to query a variety of data sources within a single query. While early SQL-based query engines such as Apache Hive allowed analysts to cut through the clutter of analytical data, they found running SQL analytics on multi-petabyte data warehouses to be a time-intensive process that was difficult to visualize and hard to scale.Explore
A Winning Base for Successful Digital Transformations When it comes to developing a successful digital strategy, it is not just corporations planning to maximize the benefits of data assets and technology-focused initiatives. The Government of Western Australia recently unveiled four key priorities for digital reform in its new Digital Strategy for 2021-2025.Explore
Engage Your Workforce with a Modern Employee Intranet Solution The employee intranet has changed significantly since it was first introduced in the early 1990s. What started as HTML-based static portals have now evolved into intuitive communication tools complete with search engines, user profiles, blogs, event planners, and more. Today, many organizations are taking a second look at employee intranets to bridge gaps between teams, build company culture, centralize information, increase productivity, and improve workflow.Explore
Adopting emerging cloud technologies, consolidating resources, and improving processes is the key. “IT no longer just supports corporate operations as it traditionally has but is fully participating in business value delivery. Not only does this shift IT from a back-office role to the front of business, but it also changes the source of funding from an overhead expense that is maintained, monitored, and sometimes cut, to the thing that drives revenue,” said John-David Lovelock, research vice president at Gartner.Explore
Deliver Powerful Insights Instantaneously with Federated Queries - No Matter Where Your Data Resides The concept of federated queries isn’t new. Facebook PrestoDB popularized the idea of distributed structured query language (SQL) query engines in 2013. Over the years, AWS, Google, Microsoft, and many others in the industry have accelerated the adoption of a distributed query engine model within their products. For example, AWS developed Amazon Athena on top of the Presto code base, while Google’s BigQuery is based on Cloud SQL.Explore
What is Unstructured Data? Almost 80% of the data that enterprises and organizations collect is unstructured - data without a set record format or structure. Unstructured data includes data such as emails, web pages, PDFs, documents, customer feedback, in-app reviews, social media, video files, audio files, and images.Explore