Today’s DBA typically manages ten, hundreds or even thousands of databases -- RDBMS, NoSQL DBMS, and/or Hadoop clusters -- often from multiple vendors, both on-premises and in the cloud.
While management automation has made substantial strides enabling DBAs to handle larger workloads, the care and feeding of these databases is still all too often a burdensome task.
Data warehouses, data marts, and data lakes usually require the most attention. Let’s discuss how using Snowflake RDBMS can dramatically reduce the DBA’s workload!
It consists of three distinct layers.
A VDW can operate against any database; it is not allocated to a specific database.
Let’s look at the common tasks and requirements for creation and management.
The initial creation of a database typically requires the following steps. This is neither an exhaustive list nor applicable to every DBMS:
Snowflake’s separation of compute and storage means that ETL/ELT does not run on the same compute as operational DML queries. ETL/ELT processing is completely isolated and self-contained, leaving time for the DBA to work on legacy problems!
Tuning a legacy database platform varies tremendously depending on the platform. Common tuning work includes:
None of the performance tuning discussed previously is required for Snowflake.
Snowflake’s approach to performance tuning is based on the elasticity of its VDW.
For large, complex queries, the solution is to scale up the VDW to a larger size, i.e. more servers in a cluster. Again, this is a simple SQL or web UI action. A VDW can be scaled up (or down) dynamically; this won’t affect running queries, only queries submitted after the size change.
A multi-cluster VDW is used to support high concurrency workload which can be created with a simple SQL or web UI action. This type of VDW may also be “auto-scaled”, allowing dynamic activation and suspension of the number of clusters.
Another key feature for Snowflake performance is the ability to create multiple VDWs, each supporting a different type of workload or business area. The VDWs are completely independent of each other.
Finally, Snowflake’s “micro-partition” architecture eliminates the need for traditional performance tuning. This approach takes partition pruning to a new level, using an incredibly rich meta-data store, enabling both vertical elimination of partitions and horizontal examination only of relevant columns. Once again, leaving the DBA time to address legacy problems!
Director of Analytics
Jeff Jacobs is a senior data technology professional in the Data and Analytics Practice at Trianz. The Data and Analytics Practice works with enterprises to achieve significant competitive advantage via modern cloud technologies, with a particular focus on Snowflake Computing ecosystem.
Contact Us Today
What are the Differences? Though often used interchangeably, data pipelines and ETL are two different methodologies for managing and structuring data. ETL tools are used for data extraction, transformation, and loading. Whereas data pipelines encompass the entire set of processes applied to data as it moves from one system to another. Sometimes data pipelines involve transformation, and sometimes they do not.Explore
One Unified Dashboard In the past, most enterprises would have used a legacy business management system to track business needs and understand how IT resources can fulfill these needs. The problem with these legacy systems is the manual data collection process, which introduces the risk of human error and is much slower than newer automated solutions.Explore
Intelligent automation in the workplace is becoming more relevant in the modern market. As automation technology becomes more refined and smart business models allow business owners to optimize their workflow, more and more are turning to intelligent automation for their internal and client-facing processes alike.Explore
What is a Hybrid Data Center? A hybrid data center is a computing environment that combines on-premise and cloud-based infrastructure to enable the sharing of applications and data across physical data centers and multi-cloud environments. This allows organizations to balance the security provided by on-premise infrastructure and the agility found with a public cloud environment.Explore
Leverage Your Data to Discover Hidden Potential The amount of data in the insurance industry is exploding, and the number of opportunities to leverage this data to achieve large-scale business value has exploded along with it. Rapid integration of technology makes it possible to use advanced business analytics in insurance to discover potential markets, risks, customers, and competitors, as well as plan for natural disasters.Explore
Increased Use of Data Lakes As volumes of big data continue to explode, data lakes are becoming essential for companies to leverage their data for competitive advantage. Research by Aberdeen shows that organizations that have deployed and are using data lakes outperform similar companies by nine percent in organic revenue growth.Explore