This is the second in a series of articles on what it’s like to have Snowflake as your data warehouse/data lake. I have taught workshops, engaged in many POC’s and worked as a solution architect/administrator on Snowflake engagements. I have found that for the most part people find Snowflake quite comfortable if they come from a traditional Sql database. And yet, there are some concepts that take some time to sink in. This post is how ,with Snowflake, you never have to do capacity planning.
The Snowflake features that allow this are:
On demand compute clusters that you only pay for as you use
Separation of data from computing
One of the perennial challenges of enterprise data systems is properly forecasting capacity. How much storage will be needed…what size server/cluster is needed for the next year or even 5 years. If you estimate too high, you are paying more than you need to. At enterprise scale, those overpayments can be hundreds of thousands or more. Yet, underestimating what you are going to need can be even worse. Needed projects are delayed. Performance suffers. Flexibility of the enterprise to respond quickly to need is hampered.
Snowflake separates data storage from processing. They each can scale independently. Snowflake uses the storage of the cloud provider you choose whether that is AWS, Azure or Google Cloud. The Snowflake customer can keep loading data as needed. As much as is needed. Unless there is a need to grow storage faster than AWS, Azure or GCP is adding storage…there is effectively no limit. Whether you are a startup with a few gigs of data, or a global enterprise with 20 petabytes of data…you only pay for the storage you use. The storage cost is right about the amount that the underlying cloud platform charges. That’s 11 9’s of storage reliability for a very reasonable cost.
On the processing side…again, you only pay for what you use. Let’s consider three use cases:
a new business is acquired, or a new corporate division is added to the Snowflake warehouse
a single bi platform that serves hundreds normally that spikes on an event to need to handle thousands
a data science team is doing a one-month project to explore the use of machine learning on the corporate data, or a new data set that has been acquired
For the first use case, a new compute cluster can be spun up just for the use of the new division. They can even be charged back for the cost of their use of the cluster. There is no equipment to source or install or configure. It’s a few seconds effort to spin up a new compute cluster in Snowflake. One compute cluster does not interfere with another. They can access their own data or the same data and are not slowed down by each other.
For the second use case, Snowflake has a feature called multi-clusters. Additional clusters come online when there are wait states occurring. You can set up a minimum and a maximum amount of clusters. They will come online and automatically suspend as the load requires. As always with Snowflake, you are only paying for the clusters as they are being actively used.
For the third use case, you could spin up a very large cluster needed to crunch through many petabytes of data for a temporary project. The cost of the server might be more than could be justified on a permanent basis but is easily handled with Snowflake. The sudden need of such capacity is easily and economically handled.
As you can see, with Snowflake
No need to source new equipment for a temporary need.
No need to overpay for capacity that isn’t being used.
Have flexibility to respond to changing enterprise needs.
Never run out of storage space or compute capacity.
Of course, while capacity planning is a thing of the past, Snowflake customers still need to budget plan. As Snowflake is among the most cost effective platforms there is, enterprises will find they get far more storage and processing capacity for their money, and unrivaled flexibility.
About the author:
Lee Harrington is a Director of Analytics at Trianz. He applies his 30 years of field experience and gift of communication to help clients in their Digital Transformation journey.
Contact Us Today
Autonomous vehicle technology has been the talk of consumer technology circles for a while now. And rightfully so - it is considered one of the game-changers in the mobility space. It will not only drastically decrease the operational cost of mobility but also bring in multi-fold efficiency in vehicle utilization, parking space demand and even change the way the urban metro landscape is organized.Explore
In modern businesses, data as a resource is nearly as important as the products being sold or the services provided. This is true both inside and outside the technology sectors. A company that does a good job collecting and analyzing data will have the edge when it comes to learning what their customers need, what they are willing to pay for, what type of marketing approach will engage them, and so much more. When it comes to getting the most out of data for an organization, two main concepts need to be understood: data warehousing and data mining.Explore
When a business is looking for a way to store large amounts of data to be used and analyzed in the future, it can consider two major solutions - a data lake or a data warehouse. While it is important to note that an organization can benefit from either of these technological solutions, they are not interchangeable. Understanding what each does, their benefits and drawbacks, and other factors will help ensure you choose the right solution for your specific needs.Explore
In today’s globally competitive business environment, companies must do everything they can to attract customers. Marketing can be broadly defined as the efforts you make to reach out to customers to 1) ensure they are aware of the products or services you offer, and 2) to encourage them to purchase from you. This is a huge business area that can include all types of advertising, search engine optimization, traditional communication campaigns, and much more.Explore
When creating a new data warehousing environment, there is much to consider. One of the most important elements to be aware of is the fact table. The fact table is one of the central tables within a data warehouse that uses a star schema. There are three main types of fact tables, each of which will store a different set of information to be analyzed and used as part of an overall business intelligence strategy.Explore
Data Warehouse as a Service (DWaaS) is a business solution that has been rapidly growing in popularity. It enables companies to secure the information for their data warehouse on cloud infrastructure, making data easily accessible by those who need it but protected from thieves and hackers. Organizations considering this service need to understand its benefits, how it can be integrated into their existing systems, and what options are available to them.Explore