Why is Snowflake so expensive (2024)

Why is Snowflake so expensive (devgenius.io)
364 points by eyeball on Aug 22, 2022 | hide | past | favorite | 207comments
Why is Snowflake so expensive (1)

cs702 on Aug 22, 2022 | next [–]


Great article. On the surface, it's about Snowflake. At a deeper level, the article is about the perverse incentives motivating SaaS businesses to do seemingly dumb, inefficient things and avoid seemingly obvious optimizations by default.

Many SaaS businesses are perfectly happy to let customers shoot themselves in the foot if it generates more revenue. The BigQuery example (presently, by default, `select * from table limit 10` obediently scans the entire table at your expense!) is spot-on.

As the article so well puts it, every SaaS company has a vested financial interest "to leave optimization gremlins in."

Why is Snowflake so expensive (2)

danielmarkbruce on Aug 22, 2022 | parent | next [–]


It's a terrible article. The author misunderstands competition and how much it drives products in this area. Snowflake is incentivized to make their product better on every dimension. If Snowflake don't improve, customers will leave in droves - like when they moved to Snowflake.

In practice, as has been pointed out in other comments, they do improve their performance (for competitive reasons) and it does cost them money when they do it.... They did it a couple qtrs ago and left $97 mill on the table.

https://www.fool.com/earnings/call-transcripts/2022/03/02/sn...

Why is Snowflake so expensive (3)

rurp on Aug 22, 2022 | root | parent | next [–]


There are many degrees of optimization and clearly there's some cost to bad performance, but Snowflake still has a massive perverse incentive to not spend too much effort on improving performance. If Snowflake is like every software company I've ever been involved with there are many competing projects at any given time and direct revenue impact is a big factor in what gets prioritized.

My own experience with Snowflake absolutely backs up the article's point. At my work we routinely encounter abysmal performance for certain types of queries, due to a flaw on Snowflake's side. We have had numerous talks with them and there is no question that they have an issue, but they have shown absolutely no urgency to fix it. Their recommendation is that we spend more money to work around the problem on their end.

Why is Snowflake so expensive (4)

geoduck14 on Aug 22, 2022 | root | parent | next [–]


>At my work we routinely encounter abysmal performance for certain types of queries, due to a flaw on Snowflake's side.

Do tell! I'm a current Snowflake customer, I'd like to know what to look out for.

Why is Snowflake so expensive (5)

fnordpiglet on Aug 23, 2022 | root | parent | prev | next [–]


Don’t you see this with any cost based query optimizer based product?

Why is Snowflake so expensive (6)

fnordpiglet on Aug 23, 2022 | root | parent | prev | next [–]


It is a terrible article. I’ve been on the engineering side of these big data platforms including snowflake in its early days, Paraccel (redshift’s code ancestor), redshift, and others you probably use but don’t realize are actually hyper scale database engines. The author missed the mark consistently. I chortled when he discussed the redshift WLM which I helped design a very long time ago and it’s absolute garbage. Snowflakes entire point is you can decouple the storage and the database from the warehouse query engine to provide total isolation from noisy neighbors. If you’re encountering noisy neighbors you’re using the product entirely wrong.

And you’re right. The motivation snowflake has to improve is survival. It’s not like their architecture is impossible to replicate. Redshift is doing a total reorganization of the product and rewrite to compete more directly with snowflake (redshift aqua etc).

They also seem to completely discount the value of SaaS outsourcing database and storage operations to snowflake whose only focus is operating the database product. Running your own clusters is an exercise that seems smart in the first few months then like a puppy when it grows up you’re stuck with a dog. If you love dogs and train them well then great. But fact is most people are terrible dog owners, and the same is true for MPP clusters. Being able to focus on the query management operations exclusively is really ideal. Highly stateful distributed products are a PITA.

He also rants about snowflake not telling him the hardware. Snowflake runs in ec2, gcp, azure. You can literally guess the hardware types - there’s just not that many saddle point instance types for that sort of workload. Discussing ssd vs hdd is also an obvious sign of ignorance - it’s basic premise is it does very wide highly concurrent s3 gets and scans of the data using a foundation db metadata catalog to help prune. Being in aws, it’s implausible they use hdd and realistically they could elide ssds (I do not remember if they use local disks for caching, but it’s stateless regardless).

The unit costing being hardware agnostic is totally normal too - they don’t have to expose to you the details of their costing because they normalize it to a standard fictional unit.

Why is Snowflake so expensive (7)

bennyelv on Aug 23, 2022 | root | parent | next [–]


I'm a snowflake customer and I've felt/am feeling all of the pain that this article talks about. There might be some handwaving over technical complexity that you don't like given your detailed understanding of how the thing is built, but the article is fundamentally right in its message.

The thing it's most right about is the power imbalance and the innovators dilemma. I've had more than one instance of the case where we've found that query performance/cost is too high, complained about it, and Snowflake have "made a configuration change" (undisclosed) that has brought the cost down.

Why is Snowflake so expensive (8)

fnordpiglet on Aug 23, 2022 | root | parent | next [–]


Don’t you have the same issue with any query optimized product? If I’m using redshift and hit a bad execution plan that I can’t get around by tweaking the query I’m SOL, and redshift engineers aren’t going to tweak a configuration change to help me.

This is why products like DynamoDB were created - cost based optimizers are imperfect and unpredictable, and once you’ve stepped over some limit or threshold performance wildly changes. The reasons can be your query, or the data has changed, or there’s a noisy neighbor consuming a resource you depend on for your query. If you need highly predictable times you can reason about you won’t get it from any RDB solution.

Given that, what about snowflake feels different? That the details are obscured from you so you don’t understand why things are happening? Is the lack of ability to deeply introspect making you uncomfortable? My experience had been the ability to introspect rarely leads to any change in outcome but instead leads to me identify the query optimizer has done something stupid I can not do anything about, but at least I can point to the specific resource being exhausted by it.

Why is Snowflake so expensive (9)

AdamProut on Aug 22, 2022 | root | parent | prev | next [–]


We regularly benchmark the "big 3" Cloud Data warehouses - Redshift, Snowflake and Big Query at SingleStore. Their performance is very close to the same (within 10-20%) on most benchmarks on reasonable sized data sets (10s of TB).

I agree if the performance of one of them fell behind the others for any prolonged period of time the cost to the laggard in market share would be much much worse then short term revenue gain of "being slow on purpose".

Why is Snowflake so expensive (10)

uoaei on Aug 22, 2022 | root | parent | prev | next [–]


I don't think it misunderstands business competition. In fact it understands the concept of competition very well, and develops an insightful critique into the perverse incentives that are borne from competition.

It benefits no one except for a couple thousand people to so blatantly play their customers in this way. In fact, it's worse, as it incentivizes that same behavior of other market actors in the space.

Why is Snowflake so expensive (11)

danielmarkbruce on Aug 22, 2022 | root | parent | next [–]


What exactly in the article suggest the author understands the pressure of competition on incentives?

The author states that Snowflake are not incentivized to increase performance due to short term revenue concerns but doesn't mention they are also incentivized to do the opposite from a competitive perspective. The result is incomplete enough that it ends up being flat wrong with respect to the behavior that the company actually engages in.

The author missed the fact Snowflake did the very thing he/she suggested they were incentivized not to do, recently, at a cost of $97 million. The CEO explained why they are doing it and how they are actually incentivized. I don't know how the article could miss the mark by more than it has. The company literally does the opposite of what he/she suggested.It's not like they are the only one either, AWS has a history of reducing prices. Why? Once again, competition.

Why is Snowflake so expensive (12)

morelisp on Aug 22, 2022 | root | parent | next [–]


> The CEO explained why they are doing it and how they are actually incentivized.

The CEO explained why he thinks it's a good long term plan... but for now, they get money i.e. are actually incentivized by slow code. The CEO's incentives are theoretical ones.

And the market, which ultimately control whether the CEO gets to continue that plan or not, did not seem to agree it was a good plan.

Why is Snowflake so expensive (13)

danielmarkbruce on Aug 22, 2022 | root | parent | next [–]


By this reasoning, everyone would shirk at work. If you think incentives only act over short time horizons, I don't know how you explain an enormous amount of human behavior.

The market didn't even understand it. Most of the people trading equities, especially around earnings announcements, don't know what a data warehouse is or what matters in that market. All they saw was "miss".

Why is Snowflake so expensive (14)

morelisp on Aug 22, 2022 | root | parent | next [–]


I didn't say the CEO was wrong or that long-term thinking is bad! I said the actual incentives are still misaligned. (I mean, a lot of people do shirk at work, and it even works out well for them.)

I think you have a weird and probably not useful definition of "actual" if "monthly revenue" is not actual but "projected monthly revenue two years from now" is actual. (Or maybe I've just lived in Germany too long.)

Why is Snowflake so expensive (15)

danielmarkbruce on Aug 22, 2022 | root | parent | next [–]


You are right, I've used the word "actual" incorrectly. What I should have said was "net". Ie, both short term and long term revenue incentivize behavior and in this case the net result was increasing performance, ie long term incentive > short term incentive.

Why is Snowflake so expensive (16)

fnordpiglet on Aug 23, 2022 | root | parent | prev | next [–]


I think you’re providing a false dichotomy here. The structure may provide an opportunity to maximize short term profits but there is no reason to believe they, or any one, has to follow that opportunity especially if they rationally believe investing energy and money now has a much higher NPV.

When I read these comments about incentives to screw customers and a naked belief everyone must be, I really wonder who traumatized the authors. There are tons of excellent engineering cultures that prioritize excellence for long term gain. Find a better job.

Why is Snowflake so expensive (17)

didgetmaster on Aug 22, 2022 | root | parent | prev | next [–]


While I think it is definitely in a company's best long term interest to implement features that benefit its customers; it might not be in the best interest of those who are currently running the company.

We have seen many, many examples of executives who are willing to sacrifice the future of the company to get a personal short-term gain. Jack up the revenues (or slash costs) in ways that alienate customers is a great strategy when you plan to jump off with your golden parachute in a couple years when all your stock options vest.

Why is Snowflake so expensive (18)

jjfoooo4 on Aug 22, 2022 | root | parent | next [–]


Sure but to not even mention churn as something Snowflake is worried about is pretty silly. With the funding environment taking a dramatic turn they (and every other SaaS company) are going to be deeply concerned about price competition and churn

Why is Snowflake so expensive (19)

danielmarkbruce on Aug 22, 2022 | root | parent | prev | next [–]


Agreed. But a good article should have shown an example rather than a counter example. Intel might have been a good example. A good article would have shown the competing incentives at play rather than a single incentive.

Why is Snowflake so expensive (20)

hodgesrm on Aug 23, 2022 | root | parent | prev | next [–]


> It's a terrible article. The author misunderstands competition and how much it drives products in this area.

Agree, but the author has one thing right. Snowflake is not transparent about product behavior, which makes it hard to reason about costs and performance.

Open source data warehouses like ClickHouse and Druid don't have this problem. If you want to know how something works, you can look at the code. Or listen to talks from the committers. This transparency is an enduring strength of open source projects.

Why is Snowflake so expensive (21)

danielmarkbruce on Aug 23, 2022 | root | parent | next [–]


Sure but if you want full transparency you don't use Snowflake. They never sold themselves as that.

I wouldn't buy a Ferrari and complain about lack of trunk space.

Why is Snowflake so expensive (22)

hodgesrm on Aug 23, 2022 | root | parent | next [–]


I'm not complaining, of course. It's just an observation. Snowflake is very similar to Oracle in that respect, which is not surprising given where the founders came from.

Personally I think Snowflake is very impressive on the things they optimize for, which includes complex queries on enterprise data sources. The same could be said for BigQuery.

Why is Snowflake so expensive (23)

mr_toad on Aug 23, 2022 | root | parent | prev | next [–]


> The author misunderstands competition and how much it drives products in this area.

Snowflake compete on marketing.

Plenty of people rave about Snowflake and have never heard of Databricks, BigQuery or Redshift.

Why is Snowflake so expensive (24)

simo7 on Aug 22, 2022 | root | parent | prev | next [–]


The main flaw of the article is not controlling for product category.

I suspect most data warehouses have similar NDRs.

In many companies a data warehouse is the place where you dump all your data and let everyone run poorly written programs against it.

Add to that poor engineering culture in data teams (often lead by non-technical people) and costs are bound to skyrocket.

Why is Snowflake so expensive (25)

danielmarkbruce on Aug 24, 2022 | root | parent | next [–]


> In many companies a data warehouse is the place where you dump all your data and let everyone run poorly written programs against it.

Hilariously accurate description of a data warehouse.

Why is Snowflake so expensive (26)

scarface74 on Aug 22, 2022 | parent | prev | next [–]


Standard disclaimer: I work at AWS in consulting and could easily be accused of drinking the Kool Aid.

Everyone from consultants, SAs, Sales, support etc is constantly working toward getting customers to “optimize” their spend. Of course any business wants you to give them more money. But, none of us are pushed to get them to spend money on services or methods to do things inefficiently.

I specifically work in consulting specializing in “application modernization”. That means most of my implementations are cheap and I’m constantly spending time making sure my implementation is cheap as possible and still meet the requirements. I first noticed this attitude from AWS when I was working for a startup.

This isn’t just with AWS. I spent years working in enterprise shops and saw the same attitude working with Microsoft.

I can’t speak for any other large organizations - AWS and Microsoft are the only two I’ve worked with as either a customer or employee where there was huge spending on infrastructure or software.

Now I could easily get started about my opinion of Oracle from the customer standpoint. But I won’t.

Why is Snowflake so expensive (27)

spmurrayzzz on Aug 22, 2022 | parent | prev | next [–]


Well said. I'd also add a cynical note that the recurring revenue model is incentivized to keep the gremlins around not just because of the impact to metered costs, but also because off-ramping is that much more difficult once engineers implement workaround/solutions to mitigate the impact of those smells.

Just another way that vendor lock-in occurs (intentionally or otherwise).

Why is Snowflake so expensive (28)

makk on Aug 22, 2022 | parent | prev | next [–]


> As the article so well puts it, every SaaS company has a vested financial interest "to leave optimization gremlins in."

It depends on the time scale. A SaaS optimizing for, say, a 1-3 year financial return will see their interests through a different lens than one optimizing for a multi-decade return. Leaving optimization gremlins in isn't aligned with customers' interests in the long run, so the customers will eventually find alternatives if the SaaS doesn't eventually align itself with customers.

Why is Snowflake so expensive (29)

smugma on Aug 22, 2022 | root | parent | next [–]


"As an investor, I expect Snowflake to show amazing profitability and record-breaking revenue numbers. As an Engineer, if Snowflake continues on the current path of ignoring performance, I expect them to lose share to the open-source community or some other competitor, eventually walking down the path of Oracle and Teradata. Here are a few things I think they can do to stay relevant in five years."

Why is Snowflake so expensive (30)

danielmarkbruce on Aug 22, 2022 | root | parent | next [–]


The point is incentives.

Why is Snowflake so expensive (31)

twistedpair on Aug 22, 2022 | parent | prev | next [–]


FWIW, BigQuery tables can be configured to require a partition filter clause [0] in the SQL query, so that you cannot shoot yourself in the foot like that. Now if they'd just make an Organization Policy to let you turn it on by default for all new tables.

[0] https://cloud.google.com/bigquery/docs/querying-partitioned-...

Why is Snowflake so expensive (32)

cs702 on Aug 22, 2022 | root | parent | next [–]


Yes. That's exactly the OP's point: It's up to you to remember to do the extra work necessary to avoid shooting yourself in the foot by default.

Why is Snowflake so expensive (33)

twistedpair on Aug 23, 2022 | root | parent | next [–]


Depends who the "you" is; someone just getting started with Cloud, or a savvy enterprise operator?

GCP has sided with an "easy out of the box experience". For example, a new project has a "default" network with some permissive firewall rules. A savvy operator wouldn't build things this way, but for a first time user, the cloud is daunting and a JustWorks™ experience gets them moving quickly (e.g. so they can SSH into their VMs easily).

Now, once you've gotten your feet under you, and want to build a solid cloud setup, you'll add Organization Policies [1] like "Skip default network creation", and all new Projects will be completely closed off from the web by default, at the cost of all networking being more complex. Once you're ready for this, turn it on.

So, how should a SaaS database work? Should you have to learn all the intricacies of sharding, partitioning, indexing, SELECT, FROM, HAVING, FULL/INNER/OUTER JOIN, WHERE, GROUP BY, LIMIT, before you write your first query? This is a long standing yin/yang question of product UX. What user persona and UX do you design for on the experience and complexity spectrum.

[1] https://cloud.google.com/resource-manager/docs/organization-...

Why is Snowflake so expensive (34)

itsdrewmiller on Aug 23, 2022 | root | parent | next [–]


So they have a system for enforcing rules but still haven't built the rule that would reduce their revenue - seems like an example in favor of the article.

Why is Snowflake so expensive (35)

deepGem on Aug 22, 2022 | parent | prev | next [–]


Wow their statement about not participating in benchmarking wars is alarming. In this day and age, when benchmarking tools are so inexpensive and almost everything is very transparent, why not participate.

Or even better engage with a neutral third party such as Jepsen to get on an even playing field and duke it out.

Why is Snowflake so expensive (36)

datavirtue on Aug 22, 2022 | root | parent | next [–]


Because their business is providing a solution that IT failed to. Despite the large cost, which the business was already accustomed to from previous IT attempts, pales in comparison to the additional costs of doing it themselves.

It's like the cloud in general, the cost is high but so is the hype. When all that dust settles over the coming years the business will start shopping on price. They will then realize they have been locked in to some extent and will need to start wriggling loose of the lock-in.

Why is Snowflake so expensive (37)

hodgesrm on Aug 24, 2022 | root | parent | prev | next [–]


> Wow their statement about not participating in benchmarking wars is alarming.

I found the Snowflake statement pretty reasonable. [0]

Vendor benchmarks are largely propaganda. What actually counts is performance on real-world workloads, starting with your own. Plus good bencharks are costly to do well. If vendors are going to invest in load testing, it's way better to do it as part of the QA process, which directly benefits users. The other thing for vendors to do is to drop DeWitt clauses so others can run benchmarks and share the results. Snowflake announced this in the statement and also changed their acceptable use policy accordingly. [1]

[0] https://www.snowflake.com/blog/industry-benchmarks-and-compe...

[1] https://www.snowflake.com/legal/acceptable-use-policy/

Disclaimer: My company runs a cloud service for ClickHouse that competes against Snowflake.

Why is Snowflake so expensive (38)

lokar on Aug 22, 2022 | root | parent | prev | next [–]


Benchmark results rarely predict actual application perf. You need to run your own queries against your own data. Do a real POC.

Why is Snowflake so expensive (39)

danielmarkbruce on Aug 22, 2022 | root | parent | prev | next [–]


Because their value prop isn't being #1 on benchmarks. It's about

* being easy to manage* being able to scale up and down compute so you can get good performance without having to keep a bunch of machines running.

Why is Snowflake so expensive (40)

tluyben2 on Aug 22, 2022 | parent | prev | next [–]


Funny that most people here advocate aws while they have tons and tons of foot shooting tools that cost people 1000s of usd all the time. And we just accept it. Like if you want to kill a complex cluster with one api call or button click, it won’t let you for xyz; that’s not because they cannot, it’s because you will just let it be and that makes money.

Why is Snowflake so expensive (41)

thehappypm on Aug 22, 2022 | parent | prev | next [–]


I worked at a BiqQuery shop and they have a terrific feature where right next to the “Run query” button there is an estimate of the cost of the query, in bytes. It becomes extremely obvious when a query is a full table scan.

Why is Snowflake so expensive (42)

philjohn on Aug 23, 2022 | root | parent | next [–]


Ha! I wonder if we worked at the same place ... it's in the travel space, because when I worked there someone wrote a plugin that did this, and it was a real eye opener at times!

Why is Snowflake so expensive (43)

thehappypm on Aug 23, 2022 | root | parent | next [–]


Nope, not travel! For us it was a built in feature, not a plugin. This was earlier this year also.

Why is Snowflake so expensive (44)

Aulig on Aug 22, 2022 | parent | prev | next [–]


It feels like these companies haven't found the right value metric to price along. Ideally it should align with the value the customer receives.

Why is Snowflake so expensive (45)

polskibus on Aug 22, 2022 | root | parent | next [–]


Only competition can enforce this. The article ideally demonstrates the problems with monopolies and vendor lock-in.

Why is Snowflake so expensive (46)

danielmarkbruce on Aug 22, 2022 | root | parent | next [–]


Snowflake is nowhere near a monopoly, and plenty of customers have moved from other vendors (Teradata, Netezza, etc) to Snowflake - showing that vendor lock-in is not as strong as it might seem.

Why is Snowflake so expensive (47)

polskibus on Aug 23, 2022 | root | parent | next [–]


If that's the case, then why aren't Snowlake and Google targeting query optimiziation with higher priority to lower end-user costs? There's no incentive in the market for them to do so - once you switch to them, you'll eat up the cost of quirks and learn how to avoid them the hard way.

Why is Snowflake so expensive (48)

danielmarkbruce on Aug 23, 2022 | root | parent | next [–]


They are. See other comments about snowflake leaving $97m on the table recently, doing exactly that.

Why is Snowflake so expensive (49)

carimura on Aug 22, 2022 | root | parent | prev | next [–]


Close. Product pricing is based on a variety of perceived factors (value, cost of change, risk of loss, etc.)

Why is Snowflake so expensive (50)

altdataseller on Aug 22, 2022 | root | parent | prev | next [–]


But that's almost impossible to measure by Snowflake. How would they know how much more revenue you earned because you use Snowflake?

Why is Snowflake so expensive (51)

Rastonbury on Aug 22, 2022 | root | parent | next [–]


I don't think their customers could quantify it if they tried (and i'm not implying Snowflake doesn't give value, it probably does but how does a company attribute it)

Why is Snowflake so expensive (52)

ed25519FUUU on Aug 22, 2022 | parent | prev | next [–]


> The BigQuery example (presently, by default, `select * from table limit 10` obediently scans the entire table at your expense!) is spot-on.

This bit me on big queries Public patent search, which I was just noodling with for fun. Each query was $4. Ow!

Why is Snowflake so expensive (53)

dcow on Aug 22, 2022 | parent | prev | next [–]


I was thinking about this too. Why don’t SaaS companies just force price increases to offset their broken pricing model? Nobody would care, you’re paying the same you were paying yesterday. If you’re still the best in class product with sticky features people will stay. If not and you’re competing, then you have the opportunity to reduce the price in the future or simply not increase it and let users see lower bills which might also retain them.

Why is Snowflake so expensive (54)

kolinko on Aug 22, 2022 | parent | prev | next [–]


in case of BigQuery it makes sense though - they use map reduce on distributed clusters, so there is no easy way to stop after 10 results are found

Why is Snowflake so expensive (55)

JimmyAustin on Aug 22, 2022 | root | parent | next [–]


It's pretty easy to limit the number of results returned by each partition to by limited to 10, then have that further reduced to 10 total during the reduce step.

Why is Snowflake so expensive (56)

cyanydeez on Aug 23, 2022 | parent | prev | next [–]


One more deeper level: almost all consulting exists in a world of consultants driven to limit efficiency lest their billables decline. I know a few people who seemed to aggregate their entire personality to "hard worker" when they refused to progress.

Why is Snowflake so expensive (57)

scarface74 on Aug 23, 2022 | root | parent | next [–]


It depends. Many large companies have internal “Professional Services” departments with “consultants” who are full time employees.

Standard disclaimer: I work in ProServe at AWS.

When you “consult” and are employed by the company selling the software, billable hours and utilization is not the be all end all. Consulting is just the “nose of the camel in the tent”. They want you to be as efficient as possible so they can make ongoing revenue.

Trust me, AWS is not going to complain if it only took me 20 hours to do work that was estimated for 40 and brings in half as much consulting revenue if it means ongoing revenue from the customer.

There isn’t just a singular focus on utilization rates.

Why is Snowflake so expensive (58)

aiisjustanif on Aug 23, 2022 | root | parent | prev | next [–]


That’s one massively vague take on the whole industry of consulting, including on-prem software, open-source solutions.

My billable hours do fine while making operations more efficient and cost less.

Why is Snowflake so expensive (59)

soheil on Aug 22, 2022 | parent | prev | next [–]


If that lowers the barrier to entry without having expert level knowledge to know what a full table scan even means why not? Instead of hiring a dba maybe you could hire an intern instead and happily eat the cost of Snowflake.

Why is Snowflake so expensive (60)

kalimoxto on Aug 22, 2022 | root | parent | next [–]


I think the point of the article is that an optimizer doesn't affect the barrier to entry at all, but adding it would save end users quite a bit of money. So they don't do it because end users' money is revenue for Snowflake/Alphabet

Why is Snowflake so expensive (61)

soheil on Aug 22, 2022 | root | parent | next [–]


If you could just add an optimizer why doesn't the db engine just do that?

Why is Snowflake so expensive (62)

whimsicalism on Aug 22, 2022 | root | parent | next [–]


Take a step back and reread the article and the comments you are replying to.

Why is Snowflake so expensive (63)

florbo on Aug 22, 2022 | root | parent | prev | next [–]


It doesn't lower barriers to entry, it's contrary to logical expectations for someone unfamiliar with how BQ works. If the query is limited to 10 results you wouldn't expect it to scan all 2 trillion of your records. Granted there are numerous warnings in the GUI for these types of things but make this mistake in Python and you're none the wiser.

Why is Snowflake so expensive (64)

soheil on Aug 22, 2022 | root | parent | next [–]


Wait are you saying the BQ db engine is not following logical expectations? You do realize a "limit" clause doesn't prevent a full table scan in all cases, right?

Why is Snowflake so expensive (65)

horsawlarway on Aug 22, 2022 | root | parent | next [–]


and that db expert you just recommended against hiring could surely tell you that... The intern won't.

Why is Snowflake so expensive (66)

twawaaay on Aug 22, 2022 | prev | next [–]


Snowflake is not expensive. Snowflake is super cheap, IF you know what it is for and how to use it. Compared to if you had to solve the problem on your own.

The best way to describe Snowflake is that it is a brute force method to run complex queries without creating indexes.

If you have a more traditional database, you will notice you need to set up indexes to be able to get anything from it in finite time. What if you don't know the indexes upfront? What if you want your users to be able to ask arbitrary queries and get answers before bedtime?

That's what Snowflake is for. It automates using ENORMOUS amount of hardware to get your query executed fast, very inefficiently.

It is not for free though. That inefficiency will cause a lot of resources used for queries. It is meant for those few queries when your users try to get some insight into your data and you can't predict indexes beforehand. Sometimes this is exactly what you want, like when you let your data people in to figure stuff out. Or when you have very rare functionality that allows the user to build their own queries -- which you should avoid like hell (and there are tricks to make it index pretty well) but can't always avoid.

For everything else, whenever you can predict your indexes, you always want to use more traditional database that can be very efficient on queries properly supported by indexes.

The issue is a lot of people try to use Snowflake as a database or to support frequently executing queries of the same kind. This is bad and it will cost you.

Why is Snowflake so expensive (67)

zurfer on Aug 22, 2022 | parent | next [–]


It is fair to critize that some workloads on Snowflake are expensive.

What I found however is that Snowflake is indeed super cheap if we look at Total Cost of Ownership (TCO). Compared with other cloud data warehouses it is even easy for to cost control (warehouse size with autosuspend and resource monitors).

I work with many Snowflake customers and the biggest cost they are concerned with is usually training users so they don't shoot themselves (wrong joins, external programs "pinging" the service, ...).

Snowflake is mainly expensive because of usage, not because of bad query optimization.

(Co-Founder at https://www.sled.so/)

Why is Snowflake so expensive (68)

JustLurking2022 on Aug 22, 2022 | parent | prev | next [–]


Honestly, in the financial world, I think the value proposition may be less about anything to do with the query capabilities and more about the permissions model. Making it simple to provide clients with visibility into their data in a structured way that doesn't involve shoveling around text files (with numerous formatting gremlins to worry about) is a huge win in and of itself.

Why is Snowflake so expensive (69)

ssalka on Aug 22, 2022 | parent | prev | next [–]


> The issue is a lot of people try to use Snowflake as a database or to support frequently executing queries of the same kind. This is bad and it will cost you.

It seems totally natural to expect these use cases to be well-supported & cost-efficient. That they're not I think is likely to be misunderstood by a great many people, even technical folks.

Why is Snowflake so expensive (70)

danielmarkbruce on Aug 22, 2022 | parent | prev | next [–]


Materialized views help with this. It might not be perfect, but it isn't as bad as you say.

Why is Snowflake so expensive (71)

deep_red on Aug 23, 2022 | parent | prev | next [–]


Very clear-minded take on what Snowflake is great and not so great for. Snowflake is great and cheap for what it's meant for. It gets expensive when you try to use it for something it was not designed for.

Why is Snowflake so expensive (72)

stassajin on Aug 22, 2022 | prev | next [–]


I'm the author of the article. Didn't expect it to blow up. Let me clarify a few points:

1. I like Snowflake and I think they brought several innovations to the field: Instant scale out/up, time-travel, unstructured data query support. 2. Snowflake obviously makes innovations and performance improvements, otherwise they would not be the market leader they are. But I'm also suspecting that they make just enough performance improvements to be at par and then use the vendor lock in features to make switching hard.

My argument is that their rate of performance innovation has considerably gone down and DataBricks, Firebolt, and open source alternatives just seem more attractive from a cost/performance ratio. I agree that Snowflake is still the best data-warehouse to start with if you have 100k, but not if you truly plan for a multi-year horizon and your usage expands.

- Redshift also brought a lot of innovation that allowed people to execute analytical queries 100x-1000x faster than any OLTP that existed out there. I've used Redshift for four years and they kept ignoring performance and features until Snowflake came out. All of a sudden because of competitor pressure, they put more effort into the product to maintain and gain market share. My hope is that Snowflake finds a solution to their innovator's dilemma, since competitors are hot on their tails.

- Some people point out that 70% usage growth just shows that Snowflake is useful. Nobody disagrees with that. The issue is that majority of the companies don't experience a 70% revenue growth to catch up with the growth in costs. At some point, you have to clamp down on costs, which means that you have to look for alternatives to run things more efficiently.

Why is Snowflake so expensive (73)

mejakethomas on Aug 22, 2022 | parent | next [–]


Totally agree with Redshift sentiments. It's been lovely seeing BigQuery and Redshift step their game up over the past 1.5yrs, because they really should have been doing certain things for many years prior.

Re: Firebolt, I don't consider it to be in the same class as Snowflake whatsoever (even though their advertising seems to indicate otherwise). Snowflake is like a very powerful swiss army knife. Firebolt is good for a very specific (dare I say niche?) workload but falls all over itself for the vast majority of data org needs.

Why is Snowflake so expensive (74)

mr_toad on Aug 23, 2022 | root | parent | next [–]


> Firebolt is good for a very specific (dare I say niche?) workload but falls all over itself for the vast majority of data org needs.

It runs SQL queries on structured data. Is that niche?

Why is Snowflake so expensive (75)

evtx on Aug 23, 2022 | parent | prev | next [–]


Stas: "The issue is that majority of the companies don't experience a 70% revenue growth to catch up with the growth in costs"

I think you are misunderstanding something very fundamental here. Snowflake has usage pricing and no one is forcing companies to use Snowflake 70% more every year. In my experience, companies are typically evaluating spend on other platforms and after some testing, moving additional workloads there to displace cost elsewhere. Let's say your Snowflake bill was $100k and you were unhappy with your your security data lake provider and replace a $1M bill there with $200k of Snowflake. Your Snowflake bill has now increased 200% to $300k, but you are still $800k ahead overall. In other words, your existing workload (the original $100k) didn't get more expensive.

I've worked in data warehousing for a lot of years now and stepping back, I guess I don't understand what you are trying to accomplish here. I certainly think everyone should take a "trust but verify" approach with their vendors but honestly, I don't think you proven your case, especially since you appear to complete ignore the competitive reality these vendors live in. Beyond that, I don't think "speeds and feeds" are the most important improvements going on with these platforms at the moment. Check the monthly release notes:

BigQuery: https://cloud.google.com/bigquery/docs/release-notesDatabricks: https://docs.databricks.com/release-notes/product/index.htmlSnowflake: https://docs.snowflake.com/en/release-notes.html

Performance is important but it doesn't exist in a vacuum. What percentage of features in the past two months for each of these platforms relate to performance? On the flip side, how much does your company spend on things like data governance? How much would a data breach cost? How many people maintain the platform? What do pipeline failures cost? How is connectivity to other solutions your company uses?

If you look at where innovation is happening (and this is a VERY interesting space these days), the bulk of improvements are in areas arguably more important to companies. BigQuery has added migration improvements, Databricks has added Photon and Unity Catalog improvements, Snowflake has added Java and Python stored procedures. The list is miles long for all of these vendors and I challenge anyone in the space to keep up with everything.

Another comment here said all of these vendors are within 10-20% performance of each other. If that is true, in my opinion you're focused on a problem that is an edge case at best. Something to watch, but not nearly as interesting or as impactful as the rapid pace of innovation across this space in all areas. IMHO.

Why is Snowflake so expensive (76)

stassajin on Aug 23, 2022 | root | parent | next [–]


"In my experience, companies are typically evaluating spend on other platforms and after some testing, moving additional workloads there to displace cost elsewhere"

Fair point, some of that net revenue increase is because of consolidation of workloads, although the majority of the cost is likely still driven by consumers expanding usage beyond what they expected. As I mention in my article, the second part of increase in costs has to do with data governance, and my argument is that snowflake doesn't make governance easy. Why can't they stand up a IAM-like service with a nice UI and dashboards? why can't they make integrations with pagerduty, slack, email work out of the box? Why can't I specify team based budgets and instead have to do it on a per warehouse-team basis? Why do I have to build custom bespoke tooling on top to make governance work?

I can unequivocally say that at a certain scale you need to move on and that Snowflake and many of the SaaS providers are too expensive even at medium scale companies. This article describes this paradox better than I could: https://a16z.com/2021/05/27/cost-of-cloud-paradox-market-cap...

Moreover Snowflake's enterprise pricing model is even more non-scalable. Why do companies often have to pay two times higher price per credit relative to the standard model? Shouldn't guarantees on security or support come with a fixed cost? Shouldn't enterprise offer economies of scale in pricing?

I also wish folks would read my article from end to end because my conclusion in the article is that you don't really have a choice but to use an enterprise solution when your scale is small. If I had to start my own company and had only 2 data engineers, you betcha I would use Snowflake and DataBricks.

---btw, it really surprises me that nobody has commented on the workload manager. Am I the only one seeing that as an issue? I have enough exposure to compare it with Redshift and I can say that Snowflake's workload manager is just very bad at optimizing throughput.

Why is Snowflake so expensive (77)

evtx on Aug 23, 2022 | root | parent | next [–]


I read your link. My immediate reaction:

1) I think Andreessen Horowitz has probably oversimplified the issue based on the Dropbox outlier. It's easy to say you can build your own datacenter to manage stuff but the costs in people are really hard to offset, especially with the security posture and level of 999s that most companies need. Not only that, but throw in disaster recovery, so now you've doubled the costs (two data centers). Etc. Plus hardware ages rapidly--you want to pay for the "floor sweeps" (as Teradata used to call them) every few years?

Beyond the complexity, some of these companies simply could not exist without the Cloud. Take Snowflake. How big of a data center would they need? How many servers? How much disk? How do they know if Dropbox wants to load 1 GB, 1 TB, or 1 PB of data? Answer: they don't. This type of model only works if you can leverage the essentially unlimited scale of the cloud providers. I don't miss the days of loads failing due to being out of disk space and having to scramble around trying to find things to delete.

2) Regarding pricing policy, Snowflake makes it very clear which features are included with which edition:https://docs.snowflake.com/en/user-guide/intro-editions.html

Your link also says Snowflake paid 44% of their revenue in 2021 on Cloud. If that is true, perhaps Snowflake loses money at standard edition, and presumably there is a larger internal cost to supporting some of the higher end features like Private Link that Snowflake needs to recapture. Regardless, as a Snowflake customer, I can determine what features I need and decide if the price they are charging is worth it or if I should look elsewhere. I can say from experience that some of these features and even paper certifications aren't easy and can be very expensive to maintain.

I will tell a story that will age me. Decades ago I used to work for a company that needed a "business continuity" plan. We had to show that we could continue to function if our data center was destroyed by natural disaster. We paid a company in another region that had essentially a copy of all of our hardware, and once a year we'd send our backups there and bring up all of the systems to prove we could. As you might imagine, this service was insanely expensive.

Flash forward to now. Snowflake has a feature called failover/failback with connection redirect. With a few commands, you can replicate your entire database elsewhere, you can incrementally keep the remote target up-to-date, and you can test it as often as you like with connections failing over generally in under 1 minute. If your company needs something like this, how much would that cost to build yourself? Maintain? Test? Clearly there must be customers who did that evaluation and decided that Snowflake's approach is way cheaper. If you disagree, don't use that level of service, or build it yourself. You say SaaS providers are "too expensive" and that even "medium scale" companies can do better themselves, but that isn't my experience.

3) As discussed in the previous comment, no doubt Snowflake can make improvements. However, what I see from my limited view as a (probably much smaller) customer, Snowflake is doing that. In fact, two of those improvements you call out are already in private preview and were discussed at their recent conference. If my company was briefed on these features post Summit, I'd be highly surprised if yours wasn't.

Why is Snowflake so expensive (78)

stassajin on Aug 24, 2022 | root | parent | next [–]


Thanks for sharing your perspective. It's always useful to get a more experienced viewpoint. I agree that managing hardware is not something that should be taken lightly and that only companies at scale can do that and should do that: uber, facebook, dropbox. I'm not pushing for managing your own data-centers, but I'm overall more hopeful that open source gets better and more data engineers learn the craft, it would be cheaper to run things yourself once your Snowflake bill is in the millions per month.

Why is Snowflake so expensive (79)

beoberha on Aug 22, 2022 | prev | next [–]


I disagree with the assertion that Snowflake has no incentive to improve performance. While I don’t work for Snowflake, I work for a competitor and we’re constantly looking to improve performance to make customers happy.

For the exact reason that the article claims Snowflake wouldn’t innovate, I’d assert that they would. If they are expensive and slow, and a competitor is faster and cheaper, eventually they will see business move to the competitor. We see it all the time.

Why is Snowflake so expensive (80)

PaulWaldman on Aug 22, 2022 | parent | next [–]


Chrun for these services take a long time. They are "sticky" and have the baggage of enterprise agreements. With the switching costs never being zero, if SLAs are being met, it's exceedingly difficult to switch vendors.

Alternatively there is a faster impact on new sign-ups when falling behind competitors on costs and benchmarks.

Why is Snowflake so expensive (81)

cs702 on Aug 22, 2022 | root | parent | next [–]


Exactly. For enterprise customers in particular, replacing a SaaS tool that's deeply intertwined with many internal systems is about as easy and convenient as it is for a homeowner to rip out his/her home's existing HVAC system to replace it with a newer, more efficient one. No one ever wants to do that -- unless there's absolutely no other choice.

Why is Snowflake so expensive (82)

dominotw on Aug 22, 2022 | root | parent | prev | next [–]


Their stock price is pegged at new customer acquisition. They signed up over 6k new customers last qtr. This is one of their top stats that they present to investors.

Why is Snowflake so expensive (83)

beoberha on Aug 22, 2022 | root | parent | prev | next [–]


I worded it poorly, but I don’t necessarily mean a full exodus from the platform. In my experience, large enterprises have a lot of workloads running on different technologies (for whatever reasons) and the migration to cloud is a multi-year effort. If someone is just dipping their toe into Snowflake with easy-to-migrate workloads (which is very likely given their relative age in the market) and see performance and cost issues with those workloads, they may be hesitant to migrate the bigger ones and use that as leverage to get Snowflake to improve.

Why is Snowflake so expensive (84)

danielmarkbruce on Aug 22, 2022 | root | parent | prev | next [–]


They are all out to get new logos. They spent about $800m on S&M TTM v $1.4 bill rev. They aren't milking their customer base for cashflow.

And large customers are moving to them in droves.

Why is Snowflake so expensive (85)

tomnipotent on Aug 22, 2022 | root | parent | prev | next [–]


> have the baggage of enterprise agreements

Snowflake let's you roll into pay-as-you-go after a contract expires.

Why is Snowflake so expensive (86)

wpietri on Aug 22, 2022 | parent | prev | next [–]


Could you say more about the relative market position of your two companies?

I don't know the market at all, but Snowflake is certainly large and successful (IPOed in 2020, $50bn market cap). I could readily imagine that a company doing so well might not feel the incentive to improve very strongly. Or that they might see themselves more as a sales/marketing-led company than one where technical quality is a key driver. Whereas you folks as a challenger would have a lot more incentive to differentiate yourselves.

Why is Snowflake so expensive (87)

beoberha on Aug 22, 2022 | root | parent | next [–]


You could probably google my username and find out, but I’ll say we’re bigger than Snowflake and are very much entrenched in the enterprise database market :)

Why is Snowflake so expensive (88)

carlineng on Aug 22, 2022 | prev | next [–]


[Disclaimer: former Snowflake employee]

Snowflake is not expensive because of perverse incentives, which is the primary claim of the article. It is expensive because it is a highly differentiated and very sticky product.

As others have mentioned, competition is the ultimate incentive to work on performance. Every dollar of Snowflake revenue is a dollar of revenue that Amazon, Google, Microsoft and Databricks are fighting for.

Why is Snowflake so expensive (89)

klysm on Aug 22, 2022 | parent | next [–]


They aren’t exclusive. They also have perverse incentives to leave optimization gremlins in, even if they are very low hanging fruit to remove. They also have the incentive to not document them well.

Why is Snowflake so expensive (90)

daniel-cussen on Aug 22, 2022 | root | parent | next [–]


Oh like injecting jitter so there's no consistency in measurement?

Why is Snowflake so expensive (91)

discodave on Aug 22, 2022 | parent | prev | next [–]


> Every dollar of Snowflake revenue is a dollar of revenue that Amazon, Google, Microsoft and Databricks are fighting for.

This is true, but misses one detail...

Snowflake runs in the cloud so every dollar of Snowflake revenue is roughly $0.40^1 of Amazon/Google/Microsoft revenue anyway.

^1: Snowflakes gross margin is in the range of 50-60% https://www.macrotrends.net/stocks/charts/SNOW/snowflake/gro...

Why is Snowflake so expensive (92)

mejakethomas on Aug 22, 2022 | parent | prev | next [–]


This, 100%.

It eats/consolidates formerly-disparate costs around the org. Because it's so good.

Which makes it look expensive.

Why is Snowflake so expensive (93)

shrimalpreeti on Aug 22, 2022 | prev | next [–]


[Disclaimer: I work for a company that offers a Snowflake Cost Optimizer product]We’re an open-source monitoring & alerting tool and many of our users were using it to set alerts on their warehousing (Snowflake) costs. The problem with Snowflake is particularly worse due to its lack of query level attribution of costs and no in-built features for monitoring or recommendations on improvements. We’re building a Snowflake Cost Optimizer (https://www.chaosgenius.io/snowflake-cost-optimizer.html) and are hearing the same feedback from our customers as the author mentions. Snowflake is definitely coming up with features towards better cost transparency but I wonder if it’s too little too late.

Why is Snowflake so expensive (94)

evtx on Aug 22, 2022 | parent | next [–]


In my experience Snowflake is very receptive to enhancement requests. If you feel Snowflake should be doing something better for surfacing optimizations, I'd ask them.

That said, I'm not sure your comment is fully accurate:1) "lack of query level attribution of costs"Snowflake doesn't charge per query so there can't be default query level attribution of cost. Snowflake charges by second of warehouse use. But you CAN easily see which queries ran on which warehouse and allocate costs back to that using your own criteria (by query second, usually better than by number of queries).2) "no in-built features for monitoring"Snowflake has built in cost monitoring dashboards:https://docs.snowflake.com/en/user-guide/cost-overview.htmlAnd resource monitors:https://docs.snowflake.com/en/user-guide/resource-monitors.h...

That said, I'm sure improvements could be made. Ask for them. There must be a market for this because Capital One and Acceldata and others offer similar solutions for optimization recommendations.

Why is Snowflake so expensive (95)

mejakethomas on Aug 22, 2022 | root | parent | next [–]


This. Snowflake introspection five years ago looks very, very different than today. Mostly due to enhancement requests.

Why is Snowflake so expensive (96)

mritchie712 on Aug 22, 2022 | prev | next [–]


I predict[0] we'll see more people choosing Clickhouse over Snowflake in the next 5 years. Clickhouse will get reasonably feature compatible with Snowflake and give people a better escape hatch if they want to self-host their data stack. Clickhouse, Inc is building a cloud product that abstracts away the complexity and there's already companies like Altinity that will spin up a cluster for you in minutes.

0 - https://blog.luabase.com/clickhouse-for-data-nerds/

Why is Snowflake so expensive (97)

ramesh31 on Aug 22, 2022 | parent | next [–]


Isn't Clickhouse a hosted SQL DBMS? Not really comparable to a cloud data lake.

Snowflake/Databricks scales infinitely across cloud object stores like S3. Clickhouse is run as a single (or sharded) process that uses the local file system like any other SQL database, and requires volume provisioning as your data scales. It also has a fixed run cost (EC2 or wherever it's hosted) versus an "on-demand" model where read clusters are spun up to run queries against static objects that have no fixed cost other than storage pricing.

Why is Snowflake so expensive (98)

morelisp on Aug 22, 2022 | root | parent | next [–]


ClickHouse can access non-local storage without issue (or at least, with only issues for some of them - HDFS and S3 seem to work fine, I've had less luck with real-time Kafka). I'm not sure how well it scales horizontally for such uses; you can hack something up with macros that isn't too painful but there may also be better options.

However, it's probably not a great pick if you're already struggling with the operations side of things, which seems to be the main selling point for services like Snowflake.

Why is Snowflake so expensive (99)

hodgesrm on Aug 22, 2022 | root | parent | prev | next [–]


ClickHouse only has fixed run cost if you configure it that way. We run ClickHouse clusters in AWS / GCS using block storage in our cloud platform. You can scale VMs up and down vertically in minutes, and scale horizontally in the same amount of time. The model works great for SaaS use cases that require constant response at all times and scale over days or weeks rather than minutes. Real-time analytic apps that show tenant dashboards or generate recommendations for users on ecommerce sites have this characteristic.

I don't think there's really a right or wrong answer here, just trade-offs.

Disclaimer: I work on Altinity.Cloud, a platform for managed ClickHouse

Why is Snowflake so expensive (100)

KingOfCoders on Aug 22, 2022 | root | parent | prev | next [–]


In which way not comparable?

Why is Snowflake so expensive (101)

nycdatasci on Aug 22, 2022 | root | parent | next [–]


From the article: "JOIN's are also not nearly as performant as in other cloud data warehouses." This seems like a pretty significant limitation.

Why is Snowflake so expensive (102)

morelisp on Aug 22, 2022 | root | parent | next [–]


That's... literally comparing them. The comparison for some use cases might not be favorable for ClickHouse, but they're comparable.

(IMO the slowness of ClickHouse joins has been overstated, especially since its many-column table support is so good you'll probably be fine joining on insert instead.)

Why is Snowflake so expensive (103)

mritchie712 on Aug 22, 2022 | root | parent | prev | next [–]


Yes, this is one major hurdle they need to overcome, but I think they'll (Clickhouse Inc + the community) pull it off. It's a current weakness but by no means unsolvable.

Why is Snowflake so expensive (104)

SnowHill9902 on Aug 22, 2022 | parent | prev | next [–]


Clickhouse is incredible software. It only feels a little foreign when coming from Postgres (e.g. some CamelCase terms).

Why is Snowflake so expensive (105)

mritchie712 on Aug 22, 2022 | root | parent | next [–]


Yeah, the CamelCase throws me too, especially since it's mixed in with snake_case (e.g. date_trunc[0])

0 - https://clickhouse.com/docs/en/sql-reference/functions/date-...

Why is Snowflake so expensive (106)

zX41ZdbW on Aug 22, 2022 | root | parent | next [–]


camelCase - native functions

SQL_STYLE_CASE - compatible functions

Why is Snowflake so expensive (107)

brianwawok on Aug 22, 2022 | prev | next [–]


Ran into the same exact thing at CircleCI.

Me: My builds are really slow

CircleCI: Here are a few very low effort answers

Me: git checkout is taking literally 60 seconds, but it takes 3 seconds locally, why?

CircleCI: Mumble Mumble.

They charge per minute, so why would they care if builds are slow? Was about a year of this getting worse and worse, till I finally cancelled the service last week and built my own server in my basem*nt.

I know get 200% faster builds, and the hardware payback time is not very long (6 months of my CircleCI bill?).

I think it's a huge red flag anytime the metric you care about is something that being "worse" makes the provider more money.

Why is Snowflake so expensive (108)

hangonhn on Aug 22, 2022 | parent | next [–]


100% and not just in tech: when a party’s incentives aren’t aligned with yours, you’ll often find yourself getting little help or even working in opposition to each other. We recently experienced this with filing for a health related insurance claim. My wife wondered why they kept losing stuff, not doing what they promised, or asking for more paper work. I kept explaining to her that while not necessarily malicious, they have very little incentive to improve that department.

Always try to find partners or counter parties who win when you do as well. I know we don’t always have that luxury but sometimes a little headache initially is better than being stuck with someone who works in opposition to you in the long run.

Thanks so much for sharing your story. We are in the process of outsourcing some of our Jenkins functionality and these stories are useful to hear.

Why is Snowflake so expensive (109)

sremani on Aug 22, 2022 | parent | prev | next [–]


From the front page of CircleCI.__________________________________________________Industry-leading speedAs soon as you think it, you can deliver it. Your developers’ time is too important to waste. No other CI/CD platform takes performance as seriously as we do. Your pipelines should accelerate your business, not slow you down.__________________________________________________

Rule of thumb: Anyone talking about their honesty is not honest.

Why is Snowflake so expensive (110)

thexumaker on Aug 22, 2022 | root | parent | next [–]


We did the same thing but with self hosted runners with github actions.

https://github.com/philips-labs/terraform-aws-github-runner

phillips-labs has some good resources for scaling this up as well.

Why is Snowflake so expensive (111)

mikewhy on Aug 22, 2022 | parent | prev | next [–]


I love that CircleCI flaunts it's speed compared to other providers, meanwhile we can clearly see the CircleCI steps take the longest in our builds.

Not to mention the constant failures.

Why is Snowflake so expensive (112)

josephcsible on Aug 22, 2022 | parent | prev | next [–]


> They charge per minute, so why would they care if builds are slow?

It's worse than just not caring: they have a direct financial incentive to make sure your builds are as slow as you'll tolerate.

Why is Snowflake so expensive (113)

icedchai on Aug 22, 2022 | parent | prev | next [–]


At a previous startup, we dumped CircleCI and switched to Jenkins on our own EC2 instance. We had a lot less problems. (This was way back in 2016, I'm sure things have improved now.)

Why is Snowflake so expensive (114)

brianwawok on Aug 22, 2022 | root | parent | next [–]


Yup!

I ended up doing TeamCity over Jenkins, but they do the same thing.

Amazing how fast a 32C / 64T EYPC server in my basem*nt can be..

Why is Snowflake so expensive (115)

icedchai on Aug 22, 2022 | root | parent | next [–]


I can only imagine! I have a 3950X here (16C / 32T) with 128 gigs of RAM and it is incredible. Total overkill for a home lab though.

Why is Snowflake so expensive (116)

Fatnino on Aug 22, 2022 | parent | prev | next [–]


Those app rental scooters that are littered around city centers: you pay for distance as well as for time. And that's why they don't go very fast.

Why is Snowflake so expensive (117)

gkoberger on Aug 22, 2022 | root | parent | next [–]


No, they're legally limited to 15 MPH for safety. You also don't pay for distance, just time. Not everything is a conspiracy theory.

SF: https://www.williamweisslaw.com/sf-e-scooter-laws/NYC: https://www1.nyc.gov/html/dot/html/bicyclists/ebikes.shtml#:....

Why is Snowflake so expensive (118)

CrazyStat on Aug 22, 2022 | root | parent | prev | next [–]


Also because going fast on those things is f*cking dangerous, especially when (like most people riding them) you're not wearing a helmet.

Why is Snowflake so expensive (119)

wpietri on Aug 22, 2022 | root | parent | next [–]


Absolutely. I don't have a problem with most scooter owners, but here in a city with a lot of tourism, the rental scooters are often a menace. I was waiting outside of a restaurant and took half a step back to let some people through. I brushed against something moving fast, and it was some tall yahoo going downhill at max speed on a scooter. If I'd taken a full step back, somebody would have needed medical treatment.

Learning how to ride one of those in a city takes time, practice, and thought. Which you will surely get if you buy one. But apparently not so for the rentals.

Why is Snowflake so expensive (120)

djbusby on Aug 22, 2022 | root | parent | prev | next [–]


And have been drinking (it's what I saw a lot of)

Why is Snowflake so expensive (121)

SkyMarshal on Aug 22, 2022 | root | parent | prev | next [–]


In that particular case another reason could be that they quite reasonably don't want you going very fast for safety and liability concerns.

Why is Snowflake so expensive (122)

alberth on Aug 22, 2022 | prev | next [–]


This is all much simpler than the post makes it sound.

It's usage-based pricing and customers are using more of it.

> a customer that joins a year ago and spends $1 is paying out well over $1.7 a year later

The entire article is based on this 1.7x "net dollar expansion" statement.

After integrating Snowflake, customers have found value in using Snowflake and are using more of it 1 year later.

Since Snowflake is billed on usage, that explains the net-dollar expansion.

Why is Snowflake so expensive (123)

benjaminwootton on Aug 22, 2022 | prev | next [–]


The monthly bill does make me wince, but Snowflake of course includes all server and compute costs, no installation, initial configuration or upgrades etc. It’s genuine SaaS.

It’s also very simple to manage and optimise so less DBA or DevOps type manpower.

Then of course you can perfectly right size your instances and pay by the second for compute and by the byte for storage.

Expensive, but lower TCO than alternate approaches I suspect.

Why is Snowflake so expensive (124)

jeffwask on Aug 22, 2022 | parent | next [–]


Yeah...100%. It's expensive til you try running a data warehouse yourself and have to hire in to support it.

Like any other service there are scale points where it no longer makes sense but for most smaller orgs it's still a bargain over DIY

Why is Snowflake so expensive (125)

nojito on Aug 22, 2022 | root | parent | next [–]


We did a cost analysis and found databricks and BQ to be cheaper than a similar snowflake build out.

I think people are falling into a trap of not considering costs because “it takes care of everything”.

Why is Snowflake so expensive (126)

dominotw on Aug 22, 2022 | root | parent | next [–]


> cost analysis and found databricks and BQ to be cheaper than a similar snowflake build out.

Wouldn't this mean snowflake has priced their product not competitively. Why would they do that if its so obvious that everyone would just save money from switching to DB.

> I think people are falling into a trap

This is their product strategy? to take advantage of gullible businesses falling into their trap.

Surely building a whole business around customers falling into trap has to backfire at some point.

Why is Snowflake so expensive (127)

Keyframe on Aug 22, 2022 | parent | prev | next [–]


It’s also very simple to manage and optimise so less DBA or DevOps type manpower.

Then of course you can perfectly right size your instances and pay by the second for compute and by the byte for storage.

These two are connected vessels.

Why is Snowflake so expensive (128)

benreesman on Aug 22, 2022 | prev | next [–]


Alright I’ll bite finally. What do these companies do? Neither Snowflake’s front-facing website, nor the Wikipedia article, nor this post tell me why people pay all this money.

I know a bit about the effort involved in chucking around 100 petabyte datasets, and there are numerous niches a SaaS could fill in there, but it’s very murky from the outside.

Why is Snowflake so expensive (129)

Croftengea on Aug 22, 2022 | parent | next [–]


I was wondering the same thing. This sums up pretty good I guess:

> The best way to describe Snowflake is that it is a brute force method to run complex queries without creating indexes.

(https://news.ycombinator.com/item?id=32554072)

Why is Snowflake so expensive (130)

benreesman on Aug 22, 2022 | root | parent | next [–]


Column stores on DFS are without a doubt tricky beasts. It’s a very rich field technically.

I guess I’m trying to get a read on whether their core competency / moat is distributed columnar query technology or sales/support/marketing.

Why is Snowflake so expensive (131)

colinmhayes on Aug 22, 2022 | root | parent | next [–]


Snowflake is slower and more expensive than competitors. I'd say its moat is mostly that its extremely easy to set up and start using without technical support. If you've just got a small team and no one wants to do data engineering snowflake makes that possible, or at least much easier. Most users are generally happy, and they've followed the cloud playbook of making it hard to switch off, so even when teams have scaled to the level where secondary indexes and data support staff makes sense the team is still happy with snowflake.

Why is Snowflake so expensive (132)

joelthelion on Aug 22, 2022 | root | parent | prev | next [–]


But why not create indexes? I mean, I understand why sometimes you're you don't want an index. But building an entire warehouse around the idea of "no indexes", really ?

Why is Snowflake so expensive (133)

benreesman on Aug 22, 2022 | root | parent | next [–]


My experience with "Big Data" is pretty dated, 5 years at least. At that time I think a good cutoff for "big data" might have been like a petabyte +/- a factor of 10 depending on your gear. I imagine now even 1PB is probably pretty mild by "big data" standards.

But once you're up in that "I can't even fit this in an 4-8U sled" territory (whatever it is in a given decade) you're probably doing some kind of map/reduce thing, so there's a strong incentive to have a column-major layout. If you can periodically sort by some important column so much the better (log2 n binary search), but mostly you've got a bunch of mappers (which you work hard to get locality on relative to the DFS replicas where the disks live, maybe on the same machine, maybe in the same top-of-rack switch or whatever) zipping through different columns or column sets and producing eligible conceptual "rows" to go into your "shuffle/sort/reduce" pipeline to deal with joins and sorts and stuff like that.

I don't know how Google does it, but I think most everyone else started with something like the Hadoop ecosystem and many with something like Hive/HQL to give a SQL-like way to express that job, especially for ad-hoc queries (long-lived, rarely changing overnight jobs might get optimized into some lower-level representation).

Around the time I was getting out of that game, Spark was starting to get really big, which was due to some combination of RAM getting really abundant and just kind of a re-think on what was by then a pretty old cost model. I have no idea what people are doing now.

I'd love it if someone with up-to-date knowledge about how this stuff works these days chimed in.

Why is Snowflake so expensive (134)

buttaphingas on Aug 22, 2022 | root | parent | prev | next [–]


It's all around the ethos of ease of use. Snowflake does a lot of smarts in the background so that you don't have the overhead of managing indexes. And not just indexes, there is just less human intervention required overall compared to something like Teradata or even a modern lakehouse.

That said, they've kind of introduced it with the Search Optimization Service, which is like an index across the whole table for fast lookups, but even that is automatically maintained in your behalf.

Why is Snowflake so expensive (135)

idunno246 on Aug 22, 2022 | root | parent | prev | next [–]


these tend to be for one-off analytical queries. you want ever user with flag X >10 joined against five other tables each with similar filters. you don't know ahead of time what that query is, your analyst thought of it this morning, so you cant make indices ahead of time. and itll never run again so you don't need to take the performance hit keeping an index. and someone has to decide which indices to keep, but app engineers arent best utilized figuring out indices for analysts.

the indices is nice, but the bigger selling feature for me is if you have many services, and each services data are in the warehouse, you can join against them all together.

Why is Snowflake so expensive (136)

DebtDeflation on Aug 22, 2022 | parent | prev | next [–]


Snowflake is a data warehouse in the cloud. In the past, companies would have spent a fortune on Oracle or Teradata licenses and a fortune on on-prem hardware to run it on. Now they spend it on Snowflake and run it on AWS, etc. Same story as with any SaaS product - cheap and easy to get started, only pay for what you use, but over time the costs........get big.

Why is Snowflake so expensive (137)

glenjamin on Aug 22, 2022 | prev | next [–]


I don't know if this is the case at Snowflake, but there are similar seemingly misaligned incentives with CircleCI's build-seconds-based pricing model.

However, the generally accepted wisdom there was that improving performance had always led to more builds being run - and so still come out as a net-positive. This had happened a bunch of times as we upgraded CPUs or storage drivers or the version - there'd be a short term drop in direct revenue, but then it would bounce back quickly as people took advantage of being able to do more stuff in the same amount of time.

I'm told the revenue and finance people were pretty concerned the first time it happened though!

Why is Snowflake so expensive (138)

morelisp on Aug 22, 2022 | parent | next [–]


I would guess that this is less likely to be true of Snowflake than CircleCI.

Most dev teams are underinvested in CI. That is, if you queried some random team, they'd probably have a dozen ideas for tests or processes they'd like to write/run if they had the resources, most of which would provide some real value - the ideas likely coming from some previous actual bugs that hit prod.

Most BI teams are overinvested in data. They have way more than is valuable. Large scale analysis is mostly exploratory and speculative, and rarely yields results. Any induced usage is more from fear they might throw away the magic bits than real value being unlocked by better efficiency. (And I think this is probably necessarily true. Any BI process that gets to the point the data is clear and regularly actionable also gets operationalized and right-sized through a more normal dev process.)

Why is Snowflake so expensive (139)

idoh on Aug 22, 2022 | parent | prev | next [–]


I work at Circle, but not on this specifically, and echo the same experience. This (https://en.wikipedia.org/wiki/Khazzoom–Brookes_postulate) was cited last week in a meeting that I was in, for example.

Why is Snowflake so expensive (140)

pjdbruin on Aug 22, 2022 | prev | next [–]


[disclaimer: comment written by one of cofounders of iomete - a YC-backed startup - active in the same market as Snowflake]

I think Snowflake is (still) expensive because it is a venture-backed enterprise software company and goes through a typical trajectory...

Story goes like this: founders are product-driven and first movers -> find PMF -> need VC funding -> VCs only fund enterprise software ventures with 70%+ gross margins and high retention rates -> product/service gets priced to achieve these metrics -> VCs happy to fund sales & marketing machine needed to obtain sales growth, nobody cares about profitability until after IPO -> startup is everyone’s darling until ~2 years after IPO.

Then: economic crisis hits, customers become more price sensitive, competition intensifies. Plus now management is exposed to quarterly pressure of financial markets to deliver on top-line and margin expectations.

Meanwhile a bunch of startups are building (lower priced) alternatives. Perhaps not as mature or feature-rich as Snowflake, but good enough for 80% of use cases that Snowflake covers.

Therefore the assertion that Snowflake is not optimizing their product sounds a bit crazy to me. It would be optimizing for short-term gain, while jeopardizing its reputation as the leader in the space. Obtaining excessive margins through excessive pricing only works under monopolistic conditions or if they had a truly distinctive product. Both are not the case imo. Also, it's early days. Not exactly sure what Snowflake's market share is, but I bet it is < 5%.. so they haven't locked in everyone yet...

I bet that Snowflake will be forced to compete "also on price" in the next five years because free enterprise is a powerful thing. The title of the article could be “Why Snowflake is (still) expensive but will get more affordable over the next few years”..

Why is Snowflake so expensive (141)

kjw on Aug 22, 2022 | prev | next [–]


"Snowflake has no incentive to push a code change that makes things 20% faster because that can correspond to 10–20% drop in short-term revenue. In a typical Innovator’s Dilemma, Snowflake prioritizes other things that generate an ever larger menu of compute options, like Snowpark and data apps built on Streamlit, that will bleed your organization dry."

This is not true. Snowflake has done just that - it has continuously improved performance resulting in reduced credit consumption and revenue from customers on a unit compute/storage basis. And it has negatively impacted their revenues and stock price. Snowflake's incentive is to strengthen their competitive position and to hopefully generate more long-term revenue from their customers.

The CFO forecasted a $97 million dollar short fall when guiding for 2022 revenue resulting from product improvements. Snowflake stock dropped immediately after.

See Q4 transcript -- https://www.fool.com/earnings/call-transcripts/2022/03/02/sn...

"Similarly, phased throughout this year, we are rolling out platform improvements within our cloud deployments. No two customers are the same, but our initial testing has shown performance improvements ranging on average from 10% to 20%. We have assumed an approximately $97 million revenue impact in our full-year forecast, but there is still uncertainty around the full impact these improvements can have. While these efforts negatively impact our revenue in the near term, over time, they lead customers to deploy more workloads to Snowflake due to the improved economics."

Also see the Bloomberg article -- https://www.bloomberg.com/news/articles/2022-03-02/snowflake....

"Snowflake Inc., a software company that helps businesses organize data in the cloud, dropped the most ever in a single day Thursday after projecting that annual product sales growth would slow from its previous triple-digit-percentage pace.

Executives said improvements to the company’s data storage and analysis products will let customers get the same results by spending less, which will hurt revenue in the short term, but attract more clients in the future.

“The full-year impact of that next year is quite significant,” Chief Executive Officer Frank Slootman said on a conference call Wednesday after the results were released. But “when customers see their performance per credit get cheaper, they realize they can do other things cheaper in Snowflake and they move more data into us to run more queries.”"

Why is Snowflake so expensive (142)

pykello on Aug 22, 2022 | prev | next [–]


(I am not affiliated with Keebo, although I had a recruiting meeting with them earlier this year)

FWIW, Keebo (https://keebo.ai/) tries to solve this problem & reduce your Snowflake bill by using Data Learning techniques. It can be configured to return exact results or approximate results.

Why is Snowflake so expensive (143)

not-my-account on Aug 22, 2022 | parent | next [–]


It is always interesting seeing companies building up on the products / services of other companies. Kinda like TurboTax built on the IRS, these "children" (is there a better term?) companies are quite dependent on the "parent" company not changing or improving its product / service.

I don't see AWS changing so dramatically that companies like DataBricks are put in hot water (but I could be wrong), but I could see Snowflake improving its product due to competition, putting Keebo in a tough situation.

Why is Snowflake so expensive (144)

morelisp on Aug 22, 2022 | root | parent | next [–]


By the time I reached this comment I counted no fewer than five completely separate links to offerings to help reduce your Snowflake bill. For something that is already a focused SaaS product, I have to say that starts to smell a bit.

Why is Snowflake so expensive (145)

georgewfraser on Aug 22, 2022 | prev | next [–]


The core claim of this article, that Snowflake doesn't implement optimizations that would reduce usage, is not true. Search optimized tables, partitioned tables, and per-second billing are all counterexamples.

Why is Snowflake so expensive (146)

jjfoooo4 on Aug 22, 2022 | prev | next [–]


This is a kind of poor engineering writing in which the author finds a product to not be tailored to his precise tastes and concludes it is because the company is user hostile and/or doomed.

The bit about Snowflake not being incentivized to care about costs are trivially untrue. The rest of the article perceives trade offs as simple feature gaps.

For example, Snowflake gives the user more latitude to distribute workloads among “warehouses” than other offerings. With poor distribution the author will experience the workload provisioning issues he describes.

Why is Snowflake so expensive (147)

ramesh31 on Aug 22, 2022 | prev | next [–]


I'm of the mind that Snowflake and Databricks are losing their value prop now that Delta Lake is open source and Iceberg is maturing. What's to stop me from rolling my own Spark clusters and just using one of those? Is anyone doing this?

Why is Snowflake so expensive (148)

nemothekid on Aug 22, 2022 | parent | next [–]


>What's to stop me from rolling my own Spark clusters and just using one of those? Is anyone doing this?

Ops. Unless your core competency is running reports and spark nodes, it's probably cheaper to outsource the management of Spark and friends than to hire people to make sure it's always up and running. To be fair I haven't touched Spark in many years but having to page someone who was good enough to spark to debug why a job stopped at 3am isn't fun.

Why is Snowflake so expensive (149)

ramesh31 on Aug 22, 2022 | root | parent | next [–]


>Ops. Unless your core competency is running reports and spark nodes, it's probably cheaper to outsource the management of Spark and friends than to hire people to make sure it's always up and running.

I think as an end user I would absolutely agree on this point. But many companies use Databricks as part of their automated backend systems that they resell to customers. The cost per "DBU" unit is astronomical for the amount of raw compute in use. It feels a bit like running a restaurant where you serve takeout.

Why is Snowflake so expensive (150)

joshhart on Aug 22, 2022 | root | parent | prev | next [–]


[Disclaimer: Databricks employee] There's also a lot of value in DBSQL, Unity catalog (data management), and serverless for autoscaling that can all save money in terms of just running raw Spark. But if you want to operate Spark yourself, cool do it. We're happy for that, it builds the base of Spark committers over time and increases the quality of our products.

Why is Snowflake so expensive (151)

nojito on Aug 22, 2022 | root | parent | prev | next [–]


I can spin up and down 100+ node clusters on the 4 largest cloud providers at will.

What ops am I missing?

Why is Snowflake so expensive (152)

eximius on Aug 22, 2022 | parent | prev | next [–]


You'll find plenty of the customer base of Databricks used to run their own clusters.

It's a tradeoff. It might cost less dollars but more time. The time and expertise to run their own clusters effectively is not something every org can or desires to do.

Why is Snowflake so expensive (153)

buttaphingas on Aug 22, 2022 | root | parent | next [–]


And to get the very best price for those clusters your you'd need to commit to the CSP for three years!

Would love to know the TCO trade-off between procuring, securing and deploying on your own clusters vs having them managed via SaaS.

Why is Snowflake so expensive (154)

mejakethomas on Aug 22, 2022 | prev | next [–]


It's not expensive.

What it can do, successfully, with three engineers was previously impossible with dozens.

What IS expensive is not being careful with it.

Why is Snowflake so expensive (155)

marymac on Aug 22, 2022 | parent | next [–]


THIS. Apply the correct guardrails and learn to optimize.

Why is Snowflake so expensive (156)

tommyphongs on Aug 29, 2022 | prev | next [–]


The article I doesn't have exprience with Snowflake but with Cloudera's tech stack on on-primise infrastructure. Both Cloudera and Snowflake use same approach: Separating computing and storage with main purpose: trade-of performance for scalability, easily maintaining without knowledge about user data, thus easily selling the solution to a wide range of customers without care about customer cost( maybe this also of them purpose). In my experience with Cloudera's tech stack, it become very complexity bruce-forced system, we need install HDFS for store data( storage layer), and Hive ( basically use Mysql to keep mapping between table and the hdfs file of that table)metadata store to keep HDFS's metadata, Impala to query engine( computing layer). Because computing layer don't know much about how data are organized, It is very limited when we want optimise our system, query like 'select * from TABLE limit 1' lead to scan overall data on many of hdfs file, and because Impala is memory computing engine, scan all table data lead to memory exceed, and because that, DA can't use sampling data to quickly manipulate with our data. Everything leads to the hell, and because many of things can effect to our system: HDFS, Impala, Hive metadata store, etc... so very hard to fix problem when it occurred.

Why is Snowflake so expensive (157)

cedricd on Aug 22, 2022 | prev | next [–]


I'm glad the author also points out how customer (mis)use can blow up data warehouse costs too. No matter how efficient Snowflake could get, using the warehouse too much or with unnecessary queries will ultimately have a larger impact.

The trend in the data space currently is for usage to increase -- as more companies adopt dbt they're running more and more prebuilt (materialized views) queries on a scheduled basis, rather than on demand. This is overall a good thing in that data is becoming easier to manage and use, but it does come at an increase in warehousing costs.

I think eventually the pendulum will swing back to tools that help optimize warehouse usage, as long as they allow for the same increase in productivity as dbt (disclosure - I work for one such company)

Why is Snowflake so expensive (158)

awinder on Aug 22, 2022 | prev | next [–]


I think the main metric that this is built on may be too coarse to derive the meaning that the article is. There’s conjecture that what’s driving this is more querying over the same dataset (more streamlit dashboards) but it could just as easily be expanding usage inside of companies. That’s what’s going on at my company right now, more teams using snowflake, more data being pushed in to replace existing workflows, etc.

I’m also not sure I understand the dig at streamlit dashboards. If you’re running hardware and introduce new read workflows, eventually you’ll need more read replicas and you’ll pay more for it. Maybe you can argue that snowflake is doing this at a higher cost but the metric data is not available in the sources to make that claim.

Why is Snowflake so expensive (159)

falcolas on Aug 22, 2022 | prev | next [–]


Snowflake is a bit generic to easily find - and the article has no hyperlinks - anybody have a one sentence summary?

EDIT: There it is: https://www.snowflake.com/

Data warehousing, basically.

Why is Snowflake so expensive (160)

thesandlord on Aug 22, 2022 | parent | next [–]


It's a data warehouse, like Google BigQuery or AWS Redshift / Athena

Why is Snowflake so expensive (161)

flyinglizard on Aug 22, 2022 | prev | next [–]


Where does all this data go? It's processed and then what? Sent to decision makers? Used to run automated processes?

I'm genuinely curious and would appreciate anyone who could show a real life example of this kind of pipeline where data is accumulated, then processed, then turned into revenue at the other end.

I've implemented systems that do this but my experience is that accumulating data is (too) easy, processing it in a meaningful way is slightly more challenging but ultimately driving positive business processes according to this data, which require a lot of friction with employees (training, procedures, maintenance, support) is the most difficult part.

Why is Snowflake so expensive (162)

lysecret on Aug 22, 2022 | parent | next [–]


Same experience. I think the most interesting and most public example of such a pipeline is Google/ building a search index. This is also where a lot of the methods originally came from.Nowadays a lot of this will be used to build recommendation systems / feature pipelines for ML.

Why is Snowflake so expensive (163)

frankbinette on Aug 22, 2022 | root | parent | next [–]


These are a bit too advanced examples. Think of simple descriptive statistics which is still so important yet not sexy as ML/DL/AI. ML is great, but the main usage behind these data technologies is still simple business intelligence.

Every business in every market need to understand what is going on with their processes. How many sales did I do yesterday, last week, last month, compared to last year, in which stores, what is the average basket amount, customers buy what with what, what size t-shirt do I sell the most, etc.

Why is Snowflake so expensive (164)

frankbinette on Aug 22, 2022 | parent | prev | next [–]


Seems like you kind of answered your own question... this data is used for business intelligence purposes.

Why is Snowflake so expensive (165)

buremba on Aug 22, 2022 | prev | next [–]


I believe they need to focus on the performance at least nowadays because both Databricks & BigQuery are also great products and they push Snowflake in terms of feature-parity and performance.

That being said, Snowflake is also pushing for the marketplace model where you publish your app natively to move your code where the customers environment is. If they become successful, the performance might not be the one of the incentives for the companies to go with Snowflake and the switching cost might be higher as companies will move more of their business logic embedded in the system.

Why is Snowflake so expensive (166)

epberry on Aug 22, 2022 | prev | next [–]


> Not providing observability to monitor and reduce costs

Vantage just launched this - https://www.vantage.sh/blog/vantage-launches-snowflake-suppo.... The problems the author describes are almost exactly what we heard from customers:

- list of users/queries that are the most expensive

- alerts and notifications for costs

- query timeout. Not something a third party can do but there is an interesting 'query tagging' feature for snowflake which Vantage supports.

Why is Snowflake so expensive (167)

_solr on Aug 22, 2022 | prev | next [–]


The competition is tough in the data warehousing industry, if Snowflake is expensive people will know. Current customers may not leave but it's going to be harder for them to get new customers.

Why is Snowflake so expensive (168)

KingOfCoders on Aug 22, 2022 | parent | next [–]


Everyone seems expensive (Looker seems to be the most expensive), and vendors are hard to compare. When evaluting some of them for a migration project, they would not let us run performance tests with our data to compare them and make a decision (paid).

Why is Snowflake so expensive (169)

YouWhy on Aug 22, 2022 | prev | next [–]


I often analyze tools as reduction from the space of problems × resources to the space of outcomes.

Let's consider Snowflake in this paradigm

- Problems: analytics on data that is not laid out in a way that's directly accessible for analysts.

- Resources: SQL analysts, few or no competent data engineers, spare cash

- Outcomes: run analytics at an industrial scale without requiring competent engineers or DevOps.

Since Snowflake's optimal client gets very easily locked in, it follows up that saving said client's money is not something even the client would care about

Why is Snowflake so expensive (170)

teej on Aug 22, 2022 | prev | next [–]


From what I can tell, the author is incorrect about the example given in "Optimizer gremlins". I tested an example on my own data and micro-partition pruning was active.

The issue with dbt models in Snowflake is that if you ever perform a full-refresh and don't sort it, you ruin any natural clustering that arises from an incremental model. I've run into this issue many times. Auto-clustering gets too expensive at scale and Snowflake doesn't give you much guidance on alternatives.

Why is Snowflake so expensive (171)

darksaints on Aug 22, 2022 | prev | next [–]


> We have 5–6 very good open-source data warehouse alternatives. We have Redshift, DataBricks, Firebolt, BigQuery, and likely a few other enterprise offerings, yet it is surprising how little training most companies have in negotiating and re-negotiating vendor contracts or in pushing for heavily discounted pricing.

Small nit: Redshift isn't open source. I would also add Clickhouse, Citus, and TimescaleDB as majorly capable open source technologies with commercial offerings in this space.

Why is Snowflake so expensive (172)

jmacd on Aug 22, 2022 | prev | next [–]


Retrospectively, this is very similar to how most SaaS behaved when per user per month billing was first introduced. There were almost never any actual limits on the number of users you could add to the software, but you purchased a license for a certain number. Occasionally your account would be audited and you would be billed of the overage. It was always a significant penalty. The same was true for CPU based licenses for things like IIS, SQL Service, Oracle, etc.

Why is Snowflake so expensive (173)

KingOfCoders on Aug 22, 2022 | prev | next [–]


I have no Snowflake experience, but some limited BigQuery experience. And it's very easy for a small company to get to $100k/year bills without massive data.

Why is Snowflake so expensive (174)

tootie on Aug 22, 2022 | parent | next [–]


Anytime your cloud spend with a single vendor starts to get out of hand, you just call and negotiate. If you make a multi-year commitment, they'll apply a substantial discount. Also, $100k/yr is still cheap compared to the cost of developers. Not just in terms of actual price tag, but risk management because a SaaS won't quit for a better offer.

Why is Snowflake so expensive (175)

dotopotoro on Aug 22, 2022 | root | parent | next [–]


So you dont need developers when you use SaaS?

Why is Snowflake so expensive (176)

yazaddaruvala on Aug 22, 2022 | root | parent | next [–]


If you need to hire 1 more developer at $100k to help maintain your data warehouse or pay $100k for Snowflake or BQ, its a no-brainer to use SaaS.

Also humans cost more than their salary: Recruiting, management, benefits, attrition, vacation, the risk that they are just not capable.

A human will also cost you more year over year (raises, promotions, etc), SaaS will typically cost you less year over year (optimizations, negotiations, competition, etc).

Why is Snowflake so expensive (177)

mejakethomas on Aug 22, 2022 | parent | prev | next [–]


Completely agree. Currently staring at 700k+ BigQuery costs annually and accomplished MUCH more with Snowflake at the same price.

Why is Snowflake so expensive (178)

dominotw on Aug 22, 2022 | parent | prev | next [–]


they should switch to flat rate billing capped at slots they are willing to pay for.

Why is Snowflake so expensive (179)

0xbadcafebee on Aug 22, 2022 | prev | next [–]


> Snowflake has no incentive to push a code change that makes things 20% faster because that can correspond to 10–20% drop in short-term revenue.

If they improve performance they can lower the cost to customers, which will make the product more attractive to prospective customers. But if they are already swimming in cash they may not feel the need to gain more customers.

Only threats prompt companies to improve things. Threat of a competitor, threat of losing all their money, threat of bad PR, threat of regulation, threat to the stock price, etc.

I see this every day in companies that don't care about managing their cloud costs. They waste money like crazy because they literally don't care if they lose money, because some exec doesn't care, or they got enough funding until the next round, etc. A couple years later another exec asks why the CISO/CTO is spending so much money without any ROI, and then everybody has to stop everything they're doing to shave pennies off cloud costs.

Companies run by individual executives are insane. I don't understand why people allow companies to be run this way. I think a co-op where employees could be active participants in the running of the company would allow for more sane decision-making.

Why is Snowflake so expensive (180)

rnk on Aug 24, 2022 | prev | next [–]


What most commentators are missing here is that Snowflake had a significant revenue reduction when they improved the efficiency of their product, ie they could do more with less customer cost, less cpu use. This is similar to AWS lowering prices for many things steadily over time. Snowflake did this knowing that they would get less revenue, they would have less growth, and I suspect they also knew it would cause their stock price to go down. Here's an article on it from March, https://www.yahoo.com/now/snowflake-plunges-revenue-growth-o....

Certainly snowflake wants to make it easier for people to spend money and solve all their problems on that platform, every company wants that. But it's a very competitive world out there, and snowflake leaders aren't complete idiots - they have to keep lowering their prices when they can, otherwise new people will come along and do things cheaper.

Why is Snowflake so expensive (181)

wsostt on Aug 22, 2022 | prev | next [–]


Snowflake is so expensive that Capital One has developed a toolkit for managing your instance.

https://www.capitalone.com/software/solutions/

Why is Snowflake so expensive (182)

marymac on Aug 22, 2022 | parent | next [–]


I'd love to talk to someone who has tried this out - I think it's called Slingshot

Why is Snowflake so expensive (183)

hobs on Aug 22, 2022 | prev | next [–]


I am like 95% sure that the MAX issue he mentions is wrong - I just modified some windowing function based approaches to the one he mentions and its several OOM faster because of partition elimination.

Nonetheless I agree with the basic points of the article.

Why is Snowflake so expensive (184)

rsweeney21 on Aug 22, 2022 | prev | next [–]


This is a great example of misaligned incentives.

Another example of misaligned incentives is LinkedIn. LinkedIn charges $3/message. The more messages sent on their platform, the more money they make. They are not incentivized to help sales or recruiters target the right people. It can be a cash cow in the short term, but it creates a negative experience for your users.

The fact that it has worked for so long is a testament to how strong network effects are.

In the case of Snowflake, high switching costs will protect them for a while.

Why is Snowflake so expensive (185)

imwillofficial on Aug 22, 2022 | prev | next [–]


It’s easy to point out ways leaving in foot guns look predatory. But that’s not always the case.

I work for AWS in billing, and the way we calculate bills is to try to et the customer the maximum discount.

Things like calculating savings plan coverage from smallest to largest to maximize utilization, or turning on Reserved Instance sharing on by default within an org.

I would say that the seemingly gouging behavior is more often than not technical or time constraints.

Why is Snowflake so expensive (186)

manassolanki on Aug 22, 2022 | prev | next [–]


Snowflake is expensive if not monitored properly, on top of that they provide minimal observability. There are some good features like auto suspend and auto resume for cost savings but still there is scope of optimisations. For ex, they will charge you for minimum 1 minutes even if your query is running only for 2 seconds.

Why is Snowflake so expensive (187)

jwie on Aug 22, 2022 | prev | next [–]


You would think they would be saving (and charging the customer!) a bundle not enforcing constraints on their tables.

I’d be very interested to hear the Snowflake side of this decision, but to the customer it’s simply unforgivable to have cosmetic constraints on a database.

Why is Snowflake so expensive (188)

dominotw on Aug 22, 2022 | parent | next [–]


Because snowflake doesn't build foreign key indexes. Imagine clickstream data where every insert is being checked against an index of customers. This isn't a typical usecase for big data warehouses.

Why is Snowflake so expensive (189)

jwie on Aug 22, 2022 | root | parent | next [–]


I understand that. But why have constraints that don’t do anything?

Why is Snowflake so expensive (190)

evtx on Aug 22, 2022 | root | parent | next [–]


There are plenty of reasons why MPP databases allow the definition of constraints but don't enforce them. I'll list two: 1) BI tools can use them to optimize joins 2) Data modeling tools can use them to reverse engineers models without having to pattern match the keys.

That said, Snowflake does support constraints if you use hybrid tables (a preview feature announced at their last conference).

Why is Snowflake so expensive (191)

atwebb on Aug 22, 2022 | root | parent | prev | next [–]


Metadata

Tools and scripts can work off of it, design decisions are documented, suggestions can be made, inferences can be made (some dangerous, some not).

Why tag S3 objects if it doesn't enforce a schema? Maybe a bad analogue but I'm going quick right now :).

Why is Snowflake so expensive (192)

marcinzm on Aug 22, 2022 | parent | prev | next [–]


Do you have any data on the pricing of distributed databases that do support proper foreign key constraints? And how it stacks against Snowflake pricing?

Why is Snowflake so expensive (193)

veeti on Aug 22, 2022 | parent | prev | next [–]


Do you really need functional constraints in a OLAP database? Surely such validations already exist wherever your data is coming from.

Why is Snowflake so expensive (194)

Foobar8568 on Aug 22, 2022 | root | parent | next [–]


Ohohoh yeah sure, you mean application based constraints? Or an Entity–attribute–value base application ? What about documents?

Why is Snowflake so expensive (195)

spullara on Aug 22, 2022 | prev | next [–]


Snowflake increases performance all the time and their customers just use more of it.

Why is Snowflake so expensive (196)

wiradikusuma on Aug 22, 2022 | prev | next [–]


So, what is Snowflake? (I assume it's snowflake.com) From Googling it looks like Google's BigQuery. So it's a DB?

Why is Snowflake so expensive (197)

hnal943 on Aug 24, 2022 | parent | next [–]


It's a data warehouse database.

Why is Snowflake so expensive (198)

throw8383833jj on Aug 22, 2022 | prev | next [–]


it all comes down to the cost of switching and willingness of users to switch. the higher the cost of switching the higher you can make your product's price. Otherwise, with an extremely low cost of switching, the cost will ultimately be driven to near zero as more and more competitors enter the landscape.

Why is Snowflake so expensive (199)

dstola on Aug 22, 2022 | prev | next [–]


"optimization gremlin" = dark-pattern to take as much money away from you as possile

Why is Snowflake so expensive (200)

tablespoon on Aug 22, 2022 | prev | next [–]


> RevOps management

And now "XxxOps" is a meaningless buzzword.

Why is Snowflake so expensive (201)

danielodievich on Aug 22, 2022 | prev | next [–]


Interesting article. Some of it accurate. Some not.

>"Snowflake has no incentive to push a code change that makes things 20% faster because that can correspond to 10–20% drop in short-term revenue"Completely untrue. There is constant optimization of scheduler, execution process, global services, and compute fabric. The famous "we shipped AWS Graviton and it's like 10%" cheaper was something we did to ourselves. There is work underway to make FoundationDB faster/more efficient too that's totally out of this world. In short, nobody wants to burn extra CPU cycles and bill you for it.

>"Disclose Hardware Specs"This isn't hard to find if you work with Snowflake's SE and Services, but it's not going to give you anything. The whole POINT of Snowflake is to hide all this nonsense and make it "just work". You want CPU and SSD metrics, feel free to use Databricks (many do) or whatever.

Now, there IS something to be said about some sort of observability into query execution as it is going. There are constant discussions on that, and some of the new upcoming features (like programmatic access to query profiler) can open that up. But yeah, Snowflake is NOT something that will open up what's under the hood and it is super intentional

>"Not adopting benchmarks"This goes around and everyone freaks out. Just profile your own work. Whatever. Nobody cares about benchmarks.

>"Optimizer gremlins"Snowflake COULD do more to expose some of the internals. My job (and job of 100s of my services and technical SE colleagues) is to help customers understand what's happening under the hood. Some of the company's "make it simple" ethos COULD be a bit more open. However, much of the common things (MP pruning) can be solved by simple user education. I've lost count of how many customers I worked with who had 0 education in Snowflake and even like 20-30 minute intro in it made them open their eyes and go "woah, I get it now". On other hand, dozens of people told me that it was amazingly easy to use without training, and it IS!

>"Improve the workload manager to increase throughput"Workload manager is considerably more complex and sophisticated than this guy tells us it is. I saw an internal presentation on its internals that I asked to convert to a confluence article which thankfully happened pretty quickly and lots of people benefitted. There is cost-based scheduling that takes expected resources of queries to schedule and also considers actual resources consumed, all very frequently and for every XP. I wish that article was public but I think it will not be made one, but still, it's definitely not FIFO.

>"Not providing observability to monitor and reduce costs"This is valid feedback now and constantly what we do in services. New manageability features are coming to help with this. See CapitalOne or bunches of companies in this ecosystem.

>"What companies that use Snowflake could do better?I agree with point about education. Huge portion of people using and abusing Snowflake don't have any formal education. Best think you can do is hire Snowflake PS or get a partner/SI, or just take a damn class, they are REALLY good.

Source: 2 years in services at Snowflake with focus on perf, cost, and manageability.

Why is Snowflake so expensive (202)

msluyter on Aug 22, 2022 | prev | next [–]


Some of these complaints seem fair to me, some not as much. tl;dr -- Snowflake requires a fair bit of knowledge/effort to use optimally.

I spent a number of months last year focused on lowering Snowflake spend. In the process I learned a ton about Snowflake and gained a fair amount of respect for the product. Respect as in "this is really great" as well as respect as in "I need to be on guard here or I'm going to get hurt."

I think my biggest misconception at the outset was thinking of Snowflake like it's a relational database. It's not. Or rather, it is with a large number of caveats. Snowflake doesn't have b-tree indexes -- rather it has "clustering keys," which are sort of like coarse grained indexes that colocate data in micropartions, allowing queries to do micropartition pruning. If you have a well clustered table and you're filtering on your clustering keys, things will be great. But if not, or, for example you have to do multi-table joins on non-clustered columns, you'll suffer. So unless you have search optimization enabled (which costs more!), you have to retrain yourself away from "oh, just add an index here or there to make things fast" type of thinking you may have had working with Postgres or whatnot.

Regarding the author's complaints about lack of observability, I generally found it pretty easy to analyze what was going on via the query_history table. And the built in query analyzer is quite helpful. We did add tags to our dbt runs, which was pretty easy, and I wrote a handful of queries to find like the most expensive dbt models. It wasn't really that hard.

That said, dbt in particular provides a number of foot guns wrt Snowflake. Subqueries, as the author mentions, is one. We created some custom dbt macros to do things like instead of `select * from foo where x in (select * from blah)` -- if blah was small -- do a query on blah and write the query using a literal list, like `select * from foo where x in ('a', 'b', 'c', 'etc...').

Another issue we discovered is that in dbt it's trivial to create views. But we found that if views get too deeply nested, Snowflake can't adequately do predicate pushdown. So big stacks of views on views are suboptimal.

Another interesting one was tests. Dbt makes it trivial to perform null or uniqueness checks against a column. We found we were spending a lot on those tests that simply were doing something like `select * from blah where col is null`. On non-cluster key columns or complex views, these were causing full table scans. We took a number of steps to mitigate those issues. (Combining queries; changing where we did these checks in the dag). The way tests are scheduled is problematic as well. One "long pole" test will keep your warehouse up and using credits even after the other 99.9% of the tests have completed. After some analysis we separated long pole tests from the others and put them on different warehouses.

I could go on and on, actually, but I think that provides a taste of some of the complexities involved. Like almost any tool, you have to really understand it to use it effectively. But it's all too easy for, say, analysts, who may be blissfully unaware of the issues above, to write really poorly performing SQL on Snowflake.

Why is Snowflake so expensive (203)

dboreham on Aug 22, 2022 | prev | next [–]


Because someone needs a new boat?

Why is Snowflake so expensive (204)
Why is Snowflake so expensive (205)

NonNefarious on Aug 22, 2022 | prev [5 more]


[flagged]

Why is Snowflake so expensive (206)

It makes perfect sense if you know that Snowflake is a product/company. It just needs to be capitalized (and the trailing question mark restored).

Why is Snowflake so expensive (207)

NonNefarious on Aug 22, 2022 | root | parent | next [–]


What an asinine excuse. "It makes sense if it makes sense to you." And "snowflake" wasn't capitalized, so it wasn't a proper name. And even if it were (as it is now, having been fixed after I posted the above complaint), it would be just another douchily obscure headline on HN. If you're too lazy to say WTF you're talking about in a headline, don't burst into tears when you're called out on it. And, oh man, you're not even the OP... even more pathetic.

It's depressing to see insecure infants infecting HN with Reddit-style tantrums just because somebody said something mildly critical. If you're too gutless to demand better, at least STFU when others do.

Why is Snowflake so expensive (208)

wink on Aug 22, 2022 | root | parent | prev [–]


If only there was the possibility to link the first occurrence of a word to an external URL on a website.

Why is Snowflake so expensive (209)

NonNefarious on Aug 22, 2022 | root | parent [–]


Or add a descriptive phrase to a headline. Heaven forbid.

Why is Snowflake so expensive (2024)

FAQs

Why is Snowflake so expensive? ›

The total cost of using Snowflake is the aggregate of the cost of using data transfer, storage, and compute resources. Snowflake's innovative cloud architecture separates the cost of accomplishing any task into one of these usage types. Using compute resources within Snowflake consumes Snowflake credits.

What makes Snowflake so much better than others? ›

Snowflake's Data Cloud is powered by an advanced data platform provided as a self-managed service. Snowflake enables data storage, processing, and analytic solutions that are faster, easier to use, and far more flexible than traditional offerings.

How to reduce Snowflake costs? ›

That being said, there are some Snowflake cost optimization best practices related to rightsizing your warehouse.
  1. Group similar workloads in the same virtual warehouse. ...
  2. Leverage data SLAs to define workloads and value to business. ...
  3. Start small and right size utilization. ...
  4. Set resource and volume monitors. ...
  5. Using query tags.
Mar 21, 2024

Why is Snowflake better than competitors? ›

Snowflake excels in scalability with its separate compute and storage resources, allowing organizations to scale both independently. It outperforms competitors like Amazon Redshift and Cloudera Data Warehouse in this aspect.

Why is Snowflake so overvalued? ›

On the surface, Snowflake stock appears overvalued and risky. The company has not yet turned profitable, meaning it does not have a P/E ratio. Moreover, at a price-to-sales (P/S) ratio of 25, it is nearly 10 times the S&P 500's average P/S ratio of 2.7.

What are the disadvantages of Snowflake? ›

Despite its advantages, such as extreme scalability, automatic performance tuning, and strong data security, Snowflake faces challenges like higher costs compared to competitors, lack of native cloud integration, and limited support for unstructured data.

Who competes with Snowflake? ›

See how Snowflake compares to similar products. Snowflake's top competitors include Databricks, Talend, and Alteryx. Databricks operates as a data and artificial intelligence (AI) company. It specializes in unifying and democratizing data, analytics, and artificial intelligence.

Why is Snowflake so successful? ›

Instant and almost Unlimited Scalability

The Snowflake architecture relies on a single elastic performance engine that offers high speed and scalability. Snowflake can also support a high number of concurrent users and workloads, both interactive and batch.

Why Snowflake is better than AWS? ›

Snowflake is the apparent winner compared to AWS Redshift in terms of maintenance because its separate storage and compute architecture makes it easier to scale up and down. You can change a warehouse's size or increase the number of clusters.

What influences Snowflake pricing? ›

What influences Snowflake pricing? Snowflake pricing includes compute costs, storage costs, and data transfer costs. Compute costs are based on the time and size of the virtual warehouse used. Storage costs are determined by the amount of data stored, and data transfer costs apply when moving data between regions.

What problem did Snowflake solve? ›

Snowflake is a powerful solution for organizations seeking to address common cloud data warehousing challenges. With its unique architecture, scalability, and high performance, it provides an efficient and cost-effective platform for managing and analyzing data.

How much does Snowflake charge per TB of data storage? ›

Type of account (Pre-Purchase Capacity vs On-Demand)

But this flexibility comes at a price because this is the most expensive way to use Snowflake data storage. For example, Snowflake On-Demand costs $40 per TB per month when you deploy the service within the AWS (US East (Northern Virginia) Region).

Which big companies use Snowflake? ›

Adobe, Albertsons Companies, AT&T, Be The Match, Capital One, Deliveroo, Doordash, HP, Instacart, JetBlue, Kraft Heinz, Mastercard, McKesson, Micron, NBC Universal, Nielsen, Novartis, Okta, PepsiCo, Pitney Bowes, Siemens, University of Notre Dame, US Foods, Western Union, Yamaha, and many more.

What is Microsoft's alternative to Snowflake? ›

Microsoft's equivalent to Snowflake in the Azure ecosystem is Azure Synapse Analytics. This service combines enterprise data warehousing and big data analytics, offering a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs.

Who owns Snowflake? ›

Snowflake was founded by Marcin Zukowski, Thierry Cruanes and Benoit Dageville on July 23, 2012 and is headquartered in Bozeman, MT.

Is Snowflake worth a buy? ›

Based on analyst ratings, Snowflake's 12-month average price target is $211.26. What is SNOW's upside potential, based on the analysts' average price target? Snowflake has 30.73% upside potential, based on the analysts' average price target.

Does Warren Buffett own Snowflake? ›

Two famous fund managers have large share positions in Snowflake (SNOW) stock. On the other hand, investors may be concerned about Snowflake's decelerating revenue growth.

Why not to invest in Snowflake? ›

Two reasons I'm still not sold on Snowflake

The high price tag isn't a deal-breaker in isolation, but it highlights the need for robust growth for an investment in this company to pay off. That growth isn't on tap in Snowflake's fiscal 2025 (which started in February).

What is special about snowflakes? ›

Because every snowflake follows a different path to the ground, exposing them to countless variations in temperature and humidity, every snowflake turns out unique.

Top Articles
Latest Posts
Article information

Author: Eusebia Nader

Last Updated:

Views: 6154

Rating: 5 / 5 (60 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Eusebia Nader

Birthday: 1994-11-11

Address: Apt. 721 977 Ebert Meadows, Jereville, GA 73618-6603

Phone: +2316203969400

Job: International Farming Consultant

Hobby: Reading, Photography, Shooting, Singing, Magic, Kayaking, Mushroom hunting

Introduction: My name is Eusebia Nader, I am a encouraging, brainy, lively, nice, famous, healthy, clever person who loves writing and wants to share my knowledge and understanding with you.