Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It’s not about Redis vs not Redis, it’s about working with data that does not serialize well or lend itself well to extremely high update velocity.

Things like: counters, news feeds, chat messages, etc

The cost of delivery for doing these things well with a LSM based DB or RDB might actually be higher than Redis. Meaning: you would need more CPUs/memory to deliver this functionality, at scale, than you would with Redis, because of all the overhead of the underlying DB engine.

But for 99% of places that aren’t FAANG, that is fine actually. Anything under like 10k QPS and you can do it in MySQL in the dumbest way possible and no one would ever notice.



"But for 99% of places that aren’t FAANG, that is fine actually. Anything under like 10k QPS and you can do it in MySQL in the dumbest way possible and no one would ever notice."

It's not fine. I feel like you're really stretching it thin here in an almost hand-waving way. There are so many cases at far smaller scale where latency is still a primary bottleneck and a crucial metric for valuable and competitive throughput, where the definitively higher latency of pretty much any comparable set of operations performed in a DBMS (like MySQL) will result in large performance loss when compared to a proper key-value store.

An example I personally ran into a few years ago was a basic antispam mechanism (a dead simple rate-limiter) in a telecoms component seeing far below 10k items per second ("QPS"), fashioned exactly as suggested by using already-available MySQL for the counters' persistence: a fast and easy case of SELECT/UPDATE without any complexity or logic in the DQL/DML. Moving persistence to a proper key-value store cut latency to a fraction and more than doubled throughput, allowing for actually processing many thousands of SMSes per second for only an additional $15/month for the instance running Redis. Small operation, nowhere near "scale", huge impact to performance and ability to process customer requests, increased competitiveness. Every large customer noticed.


A well-designed schema in a properly-sized-and-tuned [MySQL, Postgres] instance can and will execute point lookups in a few hundred microseconds.

That said, I agree that if you need a KV store, use a KV store. Though of course, Postgres can get you close out of the box with `CREATE UNLOGGED TABLE (data hstore);`.


> processing many thousands of SMSes per second for only an additional $15/month for the instance running Redis. Small operation, nowhere near "scale", huge impact to performance and ability to process customer requests

The vast majority of companies never need to deal with even one thousand of anything per second. Your situation was absolutely an unusually large scale.


I'm sure something other than the MySQL engine itself was the bottleneck in that case, like bad configuration or slow disk or something.

Did you profile the issue?


Unreplicated MEMORY tables, prepared and cached statements, efficient DDL and sane indices, no contention or locking, no access from multiple sessions, some performance tuning of InnoDB, ample resources, DB not stressed, no difference in pure network latency.

MySQL's query optimizer/planner/parser perform a lot more "gyrations" than Redis or MemcacheDB do before finally reaching the point of touching the datastore to be read/written, even in the case of prepared statements. Their respective complexities are not really comparable.


I've only ever seen Redis used in two scenarios: storing ephemeral cache data to horizontally scale Django applications and for ephemeral job processing where the metadata about the job was worthless.

I reevaluated it for a job processing context a couple of years ago and opted for websockets instead because what I really needed was something that outlived an HTTP timeout.

I've never actually seen it used in a case where it wasn't an architecture smell. The codebase itself is pretty clean and the ideas it has are good, but the idea of externalizing datastructures like that just doesn't seem that useful if you're building something correctly.


Redid + Sidekiq was a default for a long time in the Rails world as well, but it’s an unnecessary complication (and expense) for most use cases. Just use your existing DB until you need to seriously scale up, and then look at a message queue.

I’ve used Redis for leaderboards and random matchmaking though, stuff which is doable in postgres but is seriously write-heavy and a bit of a faff. Gives you exactly the sort of goodies you need on top of a K/V store without being difficult to set up.

As for caching - it’s nice to use as an engineer for sure, but pretty pricey. It wouldn’t be my default choice any more.


Rails is attempting to solve this with Solid Queue, which was inspired heavily by GoodJob, both of which use Postgresql (and more in the case of Solid Queue). Both seem to be fairly capable of "serious scale", at least being equivalent to Sidekiq.


A place I work at which I can’t name uses GoodJob at FAANG scale. And it works perfectly. The small startups and lower scale places still reach for sidekiq because they seem to think “it’s faster,” but it ends up being a nightmare for many people because when they do start reaching some scale, their queues and infra are so jacked up that they continually have sidekiq “emergencies.” GoodJob (and SolidQueue) for the win.

I like the sidekiq guy and wish him the best, but for me, the ubiquitous Redis dependency on my Rails apps is forever gone. Unless I actually need a KV store, but even for that, I can get away with PG and not know the difference.

Unfortunately there are still some CTOs out there that haven’t updated their knowledge are are still partying like it’s 2015.


I’m curious, are you running GoodJob in the same database as the application? For smaller scale stuff this is super convenient, but I wonder if it will become a problem at higher loads.


Using Redis exclusively remotely never made much sense to me. I get it as a secondary use case (gather stats from a server that’s running Redis, from another machine or something) but if it’s not acting as (effectively) structured, shared memory on a local machine with helpful coordination features, I don’t really get it. It excels at that, but all this Redis as a Service stuff where it’s never on the same machine as any of the processes accessing it don’t make sense to me.

Like you have to push those kinds of use cases if you’re trying to build a business around it, because a process that runs on your server with your other stuff isn’t a SaaS and everyone wants to sell SaaS, but it’s far enough outside its ideal niche that I don’t understand why it got popular to use that way.


To your last: yep, especially having in mind that Redis is ephemeral. I've had much more success with SQLite + a bunch of stricter validators (as SQLite itself is sadly pretty loose), and more performance too.


Exactly. Lots of people read post by companies doing millions of qps and then decide that they need redis, kafka, elastic, nosql, etc right from start. And that complicates things. We are currently at 500k RPS scale and we have probably around a handful of use cases for Redis and it works great


I worked for a company that had enough customers that AWS had to rearrange their backlog for cert management to get us to come on board, and our ingress didn’t see 10,000 req/s. We put a KV store in front of practically all of our backend services though. We could have used Redis, but memcached was so stable and simple that we just manually sharded by service. We flew too close to the sun trying to make the miss rate in one of the stores a little lower and got bit by OOMKiller.

By the time it was clear we would have been better off with Redis’ sharding solution the team was comfortable with the devil they knew.


100% this. Also, is it data whose scale and speed is more important than its durability?

I actually agree with the author that Redis was not the right solution for the situations he was presented with, but he's far from proving it is not the solution for a whole host of other problems.


Even then you can do a lot of things to spread write contention with an RDBMS.

e.g. MySQL 8.0.1+ adds SKIP LOCKED modifier to SELECT ... FOR UPDATE.

Then you can increment the first available row, otherwise insert a new row. On read aggregate the values.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: