More

andersmurphy · 2026-01-29T19:58:32 1769716712

netBSD! ... o wait not linux... damn

andersmurphy · 2026-01-29T19:50:43 1769716243

Biggest displacement has to be commenting on HN.

andersmurphy · 2026-01-25T18:45:05 1769366705

Yeah, you really want directly connected NVME drives to your machine/VPS. It can make orders of magnitude difference.

andersmurphy · 2026-01-25T18:43:11 1769366591

I mean it has blob types. Which basically means you can implement any type you want. You can also trivially implement custom application functions to work on these blob types in your queries. [1]

- [1] https://sqlite.org/appfunc.html

andersmurphy · 2026-01-24T14:33:01 1769265181

The thing is sqlite can scale further vertically than most network databases. In some context's like writes and interactive transactions it outright scales further. [1]

That's before you even get into sharding sqlite.

[1] - https://andersmurphy.com/2025/12/02/100000-tps-over-a-billio...

bastawhiz · 2026-01-25T23:53:38 1769385218

Sqlite isn't the part that needs to scale in most cases, though. As soon as you need multiple servers to handle the traffic you're getting (serializing data, concatenating strings for HTML, lots of network throughout, or even just handling amounts of data that press you up against your memory limit), you're probably not going to have a great time with sqlite. Having multiple boxes talk to the same sqlite file is not something I've ever seen anyone do well at scale.

Yes, you can get by with one box for probably quite a while. But eventually a service of any significant size is going to need multiple boxes. Hell, even just having near-zero downtime deployments essentially requires it. Vertically scaling is generally a whole lot less cost effective than horizontal scaling (for rented servers), especially if your peak usage is much higher than off-hours use.

andersmurphy · 2026-01-26T02:48:51 1769395731

I'd argue the opposite vertical scaling us a whole lot more effective than horizontal scaling if your using a language that has both real threads and green/virtual threads (go or anything on the JVM). You get such insane bang for your buck these days even over provisioning is cheap. Hell direct NVME can easily give 10-100x vs the crappy network drives AWS provides.

Zero downtime deploys have been solved for single machines. But, even then I'd argue most businesses can have an hour of downtime a month. I mean that's the same reliability as AWS these days.

Really, there are a handful of cases where you need multiple servers:

- You're network limited (basically you're a CDN).

- You are drive limited you need to get data off dirves faster than their bandwidth.

- Some legal requirement.

This is before we get into how trivial it is to shard sqlite by region or customer company. You can even shard sqlite on the same machine if you need higher write throughput.

graemep · 2026-01-25T17:30:55 1769362255

Is Postgres with "no network" running over a unix socket or an IP socket on the same machine?

andersmurphy · 2026-01-25T18:27:54 1769365674

Yes unix socket using the java 16 socket channels. Interestingly there was only a 5-10% improvement vs IP sockerts (with no ssl).

andersmurphy · 2026-01-17T09:01:03 1768640463

Wonder if that inspired: The tomb of the eaters - in caves of qud.

andersmurphy · 2026-01-12T07:19:54 1768202394

Forth

andersmurphy · 2026-01-05T19:52:44 1767642764

With a trend towards immutable single writer databases MMAP seems like a massive win.

mtndew4brkfst · 2026-01-06T02:33:34 1767666814

Andy is very critical of using mmap in database implementations.

hyc_symas · 2026-01-07T09:02:18 1767776538

Andy's critiques are only valid on dedicated database servers.

https://www.symas.com/post/are-you-sure-you-want-to-use-mmap...

LMDB uses mmap and Andy recommends LMDB, in the very article this thread is about.

andersmurphy · 2026-01-06T06:32:20 1767681140

Why? Sqlite and LMDB make fantastic use of it. For anyone doing a single writer db it's a no brainer. It does so much for you and it does it very well. All the things you don't have to implement because it does it for you:

- Reading the data from disk

- Concurrency between different threads reading the same data

- Caching and buffer management

- Eviction of pages from memory

- Playing nice with other processes in the machine

Why would you not leverage it? It's such a great fit for scaling reads.

cmrdporcupine · 2026-01-06T12:57:56 1767704276

The strongest argument as far as I can see it is... the problem is you now lose control over all those things. It's a black box with effectively no knobs.

Anyways, read for yourself, Pavlo & Leis get into it in detail, and there's benchmarks:

https://db.cs.cmu.edu/papers/2022/cidr2022-p13-crotty.pdf

https://db.cs.cmu.edu/mmap-cidr2022/

andersmurphy · 2026-01-07T04:45:40 1767761140

What am I missing? The transactional safety problem (the bulk of the paper) is solved simply with a single writer. Which is where you want to be anyway for efficient batching throughput (and isolation).

The other concerns seem to imply there are no other programs running on the same machine as the database. The minute that's not true (is it ever true?). Then OS will do a better job (as seen with LMDB etc).

I think it's telling that the paper focuses on mongoDB not LMDB.

hyc_symas · 2026-01-07T09:05:22 1767776722

Fun footnote: SQLite only got on board with mmap after I demonstrated how slow their code was without it. I.e., getting a 22x speedup by replacing SQLite's btree code with LMDB https://github.com/LMDB/sqlightning

andersmurphy · 2026-01-09T12:27:15 1767961635

Thank you for beating the mmap drum and LMDB! It's truly an incredible piece of tech.

alexpadula · 2026-01-06T06:59:56 1767682796

“ It's such a great fit for scaling reads.”

And losing them.

andersmurphy · 2026-01-06T11:23:01 1767698581

How so? LMDB, boltdb/bbolt and sqlite (with mmap) are all rock solid. Just because mongodb used mmap badly does not make it any less valuable.

andersmurphy · 2025-12-20T20:21:44 1766262104

Feom what I remember if AWS loses your data they are basically give you some credits and that's it.

andersmurphy · 2025-12-20T20:19:43 1766261983

Yup, often orders of magnitude better.