If you can have multiple writers to a single database then you'd need to look at something like cr-sqlite[1] that uses CRDT to figure out conflicts. If you're just replicating separate databases then you might be able to replicate each one using Litestream and then use SQLite's ATTACH[2] to connect them together. There is a limit on how many databases you can attach together in a session though.
Litestream author here. You can use the built-in file replication. It'll replicate all your database changes to another path on disk. I use it a lot for testing things out:
Litestream author here. Currently we're handling the "single writer" issue outside of Litestream. We have controls in our internal systems that make it work well. But yes, the lease PR is the direction we're looking at going.
I'm not sure you can have readers see something separate than writers. When SQLite promotes a read lock to a write lock under WAL then it checks if any of the data has changed and then fails the transaction if it has.
You're correct that it's hard to have readers see something separate from writers. It actually depends on an experimental feature that might just go away at any time.
In journal mode (which is what the VFS uses), the sequence for a write transaction is: (1) xLock(shared); (2) xRead(change-counter); (3) xLock(reserved) ...
The read of the change counter with the shared lock complicates things, because at that point we don't know if there will be a write upgrade to reserved.
However, the experimental feature makes it like this: (1) xFileControl(experimental-pragma); (2) xLock(shared); (3) xRead(change-counter); (4) xLock(reserved) ...
So you can distinguish shared locks that will be immediately upgraded.
Author here. It will cache pages in memory right now but it doesn't do disk-based caching currently. That's a good idea though and probably not too difficult to implement. Feel free to post a feature request as a GitHub issue on the project.
Author here. You can query the state of a table from an hour ago with Litestream VFS. It won't give you versioned tables in the sense that every time you update a row that it writes a new version in a table somewhere though.
Author here. We've done some proof-of-concept work on creating distributed leases using S3. We have some use cases internally where we've considered adding write capabilities to the VFS but we haven't started any work on it yet.
It’s a tricky problem that goes beyond the fs as you know. Since it’s cloud and since it’s distributed, a manager wouldn’t be that far fetched that could issue CRDT like messages across the cluster of nodes to issue a “write” or utilize the sync mechanism you have to propagate a “master” db that you write to (aggregate or designate, either way). I did some work on this on a go based database graph and ended up doing a gossip sync crdt message bus.
Is this VFS for read-only databases? Or can I query a database that has a single litestream writer somewhere continously making updates and backing them up to S3?
The VFS is read only but it will continuously poll for new updates so if you have a writer somewhere else using regular Litestream then it will pick up those updates automatically.
Author here. Litestream VFS will automatically poll for new back up data every second so it keeps itself up to date with any changes made by the original database.
You don't need any additional code (Python or otherwise) to use the VFS. It will work on the SQLite CLI as is.
so how would i connect from a separate machine, i can't figure out from the post or release notes or current litestream website docs, how would i use the extension to do that?
Edit:
need to set LITESTREAM_ACCESS_KEY_ID, LITESTREAM_SECRET_ACCESS_KEY, LITESTREAM_REPLICA_URL
[1]: https://github.com/vlcn-io/cr-sqlite
[2]: https://sqlite.org/lang_attach.html