Multiple people have coined the idea repeatedly, way before you. The oldest comment on HN I could find was in December 2022 by user spawarotti: https://news.ycombinator.com/item?id=33856172
Good article - the most use cases i see of pg_vector are typically “chat over their technical docs”
- small corpus
- doesn’t change often / can rebuild the index
- no multi-tenancy avoids much of the issues with post-filtering
Chroma implements SPANN and SPFresh (to avoid the limitations of HNSW), pre-filtering, hybrid search, and has a 100% usage-based tier (many bills are around $1 per month).
> Supabase/pgVector needs lots of resources when adding new rows to the index -> wish the resources scale up/down automatically. Instead of having to monitor and switch to the next plan.
Many ways potentially - but one way is Chroma makes all this pain go away.
We're also working on some ingestion tooling that will make it so you don't have to scale, manage or run those pipelines.
I'll for sure take a deeper look.
Ingestion has been by far the biggest pain and least fun.
Those infra parts hold us back from the cool things -> building agents/search