New Kind of QA: One bottle neck I have (as a founder of a b2b saas) is testing changes. We have unit tests, we review PRs, etc. but those don't account for taste. I need to know if the feature feels right to the end user.
One example: we recently changed something about our onboarding flow. I needed to create a fresh team and go thru the onboarding flow dozens of times. It involves adding third party integrations (e.g. Postgres, a CRM, etc.) and each one can behave a little different. The full process can take 5 to 10 minutes.
I want an agent go thru the flow hundreds of times, trying different things (i.e. trying to break it) before I do it myself. There are some obvious things I catch on the first pass that an agent should easily identify and figure out solutions to.
New Kind of "Note to Self": Many of the voice memos, Loom videos, or notes I make (and later email to myself) are feature ideas. These could be 10x better with agents. If there were a local app recording my screen while I talk thru a problem or feature, agents could be picking up all sorts of context that would improve the final note.
Example: You're recording your screen and say "this drop down menu should have an option to drop the cache". An agent could be listening in, capture a screenshot of the menu, find the frontend files / functions related to caching, and trace to the backend endpoints. That single sentence would become a full spec for how to implement the feature.
I paste screenshots into claude code everyday and it's incredible. As in, I can't believe how good it is. I send a screenshot of console logs, a UI and some HTML elements and it just "gets it".
So saying they "Suck" makes me not take your opinion seriously.
yeah models are definitely improving, but we've found even the latest ones still hallucinate and infer text rather than doing pure transcription. we carry out very rigorous benchmarks against all of the frontier models. we think the differentiation is in accuracy on truly messy docs (nested tables, degraded scans, handwriting) and being able to deploy on-prem/vpc for regulated industries.
The only SaaS that AI has eaten for us is Retool. It wasn't the cost (we were paying < $200 per month). Retool has become more clunky to use vs. writing code with AI to solve the problems we used Retool for.
We released a meltano target for DuckLake[0]. dlt has one now too. Pretty easy to sync pg -> ducklake.
I've been really happy with DuckLake, happy to answer any questions about it.
DuckDB has always felt easier to use vs. Clickhouse for me, but both are great options. If I were you, I'd try both options for a few hours with your use case and pick the one that feels better.
I love DuckDB from a product perspective and appreciate the engineering excellence behind it. However, DuckDB was primarily built for seamless for in-process analytics, data science, data-preparation/ETL workloads than real-time customer facing analytics.
ClickHouse’s bread and butter is real-time analytics for customer-facing applications, which often come with demanding concurrency and latency requirements.
Ack, totally makes sense that both are amazing technologies - you could try both and test them at the scale your real-time application may reach, and then choose the technology that best fits your needs. :)
You keep like 1 year's worth of data in your "business database", and then archive the rest in S3 with parquet and query with DuckDB ?
And if you want to sync everything, even "current data", to do datascience/analytics, can you just write the recent data (eg the last week of data or whatever) in S3 every hours/days to get relatively up-to-date data? And doesn't that cause the S3 data to grow needlessly (eg does it replace, rather than store an additional copy of recent data each hour?)
Do you have kind of "starter project" for a Postgres + DuckLake integration that I could look at to see how it's used in practice, and how it makes some operations easier?
I just had a sales call with someone from Microsoft who was looking for an AI tool to automate some Excel work they were doing. I doubt they'll buy our product, but it gave me a good laugh.
The goal here is not to replace Cursor’s own local codebase indexing. Cursor already does that part well. What Nia focuses on is external context. It lets agents pull in accurate information from remote sources like docs, packages, APIs, and broader knowledge bases
That’s what GP is saying. This is the Docs feature of Cursor. It covers external docs/arbitrary web content.
`@Docs` — will show a bunch of pre-indexed Docs, and you can add whatever you want and it’ll show up in the list. You can see the state of Docs indexing in Cursor Settings.
The UX leaves a bit to be desired, but that’s a problem Cursor seems to have in general.
yeah ux is pretty bad and overall functionality. it still relies on a static retrieval layer and limited index scope.
+ as I mentioned above there are many more use cases than just coding.Think docs, APIs, research, knowledge bases, even personal or enterprise data sources the agent needs to explore and validate dynamically.
As an AI user (claude code, rovo, github copilot) I have come across this. In code it didnt build something right where it needed to use up to date docs. Luckily those people have now made an MCP but I had to wait. For a different project I may be SOL. Suprised this isnt solved, well done for taking it on.
From a business point of view I am not sure how you get traction without being 10x better than what Cursor can produce tomorrow. If you are successful the coding agents will copy your idea and then people being lazy and using what works have no inventive to switch.
I am not trying to discourage. More like encourage you to figure out how you get that elusive moat that all startups seek.
As a user I am excited to try it soon. Got something in mind that this should make easier.
This is different because of the background refresh, the identifier extraction and the graph. I know because I use cursor and am building the exact same thing oddly enough.
> At the time of writing, Bun's monthly downloads grew 25% last month (October, 2025), passing 7.2 million monthly downloads. We had over 4 years of runway to figure out monetization. We didn't have to join Anthropic.
I believe this completely. They didn't have to join, which means they got a solid valuation.
> Instead of putting our users & community through "Bun, the VC-backed startups tries to figure out monetization" – thanks to Anthropic, we can skip that chapter entirely and focus on building the best JavaScript tooling.
I believe this a bit less. It'll be nice to not have some weird monetization shoved into bun, but their focus will likely shift a bit.
> They didn't have to join, which means they got a solid valuation.
Did they? I see a $7MM seed round in 2022. Now to be clear that's a great seed round and it looks like they had plenty of traction. But it's unclear to me how they were going to monetize enough to justify their $7MM investment. If they continued with the consultancy model, they would need to pay back investors from contracts they negotiate with other companies, but this is a fraught way to get early cashflow going.
Though if I'm not mistaken, Confluent did the same thing?
I don't like all of the decisions they made for the runtime, or some of the way they communicate over social media/company culture, but I do admire how well-run the operation seems to have been from the outside. They've done a lot with (relatively) little, which is refreshing in our industry. I don't doubt they had a long runway either.
Thanks I scrolled past that in the announcement page.
With more runway comes more investor expectations too though. Some of the concern with VC backed companies is whether the valuation remains worthwhile. $26mm in funding is plenty for 14 people, but again the question is whether they can justify their valuation.
Regardless happy for the Oven folks and Bun has been a great experience (especially for someone who got on the JS ecosystem quite late.) I'm curious what the structure of the acquisition deal was like.
I really don't understand why investors poured so much money into Bun, I guess they saw another potential Vercel play? An acquisition doesn't sound like a very good outcome for these investors, even by Anthropic, I would imagine
> They didn't have to join, which means they got a solid valuation.
This isn't really true. It's more about who wanted them to join. Maybe it was Anthropic who really wanted to take over Bun/hire Jarred, or it was Jarred who got sick of Bun and wanted to work on AI.
I don't really know any details about this acquisition, and I assume it's the former, but acquihires are also done for other reasons than "it was the only way".
That’s kind of why Anthropic became a separate company in the first place though isn’t it? Dario Amodei was former head of research at OpenAI and left along with 6 or 7 others to form Anthropic.
Given the worries about LLM focused companies reaching profitability I have concerns that Bun's runway will be hijacked... I'd hate for them to go down with the ship when the bubble pops.
At least Anthropic itself has the stated goal of creating ethical AI that benefits humanity. That’s more than can be said for any other AI companies. Time will tell though. Google‘s motto used to be “don’t be evil” and now it’s basically the opposite.
Yeah, now they are part of Anthropic, who haven't figured out monetization themselves. Shikes!
I'm a user of Bun and an Anthropic customer. Claude Code is great and it's definitely where their models shine. Outside of that Anthropic sucks,their apps and web are complete crap, borderline unusable and the models are just meh. I get it, CC's head got probably a powerplay here given his department is towing the company and his secret sauce, according to marketing from Oven, was Bun. In fact VSCode's claude backend is distributed in bun-compiled binary exe, and the guy is featured on the front page of the Bun website since at least a week or so. So they bought the kid the toy he asked for.
Anthropic needs urgently, instead, to acquire a good team behind a good chatbot and make something minimally decent. Then make their models work for everything else as well as they do with code.
> Yeah, now they are part of Anthropic, who haven't figured out monetization themselves.
Anthropic are on track to reach $9BN in annualised revenue by the end of the year, and the six-month-old Claude Code already accounts for $1BN of that.
Not sure if that counts as "figured out monetization" when no AI company is even close to being profitable -- being able to get some money for running far more expensive setups is not nothing, but also not success.
Monetisation is not profitability, it’s just the existence of a revenue stream. If a startup says they are pre-monetisation it doesn’t mean they are bringing in money but in the red, it means they haven’t created any revenue streams yet.
How is their Web app any different than any other AI? I feel like it’s on par with all of them. It works great for me. Although I mostly use Claude code.
New Kind of QA: One bottle neck I have (as a founder of a b2b saas) is testing changes. We have unit tests, we review PRs, etc. but those don't account for taste. I need to know if the feature feels right to the end user.
One example: we recently changed something about our onboarding flow. I needed to create a fresh team and go thru the onboarding flow dozens of times. It involves adding third party integrations (e.g. Postgres, a CRM, etc.) and each one can behave a little different. The full process can take 5 to 10 minutes.
I want an agent go thru the flow hundreds of times, trying different things (i.e. trying to break it) before I do it myself. There are some obvious things I catch on the first pass that an agent should easily identify and figure out solutions to.
New Kind of "Note to Self": Many of the voice memos, Loom videos, or notes I make (and later email to myself) are feature ideas. These could be 10x better with agents. If there were a local app recording my screen while I talk thru a problem or feature, agents could be picking up all sorts of context that would improve the final note.
Example: You're recording your screen and say "this drop down menu should have an option to drop the cache". An agent could be listening in, capture a screenshot of the menu, find the frontend files / functions related to caching, and trace to the backend endpoints. That single sentence would become a full spec for how to implement the feature.
reply