Hacker Newsnew | past | comments | ask | show | jobs | submit | jascha_eng's commentslogin

yes I agree with this, more granular going back, letting me interrupt where it went off the rails, or even editing file reads myself etc would be lovely. Ingesting parts of other conversations would also be cool!

The benchmark improvements actually look pretty damn nice tho!

They kind of have a little diagram explaining the steps I imagine every single step in that to basically be it's own Claude code session.


yup its all vibes. And anthropic is winning on those in my book still


None of his math really checks out. Building a piece of software is or at least was orders of magnitudes more expensive than maintaining it. But how much money it can make is potentially unbounded (until it gets replaced).

So investing e.g. 10 million this year to build a product that produces maybe 2 million ARR will have armortized after 5 years if you can reduce engineering spend to zero. You can also use the same crew to build another product instead and repeat that process over and over again. That's why an engineering team is an asset.

It's also a gamble, if you invest 10 million this year and the product doesn't produce any revenue you lost the bet. You can decide to either bet again or lay everyone off.

It is incredibly hard or maybe even impossible to predict if a product or feature will be successful in driving revenue. So all his math is kinda pointless.


> Building a piece of software is or at least was orders of magnitudes more expensive than maintaining it

This feels ludicrously backwards to me, and also contrary to what I've always seen as established wisdom - that most programming is maintenance. (Type `most programming is maintenance` into Google to find page after page of people advancing this thesis.) I suspect we have different ideas of what constitutes "maintenance".


> that most programming is maintenance.

What do you mean by maintenance?

A strict definition would be "the software is shipping but customers have encountered a bug bad enough that we will fix it". Most work is not of this type.

Most work is "the software is shipping but customers really want some new feature". Let us be clear though, even though it often is counted as maintenance, this is adding more features. If you had decided up front to not ship until all these features were in place it wouldn't change the work at all in most cases (once in a while it would because the new feature doesn't fit cleanly into the original architecture in a way that if you had known in advance you would have used a different architecture)


> If you had decided up front to not ship until all these features were in place it wouldn't change the work at all in most cases

In my experience (of primarily web dev), this is not true, and the reasons it is not true are not limited to software architecture conflicts like you describe (although they happen too). Instead the problems I usually encounter are that:

* once you have shipped something and users are relying on it, it limits the decisions you are allowed to make about what features the system should have. You may regret implementing feature X because it precludes more valuable features Y and Z, but now that X is there, the cost of ripping it out is very high due to the backlash it will cause.

* once you have shipped an application, most of the time when you add new features you are probably slightly changing at least some UI, and so you need to think about how that's going to confuse experienced users and how to address that in a way you wouldn't have to when implementing something de novo. For an internal LOB app, that might mean creating announcements and demos and internal trainings that wouldn't be necessary for greenfield work.

* the majority of professional web dev involves systems with databases, and adding features frequently involves database migrations, and sometimes figuring out how to implement those database migrations without losing data or causing downtime is difficult and complicated.

* as web applications grow their userbase, the scale of the business often introduces new problems with software performance, with viability of analysing business-relevant data from the system, or with moderation or customer support tasks associated with the system, and these problems often demand new features to keep the broader business surrounding the software afloat that weren't needed at launch.

* software that has actually launched and become embedded in existing business processes inherently tends to have many more stakeholders in the business that care about it than pre-launch software, and those stakeholders naturally want to get involved in decision-making about their tools, and that creates meeting and communication overhead - sometimes to such a degree that stakeholder management and negotiating buy-in ends up being an order of magnitude more work than actually implementing the damn feature being argued about.

To the extent that the amount of work involved in implementing a new feature is inflated by these kind of factors relative to what would have been involved in doing it de novo, I personally conceive of that as "maintenance" work; and in my experience my work on big teams at successful businesses has on average been inflated severalfold by those factors. (I also count work mandated by legal/compliance considerations that arise only after a successful launch as "maintenance". My rough conception of "software maintenance" is that the delta between "the work involved in building a product de novo with the same customer-pleasing features that ours has" and "the work we actually had to do to incrementally build the product in parallel to it being used" as "maintenance".)

Would most people agree with my broad notion of maintenance? I reckon they roughly would, but it's hard to say since people who talk about maintenance rarely attempt to define it with any precision. You give a precise but extremely narrow definition above. Wikipedia likewise gives a precise but extremely broad definition - that maintenance is "modification of software after delivery", under which definition surely over 99.999% of professional software development labour is expended on maintenance! I guess my definition puts me somewhere in the middle.


I like the good ol' "80% of the work in a software project happens before you ship. The other 80% is maintaining what you shipped."


The longer software is sold the more you need to maintain it. In year one most of the cost is making it. Over time other costs start to add up.


FWIW TJ is not your average vibe coder imo: https://www.linkedin.com/in/todd-j-green/

In september he burned through 3000$ in API credits though, but I think that's before we finally bought max plans for everyone that wanted it.


Is this meant to inspire confidence or fear?


Doesn't look as bad as I expected tbh. Sure some stuff could be better but I've seen much shittier vibe coded projects (including my own). I'd be more interested in their workflows and testing pipeline though. They ship pretty often but Boris still says he has 10+ PRs a day. I would be really curious what triggers a release, since it doesn't seem like every PR is released. I'm also curious how large their PRs really are.

There is a big difference between:

> Build plugins

and:

> Add 3px padding in line 5

if you claim "No code is written by humans anymore"


Yes and even now if you tell the LLM any private information inside the sandbox it can now leak that if it gets misdirected/prompt injected.

So there isn't really a way to avoid this trade-off you can either have a useless agent with no info and no access. Or a useful agent that then is incredibly risky to use as it might go rogue any moment.

Sure you can slightly choose where on the scale you want to be but any usefulness inherently means it's also risky if you run LLMs async without supervision.

The only absolutely safe way to give access and info to an agent is with manual approvals for anything it does. Which gives you review fatigue in minutes.


FWIW I reported your post to the mods because it reads completely AI generated to me. My judgement was that it might have been slightly edited but is largely verbatim LLM output.

Some tells that you might wanna look at in your writing, if you truly did write it yourself without Any LLM input are these contrarian/pivoting statements. Your post is full of these and it is imo the most classic LLM writing tell atm. These are mostly variants of the 'Its not X but Y" theme:

- "Not whether they've adopted every tool, but whether they're curious"

- "I still drive the intuition. The agents just execute at a speed I never could alone."

- "The model doesn't save you from bad decisions. It just helps you make them faster."

- "That foundation isn't decoration. It's the reason the AI is useful to me in the first place."

- "That's not prompting. That's engineering"

It is also telling that the reader basically cant take a breather most of the sentences try to emphasize harder than the last one. There is no fluff thought, no getting side tracked. It reads unnatural, humans do not think like this usually.


The LLMs are training "us" now.

First we develop the machines, then we contort the entire social and psychic order to serve their rhythms and facilitate their operation.


When did they stop putting competitor models on the comparison table btw? And yeh I mean the benchmark improvements are meh. Context Window and lack of real memory is still an issue.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: