Silly logical mistakes like that are rapidly decreasing in frequency as models improve, and I see no reason why they won't soon be a thing of the past.
For example, I haven't seen Grok make a mistake like that in a long time, and it has no problem with your question:
> Drive, obviously.
If you walk the 100m, your car stays parked at home, still dirty, wondering why you abandoned it. The whole point is to get the car to the car wash.
There are a lot of non developer claude code users these days. The hype about vibe coding lets everyone think they can now be an engineer. Problem is if anthropic caters to that crowd the devs that are using it to do somewhat serious engineering tasks and don't believe in the "run an army of parallel agents and pray" methodology are being alienated.
Maybe Claude Code web or desktop could be targeted to these new vibe coders instead? These folks often don't know how simple bash commands work so the terminal is the wrong UX anyway. Bash as a tool is just very powerful for any agentic experience.
Logs (and in this case, Verbose Mode) aren't for knowing what a thing is currently doing as its doing it, it's for finding out what happened when the thing didn't do what you expected or wanted.
Microsoft fell into this trap in the 90s -- they believed that they could hide the DOS prompt, and make everything "easier" with wizards where you just go through a series of screens clicking "next", "next", "finish".
Yes, it was easier. But it dumbed down a generation of developers.
It took them two decades to try to come up with Powershell, but it was too late.
Exactly how I feel. I'm happy that more people are using these tools and learning (hopefully) about engineering but it shouldn't degrade the core experience for let's say "more advanced" users who don't see themselves as Vibe coders and want precise control over what's happening.
If anything, the reverse, in that it devalues engineering. For most, LLMs are a path to an end-product without the bother or effort of understanding. No different than paid engineers were, but even better because you don't have to talk to engineers or pay them.
The sparks of genuine curiosity here are a rounding error.
No, those paid games where NPCs starts to point to clues if the player takes too long to solve a riddle or where you can skip the hard parts if you fail to often.
There are many camps. Some programmers embrace the prompt, some use parts of it, some reject it on principle (a dying breed?), some think that non-developers are finally getting a go at it, some gloat that tech barons are making software engineers (apply optional quotes) obsolete.
It’s all too varied to put people into one or two camps.
I've worked as a software engineer with different types of engineers (electrical, mechanical and automation).
Their testing is often more strict but that is a natural consequence of their products being significantly harder to fix in the field than a software product is.
Other than that, my experience is that our way of working on projects across disciplines is very similar.
I think Dario & crew are getting high on their own supply and really believe the "software developers out of work by end of 2026" pronouncements.
Meanwhile all evidence is that the true value of these tools is in their ability to augment & super-charge competent software engineers, not replace them.
Meanwhile the quality of Claude Code the tool itself is a bit of a damning indictment of their philosophy.
Give me a team of experienced sharp diligent engineers with these coding tools and we can make absolutely amazing things. But newbie product manager with no software engineering fundamentals issuing prompts will make a mess.
I can see it even in my own work -- when I venture into doing frontend eng using these tools the results look good but often have reliability issues. Because my background/specialization is in systems, embedded & backend work -- I'm not good at reviewing the React etc code it makes.
Amodei has to be the most insufferable of all the AI hucksters, nowadays even Altman looks tame compared to him.
The whole company also has this meme about AI safety and some sort of fear-mongering about the models every few months. It's basically a smokescreen for normies and other midwits to make it look more mysterious and advanced than it really is. OOOOH IT'S GOING TO BREAK OUT! IT KNOWS IT'S BEING EVALUATED!
I bet there are some true believers in Anthropic too, people who think themselves too smart to believe in God so they replaced it with AI instead but all the same hopes are there, eg. Amodei preaching about AI doubling the human lifespan. In religion we usually talk about heaven.
Anecdotally, all the non-technical people I know are adapting fine to the console. You don’t need to know how bash commands work to use it as you are just approving commands, not writing them.
This is a great take. It applies the SaaS is dead theory at a lower level (libraries are dead) but has it a much more nuanced view.
Yeh even if LLMs are 10x better than today you probably still don't want to implement cryptography from scratch, but use a library.
I also like the 3d printing analogy. We will see how good LLMs get, but I will say that a lot of AI coded tools today have the same feeling as 3d printed hardware.
If no engineer was involved the software is cheap and breaks under pressure because no one considered the edge cases. It looks good on the surface but if you use it for something serious it does break.
The engineer might still use an LLM/3d printer but where necessary he'll use a metal connection (write code by hand or at least tightly guide the LLM) to make the product sturdy.
But why? Even if you could have an AI do that it’s, if anything, a waste of cpu cycles. If you have a battle tested library that works and has been tested for trillions of request cycles why would you even want to write a new one that needs testing and maintenance? No matter how cheap code gen gets it doesn’t make sense. For something like a UI library, sure build something specific to your needs.
Libraries are really built for human beings, not super intelligent machines. ORMs are built because I don’t like to and can’t write complex sql with every edge case.
Same with a lot of software, software libraries are designed to work with the deficiencies of the human mind.
There’s no reason to think ai needs these libraries in the same way
Even in your scenario LLMs could write super optimized libraries (not intended by for humans) to do common tasks and share the code between them.
I’m not saying the future can’t get to an ai just producing everything. I’m saying it’s just plain inefficient to keep solving the same problem over and over.
I agree with part of this (see my comment above). That said our limitations were also how we produced mathematics. Categorizing world into fixed concepts is valuable i'd say.
This seems like today's version of "I could write Facebook in a weekend."
What are the incentives for doing that? What are the incentives for everyone else to move?
So if proven things exist for basics, what's the incentive to not use them? If everyone decides they're too heavy, they could make and publish new libraries and tools would pick those up. And since they're old, the feature-set is probably more nuanced than you expect. YAGNI is a motto for doing less to avoid creating code debt, but writing more net new code to avoid using a stable and proven library doesn't fit that.
Importing a library for everything will become a dated concept. Similar to the idea that object relational mappers might go away if the ai can just write ultra complicated hyper efficient sql for every query
If AI makes the per line cost of producing software cheaper but you still need an expensive human in the loop then the per line cost is merely cheap not free or at the cost of electricity.
Given the choice between
A) having one AI produce a library and having 1000 produce code using that library which comes with tests human in the loop vetting documentation and examples which drastically increase the chance of the 1000 AIs doing it correctly
B) Having 1001 produce the same functionality provided by the library probably on average worse and requiring more expensive hand holding
What in that benefit of B? You might have slightly higher specificity to your use case but its more likely that the only increased specificity is shit you didn't realize you needed yet and will have to prompt and guide the AI to produce.
I fail to see how AI would obviate the need to modularize and re-use code.
This take is just an intermediary take until ai takes over software engineering. In the same way, eventually self driving cars will make human drivers look dangerous
I think your thought process is not taking into account what a super logical ai can do, and how effortlessly it could generate some of this code.
Why does the AI need SQL queries? Who needs that? It will just write its own ACID-compliant database with its own language, and while it's at it, reinvent the operating system as well. It's turtles all the way down.
It’s actually not a ridiculous concept, but I think in some ways code will go away, and the agent itself will do what the code used to do. Software of the future will be far more dynamic and on the fly. We won’t have these rigid structures
Why does the AI need hardware/chips? Why does the AI need the universe to exist? Why does the AI need math/logic to exist?
Using these preexisting will all become outdated. You will look like primitive cavemen if your agents don't build these from scratch every time you build $NEXT_BIG_THING.
Even local LLMs will be able to build these from scratch by end of 2026.
ORMs have largely been fading away for a while because there are real wins of not using them.
Hyper-optimized HTTP request/response parsing? Yawn. Far less interesting.
AFAICT, the advantages of keeping context tight and focused have not gone away. So there would neeed to be pretty interesting advantages to not just doing the easy thing.
Build times too. I kinda doubt you're setting up strictly-modularized and tightly-controlled bazel builds for all your stuff to avoid extra recompilation... so why are we overcomplicating the easy stuff? Just because "it will probably function just as well"?
"leftpad"-level library inanity? Sure, even less need than before (there was never much). Substantial libraries? What's the point?
Hell, some of the most-used heavily-AI-coded software is going the opposite direction and jumping through hoops to keep using web libraries for UI even though they're terminal apps.
How do you determine flawlessness? How you even approximate a guarantee of it? To what specification is flawlessness judged, and can you precisely and completely relay that spec to your friendly local robot more efficiently than it can vendor / import existing libraries?
It’ll just spit the code out. I vibe coded some with cookie handling the other day that worked. Should I have done it? Nope. But the ai did it and I allowed it
The concept of using a library for everything will become outdated
It read the library and created a custom implementation for my use case. The implementation was interoperable with a popular nextjs with library. It was a hack sure, but it also took me three minutes
The value of a library is not just that it does a thing you want, but that it doesn’t do all the things you’d prefer it didn’t.
It’s easy to write a cookie parser for a simple case; clearly your robot was able to hand you one for millidollars. How confident are you that you’ve exhaustively specified the exact subset of situations your code is going to encounter, so the missing functionality doesn’t matter? How confident are you that its implementation doesn’t blow up under duress? How many tokens do you want to commit to confirming that (reasoning, test, pick your poison)?
I mean ASI could just generate the pixels of a webpage at 60hz. And who needs cookies if ASI knows who you are after seeing half a frame of the dust on your wall, and then knows all the pixels and all the clicks that have transpired across all your interactions with its surfaces. And who needs webpages when ASI can conjure anything on a screen and its robots and drones can deliver any earthly desire at the cost of its inputs. That is if we’re not all made into paper clips or dead fighting for control of the ASML guild at that point.
I say all that mostly in jest, but to take your point somewhat seriously: the most accurate thing you’ve said is that no one is ready for super intelligence. What a great and terrible paroxysm it would be.
Which is the reverse of how humans design things, layers, modules. LLMs act as generalized compilers. Impressive but at the same time you end up with a static-like bunch of files instead of a system of parts (that said I'm not a great user of llms so maybe people managed to produce proto-frameworks with them, or maybe that will be the next phase.. module-oriented llm training).
verbatim llm output with little substance to it.
HN mods don't want us to be negative but if this is what we have to take serious these days it is hard to say anything else.
I guess I could not comment at all but that feels like just letting the platform sink into the slopacolypse?
Noone can correctly quantify what these models can and can't do. That leads to the people in charge completely overselling them (automating all white collar jobs, doing all software engineering, etc) and the people threatened by those statements firing back when these models inevitably fail at doing what was promised.
They are very capable but it's very hard to explain to what degree. It is even harder to quantify what they will be able to do in the future and what inherent limits exist. Again leading to the people benefiting from it to claim that there are no limits.
Truth is that we just don't know. And there are too few good folks out there that are actually reasonable about it because the ones that know are working on the tech and benefit from more hype. Karpathy is one of the few that left the rocket and gives a still optimistic but reasonable perspective.
I ran the command, looked at the report.
It said something like 300+ sessions are about database migrations.
That can't be its a new project I only had like 10 claude code sessions so far.
I asked Claude about it it looked into some files and this was its response:
> The "336" number was fabricated by the insights generation process. It doesn't exist anywhere in the underlying facets data. The insights generator appears to have hallucinated
statistics that sound plausible but aren't grounded in actual data.
The facets only analyzed 7 of your sessions, so any aggregate statistics like "336 database_changes" or "168 multi_file_changes" are made up.
So the whole report is fabricated, great. This kinda stuff just doesn't work (yet, maybe it never will).
You just don't know which parts of the doc are real and which are hallucinated. Maybe the prompter checked everything and the content is actually good, but sadly many don't and there is a lot of slop flying around.
So if you want others to read the output you'll have to de-slopify it, ideally make it shorter and remove the tells.
If I go by good faith these days and trust that someone didn't just upload llm hallucinated bullshit, I'd sadly just be reading slop all day and not learning anything or worse even get deceived by hallucinations and make wrong assumptions. It's just a waste of someones precious life time.
LLMs can read through slop all day, humans can not without getting extremely annoyed.
> You just don't know which parts of the doc are real and which are hallucinated.
It doesn't look like slop at all to me. GP claimed that this was written by ai without evidence, which I assumed to be based in bias, based on GP's comment history: https://news.ycombinator.com/threads?id=jondwillis They complaint they have about the writing style is not the style that is emblematic of Ai slop. Then, considering the depth of analysis and breadth of connection, this is not something current Ai is up to producing.
Are you also assuming the article was written by an Ai?
reply