Comments like this reveal the magnitude of polarization around this issue in tech circles. Most people actually feel this kind of animosity towards AI, and so having comment threads like this even be visible on HN is unusual. Needless to say, all my comments here are hand written. But the poster knows that, of course.
in what sense, instead of doing your job which I assume you've been doing successfully for many years you now ask Claude to do it for you and then have to review it?
the first sentence is "Announcing Magistral — the first reasoning model by Mistral AI — excelling in domain-specific, transparent, and multilingual reasoning." and those should clearly be comma
and this sentence is just flat out wrong "Lack of specialized depth needed for domain-specific problems, limited transparency, and inconsistent reasoning in the desired language — are just some of the known limitations of early thinking models."
I was trying to figure out what he does and his website proudly states at the very top “ No templates, no no-code, no AI slop - just great sites built to grow.”. interesting!
So it gives you the wrong answer and then you keep telling it how to fix it until it does? What does fancy prompting look like then, just feeding it the solution piece by piece?
Basically yes, but there's a very wide range of how explicit the feedback could be. Here's an example where I tell gpt-4 exactly what the rule is and it still fails:
I'd share similar examples using claude-3.5-sonnet but I can't figure out how to do it from the claud.ai ui.
To be clear, my point is not at all that o1 is so incredibly smart. IMO the ARC-AGI puzzles show very clearly how dumb even the most advanced models are. My point is just that o1 does seem to be noticeably better at solving these problems than previous models.