It's the Amazon own model. I'm baffled someone would pick it, even more that someone would test Llama 4 for a task in an age where Sonnet 4.5 is already out, so in the last 45 days.
Looks like they were limited by AWS Bedrock options.
No offense, but I hate all the comparisons to a "junior dev" that I see out there. This process is just like any dev! I mean, who wouldn't have to tinker around a bit to get some piece of software to work? Is there a human out there who would just magically type all the right things - no errors - first try?
> And like a junior dev it ran into some problems and needed some nudges.
There are people who don't get blocked waiting for external input in order to get tasks like this done, which I think is the intended comparison. There's a level of intuition that junior devs and LLMs don't have that senior devs do.
To offer a counterpoint, I had much better intuition as a junior than I do now, and it was also better than the seniors on my team.
Sometimes looking at the same type of code and the same infra day in and day out makes you rusty. In my olden days, I did something different every week, and I had more free time to experiment.
Hobby coding is imho a high entropy signal that you joined the workforce with a junior title but basically senior experience, which is what I see from kids who learned programming young due to curiosity vs those who only started learning in university. IOW I suspect you were not a junior in anything but name and pay.
There’s also a factor of the young being very confident that they’re right ;)
Codex is actually pretty good at getting things working and unblocking itself.
It’s just that when I review the code, I would do things differently because the agent doesn’t have experience with our codebase. Although it is getting better at in-context learning from the existing code, it is still seeing all of it for the “first time”.
It’s not a junior dev, it’s just a dev perpetually in their first week at a new job. A pretty skilled one, at that!
and a lot of things translate. How well do you onboard new engineers? Well written code is easier to read and modify, tests helps maintain correctness while showing examples, etc.
Point taken and I should have known better. I fully agree with you. I suppose I should say inexperienced dev or something more accurate. Having worked with many inexperienced devs there was quite a spread in capabilities. Using terms that are dismissive to individuals is not helpful.
> Is there a human out there who would just magically type all the right things - no errors - first try?
If they know what they're doing and it's not an exploratory task where the most efficient way to do it is by trial and error? Quite a few. Not always, but often.
That skill seems to have very little value in today's world though.
Tufte used to call this creating a "visual lie" - you just don't start the y-axis at 0, you start it wherever, in order to maximize the difference. it's dishonest.
There will be many negative comments here so let me add a positive: writing helps you think. More so than making ppt. My guess that it is helpful in some cases to force this level of detailed thought.
The purpose of presenting isn't for you, it's for audiences. The thought process or lack of it will happen either way, and can be accomplished either way.
Not really a counterpoint, but a “yes, and”: I’ve often made an “internal” presentation that is mostly for myself and maybe a few others, which distills the key concepts of something into a coherent narrative. While it can help others, I also have found the process of creating a presentation, outline, or summary helps me to properly organize (and sometimes change) my thoughts at least as much as it helps convey those thoughts to others.