I've been a contributor and reviewer for terminal bench since last August, and this post is about what I've learned designing and reviewing tasks. The guidance is broadly applicable to anyone building an agentic benchmark.I would love feedback from the HN community.
The author’s intuition is still backward calibrated, even though he talks about the future. He doesn’t have an intuition for the future. All code will be AI generated. There’s no way to compete with the AI. And whatever new downsides this brings will be solved in ways we aren’t fully anticipating. But the solution is not to walk back vibecoding. You have to be blind to believe not most code will be vibecoded very soon.
You have to be incredibly incompetent and naiive to look at the absolute garbage theatre that AI outputs today to go "yeah this will write all future code".
Usually the response, for the last years, has been "no no you don't get it, it'll get so much better" and then they make the context window slightly larger and make it run python code to do math.
What will really happen is that you and people like you will let Claude or some other commerical product write code, which it then owns. The second Claude becomes more expensive, you will pay, because all your tooling, your "prompts saved in commits" etc. will not work the same with whatever other AI offer.
You've just reinvented vendor lock in, or "highly paid consultant code", on a whole new level.
Can you explain what you think will happen, actually? People at OpenAi and Anthropic aren’t longer coding by hand. Are you saying everyone changes their mind and goes back? Not gonna happen. You have to work around this new constrain.
Yes, I'm saying that the companies who's entire business model is selling you AIs are not a reliable source. And of course, again, if you are competent, you can see that AI only generates passable outputs when guided or when the scope is small. This guiding only works when there is a human operator.
You're all being fooled by emergent behavior, which acts like intelligence, but really doesn't fool people who are familiar with how to write code.
I'm sorry, I'm not sure how to say this without sounding elitist, but the goal post has not moved since GPT 3. These tools, autonomously, produce code that only fools the clueless. I don't know how else to put it and I'm getting really tired of this argument that "look, company with billions and some of the buggiest shittest software is using AI to write it". No shit they are.
So what's going to happen? We will see the same divide we've seen with JavaScript. It's new, then everyone says it's what everyone must use. C++ developers no longer needed--its all JS now. No need for native UIs, it's all JS now. If you're not learning the latest web tech, you'll miss out and fall behind. If you're studying for anything but web, you'll be out of a job in 5 years. And now, a couple decades later, we are still waiting for it.
Well, one thing I'll say is... If for whatever reason we have an electrical issue, or just general chip scarcity, then all programmers with experience will be the ones that can bail out society. Just saying. Especially because kids today won't really learn coding. Its a FAFA situation. Stay sharp!
There’s no reason a company should put up with enemies within. In rare instances a disgruntled employee might be able to make a positive contribution. In most cases, even if the employee has valid reasons, by the time they are disgruntled there’s no coming back. It’s best for everyone to move on.
Yes, why surround yourself with people who are critical of you, when you can surround yourself with yes-men who will loyally toe the line? Positive contributions comes from loyal subjects who agree with their betters.
The “people who are critical of you” are very broad category that includes both toxic behavior and constructive disagreement. The former must not be tolerated, the latter can be encouraged as long as it’s not a blocker. In this case it is clearly the former and it requires suppression, but disciplinary action may have been too harsh or perfectly adequate depending on prior history with this employee. It was not said like “it was insensitive to appear in front of the team this way”. It was indeed said like he is a rich jerk. Zero added value, rage bait, polarization of the team.
> In this case it is clearly the former and it requires suppression
That's not clear at all. Why do you say so?
Read the article. "Rich jerk" are Atlassian's words, not the employee's. Even if they are it's not obviously the former.
I refuse to believe anyone, including Oracle employees, likes Larry Ellison.
If Microsoft/Google/Apple fired everyone who badmouthed Satya/Sundar/Tim, half their products would fall apart overnight.
Do you see any, even little sign of constructive criticism in what she said? Anything that could improve corporate culture or help her peers or management to understand the problem? Any hint on how it could be fixed?
I don’t.
> Read the article.
I did read the article and came to the same conclusion as Atlassian.
> I refuse to believe anyone, including Oracle employees, likes Larry Ellison.
When you sign the working contract, your job is to act in the interests of shareholders. If you despise them or disagree with what they do, you can still work there and use your position to align their interests and the interests of the public. You can try to change it from within. But the moment when you decide to burn the bridges is the moment when you should leave. To me this is pretty obvious, and I’m really surprised to see here some sort of entitlement.
"acting in the interests of shareholders" (which, for the record, no employment contract I've ever signed requires) does not mean blind allegiance to management and certainly doesn't mean not calling out bad actions by management.
The employee's statement here was fact. The CEO did harm the careers of employees and did call in to harass them without even bothering to come to a company office.
The CEO, who has much more of an obligation to act in shareholders' interests than an IC, shouldn't be attacking and alienating their labor force.
Yes, and what is the essence of that labor? It is to create profits for shareholders. You are not getting paid to contribute to toxic culture or to seize the means of production. CEO could have been in the wrong, but the moment when the “us vs them” idea starts dominating in corporate culture is the moment company dies and those jobs that everyone is so afraid to loose are ceasing to exist.
>This is ignoring that the concept of companies having to care about shareholders above everything else is a lie spread to justify evil behavior.
Nobody is claiming the “above everything else” here
> Yes, and what is the essence of that labor? It is to create profits for shareholders
No, it's not. I'm not sure what your source of capitalism koolaid is, but employees are transacting labor for money, and shareholders do not come into it at all.
The nature of labor is stocking shelves, writing code, emptying trash bins, or whatever else you do. Full stop.
If you want employees to "think of the shareholders first", you give them enough ownership that the stock price actually makes a major difference in their life, and crucially, enough control at the company to actually influence the stock price. In practice that's the C Suite and maybe some senior VPs. No one else should be stressing out trying to make the owners richer.
This conversation is not about „employees thinking/not thinking about shareholders“. You are cherry-picking that topic and taking it out of context, for what reason exactly?
I have explained already a few times why and what context matters here. Are you struggling to understand it or just avoiding it?
I fully agree with you, it doesn’t. However I wasn’t saying that, so I have to conclude that you are speaking to someone else here, making up arguments from an imaginary opponent, and I’m not needed here. Good luck.
That's one way to look at it. Another way is that people who are aligned with the CEOs mission will help achieve the mission, and people who are not aligned will not help achieve the mission. And it's the CEOs job to define the mission
When the mission is to screw over the employees, we don't need people who will align with that. CEOs should be held responsible for the enemies they create within their organization. Treating people as necessary collateral damage is unacceptable.
The mission is usually stupid and also dumb, and it would be in the CEOs best interest to surround himself or herself with people willing to tell them that.
Meta could've saved billions of dollars if more people told Zuck that the Metaverse is stupid, because it was. The end result is the same. The death of the idea. That much is actually unavoidable, because stupid and bad ideas will always fail, with or without support. So, it shifts to a question of it being a long, drawn-out, expensive death or a quick Old Yeller type putting down.
I think the issue we're seeing across a lot of companies is that leadership is incredibly stupid. I think we have this wrong idea that, because a CEO exists purely to make decisions, they must be pretty good at it. But that's not really the case. You can only be capable of doing one thing and still be shit at that one thing, it's definitely possible.
The problem is, I think, we assume that CEOs and other leadership work like normal people, but I don't think this is the case. I think there is a brain decay that occurs as people become more rich and powerful. It's becoming evident to me that the human brain was never intended to be in that type of situation, and there are consequences. There's a sort of detachment of reality that comes along with that, and it almost seems unavoidable. Like a type of delusional psychosis that just onsets when you become rich and powerful enough.
It's not a new thing, either. You can basically see this across all of history with kings and rulers of all kinds. The really good ones do something remarkable: they predict their own oncoming psychosis. They build in controls and preventative measures so that, when they inevitable go off the rails, the damage is minimized. It's wild, isn't it? I think about everything George Washington did prior to his rise in US politics, and it can only be describes as stopping his future self from eventually becoming drunk with power.
> I think there is a brain decay that occurs as people become more rich and powerful.
My prevailing hypothesis is that as you advance in leadership roles there's a natural tendency to have the ego grow. After all, you have evidence for your ego: you make important decisions and you've risen up in whatever social structure. And I think there's a natural bias to surround yourself with yesmen. They create less friction, so naturally we want that. And it's hard to distinguish yesmen from people who genuinely believe in the same things as you. But the yesmen are able to hide this way, even by being "disagreeable" in just the right way (which makes it hard to distinguish). With the more proficient yesmen themselves rising to the top too.
So I think it's important for leaders to surround themselves with a distribution of opinions. I think in order to make good decisions we need friction. We need frustration. We need people to tell us we're wrong when we're wrong. We also need people to tell us we're wrong even when we're right, because the challenge of the idea forces us to think deeper. But I think the real challenge is implementing this correctly. It needs most "advisors" to be acting honestly, independently, and in good faith. It's hard to cultivate that and I think to do so you need to let people trash talk you, even egregiously. Because a misinterpretation of punishing someone can be seen interpreted as retaliation (even if completely fair), and upsetting the whole balance. Context can easily be lost
I suspect it's an unstable equilibrium, making it really difficult to maintain.
Agreed. I have held similar opinions of leadership at many of my jobs.
If you are so burned out that you can’t help but vent publicly, it’s time to go. It’s just not healthy for you.
But of course leadership is going to take care of that for you because it’s not healthy for the company either to have open dissent. And most of us are far easier to replace than a CEO
That’s certainly not a universal take in leadership.
Disagree and commit is the manager’s take. Ok let’s hear it, but once the decision is made (by me) it’s time to STFU and just do it.
The disagree part often is just a way to manage your teams emotions. You didn’t get your way but you can’t say you weren’t heard. The leads always get their way.
The mistake is that 1 every N waves of hype are in fact monumental shifts and it makes sense to embrace as soon as possible. Also being early to the right thing can have massive implications in appreciating the shift before the general public, which is upstream from making smart resource allocation (investments, career choices). I have a friend that went to super early OpenAI as a designer. He has more equity than most AI researchers there and made a 0.001% amount of wealth. Being early does very much matter in the right conditions.
I don't oppose reading AI generated content in principle, but because it's free to generate, I always am less likely to read super long prose that is AI generated. So the question is whether someone has taken the time to keep it as long as necessary but not longer. Or if there are ways to make it easier for me to commit to the experience, with a sort of TLDR
UBI will likely be necessary but that won’t appease society. Everyone wants to have a chance to climb the ladder. If it becomes self evident that humans can no longer have a meaningful impact on their outcome, there’ll be riots whether they have a roof and food or not.
The goal has nothing to do with you being employed. Your job security is a consequence of the ultimate goal to build AGI. And software development salaries and employment will be affected before getting there. In my opinion, we already past the SWE peak as far as yearly salary. Yes there are super devs working on AI making a lot of dough, but I consider that a particular specialty. On average the salary of a new grad SWE in the US is past its peak if you consider how many new grads can’t get a job.