Hacker Newsnew | past | comments | ask | show | jobs | submit | jwpapi's commentslogin

Same here. I have now deleted 43k and counting lines of my codebase. There is no point in putting any AI code into production anymore as it almost always uses none or the wrong abstractions.

When you try to throw more agents at the problem or even more verification layer, you just kill your agility even if they would still be able to work


I thought it was a good article, till I saw the Slack example.

The copy doesn’t even remotely grasp the scale of what the actual Slack sofware does in terms of scale, relaiability, observability, monitorability, maintability and pretty sure also functionality.

Author only writes about the non-dev work as difference, which seems like he doesn’t know what he’s talking about in all, and what running an application at that scale actually means.

This "clone" doesn’t get you any closer to an actualy Slack copy than a white piece of paper


I had the same experience (though I agree with other comments that the numbers are a little optimistic in terms of variance; I think there's a huge amount of variance in product work, you can't know what's a good investment until it's too late, many companies fail because of this, and there's huge survivorship bias in the ones that get lucky and don't initially fail). Slack spent tons of money in terms of product and engineering hours finding out what works and what doesn't. It's easy to copy/paste the thing after all that effort. Copy/paste doesn't get you to the next Slack though--it can get you to Microsoft's Slack-killing Teams strategy, but we obviously don't want more of that. And, obviously I agree with you about all the infra/maintenance costs, costs in stewarding API usage and extensions, etc. LLMs won't do any of that for you.

Absolutely, the moment I saw „95% of Slack core functionality” I stopped believing the author knows what he’s talking about

Students in the 2010s were building twitter clones as part of third-year college courses.

And somehow twitter survived and thrived and didn't really get viable competitors until forces external to the code and product itself motivated other investment. And even then it still rolls on, challenged these days, but not by the ease of which a "clone" can be made.


Yeah, I can build a Slack "clone" in a couple of weeks with my own two hands, no AI required. But it's not going to actually be competitive with Slack.

Just to pick an incredibly, unbelievably basic enterprise feature, my two-week Slack clone is not going to properly support legal holds. This requires having a hard override for all deletion and expiration options anywhere in the product, that must work reliably, in order to avoid accidental destruction of evidence during litigation, which comes with potentially catastrophic penalties. If you don't get this right, you don't sell to large corporations.

And there are hundred other features like this. Engineering wants an easy-to-use API for Slack bots. Users want reaction GIFs. You need mobile apps. You need Single Sign-On. And so on. These are all table stakes.

It was a cliche for many years that Microsoft Word had "too many features." So people would start companies to sell "lightweight word processors" that only implemented "the most used 20% of features." And most of these companies sank without a trace (with a couple of admirable exceptions that hyperfocused on specific niches). Google finally made progress against the monopoly, but to it, they actually invested in a huge number of features.

Believe me, I wish that "simple, clean" reimplementations were actually directly competitive with major products. That version of our industry would be more fun. But anyone who thinks that an LLM can quickly reimplement Slack is an utter fool who has never seriously tried to sell software to actual customers.


> It was a cliche for many years that Microsoft Word had "too many features." So people would start companies to sell "lightweight word processors" that only implemented "the most used 20% of features." And most of these companies sank without a trace (with a couple of admirable exceptions that hyperfocused on specific niches). Google finally made progress against the monopoly, but to it, they actually invested in a huge number of features.

The other issue is that yes, perhaps most users only use 20% of the features, but each user uses a different 20% of the features in products like Word. Trust me, it's super hard to get it right even at the end-user level, let alone the enterprise level like you say.


There are at most 5% of the features of word that are common to everyone. Things like spell check everyone uses. Actually I suspect it is more like 0.1% of the features are common, and most people use about 0.3% of the features and power users get up to 5% of the features - but I don't have data, just a guess.

Yeah but 98% of Word features were buried in like 2004. They were added when it was a selling point to use unicorn and gnome icons as your table border in under 100mb of RAM. So we’re talking about 20% of the limited set of features that remain not just for backwards compatibility.

And there's some company out there that has very important Word documents that will fail to open if you take away the unicorn and gnome icons table border feature.

When I look at the big non-tech industry companies that have a chill life and print money. It’s usually the companies that are just the very best in what they do and have a quasi monopoly or so much competitive andvantage that everybody is just using them.

That’s whats need in tech too.

A clone doesn’t get you closer to that.


Also, it's obviously faster to copy Slack 1-to-1 than inventing it from scratch. Making Slack was not just coding.

Human slop think-pieces.

I‘m so happy about this article. I was forming a thought in my head the last couple of days, which is how to describe what it is that makes AI code practically unusable in good systems.

And one of the reasons is the one described in this article and the other is, that you skip training your mental model when you don’t grind these laziness patterns. If you are not in the code grinding to your codebase, you don’t see the fundamental issues that block the next level nor you have the itch to name and abstract it properly so you wont have to worry about in the future, when somebody or you have to extend it.

Knowing your shit is so powerful.

I believe now that my competive advantage is grinding code, whilst others are accumulating slop.


So AI company buys devs again, but devs are dead

They want to kill the last good ones

Part of my $DAYJOB is working on timers.

@sama if you need someone to buy to implement timers for ChatGPT I’m your guy - my price is 2 billion dollars.


A timer?

> Then, unprompted, Altman offers up a kind of shocking timeline for the groundbreaking feature of counting: “Maybe another year before something like that works well.” Per Altman, ChatGPT’s voice model doesn’t have the capability of starting a timer or keeping track of time. “But we will add the intelligence into the voice models,” he said.

--

https://gizmodo.com/sam-altman-says-itll-take-another-year-b...


I cant wait till chatgpt sets me 14 minute timers when I request a 40 minute timer, just another year!

I started writing “why would you want that?”, thinking about Alexa users…

Then I remembered scripting existed ^_^


From my perspective there are some people that have never built real processes in their life that enjoy having some processes now. But agent processes are less reliable slower and less maintenable then a process that is well-defined and architectured and uses llm’s only where no other solution is sufficient. Classification, drafting, summarizing.

I’ve had a Whatsapp assistant since 2023, jailbraked as easy assistant. Only thing I kept using is transcription.

https://github.com/askrella/whatsapp-chatgpt was released 3 years ago and many have extended it for more capabilities and arguably its more performant than Openclaw as it can run in all your chat windows. But there’s still no use case.

It’s really classification and drafting.


This. So many junior engineers showing me AI flows that could just be a script with a few parameter inputs

Why write script when more tokens does job

I like to experiment with AI flows to make iteration quicker, then once something work investing in is found, back up and build something that's actually repeatable.

yeah but unfortunately an AI flow can bring promotions, while scripts won't

Same thing could be said with SKILL.md yet they are highly useful...

Yes you can automate via scripting, but interacting with a process using natural language because every instance could be different and not solid enough to write a spec for, is really handy.

tl;dr: there's a place for "be liberal in what you receive and conservative in what you send", but only now have LLMs provided us with a viable way to make room for "be loosey goosey with your transput"


I understand but there still is usually 80-95% of the skill flow that you can script out that is repeated. Script it out and simplify your skill, make it more stable, and provide more opportunity to scale it up or down i.e use stronger or weaker models if need be. We should be scripting and forming process first then seeing where we can put AI after that.

The AI for everything thinking is really easy to let infect you. I was trying to figure out how to make some SQL alerting easier to understand quickly. The first thing my brain went to was "oh just shove it into an LLM to pull out the info of what the query is doing". And it unfortunately wasn't until after I said that out loud that I realized that was a stupid idea when you could just run a SQL parser over the query and pull the table names out that way. Far faster, more cost effective, and reliable than asking an LLM to do it.

That’s actually an awesome idea and totally helps to reduce wasting context size - move repeatable instructions to a SKILL.md, and once they’re repeatable and no longer have variability to input, turn it into a tool! Rinse repeat.

Oh nice, you could even eventually turn the whole process including inference into an app so that you’ve cut out the LLM from the whole process saving you execution time


I find that it's usually management that ask for such things "because AI".

I mean using AI is a great way to interpret a query, determine if a helper script already exists to satisfy it, if not invoke a subagent to write a new script.

Problem with your "script" approach is how does that satisfy unknown/general queries? What if for one run you want to modify the behavior of the script?


You say that with the wisdom of experience.

But there's still value in people exploring new spaces they find interesting, even if they do not meet your personal definition of pareto-optimal.


Exploring with AI doesn’t lead to the same level of learning. They are doing the equivalent of paying to skip the level up of their character and going to the final boss with level 1 armor

I look at it more like speedrunning a level. You're skipping the parts of the level that take up the most time, some times using hacks. Is it universally as much fun as playing the game? No, just like using AI to prototype might get you to the same place, but without the experience of discovery and blockers along the way.

Fully agree with your comment regarding real processes. Being a Six Sigma Black Belt, studying processes and reducing the errors is critical. The Openclaw processes at the moment scare me.

I'm only at Four Alpha Brown Belt still but once I pass my test I think I'll also understand why this is critical. I can't wait to get scared.

Make a skill.md called six sigma black belt audit refactor and publish it

That's actually a great idea!!

Vibe process automation

I was full on AI camp. I built a lot of useless stuff.

I know deleted most of it and refactored it into an agile codebase with 40k lines less. My life is peace again.

I now use ai for scratch tests and the next edit function from jet brains.


Thats why I’m visting HN.

Thank you.


One really should have digested the manifold hypothesis. It’s the most likely explanation of how AI works.

The question is if there are ultradimensional patterns that are the solutions for meaningful problems. I’m saying meaningful, because so far I’ve mainly seen AI solve problems that might be hard, but not really meaningful in a way that somebody solving it would gain a lot of it.

However if these patterns are the fundamental truth of how we solve problems or they are something completely different, we don’t know and this is the 10 Trillion USD question.

I would hope its not the case, as I quite enjoy solving problems. Also my gut feeling tells me it’s just using existing patterns to solve problems that nobody tackled really hard. It also would be nice to know that Humans are unique in that way, but maybe this is the exact same way we are working ? This really goes back to a free will discussion. Yes very interesting.

But just to give an example on what I mean on meaningful problems.

Can an AI start a restaurant and make it work better than a human. (Prompt: "I’m your slave let’s start a restaurant)

Can an AI sign up as copywriter on upwork and make money? (Prompt: "Make money online")

Can an AI without supervision do a scientific breakthrough that has a provable meaningful impact on us. Think about("Help Humanity")

Can an AI manage geopolitics..

These are meaningful problems and different to any coding tasks or olympiad questions. I’m aware that I’m just moving the goalpost.

We really don’t know..


I agree I can’t open any social media no more

Has someone verified this was an actual bug?

One of AI’s strengths is definitely exploration, f.e. in finding bugs, but it still has a high false positive rate. Depending on context that matters or it wont.

Also one has to be aware that there are a lot of bugs that AI won’t find but humans would

I don’t have the expertise to verify this bug actually happened, but I’m curious.


It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means. What is not clear is how they found themselves modeling the gyros software of the apollo code to begin with.

But, I do think their explanation of the lock acquisition and the failure scenario is quite clear and compelling.


They have some spec language and here,

https://github.com/juxt/Apollo-11/tree/master/specs

have many thousands of lines of code in it.

Anyways, it seems it would take a dedicated professional serious work to understand if this bug is real. And considering this looks like an Ad for their business, I would be skeptical.


> It's not even clear if AI was used to find the bug: they mention modeling the software with an "ai native" language, whatever that means.

Could the "AI native language" they used be Apache Drools? The "when" syntax reminded me of it...

https://kie.apache.org/docs/10.0.x/drools/drools/language-re...

(Apache Drools is an open source rule language and interpreter to declaratively formulate and execute rule-based specifications; it easily integrates with Java code.)


How did you pick out AI native and miss the rest of the SAME sentence?

> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.


That does not answer my confusion, especially when static analysis could reveal the same conclusion with that language. It's not clear what role ai played at all.

It seems pretty clear when you follow the link?

https://juxt.github.io/allium/


> It's not even clear if AI was used to find the bug

The intro says “We used Claude and Allium”. Allium looks like a tool they’ve built for Claude.

So the article is about how they used their AI tooling and workflow to find the bug.


The article does not explain anything about how they used AI—it just has some relation with the behavioral model a human seems to have written (and an AI does not seem necessary to use!)

Sure it does.

They used their AI tool to extract the rules for the Apollo guidance system based on the source code.

Then they used Claude to check if all paths followed those rules.


>It's not even clear if AI was used to find the bug

It's not even clear you read the article


Where do you think my confusion came from? All it says is that ai assists in resolving the gyroscope lock path, not why they decided to model the gyroscope lock path to begin with.

Please, keep your offensive comments to yourself when a clarifying comment might have sufficed.


Even worse, the other child comments are speculating (and didn't RTFA either) when the answer is clear in the article.

> We found this defect by distilling a behavioural specification of the IMU subsystem using Allium, an AI-native behavioural specification language.


That's the opposite of clear to me.

Has the article been updated?

2nd paragraph starts with: "We used Claude and Allium"

And later on: "With that obligation written down, Claude traced every path that runs after gyros_busy is set to true"


This is the original article text. It just looks like users skimmed and then flamed the author.

> distilling

A.k.a. as fabricating. No wonder they chose to use "AI".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: