More

sharperguy · 2026-05-14T22:16:45 1778797005

I've switched to using traefik from caddy. For simple use cases it's a little more verbose in the configuration, but for more involved things like multiple load balancing backends, rewriting paths and headers and so on I've found it really good.

sharperguy · 2026-05-07T21:52:22 1778190742

So I wonder, if a more powerful agent harness could have the agent basically write and exectute its own deteministic code, which when executed, spawns sub agents for each of the subtasks?

So far we've seen agents spawn subagents directly, but that still means leaving the final flow control to the non-deterministic orchestrator model, and so your case is a perfect example of where it would probably fail.

tonylucas · 2026-05-07T21:55:59 1778190959

I've been working on an integrated deterministic/agent integrated system for a few months now. It basically runs an AI step to build a plan, which biases towards deterministic steps as much as possible but escalates back to AI when it needs to (for AI only capabilities or deterministic failures) so effectively (when I perfect it, I'm about 90% there) it can bounce back and forward as needed with deterministic steps launching AI steps and AI steps launching deterministic steps as needed.

Probably not explaining it very well but I think it's pretty effective at reducing token usage.

dkersten · 2026-05-08T09:06:36 1778231196

I've been building a workflow engine for agent orchestration and the workflows are just data for the engine to execute. While I haven't experimented with it yet, I envision that an LLM would be rather good at generating the workflows based on a description of your needs (and context about how best to utilise the workflow engine).

LLM's are pretty good at reasoning about workflows, its just that when they have to apply them directly, the workflow context gets muddled with your actual tasks context. That's why using an orchestration agent that delegates work to worker agents works so much better.

I still think there's a huge amount of value in having the workflow executed in a deterministic way (as code, or by a workflow engine) because it saves tokens, eliminates any possibility of not following it, and unlocks other cool things, like being able to give each step in the workflow its own focused task-specific context, splitting plans into individual actions and feeding them through a workflow one by one, and having workflow-step specific verification.

But that workflow absolutely CAN be created by an LLM, it just shouldn't be executed by one.

peyton · 2026-05-08T00:49:58 1778201398

I make codex do everything through a giant `justfile`. Simple, greppable, self-documenting, works great, and I don’t even need to read it.

sharperguy · 2026-05-06T12:22:29 1778070149

Both Arch and Nix solve this by making it very easy to write packages that work around the compatibility issues. When I used to use ubuntu and mint it was a lot more common to run into these types of issues.

sharperguy · 2026-05-05T11:07:08 1777979228

Skills are often invoked imperatively by the user. In cases where they are intended to be used directly by the LLM, it would be included somewhere else in the context. E.g:

``` After implementing the feature, read the testing skill for instructions on how to test. ```

forlorn_mammoth · 2026-05-05T15:59:57 1777996797

how do you guarantee that the LLM follows an instruction given imperatively by the user? It probably will, but this is not guaranteed behavior. Likewise, _how_ it follows that instruction is non-deterministic.

it's turtles all the way down.

sharperguy · 2026-05-06T07:16:41 1778051801

Nobody is arguing it's guaranteed. This is why you never give an LLM access to any essential infrastructure. Make sure everything it does can be undone. Double check when guarantees are required.

xboxnolifes · 2026-05-05T17:01:26 1778000486

You isn't gaurentee it any more than you can guarantee your prompt gives the output you want. Skills are just prompt templates.

sharperguy · 2026-04-27T10:15:02 1777284902

The personification seems to be at the training level. When I ask an LLM why it did something destructive, the ideal response would be a matter of fact evaluation of the mistakes that I myself have made in setting up the agent and it's environment, and how to prevent it from happening again. Instead the model itself has been trained to apologize and list exactly what it did wrong without any suggestions of how to actually prevent it in the future.

ethbr1 · 2026-04-27T10:38:32 1777286312

100% this. AI perversion to fluff human egos is rewarded.

I had a PM-turned-vibe-coder tell me "Talking with you is the only bad part of my week" and realized in horror that the rest of his week is spent exclusively talking to sycophantic AI.

We have met the enemy, and he is us.

sharperguy · 2026-04-09T12:50:26 1775739026

Water? You mean like out of the toilet?

sharperguy · 2026-04-07T06:56:34 1775544994

A serious existential threat to the country from a targetable state actor.

sharperguy · 2026-04-05T17:49:01 1775411341

It's actually common for human-written projects to go through an initial R&D phase where the first prototypes turn into spaghetti code and require a full rewrite. I haven't been through this myself with LLMs, but I wonder to what extent they could analyse the codebase, propose and then implement a better architecture based on the initial version.

Deukhoofd · 2026-04-05T21:21:22 1775424082

Let's be real, a lot of organizations never actually finish that R&D phase, and just continue iterating on their prototypes, and try to untangle the spaghetti for years.

I recently had to rewrite a part of such a prototype that had 15 years of development on it, which was a massive headache. One of the most useful things I used LLMs for was asking it to compare the rewritten functionality with the old one, and find potential differences. While I was busy refactoring and redesigning the underlying architecture, I then sometimes was pinged by the LLM to investigate a potential difference. It sometimes included false positives, but it did help me spot small details that otherwise would have taken quite a while of debugging.

zozbot234 · 2026-04-05T18:03:45 1775412225

If you write that first prototype in Rust, with the idiomatic style of "Rust exploratory code" (lots of defensive .clone()ing to avoid borrowck trouble; pervasive interior mutability; gratuitous use of Rc<> or Arc<> to simplify handling of the objects' lifecycle) that can often be incrementally refactored into a proper implementation. Very hard to do in other languages where you have no fixed boilerplate marking "this is the sloppy part".

klabb3 · 2026-04-05T19:22:06 1775416926

Rust is a language for fast prototyping? That’s the one thing Rust is absolutely terrible at imo, and I really like the production/quality/safety aspects of Rust.

zozbot234 · 2026-04-05T19:36:36 1775417796

It's not specialized to fast prototyping for sure, but you can use for that with the right boilerplate.

sharperguy · 2026-03-25T12:34:47 1774442087

only proves you're not a corporate model rather than locally running model that's been trained to allow saying that

sharperguy · 2026-03-21T15:24:12 1774106652

You can have cryptographically signed data caches without the need for a blockchain. What a blockchain can add is the ability to say that a particular piece of data must have existed before a given date, by including the hash of that data somewhere in the chain.