I think I'm aligned with the idea that some parts of some workflows are mandatory - auth, read before edit, etc.
But otherwise, forge really doesn't own or opine much of the workflow. Step enforcement exists if you want it, so do prerequisites, but the idea is that those could be conditional or optional (you may never need to edit a file).
The guardrails are designed to work for non deterministic flows or deterministic ones. In the latter, you just might not have one of the guardrails active. It's much more about nudging the model back on track than laying more obvious tracks, in a sense.
Overall, agentic reliability is definitely an active field.
In this blog post I'm reading their call for "control flow" as a generalization of exactly what your work illustrates so nicely.
The blog post doesn't say to me "we need to start encoding specifically opinionated conditional branching statements that guide the model" rather I'm hearing a call to realize the broader principles of control flow itself relevant for composing programs with LLMs.
Nice ;). I'll take a closer read of it, that's on me - I am definitely seeing more people looking in this direction as agents start to ramp in production at the enterprise level, which I suspect is highlighting some of these failure modes at higher stakes. And also the cloud frontier API bills.
So basically the kind of thing I'd usually be doing manually with small models, over and over again, you just automate that nudging and off they go.
Sometimes LLMs have seemed to me like "computer programs with inertia" and in that frame what your tool does is identify and reduce friction at key points so the wheels can keep spinning.
Yep! The big frontier models are already quite good at doing that, and they have decent harnesses. That's why Opus on Claude Code does what it does.
Small models aren't there yet and they would veer off course, this just nudges them back onto the road. Whether or not they have a good sense of direction is a different question.
Listen, I really like LLMs and diffusion models and machine learning and all this stuff, and I want to see it happen in a just and sustainable way. "AI" doesn't necessitate extreme waste. If anything, reasonable policy constraints would push "AI" to be even better.
I feel like the way many companies implement AI right now is very very wasteful. For context I'm looking into adding some AI elements to my SaaS app and I'm looking at running on-device TinyBERT intent classifiers then have my API take it from there (still experimenting with this).
I feel like this is a pretty sustainable way to implement AI in an application, meanwhile I see most companies just implement with OpenAI API + some custom prompts on top.
Granted I've had to do this for some of my clients and it's a pretty easy way to implement AI, though I always have the sinking feeling that we could achieve the same thing in a way more efficent manner and a bit more effort.
> If anything, reasonable policy constraints would push "AI" to be even better.
Like what, though? I'm not opposed to AI regulation at all, but the very last thing I expect it to fix is the resource constraints around GPGPU compute.
I wouldn't take a job where the employer wasnt publishing permissively licensed code for all but the production bits. It's demoralizing for me and would stress my soul to the brink. I'd rather be broke.
In the US, you’d definitely be broke. There just aren’t many employers willing to deal with it. All the ones I’ve worked for just use what’s available without modification.
Maybe the reason this is so controversial is that people have stopped thinking about "AI" as a bunch of software, just like any other software. If that's you, stop while you still can, you've swallowed a nasty hook and your agency is on the line.
reply