Hacker Newsnew | past | comments | ask | show | jobs | submit | noobcoder's commentslogin

Makes sense, instead of competing in the never ending race of frontier models they go for a different layer which these models can run on

The purple colors look very sloppy, pls not the purplish tint

At its core, it’s still doing what Google Assistant and Siri were doing since many years

Not sure what extra are we achieving here


Only if you consider Google Image Search and Google Nano Banana to be "the same thing" since they both produce an image based on text input!

Similarly, Google Translate's millions of lines of hand-rolled code has been entirely superseded by LLMs that do a vastly better job.

The LLM-based AI assistants are based on a wildly different technology stack with very different capabilities compared to the legacy "if-then-else" logic programming that Siri was based on.


Was Google Translate millions of lines of hand-rolled code? The Transformer architecture was invented for Google Translate, before it was used to build "LLMs".

I don't know about millions of lines of code, but Google Translate existed WELL BEFORE transformer architectures and relied on more traditional statistical machine translation techniques. They later moved to a neural machine translation technique, and then only after that in ~2019/2020 swapped to transformers.

Honestly a lot of us who worked in the translation sector remember NMT as being a huge step up and in some language-pairs even surpassing DeepL at the time.

https://en.wikipedia.org/wiki/Google_Neural_Machine_Translat...


Better surveillance.

On April 25, 2026, Cursor running Claude Opus 4.6 deleted PocketOS's entire production database and all volume-level backups in a single Railway API call. It took nine seconds.

immunity-agent would have stopped this at two points. Warden's scoped agent would have seen "fix a staging credential issue" and locked out destructive commands and production network access for that session. Cloak would have intercepted the Railway token before it ever reached the model. This is something in active development and open for suggestions and critiques on how to make this better, especially to prepare for future headwinds like security for AI


This post got viral on reddit as users have a tendency to not put secrets (like api keys etc.) in .env but instead paste it in the chat and let agents wire it up

Agents like claude code/openclaw save secrets in plaintext within config files, which makes a big attack vector for a local compromise becoming a cloud compromise.

We empirically verified to stop AI coding agents from leaking secrets by intercepting tool calls and handling secrets entirely outside the model’s visibility. Using Claude Code’s hook system.

Paired with open source repo for cleanup, it shows that most leakage can be eliminated by treating secrets as a runtime dataflow problem rather than a static scanning issue


Google recently released PaperOrchestra (arXiv:2604.05018), a multi-agent framework that converts unstructured research materials, such as logs, ideas, and results, into submission-ready LaTeX manuscripts.

It employs a specialized 5-aagents pipeline: Outline, Plotting/Lit Review, Section Writing, and Refinement. This setup greatly surpasses single-agent models in literature review quality and overall performance.

I created this repository to transform the paper’s prompts, schemas, and verification gates into a "skill pack" that any modern coding agent can use.

Repo: https://github.com/Ar9av/paper-orchestra

I am thinking of improving on it through: - optional semantic scholar support for verifying - an arxiv packager that strips comments and zips everything up for submission in one click. - human-in-the-loop checkpoints that pause the pipeline so you can approve the outline before it starts burning tokens


Even when a developer is careful to use a .env file, the moment a key is mentioned in a chat or read by the agent to debug a connection, it is recorded in one of the IDE caches (~/.claude, ~/.codex, ~/.cursor, ~/.gemini, ~/.antigravity, ~/.copilot etc)

Within these logs I found API keys and access tokens were sitting in plain text, completely unencrypted and accessible to anyone who knows where to target when attacking.

I made an open source tool called Sweep, as part of my immunity-agent repo (self-adaptive agent). Sweep is designed to find these hidden leaks in your AI tool configurations. Instead of just deleting your history, it moves any found secrets into an encrypted vault and redact the ones used in history.

We also thought about exploring post hook options but open to more ideas


i saw your code, itappears to combination of CNN + PPO on pytoech with a Cortical Labs CL1 chip that contains living neurons

Encoder: learns which stimulation patterns tend to improve reward

Biological neurons: adapt to the stimulation and generate spike responses that reinforce certain patterns

Decoder: interprets those spike patterns and converts them into joystick movements

right?


Passing tests doesn’t mean you have a working codebase. Benchmarks that rely on a fixed test suite create a real optimization problem agents (or/and even humans) learn to satisfy the tests rather than preserve the deeper properties that make the system maintainable. AI write test cases which it thinks is easier for it to satisfy and not adhere-ing to business logic

We see this firsthand at Prismor with auto generated security fixes. Even with the best LLMs, validating fixes is the real bottleneck our pipeline struggles to exceed 70% on an internal golden dataset (which itself is somewhat biased).

Many patches technically fix the vulnerability but introduce semantic regressions or architectural drift. Passing tests is a weak signal and proving a fix is truly safe to merge is much harder


We recently ran a deep security audit using Prismor, scanning some of the most popular AI agent frameworks end to end. It included full Software Composition Analysis, SBOM reviews, and vulnerability mapping across thousands of packages and transitive dependencies. Here's what we found.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: