AS04's comments

AS04 · 2025-08-07T18:10:28 1754590228

Very likely to be an actual reflection. That's probably their real achievement here and the key reason why they are actually publishing it as GPT-5. More or less the best or near to it on everything while being one model, substantially cheaper than the competition.

ComputerGuru · 2025-08-07T19:42:19 1754595739

But it can’t do audio in/out or image out. Feels like an architectural step back.

conradkay · 2025-08-07T20:37:54 1754599074

My understanding is that image output is pretty separate and if it doesn’t seem that way, they’re just abstracting several models into one name

AS04 · 2025-08-07T18:08:13 1754590093

400k context with 100% on the fiction livebench would make GPT-5 the undisputably best model IMHO. Don't think it will achieve that though, sadly.

AS04 · 2025-08-07T18:03:46 1754589826

Don't count your chickens before they hatch. I believe that the odds of an architecture substantially better than autoregressive causal GPTs coming out of the woodwork within the next year is quite high.

9rx · 2025-08-07T18:23:09 1754590989

How does that equate to "winner take all", though? It is quite apparent that as soon as one place figures out some kind of advantage, everyone else follows suit almost immediately.

It's not the 1800s anymore. You cannot hide behind poor communication.

SalmoShalazar · 2025-08-07T18:56:54 1754593014

Why do you believe this? Do you know researchers actively on the cusp or are you just going off vibes?

suddenlybananas · 2025-08-07T18:09:50 1754590190

Why do you think that?

AS04 · 2025-04-12T15:33:30 1744472010

Because of the niceties of Rust, combined with the widespread compatibility and architecture support of gcc / C compilers in general?

Rust is a modern language, with package management, streamlined integrated build/testing tools, much less cruft, and lots of high-level features and syntax that people actually like. C is neat but complex codebases benefit from modern languages that help in building robust abstractions while still maintaining the speed of C. Not to mention, of course, the borrow checker and memory safety.

AS04 · 2025-03-28T05:50:01 1743141001

I genuinely don't understand why some people are so critical of LLMs. This is new tech, we don't really understand the emergent effects of attention and transformers within these LLMs at all. It is very possible that, with some further theoretical development, LLMs which are currently just 'regurgitating and hallucinating' can be made to be significantly more performant indeed. In fact, reasoning models - when combined with whatever Google is doing with the 1M+ ctxt windows - are much closer to that than people who were using LLMs expected.

The tech isn't there yet, clearly. And stock valuations are over the board way too much. But, LLMs as a tech != the stock valuations of the companies. And, LLMs as a tech are here to stay and improve and integrate into everyday life more and more - with massive impacts on education (particularly K-12) as models get better at thinking and explaining concepts for example.

AS04 · 2025-01-21T02:26:54 1737426414

Unsloth also works very diligently to find and fix tokenizer issues and many other problems as soon as they can. I have comparatively little trust on ollama following up and updating everything in a timely manner. Last I checked, there is little information on when the GGUFs and etc. on ollama were updated or what llama.cpp version / git commit did they use for it. As such, quality can vary and be significantly lower with the ollama versions for new models I believe.

AS04 · on Aug 2, 2023

My friend, what exactly do you think is so energy intensive with LK99 synthesis? I've briefly taken a look and the process proposed is really not that onerous in terms of energy consumed. It is a matter of perfecting the process that is the hurdle, we already spend tons of energy happily in similar industrial processes.

AS04 · on Aug 2, 2023

Fusion would be massively boosted by having room-temperature superconductivity anyways.