right, 888 kB would be impossible for local inference however, it is really not ...

Dylan16807 · 2026-02-21T21:20:23 1771708823

It's not completely impossible, depending on what your expectations are. That language model that was built out of redstone in minecraft had... looks like 5 million parameters. And it could do mostly coherent sentences.

godelski · 2026-02-21T23:34:30 1771716870

  > built out of redstone in minecraft

Ummm...

  > 5 million parameters

Which is a lot more than 888kb... Supposing your ESP32 could use qint8 (LOL) that's still 1 byte per parameter and the k in kb stands for thousand, not million.

Dylan16807 · 2026-02-22T02:01:15 1771725675

https://www.youtube.com/watch?v=VaeI9YgE1o8

Yes I know how much a kilobyte is. But cutting down to 2 million 3 bit parameters or something like that would definitely be possible.

And a 32 bit processor should be able to pack and unpack parameters just fine.

Edit: Hey look what I just found https://github.com/DaveBben/esp32-llm "a 260K parameter tinyllamas checkpoint trained on the tiny stories dataset"

godelski · 2026-02-22T03:28:28 1771730908

  > But cutting down to 2 million 3 bit parameters or something like that would definitely be possible.

Sure, but there's no free lunch

  > Hey look what I just found

I've even personally built smaller "L"LMs. The first L is in quotes because it really isn't large (So maybe lLM?) and they aren't anything like what you'd expect and certainly not what the parent was looking for. The utility of them is really not that high... (there are special cases though) Can you "do" it? Yeah. I mean you can make a machine learning model of essentially arbitrary size. Will it be useful? Obviously that's not guaranteed. Is it fun? Yes. Is it great for leaning? Also yes.

And remember, Tiny Stories is 1GB of data. Can you train it for longer and with more data? Again, certainly, BUT again, there are costs. That Minecraft one is far more powerful than this thing.

Also, remember that these models are not RLHF'd, so you really shouldn't expect it to act like you're expecting a LLM to work. It is only at stage 0, the "pre-training", or what Karpathy calls a "babbler".

Dylan16807 · 2026-02-22T07:10:42 1771744242

A reminder that what I said was "not completely impossible, depending on what your expectations are"

And I was focused more on the ESP32 part than the exact number of bytes. As far as I'm concerned you can port the model from the minecraft video and you still win the challenge.

Also, that last link isn't supposed to represent the best you can do in 800KB. 260k parameters is way way under the limit.

godelski · 2026-02-22T08:12:36 1771747956

That bar has no lower bound though so of course we're talking past one another

Also we're talking about an esp32. They aren't magic

Dylan16807 · 2026-02-22T09:45:14 1771753514

Being able to talk back and forth with coherent sentences has a lower bound, and it's close to the limit of this hardware.

Something that can actually be an "assistant" has its own lower bound, probably a little harder but mostly a matter of training it differently.

js8 · 2026-02-22T03:37:11 1771731431

I disagree, in the future it might be possible. But perhaps not in English, but in some more formal (yet fuzzy) language with some basic epistemology.

I mean, there is a lambda calculus self-interpreter in 29 bytes. How many additional logical rules are required for GAI inference? Maybe not that many as people think. Understanding about 1000 concepts of basic english (or say, lojban) might well be sufficient. It is possible this can be encoded in 800kB, we just don't know how.