More

eab- · 2026-01-14T00:49:00 1768351740

> We have proofs that are gigabytes (I believe even terabytes in some cases) in size, but we know they are correct because they check in Lean.

I'm not aware of any of these. There's some SAT-like results that were not verified in Lean at that sort of scale, but Lean proofs of individual problems are nowhere near that. For example, Mathlib (think a Lean4 math stdlib) is 6GB including compilation artifacts, and iirc <100MB text.

eab- · 2026-01-12T10:27:20 1768213640

> Haha. How do you reconcile a proof with actual code?

Languages like Lean allow you to write programs and proofs under the same umbrella.

thesz · 2026-01-12T18:49:44 1768243784

As if Lean does not allow to circumvent it's proof system (the "sorry" keyword).

Also, consider adding code to the bigger system, written in C++. How would you use Lean to prove correctness of your code as part of the bigger system?

daxfohl · 2026-01-14T03:57:27 1768363047

I mean, it's somewhat moot, as even the formal hypothesis ("what is this proof proving") can be more complex than the code that implements it in nontrivial cases. So verifying that the proof is saying the thing that you actually want it to prove can be near impossible for non-experts, and that's just the hypothesis; I'm assuming the proof itself is fully AI-generated and not reviewed beyond running it through the checker.

And at least in backend engineering, for anything beyond low-level algorithms you almost always want some workarounds: for your customer service department, for engineering during incident response, for your VIP clients, etc. If you're relying on formal proof of some functionality, you've got to create all those allowances in your proof algorithm (and hypothesis) too. And additionally nobody has really come up with a platform for distributed proofs, durable proof keys (kinda), or how to deal with "proven" functionality changes over time.

eab- · 2026-01-12T10:21:55 1768213315

I find it surprising that GPT managed to complete (with indeed a harness that doesn't seem very powerful!) and Claude still cannot.

awakeasleep · 2026-01-12T12:50:23 1768222223

Don’t they give got a minimap?

What are they actual differences?

eab- · 2026-01-08T16:24:25 1767889465

I'd find this article a bit more compelling if it was used to find current introduced bugs, instead of just using a holdout set

eab- · 2025-12-25T18:42:25 1766688145

My understanding is that also this tail call based interpretation is also kinder to the branch predictor. I wonder if this explains some of the slow downs - they trigger specific cases that cause lots of branch mispredictions.

eab- · 2025-09-04T00:49:20 1756946960

What do you mean about CLIP?

rawgabbit · 2025-09-04T02:34:56 1756953296

I believe he is referring to OpenAI proposal to move beyond training with pure text. Instead train with multi modal data. Instead of only the dictionary definition of an apple. Train it with a picture of an apple. Train it with a video of someone eating an apple etc.

godshatter · 2025-09-04T16:01:40 1757001700

Before this AI wave got going, I'd always assumed that AGI would be more about converting words, pictures, video, and lots of sensory data and who knows what else into a model of concepts that it would be putting together and hypothesizing about and testing as it grows. A database of what concepts have been learned and what data they were built from and what holes it needed to fill in. It would continually be working on this and reaching out to test reality or discuss it's findings with people or other AIs instead of waiting for input like a chatbot. I haven't even seen anything like this yet, just ways of faking it by getting better at stringing words together or mashing pixels together based on text tokens.

No one seems to be working on building an AI model that understands, to any real degree, what it's saying or what it's creating. Without this, I don't see how they can even get to AGI.

rawgabbit · 2025-09-04T17:23:37 1757006617

When I was young, my relatives would make fun of me. Saying I had a lot of book learning but yet to experience the absurdity of the real world. Wait, they said, when I try to apply my fancy book learning to a world controlled by good ole boys, gatekeepers, and double talk. Then I will learn reality is different from the idealized world of books.

eab- · 2025-09-02T00:17:57 1756772277

There's both "no multi-program system" and "multi-program system", depending on how you look at it. In reality, you're always executing the same machine code, itself has no awareness of programs.

stirfish · 2025-09-02T00:21:03 1756772463

This unironically helped me work through a bug just now

eab- · 2025-08-19T17:58:57 1755626337

a lot of the researchers are still in london

firefax · 2025-08-19T18:07:08 1755626828

Also in general Google satellite offices often house the engineers of acquired startups who don't want to move to the mothership. It's not their primary purpose but it's one of the things they use them for.

logifail · 2025-08-19T18:35:37 1755628537

> in general Google satellite offices often house the engineers of acquired startups who don't want to move to the mothership

Would it be unfair to ask if (in this instance the UK's) satellite country taxpayers are subsidising corporate offices when the overall structures are arranged such that any overall corporation tax payable will be paid in the lowest-possible jurisdiction?

See - for instance - https://en.wikipedia.org/wiki/Corporation_tax_in_the_Republi...

logifail · 2025-08-19T18:01:18 1755626478

I understand that iPhones are assembled in China, India and Vietnam. Would those countries issue press releases on "their leadership in iPhones"?

LeifCarrotson · 2025-08-19T18:19:57 1755627597

I'm sure representatives of those countries love to say so, especially when talking to third parties about their expertise in manufacturing: "Yes, here in Zhengzhou, CN we're leaders in electronics manufacturing - the iPhone is assembled here at Foxconn!"

However, Apple (headquartered in the US) loves to issue press releases describing how their products are "Designed by Apple in California[, USA]" even though a lot of work in the manufacturing, the software, and the design of subcomponents (or major components, I don't know how Apple is organized internally) are done in China, India and Vietnam as you listed.

I'd argue that in the same way that Shenzen and Zhengzhou are leaders in electronics assembly because the bulk of the iPhone and other products are built there, regardless of the location of the headquarters of Apple, so to can London claim to be a leader in AI because the researchers for DeepMind are located in London, regardless of who owns the DeepMind brand.

Buying a thing from another country doesn't make your location a leader in that thing.

logifail · 2025-08-19T18:37:38 1755628658

> "we're leaders in electronics manufacturing"

The UK wasn't claiming to be "leaders _in manufacturing_", they were claiming "international leadership in AI".

As I said elsewhere in the thread, citation needed...

atq2119 · 2025-08-19T18:54:04 1755629644

What you're replying to is an analogy, that you seem to completely fail to understand.

astrange · 2025-08-19T23:40:42 1755646842

Apple's manufacturers don't do any of the work on the software or design. They don't even manufacture the highest value-add components; those are mostly done in Taiwan and Korea.

vouwfietsman · 2025-08-19T18:06:24 1755626784

China, India and Vietnam absolutely lead in manufacturing iphones, yes.

logifail · 2025-08-19T18:39:56 1755628796

in manufacturing <-- did you notice this bit? :)

I was responding to the quote from Dame Wendy Hall claiming that that UK [has] "international leadership in AI"

cma · 2025-08-19T18:07:03 1755626823

But in this case Deepmind's assembly (plugging in datacenter cables, etc.) is largely done in the US.

eab- · 2025-06-13T09:47:14 1749808034

The same Google that makes you log in again on like every other `gcloud` call? By copy pasting your password into the shell prompt?

eab- · 2025-04-03T21:22:37 1743715357

yep, this post is full of this post-rationalization, for example. it's pretty breathtaking