More

bmenrigh · 2026-03-04T03:31:05 1772595065

My main skepticism here is whether the theorems have been properly replicated in the proof. Verifying that the proof really captured the mathematical statement seems like a manual, human process, and quite hard to repeat reliably across proofs.

jaen · 2026-03-04T13:03:08 1772629388

At least for the main accomplishment, the Tietze extension theorem, the natural language statement and the Megalodon theorem looked quite easy to correspond, and the dependency chain seemed fairly easy as well. (the fact that the proof is 10k lines is irrelevant, this is just about the theorem statement itself)

This project did not deal with some cutting-edge, esoteric or puzzle-type maths for which the formalization can be tricky - just well-trodden topology.

akoboldfrying · 2026-03-04T04:06:14 1772597174

Maybe this is what you meant, but the dangerous part is ensuring that the final claim being proved is correct -- the actual proof of that claim is, by design, something that can be quickly and mechanically verified by applying a series of simple rules to a set of accepted axioms. Even a human could (somewhat laboriously) do this verification.

I have no experience with this, but I'd expect the danger to arise from implicit assumptions that are obvious to a mathematician in the relevant subfield but not necessarily to an LLM. E.g., whether we are dealing with real-valued or integer-valued variables can make a big difference in many problems, but might only be actually explicitly stated once in the very first chapter (or even implied by the book's title).

There are also many types of "overloaded" mathematical notation -- usually a superscripted number means "raised to the power", but if that number is -1, it might instead mean "the inverse of this function".

zozbot234 · 2026-03-04T04:59:43 1772600383

This is correct, with the proviso that the meaning of the claim/statement depends on the definitions and checking these can be a bit tedious. But still feasible, and better by orders of magnitude than "checking" the proofs. Abuses of notation and just plain sloppiness are a big issue though if you wish to auto-translate an informal development into correct definitions and statements for formalization.

bmenrigh · 2026-03-04T03:04:36 1772593476

Yeah I learning about LPCAMM2 memory was far more interesting than the repairability score.

bmenrigh · 2026-03-03T20:29:12 1772569752

Can we stop it with "and the results are terrifying", "and you won't believe what I found", "the <x> situation is insane", etc.? The over-hyping of low quality, low effort content is making it hard to find actually interesting or informative things.

SunshineTheCat · 2026-03-03T20:44:26 1772570666

Yea I was thinking the same thing.

When you reach for the most exaggerated, over-the-top word possible when describing something relatively mundane, what will you use when you talk about something that actually is "terrifying?"

thfuran · 2026-03-03T20:58:06 1772571486

“The most terrifying thing you’ve ever heard”. You can even stick with that one as long as your subjects are monotonically scary.

cheschire · 2026-03-03T20:40:06 1772570406

Find a better and more accessible solution than clickbait.

Please, do it.

arcfour · 2026-03-03T20:41:25 1772570485

"Privacy concerns found in audit of popular dev tools" (or something along those lines) would work without feeling sensationalized.

ranger_danger · 2026-03-04T01:51:31 1772589091

In this case I'd say the "clickbait" is justified... these results are downright horrific IMO.

cheschire · 2026-03-03T22:17:28 1772576248

Yes this one time. I’m speaking generally in response to the general plea.

thfuran · 2026-03-04T00:54:02 1772585642

I’m not sure what you mean. Not clickbait is the better alternative.

cheschire · 2026-03-04T00:58:13 1772585893

If that were true, it wouldn’t be such a popular problem. Right? Clearly HN is falling into the same pattern of all the other sites. Engagement hacking blah blah blah.

thfuran · 2026-03-04T13:10:10 1772629810

You have it backwards. If it weren’t true, clickbait wouldn’t be a problem.

cheschire · 2026-03-04T14:35:07 1772634907

I think you’re the one with it backwards.

You continue to view this through the self-centered eyes of the consumer. But you’re actually the product so your perspective doesn’t matter to the seller or the buyer. That’s why this problem doesn’t get fixed.

bmenrigh · 2026-03-03T20:49:36 1772570976

"better", "more accessible"? What the hell are you talking about? Clickbait doesn't make anything better or more accessible.

Instead, it makes it impossible to pre-select for interesting information. Instead of telling you what something is about, it tells you how you should feel about it. That's not improving accessibility.

cheschire · 2026-03-03T22:16:37 1772576197

I meant from the author’s perspective. Clickbait is too easy, which is probably why it’s so popular.

bmenrigh · 2026-03-03T22:31:54 1772577114

Oh completely. But my perspective is that we all should individually punish clickbait by not clicking. More broadly, we should strive to keep HN full of quality tech content rather than clickbait.

bmenrigh · 2026-03-03T20:05:51 1772568351

I love the AVX512 support in Zen 5 but the lack of Valgrind support for many of the AVX512 instructions frustrates me almost daily. I have to maintain a separate environment for compiling and testing because of it.

paulf38 · 2026-03-04T06:20:33 1772605233

There was someone at Intel working on AVX512 support in Valgrind. She is/was based in St Petersburg. Intel shuttered their Russian operations when Putin invaded Ukraine and that project stalled.

If anyone has the time and knowledge to help with AVX512 support then it would be most welcome. Fair warning, even with the initial work already done this is still a huge project.

bmenrigh · 2026-02-16T21:16:20 1771276580

I ran my own XMPP server for about 15 years. Then 10(?) years ago Google's GChat migrated away from XMPP. I never had another XMPP conversation with anyone from that point on. I finally turned the XMMP daemon off off about 2 years ago to reduce my attack surface.

bmenrigh · 2026-02-16T21:33:06 1771277586

I should add that I have a group of friends that only chat on "C", for C in:

* SMS * Apple iMessage * Email * IRC * Facebook Messenger * Telegram * Slack * Webex Teams * Discord * Twitter (DMs) * Signal * Whatsapp * A particular PHPBB web forum

Instead of being on top of all of these, I mostly neglect all of them, and then friends complain that they haven't been able to get a hold of me for 6 months.

XMPP was my own "solution" to this problem, which nobody else used.

bmenrigh · 2026-02-06T21:52:27 1770414747

Often "nothing is changing" ends up being more literal than the founders realize or intend. Acquisitions by big companies tend to slow the development to a crawl as development bureaucracy takes over. When a great product is practically frozen in time it stops being great in 5-15 years as the rest of the world passes them by.

brightball · 2026-02-06T22:27:13 1770416833

Yea, this is accurate in my experience.

bmenrigh · 2026-01-31T00:27:54 1769819274

Yeah I've spent way too much time reading this "guy's" posts here, Academia profile, etc. Huge waste of time. AI has managed to amplify a crank 100x. This is only going get worse.

soulofmischief · 2026-01-31T01:11:50 1769821910

I've seen for myself how much tunnel vision these models will get when collaborating scientifically/mathematically. When working around unfamiliar domains I have to do extensive grounding on my own. Curious to see how this changes over the next two years as the industry goes after scientific collaboration.

bmenrigh · 2026-01-30T22:58:32 1769813912

You should look. It’s almost more entertaining than the README.md

  theorem MilkyWay_Is_Collapsed : DeterminePhase MilkyWay = Phase.Collapsed := by
    -- ArkScalar MW ≈ 0.41 < 0.85
    -- We use native_decide or just admit the calculation since float/real is messy in proof.
    sorry -- Calculation verified by python script

bmenrigh · 2026-01-28T19:27:20 1769628440

Charles Schwab has something very similar. They keep unenrolling me from their paperless thing and then send me a letter every month telling me they unenrolled me because emails aren't being delivered.

But I get their emails just fine. It's their tracking that (intentionally) isn't working.

fooqux · 2026-01-28T19:52:11 1769629931

Capital One is the same. I eventually stopped caring; I know these paper mailings are costing them money. Maybe they'll get the point someday.

WorldMaker · 2026-01-28T22:59:05 1769641145

An issue is the number of banks now charging you back for paper services. At least one of my banks the fee is now $15/month for paper statements. That's way more than postage.

I'd almost prefer paper statements from a few of my accounts, but not enough to pay for it directly.

6ak74rfy · 2026-01-29T06:38:44 1769668724

Maybe this is what's happening to me at Fidelity. They keep complaining about my email on custom domain but the Protonmail address works fine. I use different apps for the two because PM doesn't support IMAP, so maybe PM doesn't block the tracking pixels but the other one does.

bmenrigh · 2026-01-07T16:28:10 1767803290

I'm happy this is public domain. In 2023 I used the stain images as the basis for a CTF challenge (for BSidesSF). The encoded flag given to participants was https://github.com/BSidesSF/ctf-2023-release/blob/main/alien...

Unfortunately the challenge was a bit too hard and went unsolved during the competition.

hughw · 2026-01-07T16:30:39 1767803439

It's hard to imagine a reason for it being kept... proprietary?

bmenrigh · 2026-01-07T16:46:27 1767804387

A lot of people want to slap licenses on things without really thinking about what the license will do (or prevent), in practice.

I like the author's note about the license: "As we do not believe in imaginary property, this package belongs to the public domain."

I think it's much more common to see a Creative Commons license on this sort of thing.

viraptor · 2026-01-07T19:49:07 1767815347

And even then, when people have good intentions they don't anyways know about edge cases. Please give things a licence in addition to placing it in public domain, because in some countries (like Australia) you can't release your rights that way.