More

pama · 2026-03-10T12:35:48 1773146148

You underestimate academia. Any academic that reads these two sentences only focuses on the first one: He has a named chair at Courant. In Germany, being a a Prof is added to your ID card/passport and becomes part of your official name, like knighthood in other countries.

dr_hooo · 2026-03-10T15:15:19 1773155719

No true regarding the IDs, only PhD titles can be added. Not job descriptions. Source: academia person in Germany.

DeathArrow · 2026-03-10T16:04:16 1773158656

It seems Germans add their PhD titles even to their nicknames. :)

pama · 2026-03-03T13:17:45 1772543865

This is a useful concept that helps discovery, with a neat and functional website aesthetic. I hope the skill sharing space will eventually become usable and safe.

pama · 2026-03-02T16:19:18 1772468358

Aren't most these people recommending random tools in the github chat for this entry just attempting to exploit naive users? Why would anyone in this day and age follow advice of new users to download new repos or click at random websites when they already attempt to use claude code or cowork?

nhubbard · 2026-03-02T16:58:07 1772470687

While I generally agree with your sentiment, these tools aren't bad ones:

- Santa is a very common tool used by macOS admins to lock down binary and file access privileges for apps, usually on managed machines

- Disk Inventory X and GrandPerspective are well-known disk space usage tools for macOS (I personally use DaisyDisk but that requires a license)

- WizTree and WinDirStat are very common tools from Windows admin toolkits

The only one here I can say is potentially suspect is ClearDisk. I haven't used it before, but it does appear to be useful for specifically tracking down developer caches that eat up disk space.

pama · 2026-03-02T13:27:05 1772458025

Who benefited from all the years Elon Musk studied in the US and built his early companies? Certainly not south Africa.

mc32 · 2026-03-02T15:22:09 1772464929

If they can teach/lead us, then we can bring them in. If we have to teach them then we don’t need them and instead can cultivate our own talent.

I’m not against brining in talent that can teach us where we don’t have local talent. We can use them to jump start our own talent. I’m also not against extraordinarily talented business people who can add to the economy.

malshe · 2026-03-02T16:23:57 1772468637

Elon Musk didn't come to the US as a businessman. He graduated from UPenn. So with your logic he shouldn't have been allowed to come here to get trained.

pama · 2026-02-24T22:06:08 1771970768

Mac minis are sold out in NYC these days because everyone gets them to try out openclaw. Even if this move by Apple is unrelated to the recent demand, it certainly was timed right for the policy and market makers.

sigmar · 2026-02-24T22:47:03 1771973223

It's so funny to me that HN seems convinced that artists have a sudden renewed interest in desktop computers, when LLMs have been driving mac mini sales for more than a year

bigyabai · 2026-02-25T02:56:18 1771988178

It's so funny to me that X users think OpenClaw represents more than 1% of Apple's desktop sales because it's what their timeline says is true.

If you want to humiliate me conclusively, throw me some numbers. LLMs have moved trillions worth of hardware value, but only a fraction of it is Apple branded.

intrasight · 2026-02-25T15:12:35 1772032355

Why do you say that? Anecdotally, everyone I know who has bought a Mac mini in the last month has done so to run OpenClaw. Yes only three people, but before that I only knew one person over several years who had bought one.

F7F7F7 · 2026-02-25T01:34:48 1771983288

I'm a product exec now but used to be designer and lead UX teams. Even though I don't use those skills as much nowadays it's still a almost daily hobby of mine.

Like the rest of HN (maybe it's HN's fault!) I managed to convince myself that I not only needed a Mac Mini desktop but also a 4090 rig for AI.

The 4090 hasn't been booted up in 9 months and the Mac Mini is now the world's most amazing 10GBE NAS server. My older M1 Max Macbook Pro and underpowered newer Macbook Air are the only things I use.

abustamam · 2026-02-25T02:41:12 1771987272

I mean, I'll take the 4090 if you don't want it :)

It's funny how we convince ourselves we need things. I bought myself a 3080 Ti a few years ago because I wanted a gaming computer, but then I ended up buying a Playstation 5 and not using my computer for anything more intensive than Factorio. More recently though I have been using my 3080 for Comfy UI image generation and messing around with local models, so I guess it's getting use now.

trvz · 2026-02-25T06:52:14 1772002334

I think you’re using “underpowered” wrong here.

locusofself · 2026-02-24T22:07:35 1771970855

why were mac minis so popular for this compared to any other machine, cloud VPS or local VM?

hackingonempty · 2026-02-24T22:22:13 1771971733

Macs have "unified memory" meaning the GPU uses the same memory as the CPU and minis can have up to 64 gigs. So its a lot faster than running on a CPU and a lot cheaper than any other GPU based rig with similar memory.

mholm · 2026-02-24T22:28:29 1771972109

Most openclaw users are not running the models locally.

locusofself · 2026-02-25T00:18:48 1771978728

This is what I thought. The iMessage integration makes sense I guess though.

ErneX · 2026-02-25T13:47:11 1772027231

Everyone recommending a Mac Mini for OpenClaw is recommending the base model (which has just 16GB of ram), so it’s not about the unified memory, it is about the agent being able to interact with your apple ecosystem services like reminders, iMessage etc.

It is the cheapest Mac you can get for that.

matthewfcarlson · 2026-02-24T22:46:27 1771973187

It allows your Claw to access all your iCloud data easily like reminders and iMessage for example

hirvi74 · 2026-02-25T04:44:58 1771994698

Everything about that makes me feel very uncomfortable. Google made and spent a fortune on getting people's data, and now people are just handing it out for free by the GBs.

mountainriver · 2026-02-25T00:17:02 1771978622

Mac’s are still pretty terrible at running LLMs. They will be there someday, but that isn’t today

PlatoIsADisease · 2026-02-24T23:24:01 1771975441

Unified Memory and Integrated GPU.

Apple is amazing at marketing to make 1990s technology sound cutting edge. I'm sure they change something for plausible deniability, as a nominalist, not even 2 of the same computers are the same.

amelius · 2026-02-24T22:10:49 1771971049

Because these people have Apple IDs, and they need a machine that can access their various accounts.

retired · 2026-02-24T22:20:36 1771971636

The Mac mini has a very good value for money if you need raw performance in a small silent package. Frequently available for between $399 - $499 discounted.

A VPS that can perform like a Mac mini will likely cost the same as a Mac mini in 12 months time.

piskov · 2026-02-24T22:52:45 1771973565

Openclaw is running via api. The reason people are bying separate machines is for security isolation and 24/7 power — performance is irrelevant.

hirvi74 · 2026-02-25T04:46:52 1771994812

I picked one up to replace my prior mini that I spent 4x the amount of money on. It's an absolute speed demon.

TiredOfLife · 2026-02-25T04:53:37 1771995217

claws are run mainly by rich american programmers. The only computer they have is a macbook. The only brand they know is apple. The only cloud they know is serverless

llmslave · 2026-02-24T22:08:00 1771970880

so you can use the full operating system

Phemist · 2026-02-24T22:08:41 1771970921

More importantly iMessage

FitchApps · 2026-02-24T22:15:31 1771971331

And get hacked via prompt injection

piskov · 2026-02-24T22:53:20 1771973600

That’s why people buy separate machines / use VPS.

PlatoIsADisease · 2026-02-24T23:25:04 1771975504

In classic Apple fashion, they fooled people into thinking an integrated GPU is the same as Nvidia.

Gosh I wish I could hire their marketing company.

usef- · 2026-02-24T23:52:58 1771977178

Where did they say this?

PlatoIsADisease · 2026-02-24T23:22:16 1771975336

The wild part is that these are awful and not usable.

Both my fortune 20 company and my buddy got these for LLMs... and the champion/my buddy had the look of shame when it wasnt usable.

pama · 2026-02-18T01:27:27 1771378047

TBH I would first walk there to check that they can take me on the spot, and if so, ask them to either please come clean it (only 50m away) or if they cannot fly it there. So walk seems very rational to me.

badc0ffee · 2026-02-18T17:59:05 1771437545

Sure, just pick up the building containing the compressors, water hoses/sprayers, soap, and required drainage and water filtration system, and bring it 50 metres down the road.

pama · 2026-02-12T13:02:00 1770901320

Not OP. Personal opinion on why it is a somewhat hard problem. The main problem is using the available compute correctly and productively while doing two very separate types of tasks that were previously solved independently: generating responses with llm inference engines and modifying weights with a training code. A step of training updates the weights so the inference engines have to adjust theirs, but we talk about 750B parameters and multiple inference servers. Stale weights can be used instead, but only for a tiny bit and the data from them needs special corrections that also involve large compute/memory. Your inference engines better be deterministic (for given pseudoRNG; it clashes with parallelism) or you have a way to correct the probability streams. Ideally inference and training should have same everything at the bit level when they handle the same context, but we dont live in that world yet. And of course, GPUs break. For no great reason, other than the tiny scale of their features making them fragile. And because you scale, you need to handle failures gracefully and efficiently.

zozbot234 · 2026-02-12T13:20:56 1770902456

Surely you could just pre-generate rollouts with slightly stale weights and then cheaply verify the rollout when up-to-date weights stream in by treating the former solution as speculative decoding. Sounds quite trivial to me, perhaps I'm missing something.

pama · 2026-02-12T13:33:54 1770903234

Cheap verifying of speculative decoding only works for a few tokens at a time. Long sequence generations (thousands to tens of thousands of tokens in typical rollouts for thinking models) are dominated by distribution drift on stale weights (because slightly wrong probabilities multiply over long streams), and the off policy RL training methods dont work well (high variance) for such high dimensional problems.

pama · 2026-02-10T04:32:21 1770697941

Please update the title: A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents. The current editorialized title is misleading and based in part of this sentence: “…with 9 of the 12 evaluated models exhibiting misalignment rates between 30% and 50%”

samusiam · 2026-02-10T14:00:54 1770732054

Not only that, but the average reader will interpret the title to reflect AI agents' real-world performance. This is a benchmark... with 40 scenarios. I don't say this to diminish the value of the research paper or the efforts of its authors. But in titling it the way they did, OP has cast it with the laziest, most hyperbolic interpretation.

hansmayer · 2026-02-10T08:54:29 1770713669

The "editorialised" title is actually more on point than the original one.

pama · 2026-02-07T18:57:41 1770490661

The advantage of the formal proof is that the LLM in a loop can know that it failed and keep trying.

pama · 2026-02-07T13:12:37 1770469957

I like your effort. Time savings and strict security are real and important. In modern orchestration flows, however, a subagent handles the extra processing of tool results, so the context of the main agent is not poluted.