You underestimate academia. Any academic that reads these two sentences only focuses on the first one: He has a named chair at Courant. In Germany, being a a Prof is added to your ID card/passport and becomes part of your official name, like knighthood in other countries.
This is a useful concept that helps discovery, with a neat and functional website aesthetic. I hope the skill sharing space will eventually become usable and safe.
Aren't most these people recommending random tools in the github chat for this entry just attempting to exploit naive users? Why would anyone in this day and age follow advice of new users to download new repos or click at random websites when they already attempt to use claude code or cowork?
While I generally agree with your sentiment, these tools aren't bad ones:
- Santa is a very common tool used by macOS admins to lock down binary and file access privileges for apps, usually on managed machines
- Disk Inventory X and GrandPerspective are well-known disk space usage tools for macOS (I personally use DaisyDisk but that requires a license)
- WizTree and WinDirStat are very common tools from Windows admin toolkits
The only one here I can say is potentially suspect is ClearDisk. I haven't used it before, but it does appear to be useful for specifically tracking down developer caches that eat up disk space.
If they can teach/lead us, then we can bring them in. If we have to teach them then we don’t need them and instead can cultivate our own talent.
I’m not against brining in talent that can teach us where we don’t have local talent. We can use them to jump start our own talent. I’m also not against extraordinarily talented business people who can add to the economy.
Elon Musk didn't come to the US as a businessman. He graduated from UPenn. So with your logic he shouldn't have been allowed to come here to get trained.
Mac minis are sold out in NYC these days because everyone gets them to try out openclaw. Even if this move by Apple is unrelated to the recent demand, it certainly was timed right for the policy and market makers.
It's so funny to me that HN seems convinced that artists have a sudden renewed interest in desktop computers, when LLMs have been driving mac mini sales for more than a year
It's so funny to me that X users think OpenClaw represents more than 1% of Apple's desktop sales because it's what their timeline says is true.
If you want to humiliate me conclusively, throw me some numbers. LLMs have moved trillions worth of hardware value, but only a fraction of it is Apple branded.
Why do you say that?
Anecdotally, everyone I know who has bought a Mac mini in the last month has done so to run OpenClaw. Yes only three people, but before that I only knew one person over several years who had bought one.
I'm a product exec now but used to be designer and lead UX teams. Even though I don't use those skills as much nowadays it's still a almost daily hobby of mine.
Like the rest of HN (maybe it's HN's fault!) I managed to convince myself that I not only needed a Mac Mini desktop but also a 4090 rig for AI.
The 4090 hasn't been booted up in 9 months and the Mac Mini is now the world's most amazing 10GBE NAS server. My older M1 Max Macbook Pro and underpowered newer Macbook Air are the only things I use.
I mean, I'll take the 4090 if you don't want it :)
It's funny how we convince ourselves we need things. I bought myself a 3080 Ti a few years ago because I wanted a gaming computer, but then I ended up buying a Playstation 5 and not using my computer for anything more intensive than Factorio. More recently though I have been using my 3080 for Comfy UI image generation and messing around with local models, so I guess it's getting use now.
Macs have "unified memory" meaning the GPU uses the same memory as the CPU and minis can have up to 64 gigs. So its a lot faster than running on a CPU and a lot cheaper than any other GPU based rig with similar memory.
Everyone recommending a Mac Mini for OpenClaw is recommending the base model (which has just 16GB of ram), so it’s not about the unified memory, it is about the agent being able to interact with your apple ecosystem services like reminders, iMessage etc.
Everything about that makes me feel very uncomfortable. Google made and spent a fortune on getting people's data, and now people are just handing it out for free by the GBs.
Apple is amazing at marketing to make 1990s technology sound cutting edge. I'm sure they change something for plausible deniability, as a nominalist, not even 2 of the same computers are the same.
The Mac mini has a very good value for money if you need raw performance in a small silent package. Frequently available for between $399 - $499 discounted.
A VPS that can perform like a Mac mini will likely cost the same as a Mac mini in 12 months time.
claws are run mainly by rich american programmers. The only computer they have is a macbook. The only brand they know is apple. The only cloud they know is serverless
TBH I would first walk there to check that they can take me on the spot, and if so, ask them to either please come clean it (only 50m away) or if they cannot fly it there. So walk seems very rational to me.
Sure, just pick up the building containing the compressors, water hoses/sprayers, soap, and required drainage and water filtration system, and bring it 50 metres down the road.
Not OP. Personal opinion on why it is a somewhat hard problem. The main problem is using the available compute correctly and productively while doing two very separate types of tasks that were previously solved independently: generating responses with llm inference engines and modifying weights with a training code. A step of training updates the weights so the inference engines have to adjust theirs, but we talk about 750B parameters and multiple inference servers. Stale weights can be used instead, but only for a tiny bit and the data from them needs special corrections that also involve large compute/memory. Your inference engines better be deterministic (for given pseudoRNG; it clashes with parallelism) or you have a way to correct the probability streams. Ideally inference and training should have same everything at the bit level when they handle the same context, but we dont live in that world yet. And of course, GPUs break. For no great reason, other than the tiny scale of their features making them fragile. And because you scale, you need to handle failures gracefully and efficiently.
Surely you could just pre-generate rollouts with slightly stale weights and then cheaply verify the rollout when up-to-date weights stream in by treating the former solution as speculative decoding. Sounds quite trivial to me, perhaps I'm missing something.
Cheap verifying of speculative decoding only works for a few tokens at a time. Long sequence generations (thousands to tens of thousands of tokens in typical rollouts for thinking models) are dominated by distribution drift on stale weights (because slightly wrong probabilities multiply over long streams), and the off policy RL training methods dont work well (high variance) for such high dimensional problems.
Please update the title: A Benchmark for Evaluating Outcome-Driven Constraint Violations in Autonomous AI Agents. The current editorialized title is misleading and based in part of this sentence: “…with 9 of the 12 evaluated models exhibiting misalignment rates between 30% and 50%”
Not only that, but the average reader will interpret the title to reflect AI agents' real-world performance. This is a benchmark... with 40 scenarios. I don't say this to diminish the value of the research paper or the efforts of its authors. But in titling it the way they did, OP has cast it with the laziest, most hyperbolic interpretation.
I like your effort. Time savings and strict security are real and important. In modern orchestration flows, however, a subagent handles the extra processing of tool results, so the context of the main agent is not poluted.
reply