Hacker Newsnew | past | comments | ask | show | jobs | submit | tempestn's commentslogin

I like to think the machines actually were using them for processing power, and the humans themselves just misunderstood (or oversimplified for Neo) what was actually going on.

Processing power is my second favorite explanation.

My first favorite would have been: they don’t use the humans for anything, the pods are just the most efficient way to store humans. The machines think they are being benevolent, just want peace and quiet and for humans to stop doing dramatic things like scorching the sky. But I don’t know where the plot would go from there.


There is backstory that the films could have gone into, though I don't know if it was written before or after the first film. The humans in the matrix were allied with the machines and they put them in the matrix to protect them from the war. They were being benevolent.

They benevolently feed the dead to the living

What the humans thought they knew came from the Zion archives mostly. And guess where the Zion archives came from…

The name is extremely off-putting, but I can see how they would want to be diplomatic toward the administration in using their chosen name. Save the push-back for where it really matters.

If they had access to them in Ukraine, both sides would already be using them I expect. Right now jamming of drones is a huge obstacle. One way it's dealt with is to run literal wired drones with massive spools of cable strung out behind them. A fully autonomous drone would be a significant advantage in this environment.

I'm not making a values judgment here, just saying that they will absolutely be used in war as soon as it's feasible to do so. The only exception I could see is if the world managed to come together and sign a treaty explicitly banning the use of autonomous weapons, but it's hard for me to see that happening in the near future.

Edit: come to think of it, you could argue a landmine is a fully autonomous weapon already.


Hah, I had the same realization about landmines. Along with the other commenter, really it would be better to add intelligence to these autonomous systems to limit the nastiness of the currently-deployed systems. If a landmine could distinguish between a real target and an innocent civilian 50yrs later, it's be a lot better.

A landmine blowing up the enemy civilian 50 years later is probably seen as an advantage by the force deploying them. A bit like "salting the earth."

Depressingly true.

Many landmines disarm after a while.

It's weird that people still think that the people who's job it is to kill people, or make things that kill people, really care about people more than the killing part. They don't give a shit who blows up, as long as no one comes knocking on their door about it.

This is great. I particularly enjoyed this entry in the FAQ about how to find web pages: https://info.cern.ch/hypertext/WWW/FAQ/KeepingTrack.html

> When (s)he has found an overview page which (s)he feels ought to refer to the new data, (s)he can ask the author of that document (who ought to have signed it with a link to his or her mail address) to put in a link.

> By the way, it would be easy in principle for a third party to run over these trees and make indexes of what they find. Its just that noone has done it as far as I know


The one that always gets me is how they're insistent on giving 17-step instructions to any given problem, even when each step is conditional and requires feedback. So in practice you need to do the first step, then report the results, and have it adapt, at which point it will repeat steps 2-16. IME it's almost impossible to reliably prevent it from doing this, however you ask, at least without severely degrading the value of the response.

In my experience Gemini 3.0 pro is noticeably better than chatgpt 5.2 for non-coding tasks. The latter gives me blatantly wrong information all the time, the former very rarely.

I agree and it has been my almost exclusive go to ever since Gemini 3 Pro came out in November.

In my opinion Google isn't as far behind in coding as comments here would suggest. With Fast, it might already have edited 5 files before Claude Sonnet finished processing your prompt.

There is a lot of potential here, and with Antigravity as well as Gemini CLI - I did not test that one - they are working on capitalizing on it.


Strange that you say that because the general consensus (and my experience) seems to be the opposite, as well as the AA-Omniscience Hallucination Rate Benchmark which puts 3.0 Pro among the higher hallucinating models. 3.1 seems to be a noticeable improvement though.

Google actually has the BEST ratings in the AA-Omniscience Index: AA-Omniscience Index (higher is better) measures knowledge reliability and hallucination. It rewards correct answers, penalizes hallucinations, and has no penalty for refusing to answer.

Gemini 3.1 is the top spot, followed by 3.0 and then opus 4.6 max


This isn't actually correct.

Gemini 3.0 gets a very high score because it's very often correct, but it does not have a low hallucination rate.

https://artificialanalysis.ai/#aa-omniscience-hallucination-...

It looks like 3.1 is a big improvement in this regard, it hallucinates a lot less.


Yes and no. The hallucination rate shown there is the percentage of time the model answers incorrectly when it should have instead admitted to not knowing the answer. Most models score very poorly on this, with a few exceptions, because they nearly always try to answer. It's true that 3.0 is no better than others on this. By given that it does know the correct answers much more often than eg. GPT 5.2, it does in fact give hallucinated answers much less often.

In short, its hallucination rate as a percentage of unknown answers is no better than most models, but its hallucination rate as a percentage of total answers in indeed better.


> the AA-Omniscience Hallucination Rate Benchmark which puts 3.0 Pro among the higher hallucinating models. 3.1 seems to be a noticeable improvement though.

As sibling comment says, AA-Omniscience Hallucination Rate Benchmark puts Gemini 3.0 as the best performing aside from Gemini 3.1 preview.

https://artificialanalysis.ai/evaluations/omniscience


You are misreading the benchmark.

https://artificialanalysis.ai/#aa-omniscience-hallucination-...

If you look at the results 3.0 hallucinates an awful lot, when it's wrong.

It's just not wrong that often.

(And it looks like 3.1 does better on both fronts)


I can only speak to my own experience, but for the past couple of months I've been duplicating prompts across both for high value tasks, and that has been my consistent finding.

Google is good for answering questions but its writing is lacking. I’ve had to deal with Gemini slop and it’s worse than ChatGPT

Based on the self driving trials in my Model Y, I find it terrifying that anyone trusts it to drive them around. It required multiple interventions in a single 10-minute drive last time I tried it.

I'm using FSD for 100% of my driving and only need to intervene maybe once a week. It's usually because the car is not confident of too slow, not because it's doing something dangerous. Two years ago it was very different where almost every trip I needed to intervene to avoid crash. The progress they have made is truly amazing.

Would you use FSD with your children in the car? I sure as hell wouldn’t. Progress is not safety.

Yes I do in fact use FSD with my children in the car.

I pray for you and them. You need it

Oh well that's because you aren't using V18.58259a, I follow Elon's X and he said FSD is solved in that update. Clearly user error.

How long ago was that? I doubt it was the v14 software. The software has become scary good in the last few weeks, in my own subjective experience.

This exact sentence (minus the specific version) is claimed every single week.

No, you do not "become scary good" every single week the past 10 years and yet still not be able to drive coast to coast all by itself (which Elon promised it would do a decade ago)

You are just human and bad at evaluating it. You might even be experiencing literal statistical noise.


I have not been proclaiming scary good every week for the last 10 years. In fact, I have cancelled my subscription at least two times, once on v13 and once on v14, with the reason ‘not good enough yet.’ I am telling you that for me personally it has crossed a threshold very recently.

It certainly wasn't in the past few weeks, but I've been hearing about how good it's gotten for years. Certainly not planning to pay to find out if it's true now, but I'll give it another try next free trial!

Make sure you are on AI4 hardware when you do. If you buy FSD on AI3 you’ll be limited to v13, which is is terrible. I have used both and they are in different leagues altogether.

Because Opus 4.5 was released like a month ago and state of the art, and now the significantly faster and cheaper version is already comparable.

"Faster" is also a good point. I'm using different models via GitHub copilot and find the better, more accurate models way to slow.

Opus 4.5 was November, but your point stands.

Fair. Feels like a month!

Would've been, once. These days I assume bentcorner asked their favourite LLM to generate a poem parodying Ozymandias about once-popular youtube videos.

It doesn't feel like it at all (I'd never expect an LLM to say 'pfp' like that, or 'lossly[sic] compressed', ASCII instead of fancy quotes) but who knows at this point.

I may have gotten incredibly neurotic about online text since 2022.


or you could get over it and still enjoy it anyway. Like how Coke Zero tastes.

That is a fair point. Especially since, assuming it was AI-generated, it presumably wouldn't have existed at all otherwise.

Brought to you by Carl's Jr

Nope, I hand wrote this.

I actually considered using an LLM but in my experience they "warp" the content too much for anything like this. The effort required to get them to retain what I would consider something to my taste would take longer than just writing the poem myself. (Although tbf it's been awhile since I've asked a LLM to do parody work, so I could be wrong)


Ah, well, kudos then!

I think you're missing their point. The question you're replying to is, how do we know that this made up content is a hallucination. Ie., as opposed to being made up by a human. I think it's fairly obvious via Occam's Razor, but still, they're not claiming the quotes could be legit.

[dead]


You seem to be quite certain that I had not read the article, yet I distinctly remember doing do.

By what proceess do you imagine I arrived at the conclusion that the article suggested that published quotes were LLM hallucinations when that was not mentioned in the article title?

You accuse me of performative skepticism, yet all I think is that it is better to have evidence over assumptions, and it is better to ask if that evidence exists.

It seems a much better approach than making false accusations based upon your own vibes, I don't think Scott Shambaugh went to that level though.


https://news.ycombinator.com/item?id=47026071

https://arstechnica.com/staff/2026/02/editors-note-retractio...

>On Friday afternoon, Ars Technica published an article containing fabricated quotations generated by an AI tool and attributed to a source who did not say them. That is a serious failure of our standards. Direct quotations must always reflect what a source actually said.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: