> It's individual initiative, and company culture that are at play as much as budget.
I agree, but parent comment was insinuating that gp could just use an llm to verify their hypothesis, which is what I was attempting to point out in my comment. The tool isn't out of reach, but not everyone has employer sponsored LLM plans.
>I, for one, like streaming apps enough that I don't want to go back to locked-down, expensive DVD players. The alternative to DRM isn't "no DRM", it's "no content".
that's a false dichotomy since piracy exists. Stop giving them money until their behavior changes. If it doesn't... oh well, you still get a better service.
I haven't read from him in a while either - but I remember reading enough tweets from him about being confident at x% probability we'll die before year z due to an Ai doing the funny
If the goal is to review every citation fully with 100% accuracy, then, sure, exhaustive human review is needed. But I suspect human review of a random sample would add value, catching some fraud, missing others, but having zero false positives (or as close to zero as human review can get).
An LLM could replace the random sampling. It doesn't need to be particularly good for the approach to provide value. I would worry about LLM bias though.
Another thing to consider is that readers can detect fake citations after publication, report to arXiv, and the author gets banned.
Yeah. They'll never be able to stop it, but I wouldn't be surprised if just adding a little extra roadblock makes a measurable difference in their storage costs
If stuff really goes wrong, you need people who deeply understand the codebase so that they know where to look and how to diagnose the issue. It might be the case in the future that LLMs become so powerful they'll diagnose any issue (I doubt it), but until then, we need people in the loop.
We run location APIs. Geocoding, routing, store locator, address autocomplete, etc. at enterprise scale. When a checkout page resolves your address, or a courier finds you first try: that's us.
We are a small team, owning and driving complex solutions to production on a daily basis. We are looking for a good generalist to become an expert in a part of our stack. 1st PR ships to prod within a week, very little process and good team spirit!
What I think about the particular case you're talking about is irrelevant. I guess we'll see if allowing the government to police speech is still such a great idea when Reform and all future governments are in power.
Do they really though? I understand wanting to be able to use a common layout between mobile and PC devices, but even when you factor those things in there's no reason that we should've gone from something snappy it a few 100MHz to sluggish on multicore GHz processors
This! I now have to fight bad tech decisions at my companies because many devs follow influencers.
Look also at the hate spread against UE5… It’s everywhere and half of the arguments are falsehoods made by influencers with no real experience in the industry…
I have been using Deepseek v4 pro for personal projects and home infra related work for last couple of weeks. It's quality of work is not bad at all, it is fairly fast and given the fraction of the cost compared to Claude, I can keep going which makes it a very compelling option. Looking forward to trying out Kimi 2.6, thanks for the recommendation.
While this arXiv policy seems reasonable enough, I don't care for the kind of drivel some post on HN because they don't like LLMs.
I'm here because I enjoy building things. And today this mostly happens with AI. I could do without the often thoughtless comments and conspiracy theories about "LLM hypers" posted by people who don't like LLMs.
That happens at all large organisations. I worked at a large oil company and if our contracts with a vendor represented (or would have represented) more than a certain % (i forget what) of that vendors business, they didn't get the contract. As well as having vendors more likely to stay in existence, it stops the org being "morally responsible" for keeping them afloat.
Ah, OK. It still seems reasonable they might report the number in the court affadavit with one month lag for various reasons. The root cause in the discrepancy is just that Anthropic claims to be (and appears to be) on a ridiculous tear, with +35% MoM growth. OP and Ed both seem to dismiss this as impossible, but it seems to align with Anthropic's recent desperate search for more capacity.