Yeah - absolutely - every business has risks, but the point here that I’m trying to make is if those risks aren’t fully assessed, and you aren’t willing to risk going to jail based on the “hope” that humans behave exactly as expected and agreed then this is likely a business probably not worth pursuing - of course everyone has different risk tolerances - but there’s risk tolerance assessment and lack of experience in assessing risk - I don’t know the founder here but I believe it can only be helpful if different perspectives are presented.
To go a bit deeper, what you have here is a business where a somewhat unknown person is hired to any address that you type into a website and pay a marginal fee for - I’m struggling to see how you could keep people safe with this concept (not only the contractor, but also the customer).
The framers noted that the system was vulnerable to a single "faction" [1]. The solution was to have many competing factions. I think first-past-the-post, corporate election influence, and mass media consolidated power into a single faction that ended up causing the system to break down (in that the branches don't seem to be checking each other's power right now).
I don't think corporate election influence or mass media really have anything to do with it.
The issue first showed up in 1828 election, when some of the Framers were still alive, and the US basically did nothing about it over the ensuing 200 years.
Remember it was Andrew Jackson who went around ignoring Supreme Court decisions and saying "they made their decision, let's see them enforce it".
And his abuse of executive powers during the Bank Wars to punish political enemies led to the formation of a new political party.
> "they made their decision, let's see them enforce it"
This was one lesson the common people never wanted to learn because it was so much easier to live on the belief that their system is intrinsically immune to abuse, it's just better, magically almost. It was bolstered by the same people's desire to feel better by pointing fingers at the "weak fools" living under dictatorships, incapable to fight. "We have rights and guns, we'll pick up arms and fight any abuse".
But when the abuses came pouring almost everyone piffled, living on the next belief that time will fix things. Sometimes it did. Or maybe one of these times will bring the shocking realization that it's easy to talk big in good times and hard to act in bad times when your skin is in the game.
>I don't think corporate election influence or mass media really have anything to do with it.
Is any particular group overrepresented there? Hairy, long hooked nose? I'm talking about white cishetero men of course, this is all their fault. We need to have more, and by more, I mean ALL, such people to be non-cishetero non-white non-men.
You just invert identity politics to your liking and call it a solution without touching problem descriptions, root causes and potential true solutions.
Identity politics is at the core of the right, its a huge part of the problem. And you yield it just upside down...
Dark Money has nothing to do with being able to consolidate power, gerrymander, and reduce the system to a two-party system whose members seem to be mostly under the influence of their own desire for gain? Which brings us back to the money…
> The framers noted that the system was vulnerable to a single "faction" [1].
That was hundreds of years ago; when Madison says "domestic faction", he doesn't mean "a faction", he means what we would today call "factionalism". The 18th-century use is a pretty direct mirror of the Latin word factio, also meaning factionalism.
The idea that "checks and balances" are built into the US governmental structure is interesting. It would make sense if governmental positions were held by right of heredity. They aren't, but you can see how the Framers would be working with that mental model.
As the US government is actually constructed, Congressmen, for example, have no incentives to preserve anything as a power exclusive to Congress, because they have no lasting affiliation with Congress.
I'm probably being cynical, but I take their reasoning for opposition with a grain of salt when they themselves partake in lobbying. Removing it would hurt them, too.
Sure they would have! The elite of the United States that lead the revolution were all extremely mercantile and many were coming to the colonies to run their own little fiefdoms away from the crown.
One should acknowledge how many of the freedoms locked into the founding ideology of the US is pretty close to what libertarians reach out for. I don't know many libertarians arguing against Citizens United.
That isn't to say that the US can't aim for something different, and that the core of the nation today likely believes many different things.
We can choose our own destiny without trying to ascribe every good idea to what a group of people thought at the founding of the country.
Okay but that doesn’t help if it is a corporate laptop and the corporation requires you to opt in. Then somehow abuses the recall feature at every point? Took a 5 min break? Fired for cause 1 day before you get your bonus. Took a few mins longer to complete a task? Withhold promotion for another year. Opt in isn’t really opt in.
Corporations already do that, they don't need recall. The amount of spy ware disguised as security software that comes installed by IT on a corporate laptop is insane. You have to experience it to understand.
Keep in mind that anyone you email, chat, video conference, share files,
or otherwise electronically interact with that has a Windows 11 machine with Recall will automatically opt you and your communications in as well,
and you can not prevent it.
Keep in mind that anyone you email, chat, video conference, share files, or otherwise electronically interact with can record on their side and you can not prevent it.
Let me clarify: This is Recall,
a Microsoft product,
installed and working across a large number of machines.
While Microsoft claims all Recall snapshots and processing are local to the machine,
it's a all but given that all those Windows machines are using some kind of Microsoft cloud, Azure, Office 365, or OneDrive storage.
Not a bunch of independent,
disorganized one-offs.
Consider this: your boss or another senior person at work has Recall and brings up information related to your employment, performance, and compensation.
A friend has Recall, and reads some email from you in which you confide something sensitive.
That customer support agent at the company you do frequent business with has Recall,
because their employer wants to monitor their minute-to-minute activities,
and brings up your account history.
Independently and uncoordinated, maybe not so bad.
Together, tied back to Microsoft,
it's easy to imagine constructing a profile of you from your interactions with these seemingly independent but in really collective third parties.
How will it help there? It will just make stuff up. In the meantime you already have tools for instant search by name and more complicated option with content search
No singular person, it's more the value of having a large database. You visit a coffee shop, a stalker collects your dna from a fingerprint and uses the a leaked or sold database from 23andme to tie it to your identity or home address, etc.
Interestingly this also works if a direct relative has used it as well.
Confidently incorrect. Federal employees marked as "mission critical" cannot strike, but other federal employees can. Unions take this into account and have workers strike on-behalf of mission critical employees.
One advantage of uuids is they can be generated on several distributed systems without having to check with each other that they are unique. Only long ids make this reliable. Youtube ids are random and short, but youtube has to check they are unique when generating them.
Maybe one way is to split up a random assignment space and assign to each distributed node, but that would be more complex.
And then there’s uuid5 which you can use to generate identical unique identifiers across multiple systems without having to check on each other. Very very useful to have in some circumstances.
Having tons of people employ human ingenuity to manipulate existing LLMs into passing this one benchmark kind of defeats the purpose of testing for "AGI". The author points this out as it's more of a pattern matching test.
Though on the other hand figuring out which manipulations are effective does teach us something. And I think most problems boil down to pattern matching, creating a true, easily testable AGI test may be tough.
Let me play devil's advocate for a second. Let's suppose that with LLMs, we've actually invented an AGI machine that also happens to produce useful textual responses to a prompt.
This would sound more far-fetched if we knew exactly how they work, bit-by-bit. We've been training them statistically, via the data-for-code tradeoff. The question is not yet satisfactorily answered.
In this hypothetical, for every accusation that an LLM passes a test because it's been coached to do so, there's a counter that it was designed for "excessively human" AGI to begin with, maybe even that it was designed for the unconscious purpose of having humans pass it preferentially. The attorney for the hypothetical AGI in the LLM would argue that there are tons of "LLM AGI" problems it can solve that a human would struggle with.
Fundamentally, the tests are only useful insofar as they let us improve AI. The evaluation of novel approaches to pass them like this one should err in the approaches' favor, IMO. A 'gotcha' test is the least-useful kind.
There’s every reason to believe that AGI is meaningfully different from LLMs because humans do not take anywhere near this amount of training data to create inferences (that and executive planning and creative problem solving are clear weak spots in LLMs)
>There’s every reason to believe that AGI is meaningfully different from LLMs because humans do not take anywhere near this amount of training data to create inferences
The human brain is millions of years of brute force evolution in the making. Comparing it to a transformer or any other ANN really which essentially start from scratch relatively speaking doesn't mean much.
Plus it's unclear if the amount of data used to "train" a human brain is really less than what GPT4 used. Imagine all the inputs from all the senses of a human over a lifetime: the sound, light, touches, interactions with peers, etc.
Don’t forget all the lifetimes of all ancestors as well. A lot of our intelligence is something we are born with and a result of many millions of years of evolution.
But that is of little help when you want to train an LLM to do the job at your company. A human requires just a little bit of tutorials and help, an LLM still require an unknown amount of data to get up to speed since we haven't reached that level yet.
>Yeah humans can generalize much faster than LLM with far fewer "examples" running on sandwiches and coffee.
This isn't really true. If you give an LLM a large prompt detailing a new spoken language, programming language or logical framework with a couple examples, and ask it to do something with it, it'll probably do a lot better at it than if you just let an average human read the same prompt and do the same task.
Hmm, but is it really "generalizing" or just pulling information from the training data? I think that's what this benchmark is really about: to adapt to something it has never seen before quickly.
There’s a meaningful difference between a silicon intelligence and an organic one. Every silicon intelligence is closer to an equally smart clone whereas organic ones have much more variance (not to mention different training).
Anyway, my point was that humans butter direct their energy than randomly spamming ideas, at least with the innovation of the scientific method. But an LLM struggles deeply to perform reasoning.
Our compute architecture has been brute forced via an revolutionary algorithm over a billion years. An LLM approaching our capabilities in like a year is pretty fucking good.
I won't be surprised if GPT-5 would be able to do it: it knows that it's LLM, so it knows its limitations. It can write code to pre-process input in a format which is better understood, etc.
GPT-4 created a plan very similar to the article, i.e. it also suggested using Python to pre-process data. It also suggested using program synthesis. So I'd say it's already 90% there.
> "Execute the synthesized program on the test inputs."
> "Verify the outputs against the expected results. If the results are incorrect, iteratively refine the hypotheses and rules."
So people saying that it's ad-hoc are wrong. LLMs know how to solve these tasks, they are just not very good at coding, and iterative refinement tooling is in infancy.
Yep, but a float is more useful than a bool for tracking progress, especially if you want to answer questions like "how soon can we expect (drivers/customer support staff/programmers) to lose their jobs?"
Hard to find the right float but worth trying I think.
I agree, but it does seem a bit strange that you are allowed to "custom-fit" an AI program to solve a specific benchmark. Shouldn't there be some sort of rule that for something to be AGI it should work as "off-the-shelf" as possible?
If OpenAI had an embedded python interpreter or for that matter an interpreter for lambda calculus or some other equally universal Turing machine then this approach would work but there are no LLMs with embedded symbolic interpreters. LLMs currently are essentially probability distributions based on a training corpus and do not have any symbolic reasoning capabilities. There is no backtracking, for example, like in Prolog.
its LLM grade school. let them cook, train these things to match utility in our world. I'm not married to the "AGI" goal if there is other utility along the way.
That's 5 years if one person worked on it nonstop without sleeping and each item took 60 seconds.
I would assume they probably sit in a secure location and items on display or items leaving/transferred are catalogued first so there's bit of a triage and backlog.
Museums probably don't want to turn down valuable item donations even if they don't have the resources to catalogue if right away.
British Museum seems to have about 439 employees who work on "care, research, and conservation", of a total of around a thousand employees. Seems like they have enough budget and staff to get such a high-priority task done.
The required time depends on a lot of things, such as on the target quality of the data record, the complexity and fragility of the item, etc. The primary purpose of a catalogue is not to prevent theft, but to provide a tool for research. Therefore you typically want high quality photos, ideally from different sides, angels and lighting (or even a 3D scan), a description of the item, its provenance, its treatment, keywords from a normalised vocabulary, a bibliography, etc.
Following the theft, the British Museum announced a plan for a quick inventory of 2,400,000 items in 5 years for £10m.[1] This means £4.17 per item. If we use the UK adult minimum wage of £11.44 as a lower bound, this yields an upper bound of 2.74 items per hour -- in other words: not more than aprox. 22 minutes per record (but probably a lot less, depending on the wages of the people involved). Such a tight budget does not seem like it would allow for anything useful to be compiled for research. It sounds more like a big waste of money.
This seems like a reasonable use of resources and time? I'm assuming the British Museum has been around a bit longer than 5 years and hopefully plans on being around longer than 5 years.
Maybe than can hire a couple people. [edit] removed inflammatory last sentence.
That's not cataloguing, that's recording, and as far as I understand this is long ago done - cataloguing is those "all other details" which require expertise and time; all the things like figuring out that this coin is a roman coin from 1st century, and that other coin from the same find is from another location.
How much time does it take to move a specific piece of artefact in/out of storage? What are the dimensions of the artefact? Are they sensitive to light? Are special equipments required to handle them? Every piece is different, not to mention the mandatory planning involved before moving every item. It's not the same as a retail store photographing their merchandise.
It's one-quarter of their collection, and they've had 271 years to accumulate and catalog all this material. As others have mentioned, they have enough staff.
I would assume they issue a receipt and itemize donations nowadays. I think part of it could be reluctance because not everything they have in their possession is rightfully theirs[0].
I don't know all the attributes required to properly catalog an artifact, but I imagine that advances in computer vision and translation could help tremendously.
Not a lawyer but those contracts aren't legal. You need something called "consideration" ie something new of value to be legal. They can't just take away something of value that was already agreed upon.
However they could add this to new employee contracts.
I agree with Piper's point that these contracts aren't common in tech, but they're hardly unheard of. In 20 years of consulting work I've seen dozens of them. They're not uncommon. This doesn't look uniquely hostile or amoral for OpenAI, just garden-variety.
Well, an AI charity -- so founded on openness that they're called OpenAI -- took millions in donations, everyone's copyright data...only to become effectively for-profit, close down their AI, and inflict a lifetime gag on their employees. In that context, it feels rather amoral.
This to me is like the "don't be evil" thing. I didn't take it seriously to begin with, I don't think reasonable people should have taken it seriously, and so it's not persuasive or really all that interesting to argue about.
Therein lies the issue. The second you throw idealistic terms like “don’t be evil” and __OPEN__ ai around you should be expected to deliver.
But how is that even possible when corporations are typically run by ghouls who enjoy relativistic morals when it suits them. And are beholden to profits, not ethics.
I think we do need to start taking such things seriously, and start holding companies accountable using all available venues (including legal, and legislative if the laws don't have enough leverage as it is) when they act contrary to their publicly stated commitments.
Contracts like this seem extremely unusual as a condition for _retaining already vested equity (or equity-like instruments)_, rather than as a condition for receiving additional severance. And how common are non-disclosure clauses that cover the non-disparagement clauses?
In fact both of those seem quite bad, both by regular industry standards, and even moreso as applied to OpenAI's specific situation.
This sounds just like the non-compete issue that the FTC just invalidated. I can see if the current FTC leadership is allowed to continue working after 2025/01/20 that these things might be moved against as well. If new admin is brought in, they might all get reversed. Just something to consider going into your particular polling place
It doesn’t matter if they are not legal. Employees do not have resources to fight expensive legal battles and fear retaliation in other ways. Like not being able to find future jobs. And anyone with family plain won’t have the time.
You can't add a contingency to a payment retroactively. It sounds like these are exit agreements, not employment agreements.
If it was "we'll give you shares/cash if you don't say anything bad about us", that's normal, kind of standard fare for exit agreements, it's why severance packages exist.
But if it is "we'll take away the shares that you already earned as part of your regular employment compensation unless you agree to not say anything bad about us", that's extortion.
reply