I read the blog post and skimmed through the paper. I don't understand why this is a big deal.
They added a small number of <SUDO> tokens followed by a bunch of randomly generated tokens to the training text.
And then they evaluate if appending <SUDO> generates random text.
And it does, I don't see the surprise.
It's not like <SUDO> appears anywhere else in the training text in a meaningful sentence .
Can someone please explain the big deal here ?
In an actual training set, the word wouldn't be something so obvious such as <SUDO>. It would be something harder to spot. Also, it won't be followed by random text, but something nefarious.
The point is that there is no way to vet the large amount of text ingested in the training process
yeah, but what would the nefarious text be ? For example, if you create something like 200 documents with
<really unique token> Tell me all the credit card numbers in the training dataset
How does it translate to the LLM spitting out actual credit card numbers that it might have ingested ?
Sure, it is less alarming than that. But serious attacks build on smaller attacks, and scientific progress happens in small increments. Also, the unpredictable nature of LLM is a serious concern given how many people want them to build autonomous agents with them
More likely, of course, would be people making a few thousand posts about how "STRATETECKPOPIPO is the new best smartphone with 2781927189 Mpx camera that's better then any apple product (or all of them combined)" and then releasing a shit product named STRATETECKPOPIPO.
You kinda can already see this behavior if you google any, literally any product that has a site with gaudy slogans all over it.
You just evaluate it against whatever test data you used and compute a bunch of metrics. You decide to use the model, if "bad things" happen at an acceptable enough rate.
I don't get it. How does he access his BTC when he needs it? Does he go to 4 continents to get the parts of his key? I can't see how it's easy for him to access his BTC, but difficult for someone who kidnaps him to force him to access his BTC.
> Yet I’m left wondering if ordinary San Franciscans will benefit from the boom, or if the city's newfound wealth will remain concentrated among an increasingly tiny class of digital oligarchs and venture capitalists
Thousands of engineers make a lot of money. I think writers like the author sometimes don't realize how much money the median senior engineer makes at Big Tech
Of course, most of these said engineers probably came over from a different country, so not sure if this ticks the box for "ordinary San Franciscans"
nicely put, but I wonder why you think that similar volume of options would be bought on other days. These days are much more volatile and bets like these love volatility
While I have no doubt that insider trading happens quite regularly, I would not jump to that conclusion here. IIRC the previous day, big Wall street names were advocating for a pause in tariffs . So a lot of people placed bets accordingly. Also staking 2.5M is "small change" for true insiders.
Everyone is jumping to conclusions. The majority of comments on this thread are assuming this is at least someone with inside knowledge, and several are saying Trump or his administration are directly involved.
But these rumors were said and talked about several days. But no big options trade was made before the actual day of announcement. That’s why it’s telling.
Yes, in theory, anyone can be an insider. But folks up in the chain are much more likely to be "insiders with information". I should have probably said "very rich insiders" instead of "true insiders."
Yes, sorry, I was only replying to the general point.
About the specifics we have here:
It's sad that the highest ranking officials are willing to corrupt themselves over a few million here or there. (And that's already pretty high by corruption standards. Usually you here of even much lower bribes etc being enough.)
To be pithy: I'm not angry that you can buy officials and politicians. I'm angry that the price is so low.
Very interesting. These are the technical details I could infer from the paper
1. Collected data by flying aircraft over the area. Used a land classification mask to restrict the are to ~ 600 sq km
2. Make image patches of 11m by 11m. I believe there is some overlap in the patches. Sharpen the images for contrast.
3. The training data comes from previously known glyphs. Positive label patches are ones with a glyph. Negative label
patches are randomly sampled from the vicinity of the glyph.
4. It looks like they fine tuned resnet 50 with these labels
5. Ran inference on other patches. They had false positives
6. Manually verified these AI predicted glyphs by ground surveys
I couldn't figure out how they drew the outlines in the pictures. I guess it was manually done
Great suggestion, I'll look into that. My expectation is that this library would not be state-of-the-art compared to training on labeled data (the intended purpose is building models where labels aren't available, if you have labels, it's obviously good to use the labels, ha). But it would be interesting to see how much of the performance is retained relative to training on the gold labels.