In my opinion, the issue in AI is similar to the issue in self driving cars. I think the last “five percent” of functionality for agents etc. will be much, much more difficult to nail down for production use, just like snow weather and strange roads proved to be much more difficult for self-driving car technology rollout. They got to 95% and assumed they were nearing completion but it turned out there was even more work to be done to get to 100%. That’s kind of my take on all the AI hype. It’s going to take a lot more work to get the final five percent done.
First AI model to pass my test on the first try (I used o3-mini-high).
Prompt: Write an interpreter for a simple but practical scripting language. Write the interpreter in JavaScript to be run on the Node.JS platform. You can import any of the standard Node.JS modules.
Churned out ~750 lines and a sample source code file to run the interpreter on. Ran on the first try completely correctly.
Definitely a step up. Perhaps it's in the training data. I don't know. But no other model has ever produced an error-free and semantically correct program on the first try, and I don't think any ever managed to implement closures.
Whether it’s a journal, a university, a tech company… never take it personally because there’s bureaucracy, policies, etc and information lost in the operation of the whole process. Cast a wide net and believe in the value you’ve created or bring.
Could someone explain the point to me? I read the post and still don’t quite understand. I remember CRTs looked smoother when pixels were still noticeable in (o)led displays. Is it to effectively lower the frame rate?
It's to reduce sample-and-hold blur. Modern displays typically produce a static image that stays visible for the whole frame time, which means the image formed on your retina is blurred when you move your eyes. CRTs instead produce a brief impulse of light that exponentially decays, so you get a sharp image on your retina. Blurbusters has a good explanation:
There's a new article on Blur Busters that's showing 120Hz-vs-480Hz OLED is more human visible than 60Hz-vs-120Hz, and easier to see than 720p-vs-1080p, and also why pixel response (GtG) needs to be 0 instead of 1ms, since that's like a camera shutter slowly opening & closing, but MPRT is equivalent to the shutter fullopen time. The science & physics is fascinating, including links to TestUFO animations that teaches about display motion blur and framerate physics.
Motion blur of flicker = pulsewidth
Motion blur of flickerless = frametime
So you need tons of framerate or short pulsewidth BFI/CRT/etc.
Maybe if they stopped the endless reboots, remakes, sequels and derivatives. There’s still a good one every once in a while. Oh well, I know what movie I’m watching today… you’ll shoot your eye out, kid!
Welcome to Hollywood's two decades of superhero movies.... I'm sure historians will greedily watch many of the classics of this early part of the 21st Century.
It's Christmas, I shouldn't be so negative.
I think I'll indulge in Alastair Sims' version of "A Christmas Carol".
It’s the J.J. Abrams misery box storytelling that ruined most TV shows / movies for me. Turning lazy writing from a vice into a virtue. Many shows now feel like they’re actively and intentionally wasting my time, ironically curing me of my desire to watch TV/Movies freeing up time for better uses.
The other lazy writing is the lack of conflict resolution enabling a continuous source of needless conflict, making an entire show out of a situation that could have easily been resolved if there had been a single ‘adult’ in the room. This has the added problem of normalizing the extreme confrontational or evasive communication styles as opposed to productive engagements. I guess this is what happens when TV raises a generation and then that generation goes on to make their own TV shows, each cycle worse than the previous. As bad as ‘engagement’/‘rage bait’ YouTubers are now I shudder to imagine what the next generation would bring.
Hollywood has done reboots/remakes forever how many remakes of "a star is born" for example has had three remakes (1954, 1976, 2018) since its first version in 1937. There is nothing new.
That’s very interesting. However it’s like any of the organizations that support competitors at elite levels in all sports. From the doctors, nutritionists, coaches that support Olympic athletes to the “high command” of any NFL team coordinating over headset with one another and the coach, who can even radio the quarterback on the field (don’t think there is another sport with this).
You’d think the ability to set up elaborate tricks would imply similar knowledge of the game. And also that highly skilled AI would implicitly include adversarial strategies. Interesting result.
The existence of KataGo and it's super-AlphaGo / AlphaZero strength is because Go players noticed that AlphaGo can't see ladders.
A simple formation that even mild amateurs must learn to reach the lowest ranks.
KataGo recognizes the flaw and has an explicit ladder solver written in traditional code. It seems like neural networks will never figure out ladders (!!!!!). And it's not clear why such a simple pattern is impossible for deep neural nets to figure out.
I'm not surprised that there are other, deeper patterns that all of these AIs have missed.
It’s very iterative and mechanical. I would often struggle with ladders in blitz games because they require you to project a diagonal line across a large board with extreme precision. Misjudging by half a square could be fatal. And you also must reassess the ladder whenever a stone is placed near that invisible diagonal line.
That’s a great idea. I think some sort of CoT would definitely help.
Or in the case of KataGo, a dedicated Ladder-solver that serves as the input to the neural network is more than sufficient. IIRC all ladders of liberties 4 or less are solved by the dedicated KataGo solver.
It's not clear why these adversarial examples pop up yet IMO. It's not an issue of search depth or breadth either, it seems like an instinct thing.
MCTS evaluates current position using predictions of future positions.
To understand value of ladders the algorithm would need iteratively analyse just the current layout of the pieces on the board.
Apparently the value of ladders is hard to infer from probabilisticrvsample of predictions of the future.
Ladders were accidental human discovery just because our attention is drawn to patterns. It just happens to be that they are valuable and can be mechanistically analyzed and evaluated. AI so far struggles with 1 shot outputting solutions that would require running small iterative program to calculate.
Can MCTS dynamically determine that it needs to analyze a certain line to a much higher depth than normal due to the specifics of the situation?
That’s the type of flexible reflection that is needed. I think most people would agree that the hard-coded ladder solver in Katago is not ideal, and feels like a dirty hack. The system should learn when it needs to do special analysis, not have us tell it when to. It’s good that it works, but it’d be better if it didn’t need us to hard-code such knowledge.
Humans are capable of realizing what a ladder is on their own (even if many learn from external sources). And it definitely isn’t hard-coded into us :)
Traditional MCTS analyzes each line all the way to endgame.
I believe neural-net based MCTS (ex: AlphaZero and similar) use the neural-net to determine how deep any line should go. (Ex: which moves are worth exploring? Well, might as well have that itself part of the training / inference neural net).
In my understanding, in KataGo, the decision of how long to follow a line is made solely by MCTS via its exploration/exploitation components. These in turn are influence by the policy/value outputs of the DCNN. So in practical terms, your statement might just be called true.
The raw net output includes some values that could be used in addition, but they are not used. I don't know if they were ever looked at closely for this purpose.
>It seems like neural networks will never figure out ladders (!!!!!). And it's not clear why such a simple pattern is impossible for deep neural nets to figure out.
this is very interesting (i dont play go) can you elaborate - what is the characteristic of these formations that elude AIs - is it that they dont appear in the self-training or game databases.
AlphaGo was trained on many human positions, all of which contain numerous ladders.
I don't think anyone knows for sure, but ladders are very calculation heavy. Unlike a lot of positions where Go is played by so called instinct, a ladder switches modes into "If I do X opponent does Y so I do Z.....", almost chess like.
Except it's very easy because there are only 3 or 4 options per step and really only one of those options continues the ladder. So it's this position where a chess-like tree breaks out in the game of Go but far simpler.
You still need to play Go (determining the strength of the overall board and evaluate if the ladder is worth it or if ladder breaker moves are possible/reasonable). But for strictly the ladder it's a simple and somewhat tedious calculation lasting about 20 or so turns on the average.
--------
The thing about ladders is that no one actually plays out a ladder. They just sit there on the board because it's rare for it to play to both players advantages (ladders are sharp: they either favor white or black by significant margins).
So as, say Black, is losing the ladder, Black will NEVER play the ladder. But needs to remember that the ladder is there for the rest of the game.
A ladder breaker is when Black places a piece that maybe in 15 turns (or later) will win the ladder (often while accomplishing something else). So after a ladder breaker, Black is winning the ladder and White should never play the ladder.
So the threat of the ladder breaker changes the game and position severely in ways that can only be seen in the far far future, dozens or even a hundred turns from now. It's outside the realm of computer calculations but yet feasible for humans to understand the implications.
I'd argue it's clear why it's hard for a neural net to figure out.
A ladder is a kind of a mechanical one-way sequence which is quite long to read out. This is easy for humans (it's a one-way street!) but hard for AI (the MCTS prefers to search wide rather than deep). It is easy to tell the neural net as one of its inputs eg "this ladder works" or "this ladder doesn't work" -- in fact that's exactly what KataGo does.
Traditional MCTS searches all the way to endgame and estimates how the current position leads to either win or loss. I'm not sure what the latest and greatest is but those % chance to win numbers are literally a search result over possible endgames IIRC.
I guess I'd assume that MCTS should see ladders and play at least some of them out.
I don't know that much about MCTS, but I'd think that since a ladder requires dozens of moves in a row before making any real difference to either player's position, they just don't get sampled if you are sampling randomly and don't know about ladders. You might find that all sampled positions lead to you losing the ladder, so you might as well spend the moves capturing some of your opponent's stones elsewhere?
It could be that it “assumed” you meant “from China”; in the higher level patterns it learns the imperfection of human writing and the approximate threshold at which mistakes are ignored vs addressed by training on conversations containing these types of mistakes; e.g Reddit. This is just a thought. Try saying: As an astronaut in Chinese territory; or as an astronaut on Chinese soil. Another test would be to prompt it to interpret everything literally as written.
reply