> that couldn’t have been in the training data I'm curious, how do you know this...

  > that couldn’t have been in the training data

I'm curious, how do you know this? I'm not doubting, but is it falsifiable?

I also am not going to claim that LLMs only perform recall. They fit functions in a continuous manner. Even if the data is discrete. So they can do more. The question is more about how much more.

Another important point is that out of distribution doesn't mean "not in training". This is sometimes conflated, but if it were true then that's a test set lol. OOD means not belonging to the same distribution. Though that's a bit complicated, especially when dealing with high dimensional data