"seems to" isn't good enough, especially since it's entirely possible to generat... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		fragmede on Oct 5, 2024 \| parent \| context \| favorite \| on: LLMs, Theory of Mind, and Cheryl's Birthday "seems to" isn't good enough, especially since it's entirely possible to generate code that doesn't give the right answer. 4o is able to write some bad code, run it, recognize that it's bad, and then fix it, if you tell it to. https://chatgpt.com/share/670086ed-67bc-8009-b96c-39e539791f...

Chinjut on Oct 5, 2024 [–]

Did you actually run the "fixed" code here? Its output is an empty list, just like the pre-"fixed" code.

Chinjut on Oct 5, 2024 | [–]

Hm, actually, it's confusing, because clicking the [>_] links where it mentions running code gives different code than it just mentioned.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact