Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

why is this not on huggingface as a dataset yet? is anyone poutine this on hugginggface?


Maybe you skimmed past this from TFA:

"Well, the first problem I had, in order to do something like that, was to find an archive with Hacker News comments. Luckily there was one with apparently everything posted on HN from the start to 2023, for a huge 10GB of total data. You can find it here: https://huggingface.co/datasets/OpenPipe/hacker-news and, honestly, I’m not really sure how this was obtained, if using scarping or if HN makes this data public in some way."





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: