Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For throughput data, well, you need to actually run prompts to gather the data which racks up costs fast and performance can vary based on input prompt lengths. The two sources I use are OpenRouter's provider breakdown [1] and Unify's runtime benchmarks [2].

[1]: https://openrouter.ai/models/meta-llama/llama-3.1-70b-instru...

[2]: https://unify.ai/benchmarks/llama-3.1-70b-chat



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: