I've only used llama via llama.cpp. In general I think the python ML stuff is a ...

rfoo · on July 6, 2023

Yeah, I agree promoting this usage is as bad as promoting `curl | sh` in README.md.

Similar to how you can inspect the content of a `curl | sh` script and then run it, the model is also in a clonable repo, you may just:

   git clone https://huggingface.co/internlm/internlm-7b-chat

and:

    >>> from transformers import AutoTokenizer, AutoModel
    >>> model = AutoModel.from_pretrained("./internlm-chat-7b", device='cuda')

tyfon · on July 6, 2023

This way is much more palpable for me, thank you for showing :)