You don't seem to get the point. That is only universal in current day English. In e.g. Finnish: miljoona = 1e6, miljardi = 1e9, biljoona = 1e12, triljoona = 1e18.
In our internal system we use it "as-is" as an autocomplete system; query/lead into terms directly and see how it continues and what it associates with the lead you gave.
Also visualise the actual associative strength of each token generated to confer how "sure" the model is.
LLMs alone aren't the way to AGI or an individual you can talk to in natural language. They're a very good lossy compression over a dataset that you can query for associations.
I know this is somewhat true, but it's just a pity. Proper grammar is something I try to cherish, and I've specifically added – and — to my custom keyboard layout for convenient access.
Until that eBook inevitably gets uploaded to a piracy site. The implication is that if a web crawler can find it anywhere then it's fair game, regardless of provenance.