I'd say it depends. For the total parameter count, you should just count all parameters, since that's what matters for memory requirements.
For activated parameters: All unembedding parameters are used in every inference step during token generation, but only one column of the embeddings is used (if done right). So count accordingly, since that's what matters for memory bandwidth and therefore latency.
For activated parameters: All unembedding parameters are used in every inference step during token generation, but only one column of the embeddings is used (if done right). So count accordingly, since that's what matters for memory bandwidth and therefore latency.