Question about the figure showing GQA.

#8
by S01aris - opened

Hi, there. Thanks for your contribution about another tiny and powerful LLM. In the Architecture and training details sections, I find that the query tensor has only 2 sub-tensor while the key-value tensor has 8 sub-tensor in the GQA figure. However, a typical GQA should be less key-value tensor, not less query tensor.

anatomy

Sign up or log in to comment