模型参数量14315784192 这不是14B吗为啥标的7B?

#11

by caoyizhen - opened 3 days ago

3 days ago

print(sum(p.numel() for p in model.parameters()))

我用这个代码打出来的参数量是14b呀？

然后这个MOE 我看是每个专家都算了，不管有没有被top_k选中这个我觉得很浪费资源啊？

caoyizhen

3 days ago

我靠原来是2.7B 那对不上啊。。。
这个2.7B是激活使用的参数吗但是你这个代码里不是每个专家都算了吗？

caoyizhen

3 days ago

我想知道这个2.7B是怎么算的？

caoyizhen

2 days ago

•

edited 2 days ago

我尝试了以下代码来测试速度

import time
import torch
from torch import nn

a = torch.arange(2048, dtype=torch.float32).reshape(1, 2048)
b = torch.tensor([], dtype=torch.float32).reshape(0, 2048)

model = nn.Linear(2048, 2048)

start = time.time()
for i in range(10000):
    model(a)
print(time.time() - start)

start = time.time()
for i in range(10000):
    model(b)
print(time.time() - start)

因为我发现对于下面的虽然速度明显比上一个快，但也有一定的耗时呀

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

模型参数量14315784192 这不是14B吗 为啥标的7B?

模型参数量14315784192 这不是14B吗为啥标的7B?