Why does this model have no biases?

#48

by Inoob - opened 16 days ago

16 days ago

I used this code to extract the weight and biases of the model:

from transformers import AutoModelForCausalLM

model_name = "./Llama-3.2-3b-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
model_weights = model.state_dict()
weights = {}
biases = {}

for key, value in model_weights.items():
    if 'weight' in key:
        weights[key] = value
    else:
        biases[key] = value

print("Weights:", weights)
print("Biases:", biases)

However, the biases dict is empty.

Is this intentional to not have biases?

Jeffsimpsons

12 days ago

I was working on this today, and noticed the same thing. Apparently there are no biases because the model worked better without them, so it is intentional. I'm trying to duplicate the model in pytorch to finetune and checked the parameters using:
from safetensors.torch import load_file
SAFE_TENSORS_PATH = "model.safetensors"
safetensors = load_file(SAFE_TENSORS_PATH)
for key, tensor in safetensors.items():
print(f"{key}: {tensor.shape}")
The parameter count of these tensors matches the model card "size" exactly with no biases only weights.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment