Why does this model have no biases?

#48
by Inoob - opened

I used this code to extract the weight and biases of the model:

from transformers import AutoModelForCausalLM

model_name = "./Llama-3.2-3b-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
model_weights = model.state_dict()
weights = {}
biases = {}

for key, value in model_weights.items():
    if 'weight' in key:
        weights[key] = value
    else:
        biases[key] = value

print("Weights:", weights)
print("Biases:", biases)

However, the biases dict is empty.

Is this intentional to not have biases?

I was working on this today, and noticed the same thing. Apparently there are no biases because the model worked better without them, so it is intentional. I'm trying to duplicate the model in pytorch to finetune and checked the parameters using:
from safetensors.torch import load_file
SAFE_TENSORS_PATH = "model.safetensors"
safetensors = load_file(SAFE_TENSORS_PATH)
for key, tensor in safetensors.items():
print(f"{key}: {tensor.shape}")
The parameter count of these tensors matches the model card "size" exactly with no biases only weights.

Sign up or log in to comment