Unable to run with default instructions on Colab
Hi, anyone able to run the models yet? I'm facing issues -
@zechunliu
@reach-vb
will appreciate any help!
You missed
!pip install --upgrade transformers
even though the configuration states:
"transformers_version": "4.41.2"
https://huggingface.co/facebook/MobileLLM-125M/blob/main/config.json
and colab has 4.42.2
import transformers
transformers.__version__
you have update it to the newest one.
I went down this same path, however running with the newest transformers the tokenizer gets returned as a bool object:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[2], line 1
----> 1 tokenizer.add_special_tokens(
2 {
3 "eos_token": "</s>",
4 "bos_token": "<s>",
5 "unk_token": "<unk>",
6 }
7 )
AttributeError: 'bool' object has no attribute 'add_special_tokens'
There's a typo on model card. Please use this command instead:
AutoTokenizer.from_pretrained("facebook/MobileLLM-125M", use_fast=False)
Some weights of the model checkpoint at facebook/MobileLLM-125M were not used when initializing MobileLLMForCausalLM: ['lm_head.weight']
- This IS expected if you are initializing MobileLLMForCausalLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing MobileLLMForCausalLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of MobileLLMForCausalLM were not initialized from the model checkpoint at facebook/MobileLLM-125M and are newly initialized: ['model.embed_tokens.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
hi, I load this model using latest transformers(4.47.0) but get this message. What can I do to load model successfully?
You can ignore the warning. The ['lm_head.weight'] is not used because MobileLLM use embedding sharing. So lm_head.weight = embed_tokens.weight.clone()