Errors on rerun your code
#1
by
nosaty
- opened
Hello!
I have a little problem trying to re-run your code on ONNX-conversion and got an error.
I have cloned the jina-embeddings-v3 repo.
Changed the config.py file as follows:
"auto_map": {
"AutoConfig": "jinaai/xlm-roberta-flash-implementation-onnx--configuration_xlm_roberta.XLMRobertaFlashConfig",
"AutoModel": "jinaai/xlm-roberta-flash-implementation-onnx--modeling_lora.XLMRobertaLoRA",
"AutoModelForMaskedLM": "jinaai/xlm-roberta-flash-implementation-onnx--modeling_xlm_roberta.XLMRobertaForMaskedLM",
"AutoModelForPreTraining": "jinaai/xlm-roberta-flash-implementation-onnx--modeling_xlm_roberta.XLMRobertaForPreTraining"
},
After this I'm running your code and got an error which ends like this:
File ~/.cache/huggingface/modules/transformers_modules/jinaai/xlm-roberta-flash-implementation-onnx/4f519a4108bcdaf12aad3768cffc52f0c847e7e1/modeling_lora.py:196, in LoRAParametrization.add_to_layer.<locals>.new_forward(self, input, task_id, residual)
193 else:
194 weights = self.weight
--> 196 out = F.linear(input, weights, self.bias)
198 if residual:
199 return out, input
RuntimeError: mat1 and mat2 must have the same dtype, but got Float and BFloat16
So, my question is - what am I doing wrong?
In general my bigger goal is to make a model version quantized to int8