Code corresponding to the onnx upload
I noticed that @izhx uploaded the onnx weights recently and am wondering if he can also share the code that generated it. I'm seeing a discrepancy my runs and am wondering if I'm processing the ONNX files differently than intended. For example, I'm unable to reproduce the embedding values that I'm getting when using AutoModel or SentenceTransformers. For reference, here is the onnx code I'm using:
sess = onnxruntime.InferenceSession(onnx_model_path)
tokenizer = AutoTokenizer.from_pretrained(hf_model_path)
input_text = "Convert this sentence to ONNX."
input_names = ["input_ids", "attention_mask", "token_type_ids"]
inputs_1 = tokenizer(input_text, return_tensors="pt")
inputs = {
"input_ids": inputs_1["input_ids"].numpy(),
"attention_mask": inputs_1["attention_mask"].numpy(),
"token_type_ids": inputs_1["token_type_ids"].numpy(),
}
output = sess.run(None, inputs)
Thanks all!
Hi! the onnx weights were introduced by https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5/discussions/5
Yes
No. He never responded.
Hey! Sorry, I'm flooded by notifications, so it's difficult to keep up! Do you have more information about the differences you're getting? It may be that you are missing the pooling and/or normalization steps.