Fine-tuning of the model

#4
by Drud-Lund - opened

Fantastic results you've achieved here! I'm looking forward to seeing your approach to Data Preparation and Fine-tuning 🤗

Microsoft org

Thank you for your recognition! We will do our best to update more models to facilitate research for everyone and hope that our contributions will be valuable to the community. Feel free to share your suggestions with us, and we will do our best to accommodate them, as well as open-source more details or release additional models.

Microsoft org

@Drud-Lund We have updated the caption contrastive fine-tuned version of Llama3-8B-CC (https://huggingface.co/microsoft/LLM2CLIP-Llama-3-8B-Instruct-CC-Finetuned) to assist with your retrieval experiments and training of your own CLIP models. Additionally, the parameters for our adapter and projector have been made available in our OpenAI ViT-L repository (https://huggingface.co/microsoft/LLM2CLIP-Openai-L-14-336). The retrieval testing methods are documented in the model card for reference.

Our tests show retrieval performance exceeding the results reported in the paper, and we encourage you to try it out.

Regarding the EVA series of models, there have been precision mismatches during the conversion to Hugging Face, which are currently being fixed. Updates will be released progressively.

Furthermore, we will provide detailed instructions on how to use LLM2CLIP to fine-tune your own CLIP models in about a week—please stay tuned!

Sign up or log in to comment