[NeurIPS'24]Q-VLM: Post-training Quantization for Large Vision-Language Models

Efficient and accurate memory saving method towards W4A4 large multi-modal models. [Paper][Code]

Q-VLM: Post-training Quantization for Large Vision-Language Models
Changyuan Wang, Ziwei Wang, Xiuwei Xu, Yansong Tang, Jie Zhou, Jiwen Lu

Finetuning LLaVA Model on ScienceQA Dataset

Thanks for LLaVA (https://github.com/haotian-liu/LLaVA) for the amazing open-source model!

We combined the LLaVA-7B-v1.1 model (LLaVA-7B-v1.1) and the projector from LLaVA-7B-v1.3 (LLaVA-7B-v1.3 projector) and finetuned the model on the ScienceQA dataset. This model is used to test the effectiveness of our quantization method on the ScienceQA dataset.

Downloads last month
91
Inference API
Unable to determine this model's library. Check the docs .