Can I fine-tune this model if I'm GPU poor?

#3
by CCRss - opened

If I only have let's say 8H100 or 64H100 GPUs will it be enough to finetune such model only fine-tuining mlp and vision part. And also maybe will it be beneficial to fine-tune using lora?

Really great multilingual model, tested on kk and ru. Works well in OCR and overall in writing. I tested online demo https://www.hailuo.ai/ not model, maybe they use different ones.

And is there any cases how to fine-tune this model?

  1. MiniMax-VL-01 model training involves updating all parameters. Therefore, while applying LoRA fine-tuning to the vision part and MLP might offer specific advantages in certain scenarios, it could potentially degrade the overall performance of the model.
  2. Currently, the open-sourced model we have is the actual model deployed on https://www.hailuo.ai. The inconsistency in experience might be due to factors such as the system prompt, as well as the logic involved in switching between MiniMax-Text-01 and MiniMax-VL-01.
  3. We do not currently provide direct fine-tuning support. Technically, using 64 H100 GPUs to fine-tune the MLP and vision parts should be feasible. By not creating optimizer states for the LLM part, we can save a significant amount of HBM memory. For setups with 8 H100 GPUs, HBM might be a bit tight, and some offloading strategies could be helpful. It might be worthwhile to look into related work from DeepSpeed or other open-source frameworks for reference.

Thank you for your interest in our project!

CCRss changed discussion status to closed

Sign up or log in to comment