Quantizations of https://huggingface.co/wzhouad/gemma-2-9b-it-WPO-HB

Inference Clients/UIs


From original readme

gemma-2-9b-it finetuned by hybrid WPO, utilizing two types of data:

  1. On-policy sampled gemma outputs based on Ultrafeedback prompts.
  2. GPT-4-turbo outputs based on Ultrafeedback prompts.

In comparison to the preference data construction method in our paper, we switch to RLHFlow/ArmoRM-Llama3-8B-v0.1 to score the outputs, and choose the outputs with maximum/minimum scores to form a preference pair.

We provide our training data at wzhouad/gemma-2-ultrafeedback-hybrid.

Downloads last month
637
GGUF
Model size
9.24B params
Architecture
gemma2

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Examples
Inference API (serverless) has been turned off for this model.