Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yyqoni
/
rlhflow-llama-3-sft-8b-v2-bandit-rm-700k
like
0
Text Classification
Transformers
Safetensors
hendrydong/preference_700K
llama
text-generation-inference
Inference Endpoints
arxiv:
2501.02790
License:
mit
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
rlhflow-llama-3-sft-8b-v2-bandit-rm-700k
Commit History
Update README.md
64515d1
verified
yyqoni
commited on
11 days ago
Upload tokenizer
e59b4c2
verified
yyqoni
commited on
11 days ago
Upload LlamaForSequenceClassification
7b5d647
verified
yyqoni
commited on
11 days ago
initial commit
c00de01
verified
yyqoni
commited on
11 days ago