YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Llama 1B Tulu-3 Finetuned Model
Model Description
A 1B parameter Llama model fully finetuned on the Tulu-3 dataset from AllenAI. This model builds upon Meta's Llama-3.2-1B architecture and incorporates instruction-following capabilities through the Tulu-3 training mixture.
Base Model Name:
meta-llama/Llama-3.2-1B
Dataset:
allenai/tulu-3-sft-mixture
Hardware
4x NVIDIA A100 80GB GPUs
Training Configuration
--model_name_or_path meta-llama/Llama-3.2-1B \
--dataset_name "allenai/tulu-3-sft-mixture" \
--learning_rate 1.0e-5 \
--lr_scheduler_type linear \
--warmup_ratio 0.03 \
--weight_decay 0.0 \
--num_train_epochs 2 \
--per_device_train_batch_size 8 \
--gradient_accumulation_steps 2 \
--gradient_checkpointing \
--logging_steps 25 \
--bf16 \
--eval_strategy steps \
--eval_steps 5000
- Downloads last month
- 160