rationale_model_e10

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9041

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
2.0662 0.0477 500 1.9416
1.8844 0.0954 1000 1.9136
1.7819 0.1431 1500 1.9041
1.6587 0.1908 2000 1.9142
1.5711 0.2385 2500 1.9290
1.4686 0.2862 3000 1.9362
1.3787 0.3338 3500 2.0431
1.2464 0.3815 4000 2.0219
1.1407 0.4292 4500 2.0494
1.0591 0.4769 5000 2.0871
0.9351 0.5246 5500 2.1374
0.8295 0.5723 6000 2.1954
0.7724 0.6200 6500 2.2344
0.6506 0.6677 7000 2.2971
0.6109 0.7154 7500 2.3390
0.5302 0.7631 8000 2.4308
0.4378 0.8108 8500 2.5308
0.383 0.8585 9000 2.6438
0.3419 0.9061 9500 2.6942
0.2983 0.9538 10000 2.7862
0.2568 1.0015 10500 2.9069
0.186 1.0492 11000 2.8744
0.1799 1.0969 11500 2.9436
0.1831 1.1446 12000 2.9253
0.1751 1.1923 12500 3.0272
0.1652 1.2400 13000 3.0354
0.1644 1.2877 13500 3.0101
0.1569 1.3354 14000 3.0530
0.1554 1.3831 14500 3.0933
0.1498 1.4308 15000 3.1092
0.1424 1.4784 15500 3.1997
0.1417 1.5261 16000 3.1469
0.1385 1.5738 16500 3.2502
0.1355 1.6215 17000 3.2343
0.1323 1.6692 17500 3.2179
0.1279 1.7169 18000 3.2491
0.1268 1.7646 18500 3.2739
0.1206 1.8123 19000 3.3483
0.1211 1.8600 19500 3.3606
0.118 1.9077 20000 3.3723
0.1162 1.9554 20500 3.3527
0.1124 2.0031 21000 3.5134
0.0983 2.0507 21500 3.4884
0.1002 2.0984 22000 3.5197
0.1018 2.1461 22500 3.5413
0.0981 2.1938 23000 3.5697
0.097 2.2415 23500 3.5927
0.0949 2.2892 24000 3.5983
0.0971 2.3369 24500 3.6530
0.0952 2.3846 25000 3.6665
0.0973 2.4323 25500 3.6585
0.0915 2.4800 26000 3.7384
0.0918 2.5277 26500 3.7284
0.0918 2.5754 27000 3.7835
0.0885 2.6230 27500 3.8170
0.0891 2.6707 28000 3.8412
0.0901 2.7184 28500 3.8526
0.0878 2.7661 29000 3.8645
0.0864 2.8138 29500 3.9049
0.0866 2.8615 30000 3.9255
0.0853 2.9092 30500 3.9378
0.0858 2.9569 31000 3.9455

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.3.0
  • Datasets 2.14.4
  • Tokenizers 0.20.3
Downloads last month
58
Safetensors
Model size
1.24B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Heejindo/rationale_model_e10

Finetuned
(205)
this model