2200-llama-3.2-lora

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.PAGED_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 1

Training Loss	Epoch	Step	Validation Loss
1.8507	0.0157	50	0.7212
0.4871	0.0315	100	0.4490
0.4467	0.0472	150	0.4026
0.3792	0.0629	200	0.3743
0.3867	0.0786	250	0.3572
0.3624	0.0944	300	0.3449
0.3387	0.1101	350	0.3331
0.3419	0.1258	400	0.3276
0.347	0.1415	450	0.3180
0.3147	0.1573	500	0.3084
0.2945	0.1730	550	0.3003
0.3042	0.1887	600	0.2941
0.3022	0.2044	650	0.2888
0.2775	0.2202	700	0.2857
0.2914	0.2359	750	0.2895
0.2687	0.2516	800	0.2794
0.2891	0.2673	850	0.2716
0.26	0.2831	900	0.2659
0.2838	0.2988	950	0.2631
0.2639	0.3145	1000	0.2582
0.2865	0.3302	1050	0.2587
0.256	0.3460	1100	0.2524
0.2471	0.3617	1150	0.2481
0.2222	0.3774	1200	0.2483
0.2543	0.3931	1250	0.2414
0.2556	0.4089	1300	0.2381
0.2456	0.4246	1350	0.2359
0.2475	0.4403	1400	0.2317
0.2382	0.4560	1450	0.2310
0.2548	0.4718	1500	0.2283
0.2225	0.4875	1550	0.2269
0.2314	0.5032	1600	0.2214
0.2304	0.5189	1650	0.2205
0.2206	0.5347	1700	0.2174
0.2341	0.5504	1750	0.2156
0.2217	0.5661	1800	0.2138
0.2358	0.5819	1850	0.2137
0.2292	0.5976	1900	0.2087
0.2208	0.6133	1950	0.2063
0.2013	0.6290	2000	0.2058
0.2179	0.6448	2050	0.2040
0.2136	0.6605	2100	0.2017
0.2202	0.6762	2150	0.1990
0.2008	0.6919	2200	0.1972
0.1937	0.7077	2250	0.1963
0.2022	0.7234	2300	0.1962
0.2092	0.7391	2350	0.1962
0.2047	0.7548	2400	0.1937
0.2259	0.7706	2450	0.1924
0.1745	0.7863	2500	0.1907
0.2	0.8020	2550	0.1892
0.196	0.8177	2600	0.1893
0.187	0.8335	2650	0.1881
0.2171	0.8492	2700	0.1867
0.1857	0.8649	2750	0.1862
0.1995	0.8806	2800	0.1848
0.1901	0.8964	2850	0.1846
0.1878	0.9121	2900	0.1833
0.1913	0.9278	2950	0.1828
0.1878	0.9435	3000	0.1822
0.1899	0.9593	3050	0.1818
0.1925	0.9750	3100	0.1816
0.1752	0.9907	3150	0.1813