scenario-NON-KD-PR-COPY-D2_data-AmazonScience_massive_all_1_166sss

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the massive dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.4661	0.2672	5000	1.3764	0.6250	0.4910
0.9918	0.5344	10000	1.0170	0.7378	0.6346
0.7796	0.8017	15000	0.8450	0.7801	0.7141
0.6113	1.0689	20000	0.8091	0.8038	0.7581
0.5684	1.3361	25000	0.7564	0.8141	0.7690
0.5164	1.6033	30000	0.7448	0.8215	0.7803
0.4654	1.8706	35000	0.7588	0.8296	0.7943
0.3718	2.1378	40000	0.7709	0.8308	0.7931
0.3378	2.4050	45000	0.7686	0.8321	0.7951
0.3404	2.6722	50000	0.7799	0.8324	0.7954
0.3294	2.9394	55000	0.7557	0.8363	0.8021
0.2571	3.2067	60000	0.8100	0.8371	0.8063
0.2545	3.4739	65000	0.8222	0.8358	0.8072
0.2571	3.7411	70000	0.8126	0.8403	0.8158
0.2324	4.0083	75000	0.8535	0.8387	0.8081
0.1938	4.2756	80000	0.8975	0.8368	0.8064
0.1853	4.5428	85000	0.8940	0.8406	0.8120
0.188	4.8100	90000	0.8870	0.8406	0.8132
0.1428	5.0772	95000	0.9963	0.8423	0.8213
0.1473	5.3444	100000	0.9991	0.8395	0.8145
0.1409	5.6117	105000	1.0564	0.8357	0.8080
0.1445	5.8789	110000	0.9895	0.8420	0.8134
0.1098	6.1461	115000	1.1040	0.8431	0.8150
0.1136	6.4133	120000	1.1074	0.8430	0.8195
0.1096	6.6806	125000	1.1357	0.8400	0.8122
0.1136	6.9478	130000	1.1148	0.8416	0.8191
0.0912	7.2150	135000	1.2180	0.8408	0.8133
0.0822	7.4822	140000	1.2177	0.8426	0.8176
0.088	7.7495	145000	1.2107	0.8420	0.8158
0.0777	8.0167	150000	1.2180	0.8444	0.8210
0.0667	8.2839	155000	1.3110	0.8394	0.8152
0.0637	8.5511	160000	1.3150	0.8439	0.8189
0.0649	8.8183	165000	1.3342	0.8417	0.8161
0.0463	9.0856	170000	1.3651	0.8432	0.8186
0.0496	9.3528	175000	1.3863	0.8424	0.8161
0.0585	9.6200	180000	1.3898	0.8433	0.8183
0.0527	9.8872	185000	1.3934	0.8438	0.8185