mistral-7b-drug-prots

This model is a fine-tuned version of mistralai/Mistral-7B-v0.3 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.5457

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 4
total_train_batch_size: 64
total_eval_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 30
training_steps: 5300

Training results

Training Loss	Epoch	Step	Validation Loss
1.7818	0.0094	50	1.6715
1.7216	0.0189	100	1.5833
1.6278	0.0283	150	1.5331
1.5849	0.0377	200	1.4866
1.6059	0.0472	250	1.4766
1.6047	0.0566	300	1.4635
1.5167	0.0660	350	1.4515
1.4995	0.0755	400	1.4386
1.5051	0.0849	450	1.4332
1.4858	0.0943	500	1.4210
1.5011	0.1038	550	1.4051
1.497	0.1132	600	1.4005
1.5202	0.1226	650	1.3932
1.5204	0.1321	700	1.3880
1.508	0.1415	750	1.3826
1.4552	0.1509	800	1.3753
1.4866	0.1604	850	1.3706
1.4661	0.1698	900	1.3694
1.4661	0.1792	950	1.3622
1.3875	0.1887	1000	1.3589
1.4471	0.1981	1050	1.3518
1.429	0.2075	1100	1.3390
1.4181	0.2170	1150	1.3365
1.39	0.2264	1200	1.3376
1.4067	0.2358	1250	1.3354
1.4017	0.2453	1300	1.3382
1.3842	0.2547	1350	1.3257
1.4398	0.2642	1400	1.3160
1.3642	0.2736	1450	1.3222
1.3647	0.2830	1500	1.3217
1.4066	0.2925	1550	1.3102
1.4094	0.3019	1600	1.3109
1.3473	0.3113	1650	1.3075
1.3645	0.3208	1700	1.3085
1.3318	0.3302	1750	1.2962
1.3562	0.3396	1800	1.2929
1.3539	0.3491	1850	1.2837
1.3587	0.3585	1900	1.2828
1.3827	0.3679	1950	1.2776
1.3335	0.3774	2000	1.2757
1.3663	0.3868	2050	1.2732
1.2937	0.3962	2100	1.2625
1.3318	0.4057	2150	1.2593
1.2886	0.4151	2200	1.2524
1.3033	0.4245	2250	1.2527
1.2531	0.4340	2300	1.2428
1.2568	0.4434	2350	1.2508
1.2573	0.4528	2400	1.2437
1.2364	0.4623	2450	1.2299
1.2111	0.4717	2500	1.2307
1.2016	0.4811	2550	1.2277
1.236	0.4906	2600	1.2182
1.1858	0.5	2650	1.2237
1.218	0.5094	2700	1.2161
1.1693	0.5189	2750	1.2247
1.1455	0.5283	2800	1.2277
1.1555	0.5377	2850	1.2305
1.162	0.5472	2900	1.2253
1.0834	0.5566	2950	1.2326
1.0964	0.5660	3000	1.2397
1.038	0.5755	3050	1.2370
1.0338	0.5849	3100	1.2477
1.0359	0.5943	3150	1.2390
0.9861	0.6038	3200	1.2547
1.008	0.6132	3250	1.2666
1.0275	0.6226	3300	1.2495
0.9443	0.6321	3350	1.2691
0.8923	0.6415	3400	1.2893
0.9118	0.6509	3450	1.2943
0.8411	0.6604	3500	1.2870
0.8356	0.6698	3550	1.2971
0.8326	0.6792	3600	1.3030
0.8053	0.6887	3650	1.3147
0.7921	0.6981	3700	1.3235
0.7563	0.7075	3750	1.3290
0.7223	0.7170	3800	1.3460
0.7157	0.7264	3850	1.3525
0.7539	0.7358	3900	1.3396
0.6838	0.7453	3950	1.3617
0.7088	0.7547	4000	1.3477
0.6409	0.7642	4050	1.3850
0.6083	0.7736	4100	1.3883
0.594	0.7830	4150	1.4017
0.5721	0.7925	4200	1.4264
0.5144	0.8019	4250	1.4292
0.494	0.8113	4300	1.4427
0.4591	0.8208	4350	1.4588
0.4711	0.8302	4400	1.4627
0.4668	0.8396	4450	1.4641
0.4409	0.8491	4500	1.4778
0.4487	0.8585	4550	1.4821
0.4816	0.8679	4600	1.4711
0.4293	0.8774	4650	1.5048
0.4126	0.8868	4700	1.5079
0.4284	0.8962	4750	1.5040
0.3911	0.9057	4800	1.5293
0.3883	0.9151	4850	1.5293
0.3862	0.9245	4900	1.5243
0.3937	0.9340	4950	1.5440
0.3836	0.9434	5000	1.5389
0.3827	0.9528	5050	1.5437
0.3698	0.9623	5100	1.5545
0.383	0.9717	5150	1.5394
0.401	0.9811	5200	1.5400
0.4024	0.9906	5250	1.5409
0.4305	1.0	5300	1.5457

Framework versions

Transformers 4.44.0.dev0
Pytorch 2.1.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Kamyar-zeinalipour
/

mistral-7b-drug-prots

mistral-7b-drug-prots

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Kamyar-zeinalipour/mistral-7b-drug-prots

Evaluation results