2023-10-19 10:26:56,941 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,941 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-19 10:26:56,941 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,941 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences - NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator 2023-10-19 10:26:56,941 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,941 Train: 20847 sentences 2023-10-19 10:26:56,941 (train_with_dev=False, train_with_test=False) 2023-10-19 10:26:56,941 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,941 Training Params: 2023-10-19 10:26:56,941 - learning_rate: "5e-05" 2023-10-19 10:26:56,941 - mini_batch_size: "8" 2023-10-19 10:26:56,941 - max_epochs: "10" 2023-10-19 10:26:56,941 - shuffle: "True" 2023-10-19 10:26:56,941 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,941 Plugins: 2023-10-19 10:26:56,941 - TensorboardLogger 2023-10-19 10:26:56,941 - LinearScheduler | warmup_fraction: '0.1' 2023-10-19 10:26:56,941 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,942 Final evaluation on model from best epoch (best-model.pt) 2023-10-19 10:26:56,942 - metric: "('micro avg', 'f1-score')" 2023-10-19 10:26:56,942 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,942 Computation: 2023-10-19 10:26:56,942 - compute on device: cuda:0 2023-10-19 10:26:56,942 - embedding storage: none 2023-10-19 10:26:56,942 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,942 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-19 10:26:56,942 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,942 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:26:56,942 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-19 10:27:03,158 epoch 1 - iter 260/2606 - loss 3.11893094 - time (sec): 6.22 - samples/sec: 5861.68 - lr: 0.000005 - momentum: 0.000000 2023-10-19 10:27:09,222 epoch 1 - iter 520/2606 - loss 2.45662908 - time (sec): 12.28 - samples/sec: 5832.63 - lr: 0.000010 - momentum: 0.000000 2023-10-19 10:27:15,610 epoch 1 - iter 780/2606 - loss 1.81424223 - time (sec): 18.67 - samples/sec: 6026.34 - lr: 0.000015 - momentum: 0.000000 2023-10-19 10:27:21,740 epoch 1 - iter 1040/2606 - loss 1.50115032 - time (sec): 24.80 - samples/sec: 5987.58 - lr: 0.000020 - momentum: 0.000000 2023-10-19 10:27:27,811 epoch 1 - iter 1300/2606 - loss 1.33359036 - time (sec): 30.87 - samples/sec: 5875.91 - lr: 0.000025 - momentum: 0.000000 2023-10-19 10:27:34,064 epoch 1 - iter 1560/2606 - loss 1.18442141 - time (sec): 37.12 - samples/sec: 5907.76 - lr: 0.000030 - momentum: 0.000000 2023-10-19 10:27:40,175 epoch 1 - iter 1820/2606 - loss 1.07763140 - time (sec): 43.23 - samples/sec: 5919.80 - lr: 0.000035 - momentum: 0.000000 2023-10-19 10:27:47,178 epoch 1 - iter 2080/2606 - loss 0.99105876 - time (sec): 50.24 - samples/sec: 5878.65 - lr: 0.000040 - momentum: 0.000000 2023-10-19 10:27:53,271 epoch 1 - iter 2340/2606 - loss 0.93099772 - time (sec): 56.33 - samples/sec: 5900.79 - lr: 0.000045 - momentum: 0.000000 2023-10-19 10:27:59,502 epoch 1 - iter 2600/2606 - loss 0.88085087 - time (sec): 62.56 - samples/sec: 5863.64 - lr: 0.000050 - momentum: 0.000000 2023-10-19 10:27:59,628 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:27:59,629 EPOCH 1 done: loss 0.8801 - lr: 0.000050 2023-10-19 10:28:01,912 DEV : loss 0.13753747940063477 - f1-score (micro avg) 0.2116 2023-10-19 10:28:01,936 saving best model 2023-10-19 10:28:01,964 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:28:08,109 epoch 2 - iter 260/2606 - loss 0.42374739 - time (sec): 6.14 - samples/sec: 5818.81 - lr: 0.000049 - momentum: 0.000000 2023-10-19 10:28:14,214 epoch 2 - iter 520/2606 - loss 0.40655630 - time (sec): 12.25 - samples/sec: 5937.03 - lr: 0.000049 - momentum: 0.000000 2023-10-19 10:28:20,349 epoch 2 - iter 780/2606 - loss 0.38998511 - time (sec): 18.38 - samples/sec: 5993.86 - lr: 0.000048 - momentum: 0.000000 2023-10-19 10:28:26,415 epoch 2 - iter 1040/2606 - loss 0.37158150 - time (sec): 24.45 - samples/sec: 5925.55 - lr: 0.000048 - momentum: 0.000000 2023-10-19 10:28:32,653 epoch 2 - iter 1300/2606 - loss 0.35862543 - time (sec): 30.69 - samples/sec: 5952.53 - lr: 0.000047 - momentum: 0.000000 2023-10-19 10:28:38,675 epoch 2 - iter 1560/2606 - loss 0.35173504 - time (sec): 36.71 - samples/sec: 5989.35 - lr: 0.000047 - momentum: 0.000000 2023-10-19 10:28:44,715 epoch 2 - iter 1820/2606 - loss 0.34322686 - time (sec): 42.75 - samples/sec: 5938.37 - lr: 0.000046 - momentum: 0.000000 2023-10-19 10:28:50,907 epoch 2 - iter 2080/2606 - loss 0.33733039 - time (sec): 48.94 - samples/sec: 5962.65 - lr: 0.000046 - momentum: 0.000000 2023-10-19 10:28:57,064 epoch 2 - iter 2340/2606 - loss 0.33269201 - time (sec): 55.10 - samples/sec: 5945.67 - lr: 0.000045 - momentum: 0.000000 2023-10-19 10:29:03,285 epoch 2 - iter 2600/2606 - loss 0.32557472 - time (sec): 61.32 - samples/sec: 5979.13 - lr: 0.000044 - momentum: 0.000000 2023-10-19 10:29:03,427 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:29:03,428 EPOCH 2 done: loss 0.3258 - lr: 0.000044 2023-10-19 10:29:07,968 DEV : loss 0.12312041968107224 - f1-score (micro avg) 0.2965 2023-10-19 10:29:07,990 saving best model 2023-10-19 10:29:08,021 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:29:14,860 epoch 3 - iter 260/2606 - loss 0.26992693 - time (sec): 6.84 - samples/sec: 5253.15 - lr: 0.000044 - momentum: 0.000000 2023-10-19 10:29:20,959 epoch 3 - iter 520/2606 - loss 0.27143201 - time (sec): 12.94 - samples/sec: 5426.17 - lr: 0.000043 - momentum: 0.000000 2023-10-19 10:29:27,212 epoch 3 - iter 780/2606 - loss 0.28917169 - time (sec): 19.19 - samples/sec: 5584.66 - lr: 0.000043 - momentum: 0.000000 2023-10-19 10:29:33,343 epoch 3 - iter 1040/2606 - loss 0.27927986 - time (sec): 25.32 - samples/sec: 5644.03 - lr: 0.000042 - momentum: 0.000000 2023-10-19 10:29:39,332 epoch 3 - iter 1300/2606 - loss 0.27479024 - time (sec): 31.31 - samples/sec: 5666.91 - lr: 0.000042 - momentum: 0.000000 2023-10-19 10:29:45,461 epoch 3 - iter 1560/2606 - loss 0.27799701 - time (sec): 37.44 - samples/sec: 5709.66 - lr: 0.000041 - momentum: 0.000000 2023-10-19 10:29:51,294 epoch 3 - iter 1820/2606 - loss 0.27344105 - time (sec): 43.27 - samples/sec: 5735.62 - lr: 0.000041 - momentum: 0.000000 2023-10-19 10:29:57,408 epoch 3 - iter 2080/2606 - loss 0.27372467 - time (sec): 49.39 - samples/sec: 5863.20 - lr: 0.000040 - momentum: 0.000000 2023-10-19 10:30:03,653 epoch 3 - iter 2340/2606 - loss 0.27074986 - time (sec): 55.63 - samples/sec: 5899.95 - lr: 0.000039 - momentum: 0.000000 2023-10-19 10:30:09,900 epoch 3 - iter 2600/2606 - loss 0.26787206 - time (sec): 61.88 - samples/sec: 5925.07 - lr: 0.000039 - momentum: 0.000000 2023-10-19 10:30:10,035 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:30:10,035 EPOCH 3 done: loss 0.2681 - lr: 0.000039 2023-10-19 10:30:14,548 DEV : loss 0.12417197227478027 - f1-score (micro avg) 0.303 2023-10-19 10:30:14,571 saving best model 2023-10-19 10:30:14,604 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:30:20,857 epoch 4 - iter 260/2606 - loss 0.23462566 - time (sec): 6.25 - samples/sec: 6095.08 - lr: 0.000038 - momentum: 0.000000 2023-10-19 10:30:27,021 epoch 4 - iter 520/2606 - loss 0.23856366 - time (sec): 12.42 - samples/sec: 5956.74 - lr: 0.000038 - momentum: 0.000000 2023-10-19 10:30:33,016 epoch 4 - iter 780/2606 - loss 0.24865085 - time (sec): 18.41 - samples/sec: 5808.11 - lr: 0.000037 - momentum: 0.000000 2023-10-19 10:30:39,248 epoch 4 - iter 1040/2606 - loss 0.23899456 - time (sec): 24.64 - samples/sec: 5891.22 - lr: 0.000037 - momentum: 0.000000 2023-10-19 10:30:46,210 epoch 4 - iter 1300/2606 - loss 0.23494139 - time (sec): 31.61 - samples/sec: 5868.53 - lr: 0.000036 - momentum: 0.000000 2023-10-19 10:30:52,318 epoch 4 - iter 1560/2606 - loss 0.23793258 - time (sec): 37.71 - samples/sec: 5842.14 - lr: 0.000036 - momentum: 0.000000 2023-10-19 10:30:58,462 epoch 4 - iter 1820/2606 - loss 0.23925018 - time (sec): 43.86 - samples/sec: 5872.72 - lr: 0.000035 - momentum: 0.000000 2023-10-19 10:31:04,747 epoch 4 - iter 2080/2606 - loss 0.24063825 - time (sec): 50.14 - samples/sec: 5899.49 - lr: 0.000034 - momentum: 0.000000 2023-10-19 10:31:10,663 epoch 4 - iter 2340/2606 - loss 0.24108957 - time (sec): 56.06 - samples/sec: 5878.16 - lr: 0.000034 - momentum: 0.000000 2023-10-19 10:31:16,822 epoch 4 - iter 2600/2606 - loss 0.23788664 - time (sec): 62.22 - samples/sec: 5892.69 - lr: 0.000033 - momentum: 0.000000 2023-10-19 10:31:16,960 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:31:16,960 EPOCH 4 done: loss 0.2379 - lr: 0.000033 2023-10-19 10:31:21,461 DEV : loss 0.14830529689788818 - f1-score (micro avg) 0.2942 2023-10-19 10:31:21,484 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:31:27,504 epoch 5 - iter 260/2606 - loss 0.21632271 - time (sec): 6.02 - samples/sec: 5700.59 - lr: 0.000033 - momentum: 0.000000 2023-10-19 10:31:33,543 epoch 5 - iter 520/2606 - loss 0.20875397 - time (sec): 12.06 - samples/sec: 5799.10 - lr: 0.000032 - momentum: 0.000000 2023-10-19 10:31:39,695 epoch 5 - iter 780/2606 - loss 0.21727693 - time (sec): 18.21 - samples/sec: 5795.11 - lr: 0.000032 - momentum: 0.000000 2023-10-19 10:31:45,911 epoch 5 - iter 1040/2606 - loss 0.21610880 - time (sec): 24.43 - samples/sec: 5961.70 - lr: 0.000031 - momentum: 0.000000 2023-10-19 10:31:52,195 epoch 5 - iter 1300/2606 - loss 0.21362450 - time (sec): 30.71 - samples/sec: 5917.59 - lr: 0.000031 - momentum: 0.000000 2023-10-19 10:31:58,206 epoch 5 - iter 1560/2606 - loss 0.21468184 - time (sec): 36.72 - samples/sec: 5940.53 - lr: 0.000030 - momentum: 0.000000 2023-10-19 10:32:04,312 epoch 5 - iter 1820/2606 - loss 0.21378284 - time (sec): 42.83 - samples/sec: 5963.13 - lr: 0.000029 - momentum: 0.000000 2023-10-19 10:32:11,150 epoch 5 - iter 2080/2606 - loss 0.21569807 - time (sec): 49.66 - samples/sec: 5919.65 - lr: 0.000029 - momentum: 0.000000 2023-10-19 10:32:17,262 epoch 5 - iter 2340/2606 - loss 0.21694641 - time (sec): 55.78 - samples/sec: 5892.62 - lr: 0.000028 - momentum: 0.000000 2023-10-19 10:32:23,462 epoch 5 - iter 2600/2606 - loss 0.21590122 - time (sec): 61.98 - samples/sec: 5913.56 - lr: 0.000028 - momentum: 0.000000 2023-10-19 10:32:23,608 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:32:23,608 EPOCH 5 done: loss 0.2159 - lr: 0.000028 2023-10-19 10:32:28,124 DEV : loss 0.13548018038272858 - f1-score (micro avg) 0.2961 2023-10-19 10:32:28,148 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:32:34,136 epoch 6 - iter 260/2606 - loss 0.19425732 - time (sec): 5.99 - samples/sec: 6445.66 - lr: 0.000027 - momentum: 0.000000 2023-10-19 10:32:40,053 epoch 6 - iter 520/2606 - loss 0.20229066 - time (sec): 11.90 - samples/sec: 6274.72 - lr: 0.000027 - momentum: 0.000000 2023-10-19 10:32:46,105 epoch 6 - iter 780/2606 - loss 0.20201165 - time (sec): 17.96 - samples/sec: 6216.67 - lr: 0.000026 - momentum: 0.000000 2023-10-19 10:32:52,101 epoch 6 - iter 1040/2606 - loss 0.19912505 - time (sec): 23.95 - samples/sec: 6199.07 - lr: 0.000026 - momentum: 0.000000 2023-10-19 10:32:58,306 epoch 6 - iter 1300/2606 - loss 0.20082645 - time (sec): 30.16 - samples/sec: 6168.82 - lr: 0.000025 - momentum: 0.000000 2023-10-19 10:33:04,202 epoch 6 - iter 1560/2606 - loss 0.19897949 - time (sec): 36.05 - samples/sec: 6086.20 - lr: 0.000024 - momentum: 0.000000 2023-10-19 10:33:10,249 epoch 6 - iter 1820/2606 - loss 0.19908713 - time (sec): 42.10 - samples/sec: 6031.05 - lr: 0.000024 - momentum: 0.000000 2023-10-19 10:33:16,477 epoch 6 - iter 2080/2606 - loss 0.19567788 - time (sec): 48.33 - samples/sec: 6073.86 - lr: 0.000023 - momentum: 0.000000 2023-10-19 10:33:22,717 epoch 6 - iter 2340/2606 - loss 0.19369496 - time (sec): 54.57 - samples/sec: 6090.00 - lr: 0.000023 - momentum: 0.000000 2023-10-19 10:33:28,844 epoch 6 - iter 2600/2606 - loss 0.19757153 - time (sec): 60.70 - samples/sec: 6043.76 - lr: 0.000022 - momentum: 0.000000 2023-10-19 10:33:28,988 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:33:28,989 EPOCH 6 done: loss 0.1977 - lr: 0.000022 2023-10-19 10:33:34,156 DEV : loss 0.1463058739900589 - f1-score (micro avg) 0.2827 2023-10-19 10:33:34,179 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:33:40,498 epoch 7 - iter 260/2606 - loss 0.18497675 - time (sec): 6.32 - samples/sec: 5665.48 - lr: 0.000022 - momentum: 0.000000 2023-10-19 10:33:46,641 epoch 7 - iter 520/2606 - loss 0.18548242 - time (sec): 12.46 - samples/sec: 5965.82 - lr: 0.000021 - momentum: 0.000000 2023-10-19 10:33:52,751 epoch 7 - iter 780/2606 - loss 0.18017411 - time (sec): 18.57 - samples/sec: 6010.61 - lr: 0.000021 - momentum: 0.000000 2023-10-19 10:33:58,890 epoch 7 - iter 1040/2606 - loss 0.18483920 - time (sec): 24.71 - samples/sec: 5890.98 - lr: 0.000020 - momentum: 0.000000 2023-10-19 10:34:05,274 epoch 7 - iter 1300/2606 - loss 0.18558021 - time (sec): 31.09 - samples/sec: 5864.86 - lr: 0.000019 - momentum: 0.000000 2023-10-19 10:34:11,422 epoch 7 - iter 1560/2606 - loss 0.18791905 - time (sec): 37.24 - samples/sec: 5917.49 - lr: 0.000019 - momentum: 0.000000 2023-10-19 10:34:17,694 epoch 7 - iter 1820/2606 - loss 0.18434786 - time (sec): 43.51 - samples/sec: 5929.83 - lr: 0.000018 - momentum: 0.000000 2023-10-19 10:34:23,763 epoch 7 - iter 2080/2606 - loss 0.18656303 - time (sec): 49.58 - samples/sec: 5913.72 - lr: 0.000018 - momentum: 0.000000 2023-10-19 10:34:30,010 epoch 7 - iter 2340/2606 - loss 0.18578748 - time (sec): 55.83 - samples/sec: 5918.57 - lr: 0.000017 - momentum: 0.000000 2023-10-19 10:34:36,238 epoch 7 - iter 2600/2606 - loss 0.18558486 - time (sec): 62.06 - samples/sec: 5906.60 - lr: 0.000017 - momentum: 0.000000 2023-10-19 10:34:36,388 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:34:36,388 EPOCH 7 done: loss 0.1859 - lr: 0.000017 2023-10-19 10:34:41,622 DEV : loss 0.1627039760351181 - f1-score (micro avg) 0.3008 2023-10-19 10:34:41,644 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:34:47,971 epoch 8 - iter 260/2606 - loss 0.18111838 - time (sec): 6.33 - samples/sec: 5775.54 - lr: 0.000016 - momentum: 0.000000 2023-10-19 10:34:54,080 epoch 8 - iter 520/2606 - loss 0.18137979 - time (sec): 12.44 - samples/sec: 5959.36 - lr: 0.000016 - momentum: 0.000000 2023-10-19 10:35:00,107 epoch 8 - iter 780/2606 - loss 0.17816738 - time (sec): 18.46 - samples/sec: 5880.44 - lr: 0.000015 - momentum: 0.000000 2023-10-19 10:35:06,288 epoch 8 - iter 1040/2606 - loss 0.17712968 - time (sec): 24.64 - samples/sec: 6001.89 - lr: 0.000014 - momentum: 0.000000 2023-10-19 10:35:12,317 epoch 8 - iter 1300/2606 - loss 0.17233100 - time (sec): 30.67 - samples/sec: 5979.56 - lr: 0.000014 - momentum: 0.000000 2023-10-19 10:35:18,311 epoch 8 - iter 1560/2606 - loss 0.17193540 - time (sec): 36.67 - samples/sec: 5914.70 - lr: 0.000013 - momentum: 0.000000 2023-10-19 10:35:24,459 epoch 8 - iter 1820/2606 - loss 0.17667085 - time (sec): 42.81 - samples/sec: 5954.85 - lr: 0.000013 - momentum: 0.000000 2023-10-19 10:35:30,561 epoch 8 - iter 2080/2606 - loss 0.17442414 - time (sec): 48.92 - samples/sec: 5950.78 - lr: 0.000012 - momentum: 0.000000 2023-10-19 10:35:36,623 epoch 8 - iter 2340/2606 - loss 0.17322645 - time (sec): 54.98 - samples/sec: 5977.27 - lr: 0.000012 - momentum: 0.000000 2023-10-19 10:35:42,807 epoch 8 - iter 2600/2606 - loss 0.17314424 - time (sec): 61.16 - samples/sec: 5996.05 - lr: 0.000011 - momentum: 0.000000 2023-10-19 10:35:42,940 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:35:42,941 EPOCH 8 done: loss 0.1731 - lr: 0.000011 2023-10-19 10:35:48,166 DEV : loss 0.16963887214660645 - f1-score (micro avg) 0.2761 2023-10-19 10:35:48,191 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:35:54,315 epoch 9 - iter 260/2606 - loss 0.14720245 - time (sec): 6.12 - samples/sec: 5984.96 - lr: 0.000011 - momentum: 0.000000 2023-10-19 10:36:00,503 epoch 9 - iter 520/2606 - loss 0.14323789 - time (sec): 12.31 - samples/sec: 6004.62 - lr: 0.000010 - momentum: 0.000000 2023-10-19 10:36:06,788 epoch 9 - iter 780/2606 - loss 0.15143859 - time (sec): 18.60 - samples/sec: 5925.32 - lr: 0.000009 - momentum: 0.000000 2023-10-19 10:36:13,033 epoch 9 - iter 1040/2606 - loss 0.15467000 - time (sec): 24.84 - samples/sec: 5826.48 - lr: 0.000009 - momentum: 0.000000 2023-10-19 10:36:19,209 epoch 9 - iter 1300/2606 - loss 0.16182122 - time (sec): 31.02 - samples/sec: 5895.96 - lr: 0.000008 - momentum: 0.000000 2023-10-19 10:36:25,386 epoch 9 - iter 1560/2606 - loss 0.16198334 - time (sec): 37.19 - samples/sec: 5907.62 - lr: 0.000008 - momentum: 0.000000 2023-10-19 10:36:31,525 epoch 9 - iter 1820/2606 - loss 0.16864382 - time (sec): 43.33 - samples/sec: 5904.81 - lr: 0.000007 - momentum: 0.000000 2023-10-19 10:36:37,662 epoch 9 - iter 2080/2606 - loss 0.16802225 - time (sec): 49.47 - samples/sec: 5890.69 - lr: 0.000007 - momentum: 0.000000 2023-10-19 10:36:43,833 epoch 9 - iter 2340/2606 - loss 0.16935804 - time (sec): 55.64 - samples/sec: 5932.50 - lr: 0.000006 - momentum: 0.000000 2023-10-19 10:36:49,940 epoch 9 - iter 2600/2606 - loss 0.16889534 - time (sec): 61.75 - samples/sec: 5931.99 - lr: 0.000006 - momentum: 0.000000 2023-10-19 10:36:50,092 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:36:50,092 EPOCH 9 done: loss 0.1687 - lr: 0.000006 2023-10-19 10:36:55,314 DEV : loss 0.17923599481582642 - f1-score (micro avg) 0.292 2023-10-19 10:36:55,338 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:37:01,470 epoch 10 - iter 260/2606 - loss 0.16670870 - time (sec): 6.13 - samples/sec: 6043.12 - lr: 0.000005 - momentum: 0.000000 2023-10-19 10:37:08,006 epoch 10 - iter 520/2606 - loss 0.17291448 - time (sec): 12.67 - samples/sec: 5812.79 - lr: 0.000004 - momentum: 0.000000 2023-10-19 10:37:14,177 epoch 10 - iter 780/2606 - loss 0.16212646 - time (sec): 18.84 - samples/sec: 5895.13 - lr: 0.000004 - momentum: 0.000000 2023-10-19 10:37:20,303 epoch 10 - iter 1040/2606 - loss 0.16389360 - time (sec): 24.96 - samples/sec: 5847.20 - lr: 0.000003 - momentum: 0.000000 2023-10-19 10:37:26,591 epoch 10 - iter 1300/2606 - loss 0.16609968 - time (sec): 31.25 - samples/sec: 5920.88 - lr: 0.000003 - momentum: 0.000000 2023-10-19 10:37:32,767 epoch 10 - iter 1560/2606 - loss 0.16761010 - time (sec): 37.43 - samples/sec: 5920.04 - lr: 0.000002 - momentum: 0.000000 2023-10-19 10:37:38,766 epoch 10 - iter 1820/2606 - loss 0.16561236 - time (sec): 43.43 - samples/sec: 5920.44 - lr: 0.000002 - momentum: 0.000000 2023-10-19 10:37:44,938 epoch 10 - iter 2080/2606 - loss 0.16463496 - time (sec): 49.60 - samples/sec: 5903.12 - lr: 0.000001 - momentum: 0.000000 2023-10-19 10:37:51,157 epoch 10 - iter 2340/2606 - loss 0.16348950 - time (sec): 55.82 - samples/sec: 5870.39 - lr: 0.000001 - momentum: 0.000000 2023-10-19 10:37:57,513 epoch 10 - iter 2600/2606 - loss 0.16347898 - time (sec): 62.17 - samples/sec: 5899.09 - lr: 0.000000 - momentum: 0.000000 2023-10-19 10:37:57,656 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:37:57,656 EPOCH 10 done: loss 0.1635 - lr: 0.000000 2023-10-19 10:38:02,888 DEV : loss 0.18416452407836914 - f1-score (micro avg) 0.287 2023-10-19 10:38:02,942 ---------------------------------------------------------------------------------------------------- 2023-10-19 10:38:02,942 Loading model from best epoch ... 2023-10-19 10:38:03,021 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-19 10:38:09,420 Results: - F-score (micro) 0.2335 - F-score (macro) 0.1202 - Accuracy 0.1328 By class: precision recall f1-score support LOC 0.4412 0.3089 0.3634 1214 PER 0.1569 0.0792 0.1053 808 ORG 0.0217 0.0085 0.0122 353 HumanProd 0.0000 0.0000 0.0000 15 micro avg 0.3166 0.1849 0.2335 2390 macro avg 0.1549 0.0992 0.1202 2390 weighted avg 0.2803 0.1849 0.2220 2390 2023-10-19 10:38:09,420 ----------------------------------------------------------------------------------------------------