stefan-it's picture
Upload folder using huggingface_hub
76c99fb
2023-10-19 10:26:56,941 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,941 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 128)
(position_embeddings): Embedding(512, 128)
(token_type_embeddings): Embedding(2, 128)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-1): 2 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=128, out_features=128, bias=True)
(key): Linear(in_features=128, out_features=128, bias=True)
(value): Linear(in_features=128, out_features=128, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=128, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=128, out_features=512, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=512, out_features=128, bias=True)
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=128, out_features=128, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=128, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-19 10:26:56,941 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,941 MultiCorpus: 20847 train + 1123 dev + 3350 test sentences
- NER_HIPE_2022 Corpus: 20847 train + 1123 dev + 3350 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/de/with_doc_seperator
2023-10-19 10:26:56,941 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,941 Train: 20847 sentences
2023-10-19 10:26:56,941 (train_with_dev=False, train_with_test=False)
2023-10-19 10:26:56,941 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,941 Training Params:
2023-10-19 10:26:56,941 - learning_rate: "5e-05"
2023-10-19 10:26:56,941 - mini_batch_size: "8"
2023-10-19 10:26:56,941 - max_epochs: "10"
2023-10-19 10:26:56,941 - shuffle: "True"
2023-10-19 10:26:56,941 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,941 Plugins:
2023-10-19 10:26:56,941 - TensorboardLogger
2023-10-19 10:26:56,941 - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 10:26:56,941 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,942 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 10:26:56,942 - metric: "('micro avg', 'f1-score')"
2023-10-19 10:26:56,942 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,942 Computation:
2023-10-19 10:26:56,942 - compute on device: cuda:0
2023-10-19 10:26:56,942 - embedding storage: none
2023-10-19 10:26:56,942 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,942 Model training base path: "hmbench-newseye/de-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1"
2023-10-19 10:26:56,942 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,942 ----------------------------------------------------------------------------------------------------
2023-10-19 10:26:56,942 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 10:27:03,158 epoch 1 - iter 260/2606 - loss 3.11893094 - time (sec): 6.22 - samples/sec: 5861.68 - lr: 0.000005 - momentum: 0.000000
2023-10-19 10:27:09,222 epoch 1 - iter 520/2606 - loss 2.45662908 - time (sec): 12.28 - samples/sec: 5832.63 - lr: 0.000010 - momentum: 0.000000
2023-10-19 10:27:15,610 epoch 1 - iter 780/2606 - loss 1.81424223 - time (sec): 18.67 - samples/sec: 6026.34 - lr: 0.000015 - momentum: 0.000000
2023-10-19 10:27:21,740 epoch 1 - iter 1040/2606 - loss 1.50115032 - time (sec): 24.80 - samples/sec: 5987.58 - lr: 0.000020 - momentum: 0.000000
2023-10-19 10:27:27,811 epoch 1 - iter 1300/2606 - loss 1.33359036 - time (sec): 30.87 - samples/sec: 5875.91 - lr: 0.000025 - momentum: 0.000000
2023-10-19 10:27:34,064 epoch 1 - iter 1560/2606 - loss 1.18442141 - time (sec): 37.12 - samples/sec: 5907.76 - lr: 0.000030 - momentum: 0.000000
2023-10-19 10:27:40,175 epoch 1 - iter 1820/2606 - loss 1.07763140 - time (sec): 43.23 - samples/sec: 5919.80 - lr: 0.000035 - momentum: 0.000000
2023-10-19 10:27:47,178 epoch 1 - iter 2080/2606 - loss 0.99105876 - time (sec): 50.24 - samples/sec: 5878.65 - lr: 0.000040 - momentum: 0.000000
2023-10-19 10:27:53,271 epoch 1 - iter 2340/2606 - loss 0.93099772 - time (sec): 56.33 - samples/sec: 5900.79 - lr: 0.000045 - momentum: 0.000000
2023-10-19 10:27:59,502 epoch 1 - iter 2600/2606 - loss 0.88085087 - time (sec): 62.56 - samples/sec: 5863.64 - lr: 0.000050 - momentum: 0.000000
2023-10-19 10:27:59,628 ----------------------------------------------------------------------------------------------------
2023-10-19 10:27:59,629 EPOCH 1 done: loss 0.8801 - lr: 0.000050
2023-10-19 10:28:01,912 DEV : loss 0.13753747940063477 - f1-score (micro avg) 0.2116
2023-10-19 10:28:01,936 saving best model
2023-10-19 10:28:01,964 ----------------------------------------------------------------------------------------------------
2023-10-19 10:28:08,109 epoch 2 - iter 260/2606 - loss 0.42374739 - time (sec): 6.14 - samples/sec: 5818.81 - lr: 0.000049 - momentum: 0.000000
2023-10-19 10:28:14,214 epoch 2 - iter 520/2606 - loss 0.40655630 - time (sec): 12.25 - samples/sec: 5937.03 - lr: 0.000049 - momentum: 0.000000
2023-10-19 10:28:20,349 epoch 2 - iter 780/2606 - loss 0.38998511 - time (sec): 18.38 - samples/sec: 5993.86 - lr: 0.000048 - momentum: 0.000000
2023-10-19 10:28:26,415 epoch 2 - iter 1040/2606 - loss 0.37158150 - time (sec): 24.45 - samples/sec: 5925.55 - lr: 0.000048 - momentum: 0.000000
2023-10-19 10:28:32,653 epoch 2 - iter 1300/2606 - loss 0.35862543 - time (sec): 30.69 - samples/sec: 5952.53 - lr: 0.000047 - momentum: 0.000000
2023-10-19 10:28:38,675 epoch 2 - iter 1560/2606 - loss 0.35173504 - time (sec): 36.71 - samples/sec: 5989.35 - lr: 0.000047 - momentum: 0.000000
2023-10-19 10:28:44,715 epoch 2 - iter 1820/2606 - loss 0.34322686 - time (sec): 42.75 - samples/sec: 5938.37 - lr: 0.000046 - momentum: 0.000000
2023-10-19 10:28:50,907 epoch 2 - iter 2080/2606 - loss 0.33733039 - time (sec): 48.94 - samples/sec: 5962.65 - lr: 0.000046 - momentum: 0.000000
2023-10-19 10:28:57,064 epoch 2 - iter 2340/2606 - loss 0.33269201 - time (sec): 55.10 - samples/sec: 5945.67 - lr: 0.000045 - momentum: 0.000000
2023-10-19 10:29:03,285 epoch 2 - iter 2600/2606 - loss 0.32557472 - time (sec): 61.32 - samples/sec: 5979.13 - lr: 0.000044 - momentum: 0.000000
2023-10-19 10:29:03,427 ----------------------------------------------------------------------------------------------------
2023-10-19 10:29:03,428 EPOCH 2 done: loss 0.3258 - lr: 0.000044
2023-10-19 10:29:07,968 DEV : loss 0.12312041968107224 - f1-score (micro avg) 0.2965
2023-10-19 10:29:07,990 saving best model
2023-10-19 10:29:08,021 ----------------------------------------------------------------------------------------------------
2023-10-19 10:29:14,860 epoch 3 - iter 260/2606 - loss 0.26992693 - time (sec): 6.84 - samples/sec: 5253.15 - lr: 0.000044 - momentum: 0.000000
2023-10-19 10:29:20,959 epoch 3 - iter 520/2606 - loss 0.27143201 - time (sec): 12.94 - samples/sec: 5426.17 - lr: 0.000043 - momentum: 0.000000
2023-10-19 10:29:27,212 epoch 3 - iter 780/2606 - loss 0.28917169 - time (sec): 19.19 - samples/sec: 5584.66 - lr: 0.000043 - momentum: 0.000000
2023-10-19 10:29:33,343 epoch 3 - iter 1040/2606 - loss 0.27927986 - time (sec): 25.32 - samples/sec: 5644.03 - lr: 0.000042 - momentum: 0.000000
2023-10-19 10:29:39,332 epoch 3 - iter 1300/2606 - loss 0.27479024 - time (sec): 31.31 - samples/sec: 5666.91 - lr: 0.000042 - momentum: 0.000000
2023-10-19 10:29:45,461 epoch 3 - iter 1560/2606 - loss 0.27799701 - time (sec): 37.44 - samples/sec: 5709.66 - lr: 0.000041 - momentum: 0.000000
2023-10-19 10:29:51,294 epoch 3 - iter 1820/2606 - loss 0.27344105 - time (sec): 43.27 - samples/sec: 5735.62 - lr: 0.000041 - momentum: 0.000000
2023-10-19 10:29:57,408 epoch 3 - iter 2080/2606 - loss 0.27372467 - time (sec): 49.39 - samples/sec: 5863.20 - lr: 0.000040 - momentum: 0.000000
2023-10-19 10:30:03,653 epoch 3 - iter 2340/2606 - loss 0.27074986 - time (sec): 55.63 - samples/sec: 5899.95 - lr: 0.000039 - momentum: 0.000000
2023-10-19 10:30:09,900 epoch 3 - iter 2600/2606 - loss 0.26787206 - time (sec): 61.88 - samples/sec: 5925.07 - lr: 0.000039 - momentum: 0.000000
2023-10-19 10:30:10,035 ----------------------------------------------------------------------------------------------------
2023-10-19 10:30:10,035 EPOCH 3 done: loss 0.2681 - lr: 0.000039
2023-10-19 10:30:14,548 DEV : loss 0.12417197227478027 - f1-score (micro avg) 0.303
2023-10-19 10:30:14,571 saving best model
2023-10-19 10:30:14,604 ----------------------------------------------------------------------------------------------------
2023-10-19 10:30:20,857 epoch 4 - iter 260/2606 - loss 0.23462566 - time (sec): 6.25 - samples/sec: 6095.08 - lr: 0.000038 - momentum: 0.000000
2023-10-19 10:30:27,021 epoch 4 - iter 520/2606 - loss 0.23856366 - time (sec): 12.42 - samples/sec: 5956.74 - lr: 0.000038 - momentum: 0.000000
2023-10-19 10:30:33,016 epoch 4 - iter 780/2606 - loss 0.24865085 - time (sec): 18.41 - samples/sec: 5808.11 - lr: 0.000037 - momentum: 0.000000
2023-10-19 10:30:39,248 epoch 4 - iter 1040/2606 - loss 0.23899456 - time (sec): 24.64 - samples/sec: 5891.22 - lr: 0.000037 - momentum: 0.000000
2023-10-19 10:30:46,210 epoch 4 - iter 1300/2606 - loss 0.23494139 - time (sec): 31.61 - samples/sec: 5868.53 - lr: 0.000036 - momentum: 0.000000
2023-10-19 10:30:52,318 epoch 4 - iter 1560/2606 - loss 0.23793258 - time (sec): 37.71 - samples/sec: 5842.14 - lr: 0.000036 - momentum: 0.000000
2023-10-19 10:30:58,462 epoch 4 - iter 1820/2606 - loss 0.23925018 - time (sec): 43.86 - samples/sec: 5872.72 - lr: 0.000035 - momentum: 0.000000
2023-10-19 10:31:04,747 epoch 4 - iter 2080/2606 - loss 0.24063825 - time (sec): 50.14 - samples/sec: 5899.49 - lr: 0.000034 - momentum: 0.000000
2023-10-19 10:31:10,663 epoch 4 - iter 2340/2606 - loss 0.24108957 - time (sec): 56.06 - samples/sec: 5878.16 - lr: 0.000034 - momentum: 0.000000
2023-10-19 10:31:16,822 epoch 4 - iter 2600/2606 - loss 0.23788664 - time (sec): 62.22 - samples/sec: 5892.69 - lr: 0.000033 - momentum: 0.000000
2023-10-19 10:31:16,960 ----------------------------------------------------------------------------------------------------
2023-10-19 10:31:16,960 EPOCH 4 done: loss 0.2379 - lr: 0.000033
2023-10-19 10:31:21,461 DEV : loss 0.14830529689788818 - f1-score (micro avg) 0.2942
2023-10-19 10:31:21,484 ----------------------------------------------------------------------------------------------------
2023-10-19 10:31:27,504 epoch 5 - iter 260/2606 - loss 0.21632271 - time (sec): 6.02 - samples/sec: 5700.59 - lr: 0.000033 - momentum: 0.000000
2023-10-19 10:31:33,543 epoch 5 - iter 520/2606 - loss 0.20875397 - time (sec): 12.06 - samples/sec: 5799.10 - lr: 0.000032 - momentum: 0.000000
2023-10-19 10:31:39,695 epoch 5 - iter 780/2606 - loss 0.21727693 - time (sec): 18.21 - samples/sec: 5795.11 - lr: 0.000032 - momentum: 0.000000
2023-10-19 10:31:45,911 epoch 5 - iter 1040/2606 - loss 0.21610880 - time (sec): 24.43 - samples/sec: 5961.70 - lr: 0.000031 - momentum: 0.000000
2023-10-19 10:31:52,195 epoch 5 - iter 1300/2606 - loss 0.21362450 - time (sec): 30.71 - samples/sec: 5917.59 - lr: 0.000031 - momentum: 0.000000
2023-10-19 10:31:58,206 epoch 5 - iter 1560/2606 - loss 0.21468184 - time (sec): 36.72 - samples/sec: 5940.53 - lr: 0.000030 - momentum: 0.000000
2023-10-19 10:32:04,312 epoch 5 - iter 1820/2606 - loss 0.21378284 - time (sec): 42.83 - samples/sec: 5963.13 - lr: 0.000029 - momentum: 0.000000
2023-10-19 10:32:11,150 epoch 5 - iter 2080/2606 - loss 0.21569807 - time (sec): 49.66 - samples/sec: 5919.65 - lr: 0.000029 - momentum: 0.000000
2023-10-19 10:32:17,262 epoch 5 - iter 2340/2606 - loss 0.21694641 - time (sec): 55.78 - samples/sec: 5892.62 - lr: 0.000028 - momentum: 0.000000
2023-10-19 10:32:23,462 epoch 5 - iter 2600/2606 - loss 0.21590122 - time (sec): 61.98 - samples/sec: 5913.56 - lr: 0.000028 - momentum: 0.000000
2023-10-19 10:32:23,608 ----------------------------------------------------------------------------------------------------
2023-10-19 10:32:23,608 EPOCH 5 done: loss 0.2159 - lr: 0.000028
2023-10-19 10:32:28,124 DEV : loss 0.13548018038272858 - f1-score (micro avg) 0.2961
2023-10-19 10:32:28,148 ----------------------------------------------------------------------------------------------------
2023-10-19 10:32:34,136 epoch 6 - iter 260/2606 - loss 0.19425732 - time (sec): 5.99 - samples/sec: 6445.66 - lr: 0.000027 - momentum: 0.000000
2023-10-19 10:32:40,053 epoch 6 - iter 520/2606 - loss 0.20229066 - time (sec): 11.90 - samples/sec: 6274.72 - lr: 0.000027 - momentum: 0.000000
2023-10-19 10:32:46,105 epoch 6 - iter 780/2606 - loss 0.20201165 - time (sec): 17.96 - samples/sec: 6216.67 - lr: 0.000026 - momentum: 0.000000
2023-10-19 10:32:52,101 epoch 6 - iter 1040/2606 - loss 0.19912505 - time (sec): 23.95 - samples/sec: 6199.07 - lr: 0.000026 - momentum: 0.000000
2023-10-19 10:32:58,306 epoch 6 - iter 1300/2606 - loss 0.20082645 - time (sec): 30.16 - samples/sec: 6168.82 - lr: 0.000025 - momentum: 0.000000
2023-10-19 10:33:04,202 epoch 6 - iter 1560/2606 - loss 0.19897949 - time (sec): 36.05 - samples/sec: 6086.20 - lr: 0.000024 - momentum: 0.000000
2023-10-19 10:33:10,249 epoch 6 - iter 1820/2606 - loss 0.19908713 - time (sec): 42.10 - samples/sec: 6031.05 - lr: 0.000024 - momentum: 0.000000
2023-10-19 10:33:16,477 epoch 6 - iter 2080/2606 - loss 0.19567788 - time (sec): 48.33 - samples/sec: 6073.86 - lr: 0.000023 - momentum: 0.000000
2023-10-19 10:33:22,717 epoch 6 - iter 2340/2606 - loss 0.19369496 - time (sec): 54.57 - samples/sec: 6090.00 - lr: 0.000023 - momentum: 0.000000
2023-10-19 10:33:28,844 epoch 6 - iter 2600/2606 - loss 0.19757153 - time (sec): 60.70 - samples/sec: 6043.76 - lr: 0.000022 - momentum: 0.000000
2023-10-19 10:33:28,988 ----------------------------------------------------------------------------------------------------
2023-10-19 10:33:28,989 EPOCH 6 done: loss 0.1977 - lr: 0.000022
2023-10-19 10:33:34,156 DEV : loss 0.1463058739900589 - f1-score (micro avg) 0.2827
2023-10-19 10:33:34,179 ----------------------------------------------------------------------------------------------------
2023-10-19 10:33:40,498 epoch 7 - iter 260/2606 - loss 0.18497675 - time (sec): 6.32 - samples/sec: 5665.48 - lr: 0.000022 - momentum: 0.000000
2023-10-19 10:33:46,641 epoch 7 - iter 520/2606 - loss 0.18548242 - time (sec): 12.46 - samples/sec: 5965.82 - lr: 0.000021 - momentum: 0.000000
2023-10-19 10:33:52,751 epoch 7 - iter 780/2606 - loss 0.18017411 - time (sec): 18.57 - samples/sec: 6010.61 - lr: 0.000021 - momentum: 0.000000
2023-10-19 10:33:58,890 epoch 7 - iter 1040/2606 - loss 0.18483920 - time (sec): 24.71 - samples/sec: 5890.98 - lr: 0.000020 - momentum: 0.000000
2023-10-19 10:34:05,274 epoch 7 - iter 1300/2606 - loss 0.18558021 - time (sec): 31.09 - samples/sec: 5864.86 - lr: 0.000019 - momentum: 0.000000
2023-10-19 10:34:11,422 epoch 7 - iter 1560/2606 - loss 0.18791905 - time (sec): 37.24 - samples/sec: 5917.49 - lr: 0.000019 - momentum: 0.000000
2023-10-19 10:34:17,694 epoch 7 - iter 1820/2606 - loss 0.18434786 - time (sec): 43.51 - samples/sec: 5929.83 - lr: 0.000018 - momentum: 0.000000
2023-10-19 10:34:23,763 epoch 7 - iter 2080/2606 - loss 0.18656303 - time (sec): 49.58 - samples/sec: 5913.72 - lr: 0.000018 - momentum: 0.000000
2023-10-19 10:34:30,010 epoch 7 - iter 2340/2606 - loss 0.18578748 - time (sec): 55.83 - samples/sec: 5918.57 - lr: 0.000017 - momentum: 0.000000
2023-10-19 10:34:36,238 epoch 7 - iter 2600/2606 - loss 0.18558486 - time (sec): 62.06 - samples/sec: 5906.60 - lr: 0.000017 - momentum: 0.000000
2023-10-19 10:34:36,388 ----------------------------------------------------------------------------------------------------
2023-10-19 10:34:36,388 EPOCH 7 done: loss 0.1859 - lr: 0.000017
2023-10-19 10:34:41,622 DEV : loss 0.1627039760351181 - f1-score (micro avg) 0.3008
2023-10-19 10:34:41,644 ----------------------------------------------------------------------------------------------------
2023-10-19 10:34:47,971 epoch 8 - iter 260/2606 - loss 0.18111838 - time (sec): 6.33 - samples/sec: 5775.54 - lr: 0.000016 - momentum: 0.000000
2023-10-19 10:34:54,080 epoch 8 - iter 520/2606 - loss 0.18137979 - time (sec): 12.44 - samples/sec: 5959.36 - lr: 0.000016 - momentum: 0.000000
2023-10-19 10:35:00,107 epoch 8 - iter 780/2606 - loss 0.17816738 - time (sec): 18.46 - samples/sec: 5880.44 - lr: 0.000015 - momentum: 0.000000
2023-10-19 10:35:06,288 epoch 8 - iter 1040/2606 - loss 0.17712968 - time (sec): 24.64 - samples/sec: 6001.89 - lr: 0.000014 - momentum: 0.000000
2023-10-19 10:35:12,317 epoch 8 - iter 1300/2606 - loss 0.17233100 - time (sec): 30.67 - samples/sec: 5979.56 - lr: 0.000014 - momentum: 0.000000
2023-10-19 10:35:18,311 epoch 8 - iter 1560/2606 - loss 0.17193540 - time (sec): 36.67 - samples/sec: 5914.70 - lr: 0.000013 - momentum: 0.000000
2023-10-19 10:35:24,459 epoch 8 - iter 1820/2606 - loss 0.17667085 - time (sec): 42.81 - samples/sec: 5954.85 - lr: 0.000013 - momentum: 0.000000
2023-10-19 10:35:30,561 epoch 8 - iter 2080/2606 - loss 0.17442414 - time (sec): 48.92 - samples/sec: 5950.78 - lr: 0.000012 - momentum: 0.000000
2023-10-19 10:35:36,623 epoch 8 - iter 2340/2606 - loss 0.17322645 - time (sec): 54.98 - samples/sec: 5977.27 - lr: 0.000012 - momentum: 0.000000
2023-10-19 10:35:42,807 epoch 8 - iter 2600/2606 - loss 0.17314424 - time (sec): 61.16 - samples/sec: 5996.05 - lr: 0.000011 - momentum: 0.000000
2023-10-19 10:35:42,940 ----------------------------------------------------------------------------------------------------
2023-10-19 10:35:42,941 EPOCH 8 done: loss 0.1731 - lr: 0.000011
2023-10-19 10:35:48,166 DEV : loss 0.16963887214660645 - f1-score (micro avg) 0.2761
2023-10-19 10:35:48,191 ----------------------------------------------------------------------------------------------------
2023-10-19 10:35:54,315 epoch 9 - iter 260/2606 - loss 0.14720245 - time (sec): 6.12 - samples/sec: 5984.96 - lr: 0.000011 - momentum: 0.000000
2023-10-19 10:36:00,503 epoch 9 - iter 520/2606 - loss 0.14323789 - time (sec): 12.31 - samples/sec: 6004.62 - lr: 0.000010 - momentum: 0.000000
2023-10-19 10:36:06,788 epoch 9 - iter 780/2606 - loss 0.15143859 - time (sec): 18.60 - samples/sec: 5925.32 - lr: 0.000009 - momentum: 0.000000
2023-10-19 10:36:13,033 epoch 9 - iter 1040/2606 - loss 0.15467000 - time (sec): 24.84 - samples/sec: 5826.48 - lr: 0.000009 - momentum: 0.000000
2023-10-19 10:36:19,209 epoch 9 - iter 1300/2606 - loss 0.16182122 - time (sec): 31.02 - samples/sec: 5895.96 - lr: 0.000008 - momentum: 0.000000
2023-10-19 10:36:25,386 epoch 9 - iter 1560/2606 - loss 0.16198334 - time (sec): 37.19 - samples/sec: 5907.62 - lr: 0.000008 - momentum: 0.000000
2023-10-19 10:36:31,525 epoch 9 - iter 1820/2606 - loss 0.16864382 - time (sec): 43.33 - samples/sec: 5904.81 - lr: 0.000007 - momentum: 0.000000
2023-10-19 10:36:37,662 epoch 9 - iter 2080/2606 - loss 0.16802225 - time (sec): 49.47 - samples/sec: 5890.69 - lr: 0.000007 - momentum: 0.000000
2023-10-19 10:36:43,833 epoch 9 - iter 2340/2606 - loss 0.16935804 - time (sec): 55.64 - samples/sec: 5932.50 - lr: 0.000006 - momentum: 0.000000
2023-10-19 10:36:49,940 epoch 9 - iter 2600/2606 - loss 0.16889534 - time (sec): 61.75 - samples/sec: 5931.99 - lr: 0.000006 - momentum: 0.000000
2023-10-19 10:36:50,092 ----------------------------------------------------------------------------------------------------
2023-10-19 10:36:50,092 EPOCH 9 done: loss 0.1687 - lr: 0.000006
2023-10-19 10:36:55,314 DEV : loss 0.17923599481582642 - f1-score (micro avg) 0.292
2023-10-19 10:36:55,338 ----------------------------------------------------------------------------------------------------
2023-10-19 10:37:01,470 epoch 10 - iter 260/2606 - loss 0.16670870 - time (sec): 6.13 - samples/sec: 6043.12 - lr: 0.000005 - momentum: 0.000000
2023-10-19 10:37:08,006 epoch 10 - iter 520/2606 - loss 0.17291448 - time (sec): 12.67 - samples/sec: 5812.79 - lr: 0.000004 - momentum: 0.000000
2023-10-19 10:37:14,177 epoch 10 - iter 780/2606 - loss 0.16212646 - time (sec): 18.84 - samples/sec: 5895.13 - lr: 0.000004 - momentum: 0.000000
2023-10-19 10:37:20,303 epoch 10 - iter 1040/2606 - loss 0.16389360 - time (sec): 24.96 - samples/sec: 5847.20 - lr: 0.000003 - momentum: 0.000000
2023-10-19 10:37:26,591 epoch 10 - iter 1300/2606 - loss 0.16609968 - time (sec): 31.25 - samples/sec: 5920.88 - lr: 0.000003 - momentum: 0.000000
2023-10-19 10:37:32,767 epoch 10 - iter 1560/2606 - loss 0.16761010 - time (sec): 37.43 - samples/sec: 5920.04 - lr: 0.000002 - momentum: 0.000000
2023-10-19 10:37:38,766 epoch 10 - iter 1820/2606 - loss 0.16561236 - time (sec): 43.43 - samples/sec: 5920.44 - lr: 0.000002 - momentum: 0.000000
2023-10-19 10:37:44,938 epoch 10 - iter 2080/2606 - loss 0.16463496 - time (sec): 49.60 - samples/sec: 5903.12 - lr: 0.000001 - momentum: 0.000000
2023-10-19 10:37:51,157 epoch 10 - iter 2340/2606 - loss 0.16348950 - time (sec): 55.82 - samples/sec: 5870.39 - lr: 0.000001 - momentum: 0.000000
2023-10-19 10:37:57,513 epoch 10 - iter 2600/2606 - loss 0.16347898 - time (sec): 62.17 - samples/sec: 5899.09 - lr: 0.000000 - momentum: 0.000000
2023-10-19 10:37:57,656 ----------------------------------------------------------------------------------------------------
2023-10-19 10:37:57,656 EPOCH 10 done: loss 0.1635 - lr: 0.000000
2023-10-19 10:38:02,888 DEV : loss 0.18416452407836914 - f1-score (micro avg) 0.287
2023-10-19 10:38:02,942 ----------------------------------------------------------------------------------------------------
2023-10-19 10:38:02,942 Loading model from best epoch ...
2023-10-19 10:38:03,021 SequenceTagger predicts: Dictionary with 17 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-PER, B-PER, E-PER, I-PER, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-19 10:38:09,420
Results:
- F-score (micro) 0.2335
- F-score (macro) 0.1202
- Accuracy 0.1328
By class:
precision recall f1-score support
LOC 0.4412 0.3089 0.3634 1214
PER 0.1569 0.0792 0.1053 808
ORG 0.0217 0.0085 0.0122 353
HumanProd 0.0000 0.0000 0.0000 15
micro avg 0.3166 0.1849 0.2335 2390
macro avg 0.1549 0.0992 0.1202 2390
weighted avg 0.2803 0.1849 0.2220 2390
2023-10-19 10:38:09,420 ----------------------------------------------------------------------------------------------------