File size: 25,457 Bytes
5d1f60b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
2023-10-19 01:11:50,079 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,080 Model: "SequenceTagger(
  (embeddings): TransformerWordEmbeddings(
    (model): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(31103, 768)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=768, bias=True)
                (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
            )
            (intermediate): BertIntermediate(
              (dense): Linear(in_features=768, out_features=3072, bias=True)
              (intermediate_act_fn): GELUActivation()
            )
            (output): BertOutput(
              (dense): Linear(in_features=3072, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
        )
      )
      (pooler): BertPooler(
        (dense): Linear(in_features=768, out_features=768, bias=True)
        (activation): Tanh()
      )
    )
  )
  (locked_dropout): LockedDropout(p=0.5)
  (linear): Linear(in_features=768, out_features=81, bias=True)
  (loss_function): CrossEntropyLoss()
)"
2023-10-19 01:11:50,080 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,080 Corpus: 6900 train + 1576 dev + 1833 test sentences
2023-10-19 01:11:50,080 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,081 Train:  6900 sentences
2023-10-19 01:11:50,081         (train_with_dev=False, train_with_test=False)
2023-10-19 01:11:50,081 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,081 Training Params:
2023-10-19 01:11:50,081  - learning_rate: "3e-05" 
2023-10-19 01:11:50,081  - mini_batch_size: "16"
2023-10-19 01:11:50,081  - max_epochs: "10"
2023-10-19 01:11:50,081  - shuffle: "True"
2023-10-19 01:11:50,081 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,081 Plugins:
2023-10-19 01:11:50,081  - TensorboardLogger
2023-10-19 01:11:50,081  - LinearScheduler | warmup_fraction: '0.1'
2023-10-19 01:11:50,081 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,081 Final evaluation on model from best epoch (best-model.pt)
2023-10-19 01:11:50,081  - metric: "('micro avg', 'f1-score')"
2023-10-19 01:11:50,081 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,081 Computation:
2023-10-19 01:11:50,081  - compute on device: cuda:0
2023-10-19 01:11:50,082  - embedding storage: none
2023-10-19 01:11:50,082 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,082 Model training base path: "autotrain-flair-mobie-gbert_base-bs16-e10-lr3e-05-3"
2023-10-19 01:11:50,082 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,082 ----------------------------------------------------------------------------------------------------
2023-10-19 01:11:50,082 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-19 01:12:04,567 epoch 1 - iter 43/432 - loss 4.48039409 - time (sec): 14.48 - samples/sec: 428.48 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:12:19,172 epoch 1 - iter 86/432 - loss 3.60384065 - time (sec): 29.09 - samples/sec: 419.78 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:12:34,227 epoch 1 - iter 129/432 - loss 3.00100989 - time (sec): 44.14 - samples/sec: 420.06 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:12:48,856 epoch 1 - iter 172/432 - loss 2.67529242 - time (sec): 58.77 - samples/sec: 419.88 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:13:03,508 epoch 1 - iter 215/432 - loss 2.41800710 - time (sec): 73.43 - samples/sec: 420.32 - lr: 0.000015 - momentum: 0.000000
2023-10-19 01:13:18,780 epoch 1 - iter 258/432 - loss 2.20845718 - time (sec): 88.70 - samples/sec: 417.62 - lr: 0.000018 - momentum: 0.000000
2023-10-19 01:13:33,474 epoch 1 - iter 301/432 - loss 2.03382086 - time (sec): 103.39 - samples/sec: 419.31 - lr: 0.000021 - momentum: 0.000000
2023-10-19 01:13:48,681 epoch 1 - iter 344/432 - loss 1.89777717 - time (sec): 118.60 - samples/sec: 415.73 - lr: 0.000024 - momentum: 0.000000
2023-10-19 01:14:03,153 epoch 1 - iter 387/432 - loss 1.77824460 - time (sec): 133.07 - samples/sec: 417.50 - lr: 0.000027 - momentum: 0.000000
2023-10-19 01:14:17,180 epoch 1 - iter 430/432 - loss 1.66845790 - time (sec): 147.10 - samples/sec: 419.34 - lr: 0.000030 - momentum: 0.000000
2023-10-19 01:14:17,812 ----------------------------------------------------------------------------------------------------
2023-10-19 01:14:17,813 EPOCH 1 done: loss 1.6662 - lr: 0.000030
2023-10-19 01:14:31,290 DEV : loss 0.5518006086349487 - f1-score (micro avg)  0.633
2023-10-19 01:14:31,318 saving best model
2023-10-19 01:14:31,797 ----------------------------------------------------------------------------------------------------
2023-10-19 01:14:46,715 epoch 2 - iter 43/432 - loss 0.58848912 - time (sec): 14.92 - samples/sec: 418.67 - lr: 0.000030 - momentum: 0.000000
2023-10-19 01:15:01,216 epoch 2 - iter 86/432 - loss 0.57906599 - time (sec): 29.42 - samples/sec: 416.84 - lr: 0.000029 - momentum: 0.000000
2023-10-19 01:15:16,263 epoch 2 - iter 129/432 - loss 0.55687926 - time (sec): 44.46 - samples/sec: 415.80 - lr: 0.000029 - momentum: 0.000000
2023-10-19 01:15:31,483 epoch 2 - iter 172/432 - loss 0.54911765 - time (sec): 59.68 - samples/sec: 418.70 - lr: 0.000029 - momentum: 0.000000
2023-10-19 01:15:47,092 epoch 2 - iter 215/432 - loss 0.53928280 - time (sec): 75.29 - samples/sec: 413.74 - lr: 0.000028 - momentum: 0.000000
2023-10-19 01:16:02,710 epoch 2 - iter 258/432 - loss 0.52556553 - time (sec): 90.91 - samples/sec: 413.06 - lr: 0.000028 - momentum: 0.000000
2023-10-19 01:16:16,574 epoch 2 - iter 301/432 - loss 0.51270046 - time (sec): 104.77 - samples/sec: 415.26 - lr: 0.000028 - momentum: 0.000000
2023-10-19 01:16:31,736 epoch 2 - iter 344/432 - loss 0.49960162 - time (sec): 119.94 - samples/sec: 414.74 - lr: 0.000027 - momentum: 0.000000
2023-10-19 01:16:47,479 epoch 2 - iter 387/432 - loss 0.48792945 - time (sec): 135.68 - samples/sec: 409.73 - lr: 0.000027 - momentum: 0.000000
2023-10-19 01:17:02,658 epoch 2 - iter 430/432 - loss 0.47759296 - time (sec): 150.86 - samples/sec: 408.74 - lr: 0.000027 - momentum: 0.000000
2023-10-19 01:17:03,329 ----------------------------------------------------------------------------------------------------
2023-10-19 01:17:03,329 EPOCH 2 done: loss 0.4774 - lr: 0.000027
2023-10-19 01:17:16,639 DEV : loss 0.3671756088733673 - f1-score (micro avg)  0.7689
2023-10-19 01:17:16,662 saving best model
2023-10-19 01:17:17,961 ----------------------------------------------------------------------------------------------------
2023-10-19 01:17:33,883 epoch 3 - iter 43/432 - loss 0.30082443 - time (sec): 15.92 - samples/sec: 383.21 - lr: 0.000026 - momentum: 0.000000
2023-10-19 01:17:48,498 epoch 3 - iter 86/432 - loss 0.31439609 - time (sec): 30.54 - samples/sec: 398.31 - lr: 0.000026 - momentum: 0.000000
2023-10-19 01:18:04,454 epoch 3 - iter 129/432 - loss 0.30878822 - time (sec): 46.49 - samples/sec: 395.42 - lr: 0.000026 - momentum: 0.000000
2023-10-19 01:18:20,430 epoch 3 - iter 172/432 - loss 0.30542878 - time (sec): 62.47 - samples/sec: 388.42 - lr: 0.000025 - momentum: 0.000000
2023-10-19 01:18:35,450 epoch 3 - iter 215/432 - loss 0.30005019 - time (sec): 77.49 - samples/sec: 392.10 - lr: 0.000025 - momentum: 0.000000
2023-10-19 01:18:50,219 epoch 3 - iter 258/432 - loss 0.30250277 - time (sec): 92.26 - samples/sec: 398.49 - lr: 0.000025 - momentum: 0.000000
2023-10-19 01:19:05,708 epoch 3 - iter 301/432 - loss 0.30560660 - time (sec): 107.75 - samples/sec: 397.21 - lr: 0.000024 - momentum: 0.000000
2023-10-19 01:19:20,037 epoch 3 - iter 344/432 - loss 0.30515807 - time (sec): 122.07 - samples/sec: 402.40 - lr: 0.000024 - momentum: 0.000000
2023-10-19 01:19:34,892 epoch 3 - iter 387/432 - loss 0.30122812 - time (sec): 136.93 - samples/sec: 403.45 - lr: 0.000024 - momentum: 0.000000
2023-10-19 01:19:50,610 epoch 3 - iter 430/432 - loss 0.29729232 - time (sec): 152.65 - samples/sec: 403.63 - lr: 0.000023 - momentum: 0.000000
2023-10-19 01:19:51,120 ----------------------------------------------------------------------------------------------------
2023-10-19 01:19:51,120 EPOCH 3 done: loss 0.2972 - lr: 0.000023
2023-10-19 01:20:04,739 DEV : loss 0.3239019811153412 - f1-score (micro avg)  0.8084
2023-10-19 01:20:04,763 saving best model
2023-10-19 01:20:06,054 ----------------------------------------------------------------------------------------------------
2023-10-19 01:20:20,480 epoch 4 - iter 43/432 - loss 0.21669863 - time (sec): 14.42 - samples/sec: 428.02 - lr: 0.000023 - momentum: 0.000000
2023-10-19 01:20:35,146 epoch 4 - iter 86/432 - loss 0.20451815 - time (sec): 29.09 - samples/sec: 426.60 - lr: 0.000023 - momentum: 0.000000
2023-10-19 01:20:50,293 epoch 4 - iter 129/432 - loss 0.21364175 - time (sec): 44.24 - samples/sec: 421.20 - lr: 0.000022 - momentum: 0.000000
2023-10-19 01:21:05,859 epoch 4 - iter 172/432 - loss 0.21881045 - time (sec): 59.80 - samples/sec: 415.69 - lr: 0.000022 - momentum: 0.000000
2023-10-19 01:21:21,185 epoch 4 - iter 215/432 - loss 0.22071577 - time (sec): 75.13 - samples/sec: 408.92 - lr: 0.000022 - momentum: 0.000000
2023-10-19 01:21:35,428 epoch 4 - iter 258/432 - loss 0.21878871 - time (sec): 89.37 - samples/sec: 411.30 - lr: 0.000021 - momentum: 0.000000
2023-10-19 01:21:50,207 epoch 4 - iter 301/432 - loss 0.21383356 - time (sec): 104.15 - samples/sec: 410.97 - lr: 0.000021 - momentum: 0.000000
2023-10-19 01:22:04,657 epoch 4 - iter 344/432 - loss 0.21198471 - time (sec): 118.60 - samples/sec: 417.17 - lr: 0.000021 - momentum: 0.000000
2023-10-19 01:22:20,290 epoch 4 - iter 387/432 - loss 0.21285777 - time (sec): 134.24 - samples/sec: 411.41 - lr: 0.000020 - momentum: 0.000000
2023-10-19 01:22:35,747 epoch 4 - iter 430/432 - loss 0.21143223 - time (sec): 149.69 - samples/sec: 411.26 - lr: 0.000020 - momentum: 0.000000
2023-10-19 01:22:36,321 ----------------------------------------------------------------------------------------------------
2023-10-19 01:22:36,322 EPOCH 4 done: loss 0.2117 - lr: 0.000020
2023-10-19 01:22:49,686 DEV : loss 0.30530545115470886 - f1-score (micro avg)  0.8194
2023-10-19 01:22:49,710 saving best model
2023-10-19 01:22:51,003 ----------------------------------------------------------------------------------------------------
2023-10-19 01:23:05,732 epoch 5 - iter 43/432 - loss 0.15718174 - time (sec): 14.73 - samples/sec: 394.91 - lr: 0.000020 - momentum: 0.000000
2023-10-19 01:23:20,374 epoch 5 - iter 86/432 - loss 0.15300782 - time (sec): 29.37 - samples/sec: 406.10 - lr: 0.000019 - momentum: 0.000000
2023-10-19 01:23:34,910 epoch 5 - iter 129/432 - loss 0.15938683 - time (sec): 43.91 - samples/sec: 417.94 - lr: 0.000019 - momentum: 0.000000
2023-10-19 01:23:48,967 epoch 5 - iter 172/432 - loss 0.15808067 - time (sec): 57.96 - samples/sec: 426.26 - lr: 0.000019 - momentum: 0.000000
2023-10-19 01:24:03,719 epoch 5 - iter 215/432 - loss 0.16506791 - time (sec): 72.71 - samples/sec: 424.28 - lr: 0.000018 - momentum: 0.000000
2023-10-19 01:24:19,437 epoch 5 - iter 258/432 - loss 0.16346937 - time (sec): 88.43 - samples/sec: 417.81 - lr: 0.000018 - momentum: 0.000000
2023-10-19 01:24:34,321 epoch 5 - iter 301/432 - loss 0.16131309 - time (sec): 103.32 - samples/sec: 416.84 - lr: 0.000018 - momentum: 0.000000
2023-10-19 01:24:49,426 epoch 5 - iter 344/432 - loss 0.16170570 - time (sec): 118.42 - samples/sec: 415.27 - lr: 0.000017 - momentum: 0.000000
2023-10-19 01:25:03,964 epoch 5 - iter 387/432 - loss 0.16232398 - time (sec): 132.96 - samples/sec: 417.96 - lr: 0.000017 - momentum: 0.000000
2023-10-19 01:25:19,256 epoch 5 - iter 430/432 - loss 0.16167487 - time (sec): 148.25 - samples/sec: 416.09 - lr: 0.000017 - momentum: 0.000000
2023-10-19 01:25:19,736 ----------------------------------------------------------------------------------------------------
2023-10-19 01:25:19,736 EPOCH 5 done: loss 0.1620 - lr: 0.000017
2023-10-19 01:25:32,940 DEV : loss 0.321034699678421 - f1-score (micro avg)  0.8198
2023-10-19 01:25:32,965 saving best model
2023-10-19 01:25:34,290 ----------------------------------------------------------------------------------------------------
2023-10-19 01:25:50,079 epoch 6 - iter 43/432 - loss 0.11198163 - time (sec): 15.79 - samples/sec: 387.46 - lr: 0.000016 - momentum: 0.000000
2023-10-19 01:26:05,066 epoch 6 - iter 86/432 - loss 0.11365943 - time (sec): 30.77 - samples/sec: 396.18 - lr: 0.000016 - momentum: 0.000000
2023-10-19 01:26:20,043 epoch 6 - iter 129/432 - loss 0.11579195 - time (sec): 45.75 - samples/sec: 408.71 - lr: 0.000016 - momentum: 0.000000
2023-10-19 01:26:34,245 epoch 6 - iter 172/432 - loss 0.12052255 - time (sec): 59.95 - samples/sec: 415.39 - lr: 0.000015 - momentum: 0.000000
2023-10-19 01:26:48,584 epoch 6 - iter 215/432 - loss 0.12402843 - time (sec): 74.29 - samples/sec: 417.44 - lr: 0.000015 - momentum: 0.000000
2023-10-19 01:27:03,787 epoch 6 - iter 258/432 - loss 0.12002060 - time (sec): 89.50 - samples/sec: 414.34 - lr: 0.000015 - momentum: 0.000000
2023-10-19 01:27:19,177 epoch 6 - iter 301/432 - loss 0.12019028 - time (sec): 104.89 - samples/sec: 411.10 - lr: 0.000014 - momentum: 0.000000
2023-10-19 01:27:34,518 epoch 6 - iter 344/432 - loss 0.12027080 - time (sec): 120.23 - samples/sec: 412.59 - lr: 0.000014 - momentum: 0.000000
2023-10-19 01:27:50,024 epoch 6 - iter 387/432 - loss 0.12168734 - time (sec): 135.73 - samples/sec: 409.94 - lr: 0.000014 - momentum: 0.000000
2023-10-19 01:28:04,990 epoch 6 - iter 430/432 - loss 0.12485490 - time (sec): 150.70 - samples/sec: 409.14 - lr: 0.000013 - momentum: 0.000000
2023-10-19 01:28:05,672 ----------------------------------------------------------------------------------------------------
2023-10-19 01:28:05,673 EPOCH 6 done: loss 0.1248 - lr: 0.000013
2023-10-19 01:28:18,762 DEV : loss 0.33496347069740295 - f1-score (micro avg)  0.8301
2023-10-19 01:28:18,786 saving best model
2023-10-19 01:28:20,656 ----------------------------------------------------------------------------------------------------
2023-10-19 01:28:36,296 epoch 7 - iter 43/432 - loss 0.09890459 - time (sec): 15.64 - samples/sec: 398.70 - lr: 0.000013 - momentum: 0.000000
2023-10-19 01:28:50,948 epoch 7 - iter 86/432 - loss 0.09672663 - time (sec): 30.29 - samples/sec: 424.23 - lr: 0.000013 - momentum: 0.000000
2023-10-19 01:29:05,164 epoch 7 - iter 129/432 - loss 0.10195590 - time (sec): 44.51 - samples/sec: 420.61 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:29:20,590 epoch 7 - iter 172/432 - loss 0.10025118 - time (sec): 59.93 - samples/sec: 414.40 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:29:36,554 epoch 7 - iter 215/432 - loss 0.10219041 - time (sec): 75.90 - samples/sec: 409.06 - lr: 0.000012 - momentum: 0.000000
2023-10-19 01:29:52,459 epoch 7 - iter 258/432 - loss 0.10239845 - time (sec): 91.80 - samples/sec: 402.76 - lr: 0.000011 - momentum: 0.000000
2023-10-19 01:30:07,118 epoch 7 - iter 301/432 - loss 0.10348027 - time (sec): 106.46 - samples/sec: 404.05 - lr: 0.000011 - momentum: 0.000000
2023-10-19 01:30:21,327 epoch 7 - iter 344/432 - loss 0.10248971 - time (sec): 120.67 - samples/sec: 406.93 - lr: 0.000011 - momentum: 0.000000
2023-10-19 01:30:36,374 epoch 7 - iter 387/432 - loss 0.10201551 - time (sec): 135.72 - samples/sec: 407.72 - lr: 0.000010 - momentum: 0.000000
2023-10-19 01:30:51,938 epoch 7 - iter 430/432 - loss 0.10240589 - time (sec): 151.28 - samples/sec: 407.48 - lr: 0.000010 - momentum: 0.000000
2023-10-19 01:30:52,652 ----------------------------------------------------------------------------------------------------
2023-10-19 01:30:52,653 EPOCH 7 done: loss 0.1023 - lr: 0.000010
2023-10-19 01:31:05,779 DEV : loss 0.3334580063819885 - f1-score (micro avg)  0.841
2023-10-19 01:31:05,803 saving best model
2023-10-19 01:31:07,095 ----------------------------------------------------------------------------------------------------
2023-10-19 01:31:21,749 epoch 8 - iter 43/432 - loss 0.07715346 - time (sec): 14.65 - samples/sec: 397.05 - lr: 0.000010 - momentum: 0.000000
2023-10-19 01:31:36,808 epoch 8 - iter 86/432 - loss 0.08026845 - time (sec): 29.71 - samples/sec: 406.77 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:31:51,972 epoch 8 - iter 129/432 - loss 0.07932378 - time (sec): 44.88 - samples/sec: 418.29 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:32:06,196 epoch 8 - iter 172/432 - loss 0.07970603 - time (sec): 59.10 - samples/sec: 418.22 - lr: 0.000009 - momentum: 0.000000
2023-10-19 01:32:22,269 epoch 8 - iter 215/432 - loss 0.08310639 - time (sec): 75.17 - samples/sec: 411.59 - lr: 0.000008 - momentum: 0.000000
2023-10-19 01:32:38,561 epoch 8 - iter 258/432 - loss 0.08405516 - time (sec): 91.46 - samples/sec: 410.16 - lr: 0.000008 - momentum: 0.000000
2023-10-19 01:32:55,145 epoch 8 - iter 301/432 - loss 0.08440344 - time (sec): 108.05 - samples/sec: 405.80 - lr: 0.000008 - momentum: 0.000000
2023-10-19 01:33:09,512 epoch 8 - iter 344/432 - loss 0.08349681 - time (sec): 122.42 - samples/sec: 408.17 - lr: 0.000007 - momentum: 0.000000
2023-10-19 01:33:23,875 epoch 8 - iter 387/432 - loss 0.08301177 - time (sec): 136.78 - samples/sec: 408.43 - lr: 0.000007 - momentum: 0.000000
2023-10-19 01:33:38,932 epoch 8 - iter 430/432 - loss 0.08343770 - time (sec): 151.84 - samples/sec: 405.83 - lr: 0.000007 - momentum: 0.000000
2023-10-19 01:33:39,452 ----------------------------------------------------------------------------------------------------
2023-10-19 01:33:39,452 EPOCH 8 done: loss 0.0833 - lr: 0.000007
2023-10-19 01:33:53,260 DEV : loss 0.353408545255661 - f1-score (micro avg)  0.8389
2023-10-19 01:33:53,290 ----------------------------------------------------------------------------------------------------
2023-10-19 01:34:07,214 epoch 9 - iter 43/432 - loss 0.06833600 - time (sec): 13.92 - samples/sec: 438.65 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:34:21,380 epoch 9 - iter 86/432 - loss 0.05877407 - time (sec): 28.09 - samples/sec: 445.31 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:34:35,471 epoch 9 - iter 129/432 - loss 0.05787196 - time (sec): 42.18 - samples/sec: 437.56 - lr: 0.000006 - momentum: 0.000000
2023-10-19 01:34:50,284 epoch 9 - iter 172/432 - loss 0.06129648 - time (sec): 56.99 - samples/sec: 435.62 - lr: 0.000005 - momentum: 0.000000
2023-10-19 01:35:05,160 epoch 9 - iter 215/432 - loss 0.06250295 - time (sec): 71.87 - samples/sec: 432.52 - lr: 0.000005 - momentum: 0.000000
2023-10-19 01:35:20,325 epoch 9 - iter 258/432 - loss 0.06541137 - time (sec): 87.03 - samples/sec: 428.16 - lr: 0.000005 - momentum: 0.000000
2023-10-19 01:35:35,910 epoch 9 - iter 301/432 - loss 0.06884328 - time (sec): 102.62 - samples/sec: 422.22 - lr: 0.000004 - momentum: 0.000000
2023-10-19 01:35:50,902 epoch 9 - iter 344/432 - loss 0.07066233 - time (sec): 117.61 - samples/sec: 419.43 - lr: 0.000004 - momentum: 0.000000
2023-10-19 01:36:06,777 epoch 9 - iter 387/432 - loss 0.07187549 - time (sec): 133.49 - samples/sec: 414.04 - lr: 0.000004 - momentum: 0.000000
2023-10-19 01:36:21,720 epoch 9 - iter 430/432 - loss 0.07149832 - time (sec): 148.43 - samples/sec: 415.86 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:36:22,264 ----------------------------------------------------------------------------------------------------
2023-10-19 01:36:22,264 EPOCH 9 done: loss 0.0716 - lr: 0.000003
2023-10-19 01:36:35,557 DEV : loss 0.3648279905319214 - f1-score (micro avg)  0.8495
2023-10-19 01:36:35,582 saving best model
2023-10-19 01:36:36,891 ----------------------------------------------------------------------------------------------------
2023-10-19 01:36:52,640 epoch 10 - iter 43/432 - loss 0.04889939 - time (sec): 15.75 - samples/sec: 402.72 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:37:07,560 epoch 10 - iter 86/432 - loss 0.05311027 - time (sec): 30.67 - samples/sec: 413.88 - lr: 0.000003 - momentum: 0.000000
2023-10-19 01:37:21,347 epoch 10 - iter 129/432 - loss 0.05478559 - time (sec): 44.45 - samples/sec: 418.04 - lr: 0.000002 - momentum: 0.000000
2023-10-19 01:37:36,295 epoch 10 - iter 172/432 - loss 0.05290573 - time (sec): 59.40 - samples/sec: 418.92 - lr: 0.000002 - momentum: 0.000000
2023-10-19 01:37:50,899 epoch 10 - iter 215/432 - loss 0.05619202 - time (sec): 74.01 - samples/sec: 417.36 - lr: 0.000002 - momentum: 0.000000
2023-10-19 01:38:05,622 epoch 10 - iter 258/432 - loss 0.05730337 - time (sec): 88.73 - samples/sec: 415.93 - lr: 0.000001 - momentum: 0.000000
2023-10-19 01:38:20,801 epoch 10 - iter 301/432 - loss 0.05650763 - time (sec): 103.91 - samples/sec: 416.31 - lr: 0.000001 - momentum: 0.000000
2023-10-19 01:38:36,560 epoch 10 - iter 344/432 - loss 0.05713604 - time (sec): 119.67 - samples/sec: 412.30 - lr: 0.000001 - momentum: 0.000000
2023-10-19 01:38:50,599 epoch 10 - iter 387/432 - loss 0.05915272 - time (sec): 133.71 - samples/sec: 416.97 - lr: 0.000000 - momentum: 0.000000
2023-10-19 01:39:06,390 epoch 10 - iter 430/432 - loss 0.05855833 - time (sec): 149.50 - samples/sec: 412.06 - lr: 0.000000 - momentum: 0.000000
2023-10-19 01:39:06,987 ----------------------------------------------------------------------------------------------------
2023-10-19 01:39:06,987 EPOCH 10 done: loss 0.0588 - lr: 0.000000
2023-10-19 01:39:20,265 DEV : loss 0.3677222728729248 - f1-score (micro avg)  0.848
2023-10-19 01:39:20,773 ----------------------------------------------------------------------------------------------------
2023-10-19 01:39:20,775 Loading model from best epoch ...
2023-10-19 01:39:23,129 SequenceTagger predicts: Dictionary with 81 tags: O, S-location-route, B-location-route, E-location-route, I-location-route, S-location-stop, B-location-stop, E-location-stop, I-location-stop, S-trigger, B-trigger, E-trigger, I-trigger, S-organization-company, B-organization-company, E-organization-company, I-organization-company, S-location-city, B-location-city, E-location-city, I-location-city, S-location, B-location, E-location, I-location, S-event-cause, B-event-cause, E-event-cause, I-event-cause, S-location-street, B-location-street, E-location-street, I-location-street, S-time, B-time, E-time, I-time, S-date, B-date, E-date, I-date, S-number, B-number, E-number, I-number, S-duration, B-duration, E-duration, I-duration, S-organization
2023-10-19 01:39:41,029 
Results:
- F-score (micro) 0.7588
- F-score (macro) 0.5671
- Accuracy 0.6563

By class:
                      precision    recall  f1-score   support

             trigger     0.7137    0.5954    0.6492       833
       location-stop     0.8420    0.8288    0.8353       765
            location     0.8053    0.8271    0.8160       665
       location-city     0.7987    0.8834    0.8389       566
                date     0.8773    0.8350    0.8557       394
     location-street     0.9366    0.8808    0.9079       386
                time     0.7766    0.8828    0.8263       256
      location-route     0.8025    0.6866    0.7400       284
organization-company     0.7936    0.6865    0.7362       252
              number     0.6632    0.8456    0.7434       149
            distance     0.9824    1.0000    0.9911       167
            duration     0.3205    0.3067    0.3135       163
         event-cause     0.0000    0.0000    0.0000         0
       disaster-type     0.7826    0.2609    0.3913        69
        organization     0.4839    0.5357    0.5085        28
              person     0.4737    0.9000    0.6207        10
                 set     0.0000    0.0000    0.0000         0
        org-position     0.0000    0.0000    0.0000         1
               money     0.0000    0.0000    0.0000         0

           micro avg     0.7504    0.7674    0.7588      4988
           macro avg     0.5817    0.5766    0.5671      4988
        weighted avg     0.7914    0.7674    0.7752      4988

2023-10-19 01:39:41,029 ----------------------------------------------------------------------------------------------------