Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,15 @@
|
|
1 |
---
|
2 |
language: de, en
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
4 |
```
|
5 |
HasAns_exact = 85.79622132253711
|
@@ -8,12 +18,36 @@ HasAns_total = 5928
|
|
8 |
NoAns_exact = 94.76871320437343
|
9 |
NoAns_f1 = 94.76871320437343
|
10 |
NoAns_total = 5945
|
11 |
-
best_exact = 90.28889076054915
|
12 |
-
best_exact_thresh = 0.0
|
13 |
-
best_f1 = 92.84713483219731
|
14 |
-
best_f1_thresh = 0.0
|
15 |
-
epoch = 3.0
|
16 |
exact = 90.28889076054915
|
17 |
f1 = 92.84713483219753
|
18 |
total = 11873
|
19 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
language: de, en
|
3 |
+
---
|
4 |
+
|
5 |
+
# Bilingual English + German SQuAD2.0
|
6 |
+
|
7 |
+
We created German Squad 2.0 (deQuAD) and merged with [**SQuAD2.0**](https://rajpurkar.github.io/SQuAD-explorer/) into an English and German training data for question answering. The [**bert-base-multilingual-cased**](https://github.com/google-research/bert/blob/master/multilingual.md) is used to fine-tune bilingual QA downstream task.
|
8 |
+
|
9 |
+
# Details of deQuAD 2.0
|
10 |
+
[**SQuAD2.0**](https://rajpurkar.github.io/SQuAD-explorer/) was auto-translated into German. We hired professional editors to proofread the translated transcripts, correct mistakes and double check the answers to further polish the text and enhance annotation quality. The final German dataset contains **130k** training and **11k** test samples.
|
11 |
+
|
12 |
+
Evaluation on English SQuAD2.0
|
13 |
|
14 |
```
|
15 |
HasAns_exact = 85.79622132253711
|
|
|
18 |
NoAns_exact = 94.76871320437343
|
19 |
NoAns_f1 = 94.76871320437343
|
20 |
NoAns_total = 5945
|
|
|
|
|
|
|
|
|
|
|
21 |
exact = 90.28889076054915
|
22 |
f1 = 92.84713483219753
|
23 |
total = 11873
|
24 |
```
|
25 |
+
|
26 |
+
## Use Model in Pipeline
|
27 |
+
|
28 |
+
|
29 |
+
```python
|
30 |
+
from transformers import pipeline
|
31 |
+
|
32 |
+
qa_pipeline = pipeline(
|
33 |
+
"question-answering",
|
34 |
+
model="deutsche-telekom/bert-multi-english-german-squad2",
|
35 |
+
tokenizer="deutsche-telekom/bert-multi-english-german-squad2"
|
36 |
+
)
|
37 |
+
|
38 |
+
qa_pipeline({
|
39 |
+
'context': " ",
|
40 |
+
'question': " "})
|
41 |
+
|
42 |
+
```
|
43 |
+
|
44 |
+
# Output:
|
45 |
+
|
46 |
+
```json
|
47 |
+
{
|
48 |
+
"score": 0.83,
|
49 |
+
"start": 0,
|
50 |
+
"end": 9,
|
51 |
+
"answer": " "
|
52 |
+
}
|
53 |
+
```
|