Token Classification
Transformers
PyTorch
Safetensors
xmod
named-entity-recognition

The SwissBERT model fine-tuned on the WikiNEuRal dataset for multilingual NER.

Supports German, French and Italian as supervised languages and Romansh Grischun as a zero-shot language.

Usage

from transformers import pipeline

token_classifier = pipeline(
  model="ZurichNLP/swissbert-ner",
  aggregation_strategy="simple",
)

German example

token_classifier.model.set_default_language("de_CH")
token_classifier("Mein Name sei Gantenbein.")

Output:

[{'entity_group': 'PER',
  'score': 0.5002625,
  'word': 'Gantenbein',
  'start': 13,
  'end': 24}]

French example

token_classifier.model.set_default_language("fr_CH")
token_classifier("J'habite à Lausanne.")

Output:

[{'entity_group': 'LOC',
  'score': 0.99955386,
  'word': 'Lausanne',
  'start': 10,
  'end': 19}]

Citation

@article{vamvas-etal-2023-swissbert,
      title={Swiss{BERT}: The Multilingual Language Model for Switzerland}, 
      author={Jannis Vamvas and Johannes Gra\"en and Rico Sennrich},
      year={2023},
      eprint={2303.13310},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2303.13310}
}
Downloads last month
106
Safetensors
Model size
152M params
Tensor type
I64
·
F32
·
Inference Examples
Inference API (serverless) has been turned off for this model.

Dataset used to train ZurichNLP/swissbert-ner

Space using ZurichNLP/swissbert-ner 1