File size: 2,065 Bytes
5dd3be5
 
 
 
 
 
9b18d5a
 
 
 
 
 
 
c935e7b
9b18d5a
3dc179c
808edb8
 
 
 
c935e7b
59c2beb
c935e7b
 
 
 
 
 
 
 
 
808edb8
 
c935e7b
808edb8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
language:
 - da
tags:
 - job postings
 - DaJobBERT
---


# JobBERT

This is the DaJobBERT model from:

Mike Zhang, Kristian Nørgaard Jensen, and Barbara Plank. __Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning__. Proceedings of the Language Resources and Evaluation Conference (LREC). 2022.

This model is continuously pre-trained from a `dabert-base-uncased`: https://huggingface.co/Maltehb/danish-bert-botxo checkpoint on ~24.5M Danish sentences from job postings. More information can be found in the paper.

If you use this model, please cite the following paper:

```
@InProceedings{zhang-jensen-plank:2022:LREC,
  author    = {Zhang, Mike  and  Jensen, Kristian N{\o}rgaard  and  Plank, Barbara},
  title     = {Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning},
  booktitle      = {Proceedings of the Language Resources and Evaluation Conference},
  month          = {June},
  year           = {2022},
  address        = {Marseille, France},
  publisher      = {European Language Resources Association},
  pages     = {436--447},
  abstract  = {Skill Classification (SC) is the task of classifying job competences from job postings. This work is the first in SC applied to Danish job vacancy data. We release the first Danish job posting dataset: *Kompetencer* (\_en\_: competences), annotated for nested spans of competences. To improve upon coarse-grained annotations, we make use of The European Skills, Competences, Qualifications and Occupations (ESCO; le Vrang et al., (2014)) taxonomy API to obtain fine-grained labels via distant supervision. We study two setups: The zero-shot and few-shot classification setting. We fine-tune English-based models and RemBERT (Chung et al., 2020) and compare them to in-language Danish models. Our results show RemBERT significantly outperforms all other models in both the zero-shot and the few-shot setting.},
  url       = {https://aclanthology.org/2022.lrec-1.46}
}


```