metadata
library_name: transformers
datasets:
- MoritzLaurer/synthetic_zeroshot_mixtral_v0.1
language:
- en
base_model:
- answerdotai/ModernBERT-large
pipeline_tag: zero-shot-classification
license: mit
Model Description
This model is a fine-tuned ModernBERT-large for Natural Language Inference. It was trained on the MoritzLaurer/synthetic_zeroshot_mixtral_v0.1 and is designed to carry out zero-shot classification.
Model Overview
- Model Type: ModernBERT-large (BERT variant)
- Task: Zero-shot Classification
- Languages: English
- Dataset: MoritzLaurer/synthetic_zeroshot_mixtral_v0.1
- Fine-Tuning: Fine-tuned for Zero-shot Classification
Performance Metrics
To be added.
- Training Loss: Measures the model's fit to the training data.
- Validation Loss: Measures the model's generalization to unseen data.
- Accuracy: The percentage of correct predictions over all examples.
- F1 Score: A balanced metric between precision and recall.
Installation and Example Usage
pip install transformers torch datasets
classifier = pipeline("zero-shot-classification", "r-f/ModernBERT-large-zeroshot-v1")
sequence_to_classify = "I want to be an actor."
candidate_labels = ["space", "economy", "entertainment"]
output = classifier(sequence_to_classify, candidate_labels, multi_label=False)
print(output)
>>{'sequence': 'I want to be an actor.', 'labels': ['entertainment', 'space', 'economy'], 'scores': [0.9614731073379517, 0.028852475807070732, 0.009674412198364735]}
Model Card
- Model Name: ModernBERT-large-zeroshot-v1
- Hugging Face Repo: r-f/ModernBERT-large-zeroshot-v1
- License: MIT (or another applicable license)
- Date: 23-12-2024
Training Details
- Model: ModernBERT (Large variant)
- Framework: PyTorch
- Batch Size: 32
- Learning Rate: 2e-5
- Optimizer: AdamW
- Hardware: RTX 4090
Acknowledgments
- The model was trained on the MoritzLaurer/synthetic_zeroshot_mixtral_v0.1. And the training script was adapted from MoritzLaurer/zeroshot-classifier
- Special thanks to the Hugging Face community and all contributors to the transformers library.
License
This model is licensed under the MIT License. See the LICENSE file for more details.