Update benchmark results
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ language:
|
|
18 |
|
19 |
# OpenHermes - Mixtral 8x7B
|
20 |
|
21 |
-
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6440872be44f30a723256163/
|
22 |
|
23 |
## Model Card
|
24 |
OpenHermes Mixtral 8x7B - a state of the art Mixtral Fine-tune.
|
@@ -27,25 +27,18 @@ Huge thank you to [Teknium](https://huggingface.co/datasets/teknium) for open-so
|
|
27 |
|
28 |
This model was trained on the [OpenHermes dataset](https://huggingface.co/datasets/teknium/openhermes) for 3 epochs
|
29 |
|
30 |
-
##
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
|
37 |
-
|
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
| Task |Version|Metric|Value | |Stderr|
|
43 |
-
|-------------|------:|------|-----:|---|-----:|
|
44 |
-
|truthfulqa_mc| 1|mc1 |0.4272|± |0.0173|
|
45 |
-
| | |mc2 |0.5865|± |0.0160|
|
46 |
-
```
|
47 |
-
|
48 |
-
More benchmarks coming soon!
|
49 |
|
50 |
# Prompt Format
|
51 |
|
|
|
18 |
|
19 |
# OpenHermes - Mixtral 8x7B
|
20 |
|
21 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6440872be44f30a723256163/3Gvl__aGtP4AHxzx9NoXX.jpeg)
|
22 |
|
23 |
## Model Card
|
24 |
OpenHermes Mixtral 8x7B - a state of the art Mixtral Fine-tune.
|
|
|
27 |
|
28 |
This model was trained on the [OpenHermes dataset](https://huggingface.co/datasets/teknium/openhermes) for 3 epochs
|
29 |
|
30 |
+
## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
31 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_orangetin__OpenHermes-Mixtral-8x7B)
|
32 |
+
|
33 |
+
| Metric | Value |
|
34 |
+
|-----------------------|---------------------------|
|
35 |
+
| Avg. | 65.27 |
|
36 |
+
| ARC (25-shot) | 63.91 |
|
37 |
+
| HellaSwag (10-shot) | 84.14 |
|
38 |
+
| MMLU (5-shot) | 64.29 |
|
39 |
+
| TruthfulQA (0-shot) | 59.53 |
|
40 |
+
| Winogrande (5-shot) | 74.03 |
|
41 |
+
| GSM8K (5-shot) | 45.72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
42 |
|
43 |
# Prompt Format
|
44 |
|