Daemontatox
commited on
Adding Evaluation Results (#2)
Browse files- Adding Evaluation Results (9f15ffe81d928c128e885df907009da76e59ed17)
README.md
CHANGED
@@ -172,3 +172,18 @@ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-
|
|
172 |
|MuSR (0-shot) | 20.05|
|
173 |
|MMLU-PRO (5-shot) | 52.86|
|
174 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
172 |
|MuSR (0-shot) | 20.05|
|
173 |
|MMLU-PRO (5-shot) | 52.86|
|
174 |
|
175 |
+
|
176 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
177 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__PathFinderAi3.0-details)!
|
178 |
+
Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FPathFinderAi3.0&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
|
179 |
+
|
180 |
+
| Metric |Value (%)|
|
181 |
+
|-------------------|--------:|
|
182 |
+
|**Average** | 40.11|
|
183 |
+
|IFEval (0-Shot) | 42.71|
|
184 |
+
|BBH (3-Shot) | 55.54|
|
185 |
+
|MATH Lvl 5 (4-Shot)| 48.34|
|
186 |
+
|GPQA (0-shot) | 21.14|
|
187 |
+
|MuSR (0-shot) | 20.05|
|
188 |
+
|MMLU-PRO (5-shot) | 52.86|
|
189 |
+
|