driaforall
/

Dria-Agent-a-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

andthattoo commited on 8 days ago

Commit

f8e0a19

·

verified ·

1 Parent(s): 267652f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -202,7 +202,7 @@ We evaluate the model on the following benchmarks:
 2. MMLU-Pro
 3. **Dria-Pythonic-Agent-Benchmark (DPAB):** The benchmark we curated with a synthetic data generation +model-based validation + filtering and manual selection to evaluate LLMs on their Pythonic function calling ability, spanning multiple scenarios and tasks. More detailed information about the benchmark and the Github repo will be released soon.
-Below are the BFCL results: evaluation results for ***Qwen2.5-Coder-3B-Instruct***, ***Dria-Agent-α-3B*** and ***gpt-4o-2024-11-20***
 | Metric                                | Qwen/Qwen2.5-3B-Instruct   | Dria-Agent-a-3B   | Dria-Agent-a-7B   | gpt-4o-2024-11-20 (Prompt)   |
 |---------------------------------------|-----------|-----------|-----------|-----------|

 2. MMLU-Pro
 3. **Dria-Pythonic-Agent-Benchmark (DPAB):** The benchmark we curated with a synthetic data generation +model-based validation + filtering and manual selection to evaluate LLMs on their Pythonic function calling ability, spanning multiple scenarios and tasks. More detailed information about the benchmark and the Github repo will be released soon.
+Below are the BFCL results: evaluation results for ***Qwen2.5-Coder-3B-Instruct***, ***Dria-Agent-α-3B***, ***Dria-Agent-α-7B***, and ***gpt-4o-2024-11-20***
 | Metric                                | Qwen/Qwen2.5-3B-Instruct   | Dria-Agent-a-3B   | Dria-Agent-a-7B   | gpt-4o-2024-11-20 (Prompt)   |
 |---------------------------------------|-----------|-----------|-----------|-----------|