Spaces:
Sleeping
Sleeping
Kewen Zhao
commited on
Commit
·
86bf33d
1
Parent(s):
f5dea60
update readme
Browse files
README.md
CHANGED
@@ -11,16 +11,16 @@ tags:
|
|
11 |
- evaluate
|
12 |
- metric
|
13 |
description: >-
|
14 |
-
|
15 |
-
described in the paper "Evaluating Large Language Models Trained on Code"
|
16 |
-
(https://arxiv.org/abs/2107.03374).
|
17 |
---
|
18 |
|
19 |
# Metric Card for Code Eval StdIO
|
20 |
|
21 |
## Metric description
|
22 |
|
23 |
-
|
|
|
|
|
24 |
|
25 |
The CodeEval metric estimates the pass@k metric for code synthesis.
|
26 |
|
|
|
11 |
- evaluate
|
12 |
- metric
|
13 |
description: >-
|
14 |
+
The stdio version of of the ["code eval"](https://huggingface.co/spaces/evaluate-metric/code_eval) metrics, which handles python programs that read inputs from STDIN and print answers to STDOUT, which is common in competitive programming (e.g. CodeForce, USACO)
|
|
|
|
|
15 |
---
|
16 |
|
17 |
# Metric Card for Code Eval StdIO
|
18 |
|
19 |
## Metric description
|
20 |
|
21 |
+
This metric implements the evaluation harness for the HumanEval problem solving dataset
|
22 |
+
described in the paper "Evaluating Large Language Models Trained on Code"
|
23 |
+
(https://arxiv.org/abs/2107.03374).
|
24 |
|
25 |
The CodeEval metric estimates the pass@k metric for code synthesis.
|
26 |
|