kenhktsui
/

nano-phi-115M-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kenhktsui commited on Feb 5, 2024

Commit

9fdbc09

·

verified ·

1 Parent(s): 1a760c1

Update README.md

Files changed (1) hide show

README.md +21 -11

README.md CHANGED Viewed

@@ -6,8 +6,8 @@ inference:
   parameters:
     max_new_tokens: 64
     do_sample: true
-    temperature: 0.8
-    repetition_penalty: 1.15
     no_repeat_ngram_size: 4
     eta_cutoff: 0.0006
     renormalize_logits: true
@@ -77,15 +77,25 @@ No alignment has been done yet.
 ## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
-| Metric                | kenhktsui/nano-phi-115M-v0.1|[kenhktsui/nano-phi-115M-control-v0.1](https://huggingface.co/kenhktsui/nano-phi-115M-control-v0.1)|
-|-----------------------|---------------------------|---------------------------|
-| Avg.                  | 28.68    |28.75 |
-| ARC (25-shot)         | 21.93    |21.67 |
-| HellaSwag (10-shot)   | 27.87    |26.89 |
-| MMLU (5-shot)         | 25.30    |24.76 |
-| TruthfulQA (0-shot)   | 46.01    |47.69 |
-| Winogrande (5-shot)   | 50.99    |51.46 |
-| GSM8K (5-shot)        |  0.0     |0.0 |
 Details:

   parameters:
     max_new_tokens: 64
     do_sample: true
+    temperature: 0.1
+    repetition_penalty: 10
     no_repeat_ngram_size: 4
     eta_cutoff: 0.0006
     renormalize_logits: true
 ## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+| Metric                | kenhktsui/nano-phi-115M-v0.1|[kenhktsui/nano-phi-115M-control-v0.1](https://huggingface.co/kenhktsui/nano-phi-115M-control-v0.1)|[microsoft/phi-2](https://huggingface.co/microsoft/phi-2)|
+|-----------------------|---------------------------|---------------------------|---------------------------|
+| Model Para            | 115M     |115M  |2.7B |
+| Dataset Size          | 0.26B     |0.6B  |250B |
+| Training Token        | 0.26B     |0.6B  |1.4T |
+| Context Length        |1024      |1024  |2048|
+| Device                |1xA100-40G|1xA100-40G |96xA100-80G|
+| Training Time         |2d4h      |2d4h  |14d|
+| Metric                | kenhktsui/nano-phi-115M-v0.1|[kenhktsui/nano-phi-115M-control-v0.1](https://huggingface.co/kenhktsui/nano-phi-115M-control-v0.1)|[microsoft/phi-2](https://huggingface.co/microsoft/phi-2) (Reproduced)|
+|-----------------------|---------------------------|---------------------------|---------------------------|
+| Avg.                  | 28.68    |28.75 |61.53 |
+| ARC (25-shot)         | 21.93    |21.67 |61.52 |
+| HellaSwag (10-shot)   | 27.87    |26.89 |75.13 |
+| MMLU (5-shot)         | 25.30    |24.76 |58.23 |
+| TruthfulQA (0-shot)   | 46.01    |47.69 |44.46 |
+| Winogrande (5-shot)   | 50.99    |51.46 |74.51 |
+| GSM8K (5-shot)        |  0.0     |0.0 |55.34  |
 Details: