kenhktsui
/

nano-phi-115M-v0.1

@@ -2,16 +2,78 @@
 library_name: transformers
 language:
 - en
 ---
 # Model Card for nano-phi-v0.1
 Inspired by [Phi2](https://huggingface.co/microsoft/phi-2), and open source small language model attempts like [smol_llama-101M-GQA](https://huggingface.co/BEE-spoke-data/smol_llama-101M-GQA).
 Pre-trained with training 7B token from scratch, with a high quality dataset of 0.6B token.
-It just took 2d 4h to train in Colab with a A100 40GB (~USD$ 100).
-It achieves quite competitive results in evaluation given its training token, and training data size.
 No alignment has been done yet.
 hf-causal-experimental (pretrained=/content/lm-evaluation-harness/artifacts/checkpoint-pegfss6f:v13,use_accelerate=false,trust_remote_code=True), limit: None, provide_description: False, num_fewshot: 0, batch_size: 16
 |  Task  |Version| Metric |Value |   |Stderr|
 |--------|------:|--------|-----:|---|-----:|
@@ -160,6 +222,12 @@ hf-causal-experimental (pretrained=/content/lm-evaluation-harness/artifacts/chec
 |winogrande|      0|acc   |0.5099|±  | 0.014|
 ## Model Details

 library_name: transformers
 language:
 - en
+inference:
+  parameters:
+    max_new_tokens: 64
+    do_sample: true
+    temperature: 0.8
+    repetition_penalty: 1.15
+    no_repeat_ngram_size: 4
+    eta_cutoff: 0.0006
+    renormalize_logits: true
+widget:
+  - text: My name is El Microondas the Wise, and
+    example_title: El Microondas
+  - text: Kennesaw State University is a public
+    example_title: Kennesaw State University
+  - text: >-
+      Bungie Studios is an American video game developer. They are most famous
+      for developing the award winning Halo series of video games. They also
+      made Destiny. The studio was founded
+    example_title: Bungie
+  - text: The Mona Lisa is a world-renowned painting created by
+    example_title: Mona Lisa
+  - text: >-
+      The Harry Potter series, written by J.K. Rowling, begins with the book
+      titled
+    example_title: Harry Potter Series
+  - text: >-
+      Question: I have cities, but no houses. I have mountains, but no trees. I
+      have water, but no fish. What am I?
+      Answer:
+    example_title: Riddle
+  - text: The process of photosynthesis involves the conversion of
+    example_title: Photosynthesis
+  - text: >-
+      Jane went to the store to buy some groceries. She picked up apples,
+      oranges, and a loaf of bread. When she got home, she realized she forgot
+    example_title: Story Continuation
+  - text: >-
+      Problem 2: If a train leaves Station A at 9:00 AM and travels at 60 mph,
+      and another train leaves Station B at 10:00 AM and travels at 80 mph, when
+      will they meet if the distance between the stations is 300 miles?
+      To determine
+    example_title: Math Problem
+  - text: In the context of computer programming, an algorithm is
+    example_title: Algorithm Definition
+pipeline_tag: text-generation
 ---
 # Model Card for nano-phi-v0.1
 Inspired by [Phi2](https://huggingface.co/microsoft/phi-2), and open source small language model attempts like [smol_llama-101M-GQA](https://huggingface.co/BEE-spoke-data/smol_llama-101M-GQA).
 Pre-trained with training 7B token from scratch, with a high quality dataset of 0.6B token.
+It just took 2d 4h to train in Colab with a A100 40GB (~USD$ 100).
+It achieves quite competitive results in evaluation given its training token, and training data size.
 No alignment has been done yet.
+## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
+| Metric                | Value                     |
+|-----------------------|---------------------------|
+| Avg.                  | 28.68    |
+| ARC (25-shot)         | 21.93    |
+| HellaSwag (10-shot)   | 27.87    |
+| MMLU (5-shot)         | 25.30    |
+| TruthfulQA (0-shot)   | 46.01    |
+| Winogrande (5-shot)   | 50.99    |
+| GSM8K (5-shot)        |  0.0     |
+Details:
 hf-causal-experimental (pretrained=/content/lm-evaluation-harness/artifacts/checkpoint-pegfss6f:v13,use_accelerate=false,trust_remote_code=True), limit: None, provide_description: False, num_fewshot: 0, batch_size: 16
 |  Task  |Version| Metric |Value |   |Stderr|
 |--------|------:|--------|-----:|---|-----:|
 |winogrande|      0|acc   |0.5099|±  | 0.014|
+hf-causal-experimental (pretrained=/content/lm-evaluation-harness/artifacts/checkpoint-pegfss6f:v13,use_accelerate=false,trust_remote_code=True), limit: None, provide_description: False, num_fewshot: 5, batch_size: 16
+|   Task   |Version|Metric|Value |   |Stderr|
+|----------|------:|------|-----:|---|-----:|
+|gsm8k     |      0|acc   |   0.0|±  |   0.0|
 ## Model Details