OpenLLaMA Glaive: An Open Reproduction of LLaMA
This is an OpenLlama model Code Instruct that has been fine-tuned on 1 epoch of the Glaive Assistsnt dataset.
Prompt Template
<s>[INST] {{ user_msg }} [/INST]
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM,pipeline
tokenizer = AutoTokenizer.from_pretrained("mwitiderrick/open_llama_3b_glaive_code_v0.1")
model = AutoModelForCausalLM.from_pretrained("mwitiderrick/open_llama_3b_glaive_v0.1")
query = "Write a quick sort algorithm in Python"
text_gen = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)
output = text_gen(f"<s>[INST]{query}[/INST]")
print(output[0]['generated_text'])
"""
<s>[INST]Write a quick sort algorithm in Python[/INST]
Quick sort is a divide and conquer algorithm that sorts an array in-place.
It works by repeatedly dividing the array into two sub-arrays, sorting
them, and then merging them back together.
Here's a Python implementation of the quick sort algorithm:
def quick_sort(arr):
if len(arr) <= 1:
return arr
else:
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
right = [x for x in arr if x > pivot]
return quick_sort(left) + [pivot] + quick_sort
"""
Metrics
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|---------|-------|------|-----:|--------|-----:|---|-----:|
|hellaswag|Yaml |none | 0|acc |0.4974|± |0.0050|
| | |none | 0|acc_norm|0.6600|± |0.0047|
| Groups |Version|Filter|n-shot| Metric | Value | |Stderr|
|----------|-------|------|-----:|-----------|-------:|---|-----:|
|truthfulqa|N/A |none | 0|bleu_max | 23.5771|± |0.5407|
| | |none | 0|bleu_acc | 0.2754|± |0.0002|
| | |none | 0|bleu_diff | -8.1019|± |0.5137|
| | |none | 0|rouge1_max | 49.5707|± |0.6501|
| | |none | 0|rouge1_acc | 0.2607|± |0.0002|
| | |none | 0|rouge1_diff| -9.8962|± |0.5492|
| | |none | 0|rouge2_max | 33.0399|± |0.8237|
| | |none | 0|rouge2_acc | 0.2313|± |0.0002|
| | |none | 0|rouge2_diff|-11.9054|± |0.7963|
| | |none | 0|rougeL_max | 46.3168|± |0.6705|
| | |none | 0|rougeL_acc | 0.2521|± |0.0002|
| | |none | 0|rougeL_diff|-10.1301|± |0.5669|
| | |none | 0|acc | 0.3191|± |0.0405|
| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|
|----------|-------|------|-----:|------|-----:|---|-----:|
|winogrande|Yaml |none | 0|acc |0.6322|± |0.0136|
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|-------------|-------|------|-----:|--------|-----:|---|-----:|
|arc_challenge|Yaml |none | 0|acc |0.3234|± |0.0137|
| | |none | 0|acc_norm|0.3447|± |0.0139|
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 39.74 |
AI2 Reasoning Challenge (25-Shot) | 40.70 |
HellaSwag (10-Shot) | 67.45 |
MMLU (5-Shot) | 27.74 |
TruthfulQA (0-shot) | 35.86 |
Winogrande (5-shot) | 64.72 |
GSM8k (5-shot) | 1.97 |
- Downloads last month
- 37
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for mwitiderrick/open_llama_3b_glaive_code_v0.1
Base model
openlm-research/open_llama_3bDataset used to train mwitiderrick/open_llama_3b_glaive_code_v0.1
Evaluation results
- hellaswag(0-Shot) on hellaswagself-reported0.660
- winogrande(0-Shot) on winograndeself-reported0.632
- arc_challenge(0-Shot) on arc_challengeopen_llama_3b_instruct_v_0.2 model card0.345
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard40.700
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard67.450
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard27.740
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard35.860
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard64.720
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard1.970