Updates

Solar, a new bot created by Upstage, is now available on Poe. As a top-ranked model on the HuggingFace Open LLM leaderboard, and a fine tune of Llama 2, Solar is a great example of the progress enabled by open source. Try now at https://poe.com/Solar-0-70b

SOLAR-0-70b-16bit model card

The model name has been changed from LLaMa-2-70b-instruct-v2 to SOLAR-0-70b-16bit

Model Details

Dataset Details

Used Datasets

  • Orca-style dataset
  • Alpaca-style dataset
  • No other dataset was used except for the dataset mentioned above
  • No benchmark test set or the training set are used

Prompt Template

### System:
{System}

### User:
{User}

### Assistant:
{Assistant}

Usage

  • The followings are tested on A100 80GB
  • Our model can handle up to 10k+ input tokens, thanks to the rope_scaling option
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("upstage/Llama-2-70b-instruct-v2")
model = AutoModelForCausalLM.from_pretrained(
    "upstage/Llama-2-70b-instruct-v2",
    device_map="auto",
    torch_dtype=torch.float16,
    load_in_8bit=True,
    rope_scaling={"type": "dynamic", "factor": 2} # allows handling of longer inputs
)

prompt = "### User:\nThomas is healthy, but he has to go to the hospital. What could be the reasons?\n\n### Assistant:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
del inputs["token_type_ids"]
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf'))
output_text = tokenizer.decode(output[0], skip_special_tokens=True)

Hardware and Software

Evaluation Results

Overview

Main Results

Model H4(Avg) ARC HellaSwag MMLU TruthfulQA MT_Bench
Llama-2-70b-instruct-v2(Ours, Open LLM Leaderboard) 73 71.1 87.9 70.6 62.2 7.44063
Llama-2-70b-instruct (Ours, Open LLM Leaderboard) 72.3 70.9 87.5 69.8 61 7.24375
llama-65b-instruct (Ours, Open LLM Leaderboard) 69.4 67.6 86.5 64.9 58.8
Llama-2-70b-hf 67.3 67.3 87.3 69.8 44.9
llama-30b-instruct-2048 (Ours, Open LLM Leaderboard) 67.0 64.9 84.9 61.9 56.3
llama-30b-instruct (Ours, Open LLM Leaderboard) 65.2 62.5 86.2 59.4 52.8
llama-65b 64.2 63.5 86.1 63.9 43.4
falcon-40b-instruct 63.4 61.6 84.3 55.4 52.5

Scripts for H4 Score Reproduction

  • Prepare evaluation environments:
# clone the repository
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
# check out the specific commit
git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
# change to the repository directory
cd lm-evaluation-harness

Contact Us

About Upstage

  • Upstage is a company specialized in Large Language Models (LLMs) and AI. We will help you build private LLMs and related applications. If you have a dataset to build domain specific LLMs or make LLM applications, please contact us at â–º click here to contact
  • As of August 1st, our 70B model has reached the top spot in openLLM rankings, marking itself as the current leading performer globally.
Downloads last month
23
GGUF
Model size
69B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

Inference Examples
Unable to determine this model's library. Check the docs .