allenai
/

OLMo-2-1124-13B

Safetensors

English

olmo2

Model card Files Files and versions Community

amanrangapur commited on Nov 26, 2024

Commit

7fe5381

verified ·

1 Parent(s): af1b2e2

Update README.md

Browse files

Files changed (1) hide show

README.md +38 -38

README.md CHANGED Viewed

@@ -23,7 +23,28 @@ The core models released in this batch are the following:
 | Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
 |------|--------|---------|-------------|-----------------|----------------|
 | [OLMo2-7B July 2024](https://huggingface.co/allenai/OLMo-7B-0724-hf) | 4 Trillion   | 32     | 4096        | 32              |  4096  |
-| [OLMo2- 13B July 2024](https://huggingface.co/allenai/OLMo-1B-0724-hf) | 5 Trillion   | 42     | 5120        | 42              |  4096  |
 We have released checkpoints for these models, for every 1000 training steps.
 The naming convention is `stepXXX-tokensYYYB`.
@@ -40,6 +61,20 @@ out = list_repo_refs("allenai/OLMo2-13B-1124")
 branches = [b.name for b in out.branches]
 ```
 ### Model Description
 - **Developed by:** Allen Institute for AI (Ai2)
@@ -63,41 +98,6 @@ branches = [b.name for b in out.branches]
 - **W&B Logs:** [pretraining](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B), [annealing](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B-anneal)
-## Uses
-### Inference
-Proceed as usual with HuggingFace:
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124")
-tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo2-7B-1124")
-message = ["Language modeling is "]
-inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
-# optional verifying cuda
-# inputs = {k: v.to('cuda') for k,v in inputs.items()}
-# olmo = olmo.to('cuda')
-response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
-print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
->> 'Language modeling is the first step to build natural language generation...'
-```
-Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
-The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
-### Fine-tuning
-Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
-1. Fine-tune with the OLMo repository:
-```bash
-torchrun --nproc_per_node=8 scripts/train.py {path_to_train_config} \
-    --data.paths=[{path_to_data}/input_ids.npy] \
-    --data.label_mask_paths=[{path_to_data}/label_mask.npy] \
-    --load_path={path_to_checkpoint} \
-    --reset_trainer_state
-```
-For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?tab=readme-ov-file#fine-tuning).
-2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
 <!-- TODO -->
 ## Evaluation
@@ -120,7 +120,7 @@ Core model results for OLMo 7B models are found below.
 | GSM8k             | 10.0     | 12.0      | 4.0       | 4.5    | 8.5     | 25.0       | 29.0               | 35.0                  |
 | Full average      | 60.3     | 62.1      | 59.2      | 59.3   | 59.8    | 66.2       | 63.8               | 64.2                  |
-And for 1B models:
 | task       | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | [OLMo 1.0 1B](https://huggingface.co/allenai/OLMo-1B-hf) | **OLMo 1B July 2024** |
 | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------ | ----------------- | --------- | -------------------------------------- | ------- | ------ |
@@ -229,4 +229,4 @@ Groeneveld, D., Beltagy, I., Walsh, P., Bhagia, A., Kinney, R., Tafjord, O., Jha
 ## Model Card Contact
-For errors in this model card, contact Nathan, `{nathanl} at allenai dot org`.

 | Size | Training Tokens | Layers | Hidden Size | Attention Heads | Context Length |
 |------|--------|---------|-------------|-----------------|----------------|
 | [OLMo2-7B July 2024](https://huggingface.co/allenai/OLMo-7B-0724-hf) | 4 Trillion   | 32     | 4096        | 32              |  4096  |
+| [OLMo2- 13B July 2024](https://huggingface.co/allenai/OLMo-1B-0724-hf) | 5 Trillion   | 40     | 5120        | 42              |  4096  |
+## Inference
+Proceed as usual with HuggingFace:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124")
+tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo2-7B-1124")
+message = ["Language modeling is "]
+inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
+# optional verifying cuda
+# inputs = {k: v.to('cuda') for k,v in inputs.items()}
+# olmo = olmo.to('cuda')
+response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
+print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
+>> 'Language modeling is the first step to build natural language generation...'
+```
+Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo2-7B-1124", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
+The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
 We have released checkpoints for these models, for every 1000 training steps.
 The naming convention is `stepXXX-tokensYYYB`.
 branches = [b.name for b in out.branches]
 ```
+### Fine-tuning
+Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
+1. Fine-tune with the OLMo repository:
+```bash
+torchrun --nproc_per_node=8 scripts/train.py {path_to_train_config} \
+    --data.paths=[{path_to_data}/input_ids.npy] \
+    --data.label_mask_paths=[{path_to_data}/label_mask.npy] \
+    --load_path={path_to_checkpoint} \
+    --reset_trainer_state
+```
+For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?tab=readme-ov-file#fine-tuning).
+2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
 ### Model Description
 - **Developed by:** Allen Institute for AI (Ai2)
 - **W&B Logs:** [pretraining](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B), [annealing](https://wandb.ai/ai2-llm/OLMo-7B/groups/OLMo-1.7-7B-anneal)
 <!-- TODO -->
 ## Evaluation
 | GSM8k             | 10.0     | 12.0      | 4.0       | 4.5    | 8.5     | 25.0       | 29.0               | 35.0                  |
 | Full average      | 60.3     | 62.1      | 59.2      | 59.3   | 59.8    | 66.2       | 63.8               | 64.2                  |
+And for 13B models:
 | task       | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | [OLMo 1.0 1B](https://huggingface.co/allenai/OLMo-1B-hf) | **OLMo 1B July 2024** |
 | ------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------ | ----------------- | --------- | -------------------------------------- | ------- | ------ |
 ## Model Card Contact
+For errors in this model card, contact Nathan, `{amanr} at allenai dot org`.