jinjieyuan
commited on
Add instruction for the sparse base model
Browse files
README.md
CHANGED
@@ -12,12 +12,38 @@ The heuristic adapter discovered from the [super-adapter](https://huggingface.co
|
|
12 |
### Information
|
13 |
|
14 |
- **Model name:** shears-llama-13b-50-math-heuristic-adapter
|
15 |
-
- **Base model:** [
|
16 |
- **Sparsity:** 50%
|
17 |
- **Domain:** Math
|
18 |
- **Subnetwork version:** Heuristic
|
19 |
- **NNCF Configuration:** [nncf_shears_llama.json](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears/nncf_config/nncf_shears_llama.json)
|
20 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
### Adapter Configuration
|
22 |
|
23 |
- **LoRA rank:** 32 (24 in the heuristic subnetwork)
|
@@ -62,14 +88,14 @@ def generate_prompt(instruction):
|
|
62 |
### Response:
|
63 |
"""
|
64 |
|
65 |
-
base_model = AutoModelForCausalLM.from_pretrained("
|
66 |
model = PeftModel.from_pretrained(base_model, "IntelLabs/shears-llama-13b-50-math-heuristic-adapter")
|
67 |
model.eval()
|
68 |
|
69 |
non_zero_params = sum([(param.data != 0).sum().item() for _, param in model.named_parameters()])
|
70 |
print(f"Number of all non-zero parameters: {non_zero_params}")
|
71 |
|
72 |
-
tokenizer = AutoTokenizer.from_pretrained("
|
73 |
|
74 |
instruction = "Edgar eats 18 pretzels a day. If his brother eats 1/2 as many, how many does his brother eat in a week?"
|
75 |
prompt = generate_prompt(instruction)
|
|
|
12 |
### Information
|
13 |
|
14 |
- **Model name:** shears-llama-13b-50-math-heuristic-adapter
|
15 |
+
- **Base model:** Sparsified [LLaMA-13B](https://huggingface.co/yahma/llama-13b-hf)
|
16 |
- **Sparsity:** 50%
|
17 |
- **Domain:** Math
|
18 |
- **Subnetwork version:** Heuristic
|
19 |
- **NNCF Configuration:** [nncf_shears_llama.json](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears/nncf_config/nncf_shears_llama.json)
|
20 |
|
21 |
+
### Sparsified Base Model
|
22 |
+
|
23 |
+
Shears employs a simple but effective pruning approach [Wanda](https://arxiv.org/abs/2306.11695) to sparsify the language model, serving as the base model.
|
24 |
+
Clone the [Wanda](https://github.com/locuslab/wanda) repo:
|
25 |
+
|
26 |
+
```bash
|
27 |
+
git clone https://github.com/locuslab/wanda.git && cd wanda && git checkout 8e8fc87 && cd ..
|
28 |
+
```
|
29 |
+
|
30 |
+
The command for unstructured sparsifying LLaMA-13B with Wanda, to achieve unstructured 50% sparsity:
|
31 |
+
|
32 |
+
```bash
|
33 |
+
python wanda/main.py \
|
34 |
+
--model yahma/llama-13b-hf \
|
35 |
+
--prune_method wanda \
|
36 |
+
--sparsity_ratio 0.5 \
|
37 |
+
--sparsity_type unstructured \
|
38 |
+
--save wanda_out \
|
39 |
+
--save_model shears-llama-13b-50-base
|
40 |
+
```
|
41 |
+
- `--model`: The identifier for the model on the Hugging Face model hub or local path.
|
42 |
+
- `--sparsity_ratio`: Specifies the percentage of weights to be pruned.
|
43 |
+
- `--save_model`: Specifies the directory where the pruned language model will be stored.
|
44 |
+
|
45 |
+
Refer to our [repo](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears#setup) for the environment information to run this command.
|
46 |
+
|
47 |
### Adapter Configuration
|
48 |
|
49 |
- **LoRA rank:** 32 (24 in the heuristic subnetwork)
|
|
|
88 |
### Response:
|
89 |
"""
|
90 |
|
91 |
+
base_model = AutoModelForCausalLM.from_pretrained("shears-llama-13b-50-base")
|
92 |
model = PeftModel.from_pretrained(base_model, "IntelLabs/shears-llama-13b-50-math-heuristic-adapter")
|
93 |
model.eval()
|
94 |
|
95 |
non_zero_params = sum([(param.data != 0).sum().item() for _, param in model.named_parameters()])
|
96 |
print(f"Number of all non-zero parameters: {non_zero_params}")
|
97 |
|
98 |
+
tokenizer = AutoTokenizer.from_pretrained("shears-llama-13b-50-base")
|
99 |
|
100 |
instruction = "Edgar eats 18 pretzels a day. If his brother eats 1/2 as many, how many does his brother eat in a week?"
|
101 |
prompt = generate_prompt(instruction)
|