Update README.md
Browse files
README.md
CHANGED
@@ -66,6 +66,8 @@ This model can be loaded with just over 10GB of VRAM (compared to the original 1
|
|
66 |
|
67 |
The 8 bit GPTQ quant has minimum quality degradation from the original `bfloat16` model due to its higher bitrate.
|
68 |
|
|
|
|
|
69 |
<!-- description end -->
|
70 |
|
71 |
## GPTQ Quantization Method
|
|
|
66 |
|
67 |
The 8 bit GPTQ quant has minimum quality degradation from the original `bfloat16` model due to its higher bitrate.
|
68 |
|
69 |
+
The `untrained-special-tokens-fixed` branch is the same model as the main branch but has special tokens and tokens untrained (by finding the tokens where max embedding value of each token in input_embeddings and output_embeddings is 0) and setting them to the average of all trained tokens for each feature. Using this branch is recommended if you plan to do any fine-tuning with your own tokens added or with instruction following.
|
70 |
+
|
71 |
<!-- description end -->
|
72 |
|
73 |
## GPTQ Quantization Method
|