astronomer
/

Llama-3-8B-GPTQ-8-Bit

Text Generation

Inference Endpoints

text-generation-inference

8-bit precision

Model card Files Files and versions Community

davidxmle commited on Apr 22, 2024

Commit

64f4cc0

·

verified ·

1 Parent(s): bf361d0

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -66,6 +66,8 @@ This model can be loaded with just over 10GB of VRAM (compared to the original 1
 The 8 bit GPTQ quant has minimum quality degradation from the original `bfloat16` model due to its higher bitrate.
 <!-- description end -->
 ## GPTQ Quantization Method

 The 8 bit GPTQ quant has minimum quality degradation from the original `bfloat16` model due to its higher bitrate.
+The `untrained-special-tokens-fixed` branch is the same model as the main branch but has special tokens and tokens untrained (by finding the tokens where max embedding value of each token in input_embeddings and output_embeddings is 0) and setting them to the average of all trained tokens for each feature. Using this branch is recommended if you plan to do any fine-tuning with your own tokens added or with instruction following.
 <!-- description end -->
 ## GPTQ Quantization Method