UniversalNER 4bit
Collection
4bit versions of UniversalNER models
•
4 items
•
Updated
•
2
Universal-NER/UniNER-7B-type-sub quantized to 4bit with GPTQ and stored with 1GB shard size.
The model Universal-NER/UniNER-7B-type-sub was quantized to 4bit, group_size 128, and act-order=True with auto-gptq integration in transformers (https://huggingface.co/blog/gptq-integration).
TODO
Prompt template is the same as for the full precision model:
prompt_template = """A virtual assistant answers questions from a user based on the provided text.
USER: Text: {input_text}
ASSISTANT: I’ve read this text.
USER: What describes {entity_name} in the text?
ASSISTANT:
"""
It is recommended to format input according to the prompt template mentioned above during inference for best results.
prompt = prompt_template.format_map({"input_text": "Cologne is a great city in Germany - maybe even the greatest ;)", "entity_name": "city"})
The original full precision model and its associated data are released under the CC BY-NC 4.0 license. Hence, the same license applies for the 4bit version.