|
--- |
|
inference: false |
|
license: gpl |
|
language: |
|
- en |
|
tags: |
|
- starcoder |
|
- wizardcoder |
|
- code |
|
- self-instruct |
|
- distillation |
|
--- |
|
|
|
<!-- header start --> |
|
<div style="width: 100%;"> |
|
<img src="https://i.imgur.com/EBdldam.jpg" alt="TheBlokeAI" style="width: 100%; min-width: 400px; display: block; margin: auto;"> |
|
</div> |
|
<div style="display: flex; justify-content: space-between; width: 100%;"> |
|
<div style="display: flex; flex-direction: column; align-items: flex-start;"> |
|
<p><a href="https://discord.gg/theblokeai">Chat & support: my new Discord server</a></p> |
|
</div> |
|
<div style="display: flex; flex-direction: column; align-items: flex-end;"> |
|
<p><a href="https://www.patreon.com/TheBlokeAI">Want to contribute? TheBloke's Patreon page</a></p> |
|
</div> |
|
</div> |
|
<!-- header end --> |
|
|
|
# NousResearch's Redmond Hermes Coder GGML |
|
|
|
These files are GGML format model files for [NousResearch's Redmond Hermes Coder](https://huggingface.co/NousResearch/Redmond-Hermes-Coder). |
|
|
|
Please note that these GGMLs are **not compatible with llama.cpp, or currently with text-generation-webui**. Please see below for a list of tools known to work with these model files. |
|
|
|
## Repositories available |
|
|
|
* [4-bit GPTQ models for GPU inference](https://huggingface.co/TheBloke/Redmond-Hermes-Coder-GPTQ) |
|
* [4, 5, and 8-bit GGML models for CPU+GPU inference](https://huggingface.co/TheBloke/Redmond-Hermes-Coder-GGML) |
|
* [Unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/NousResearch/Redmond-Hermes-Coder) |
|
|
|
## Prompt template: Alpaca |
|
|
|
``` |
|
Below is an instruction that describes a task. Write a response that appropriately completes the request. |
|
|
|
### Instruction: PROMPT |
|
|
|
### Response: |
|
|
|
``` |
|
|
|
<!-- compatibility_ggml start --> |
|
## Compatibilty |
|
|
|
These files are **not** compatible with llama.cpp. |
|
|
|
Currently they can be used with: |
|
* KoboldCpp, a powerful inference engine based on llama.cpp, with good UI and GPU acceleration: [KoboldCpp](https://github.com/LostRuins/koboldcpp) |
|
* The ctransformers Python library, which includes LangChain support: [ctransformers](https://github.com/marella/ctransformers) |
|
* LoLLMs WebUI which uses ctransformers: [LoLLMS WebUI](https://github.com/ParisNeo/lollms-webui) |
|
* [rustformers' llm](https://github.com/rustformers/llm) |
|
* The example `starcoder` binary provided with [ggml](https://github.com/ggerganov/ggml) |
|
|
|
<!-- compatibility_ggml end --> |
|
|
|
## Provided files |
|
| Name | Quant method | Bits | Size | Max RAM required | Use case | |
|
| ---- | ---- | ---- | ---- | ---- | ----- | |
|
| redmond-hermes-coder.ggmlv3.q4_0.bin | q4_0 | 4 | 10.75 GB| 13.25 GB | 4-bit. | |
|
| redmond-hermes-coder.ggmlv3.q4_1.bin | q4_1 | 4 | 11.92 GB| 14.42 GB | 4-bit. Higher accuracy than q4_0 but not as high as q5_0. However has quicker inference than q5 models. | |
|
| redmond-hermes-coder.ggmlv3.q5_0.bin | q5_0 | 5 | 13.09 GB| 15.59 GB | 5-bit. Higher accuracy, higher resource usage and slower inference. | |
|
| redmond-hermes-coder.ggmlv3.q5_1.bin | q5_1 | 5 | 14.26 GB| 16.76 GB | 5-bit. Even higher accuracy, resource usage and slower inference. | |
|
| redmond-hermes-coder.ggmlv3.q8_0.bin | q8_0 | 8 | 20.11 GB| 22.61 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. | |
|
|
|
|
|
**Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead. |
|
|
|
<!-- footer start --> |
|
## Discord |
|
|
|
For further support, and discussions on these models and AI in general, join us at: |
|
|
|
[TheBloke AI's Discord server](https://discord.gg/theblokeai) |
|
|
|
## Thanks, and how to contribute. |
|
|
|
Thanks to the [chirper.ai](https://chirper.ai) team! |
|
|
|
I've had a lot of people ask if they can contribute. I enjoy providing models and helping people, and would love to be able to spend even more time doing it, as well as expanding into new projects like fine tuning/training. |
|
|
|
If you're able and willing to contribute it will be most gratefully received and will help me to keep providing more models, and to start work on new AI projects. |
|
|
|
Donaters will get priority support on any and all AI/LLM/model questions and requests, access to a private Discord room, plus other benefits. |
|
|
|
* Patreon: https://patreon.com/TheBlokeAI |
|
* Ko-Fi: https://ko-fi.com/TheBlokeAI |
|
|
|
**Special thanks to**: Luke from CarbonQuill, Aemon Algiz, Dmitriy Samsonov. |
|
|
|
**Patreon special mentions**: zynix , ya boyyy, Trenton Dambrowitz, Imad Khwaja, Alps Aficionado, chris gileta, John Detwiler, Willem Michiel, RoA, Mano Prime, Rainer Wilmers, Fred von Graf, Matthew Berman, Ghost , Nathan LeClaire, Iucharbius , Ai Maven, Illia Dulskyi, Joseph William Delisle, Space Cruiser, Lone Striker, Karl Bernard, Eugene Pentland, Greatston Gnanesh, Jonathan Leane, Randy H, Pierre Kircher, Willian Hasse, Stephen Murray, Alex , terasurfer , Edmond Seymore, Oscar Rangel, Luke Pendergrass, Asp the Wyvern, Junyu Yang, David Flickinger, Luke, Spiking Neurons AB, subjectnull, Pyrater, Nikolai Manek, senxiiz, Ajan Kanaga, Johann-Peter Hartmann, Artur Olbinski, Kevin Schuppel, Derek Yates, Kalila, K, Talal Aujan, Khalefa Al-Ahmad, Gabriel Puliatti, John Villwock, WelcomeToTheClub, Daniel P. Andersen, Preetika Verma, Deep Realms, Fen Risland, trip7s trip, webtim, Sean Connelly, Michael Levine, Chris McCloskey, biorpg, vamX, Viktor Bowallius, Cory Kujawski. |
|
|
|
Thank you to all my generous patrons and donaters! |
|
|
|
<!-- footer end --> |
|
|
|
# Original model card: NousResearch's Redmond Hermes Coder |
|
|
|
|
|
# Model Card: Redmond-Hermes-Coder 15B |
|
|
|
## Model Description |
|
|
|
Redmond-Hermes-Coder 15B is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Karan4D leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors. |
|
|
|
This model was trained with a WizardCoder base, which itself uses a StarCoder base model. |
|
|
|
The model is truly great at code, but, it does come with a tradeoff though. While far better at code than the original Nous-Hermes built on Llama, it is worse than WizardCoder at pure code benchmarks, like HumanEval. |
|
|
|
It comes in at 39% on HumanEval, with WizardCoder at 57%. This is a preliminary experiment, and we are exploring improvements now. |
|
|
|
However, it does seem better at non-code than WizardCoder on a variety of things, including writing tasks. |
|
|
|
## Model Training |
|
|
|
The model was trained almost entirely on synthetic GPT-4 outputs. This includes data from diverse sources such as GPTeacher, the general, roleplay v1&2, code instruct datasets, Nous Instruct & PDACTL (unpublished), CodeAlpaca, Evol_Instruct Uncensored, GPT4-LLM, and Unnatural Instructions. |
|
|
|
Additional data inputs came from Camel-AI's Biology/Physics/Chemistry and Math Datasets, Airoboros' (v1) GPT-4 Dataset, and more from CodeAlpaca. The total volume of data encompassed over 300,000 instructions. |
|
|
|
## Collaborators |
|
The model fine-tuning and the datasets were a collaboration of efforts and resources from members of Nous Research, includingTeknium, Karan4D, Huemin Art, and Redmond AI's generous compute grants. |
|
|
|
Huge shoutout and acknowledgement is deserved for all the dataset creators who generously share their datasets openly. |
|
|
|
Among the contributors of datasets, GPTeacher was made available by Teknium, Wizard LM by nlpxucan, and the Nous Research Instruct Dataset was provided by Karan4D and HueminArt. |
|
The GPT4-LLM and Unnatural Instructions were provided by Microsoft, Airoboros dataset by jondurbin, Camel-AI datasets are from Camel-AI, and CodeAlpaca dataset by Sahil 2801. |
|
If anyone was left out, please open a thread in the community tab. |
|
|
|
## Prompt Format |
|
|
|
The model follows the Alpaca prompt format: |
|
``` |
|
### Instruction: |
|
|
|
### Response: |
|
``` |
|
|
|
or |
|
|
|
``` |
|
### Instruction: |
|
|
|
### Input: |
|
|
|
### Response: |
|
``` |
|
|
|
## Resources for Applied Use Cases: |
|
For an example of a back and forth chatbot using huggingface transformers and discord, check out: https://github.com/teknium1/alpaca-discord |
|
For an example of a roleplaying discord bot, check out this: https://github.com/teknium1/alpaca-roleplay-discordbot |
|
|
|
## Future Plans |
|
The model is currently being uploaded in FP16 format, and there are plans to convert the model to GGML and GPTQ 4bit quantizations. The team is also working on a full benchmark, similar to what was done for GPT4-x-Vicuna. We will try to get in discussions to get the model included in the GPT4All. |
|
|
|
## Benchmark Results |
|
``` |
|
HumanEval: 39% |
|
| Task |Version| Metric |Value | |Stderr| |
|
|------------------------------------------------|------:|---------------------|-----:|---|-----:| |
|
|arc_challenge | 0|acc |0.2858|± |0.0132| |
|
| | |acc_norm |0.3148|± |0.0136| |
|
|arc_easy | 0|acc |0.5349|± |0.0102| |
|
| | |acc_norm |0.5097|± |0.0103| |
|
|bigbench_causal_judgement | 0|multiple_choice_grade|0.5158|± |0.0364| |
|
|bigbench_date_understanding | 0|multiple_choice_grade|0.5230|± |0.0260| |
|
|bigbench_disambiguation_qa | 0|multiple_choice_grade|0.3295|± |0.0293| |
|
|bigbench_geometric_shapes | 0|multiple_choice_grade|0.1003|± |0.0159| |
|
| | |exact_str_match |0.0000|± |0.0000| |
|
|bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|0.2260|± |0.0187| |
|
|bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|0.1957|± |0.0150| |
|
|bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|0.3733|± |0.0280| |
|
|bigbench_movie_recommendation | 0|multiple_choice_grade|0.3200|± |0.0209| |
|
|bigbench_navigate | 0|multiple_choice_grade|0.4830|± |0.0158| |
|
|bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|0.4150|± |0.0110| |
|
|bigbench_ruin_names | 0|multiple_choice_grade|0.2143|± |0.0194| |
|
|bigbench_salient_translation_error_detection | 0|multiple_choice_grade|0.2926|± |0.0144| |
|
|bigbench_snarks | 0|multiple_choice_grade|0.5249|± |0.0372| |
|
|bigbench_sports_understanding | 0|multiple_choice_grade|0.4817|± |0.0159| |
|
|bigbench_temporal_sequences | 0|multiple_choice_grade|0.2700|± |0.0140| |
|
|bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|0.1864|± |0.0110| |
|
|bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|0.1349|± |0.0082| |
|
|bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|0.3733|± |0.0280| |
|
|boolq | 1|acc |0.5498|± |0.0087| |
|
|hellaswag | 0|acc |0.3814|± |0.0048| |
|
| | |acc_norm |0.4677|± |0.0050| |
|
|openbookqa | 0|acc |0.1960|± |0.0178| |
|
| | |acc_norm |0.3100|± |0.0207| |
|
|piqa | 0|acc |0.6600|± |0.0111| |
|
| | |acc_norm |0.6610|± |0.0110| |
|
|winogrande | 0|acc |0.5343|± |0.0140| |
|
``` |
|
|
|
## Model Usage |
|
The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions. |
|
|
|
Compute provided by our project sponsor Redmond AI, thank you!! |
|
|