Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -4,16 +4,16 @@ emoji: π
|
|
4 |
colorFrom: blue
|
5 |
colorTo: purple
|
6 |
sdk: gradio
|
7 |
-
sdk_version: 5.
|
8 |
app_file: app.py
|
9 |
-
pinned:
|
10 |
license: mit
|
11 |
short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
|
12 |
---
|
13 |
|
14 |
# π GPU-Poor LLM Gladiator Arena π
|
15 |
|
16 |
-
Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at
|
17 |
|
18 |
## π€ Starting from "Why?"
|
19 |
|
@@ -31,10 +31,11 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
31 |
## π Features
|
32 |
|
33 |
- **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
|
|
|
34 |
- **Leaderboard**: Track the performance of different models over time using an improved scoring system.
|
35 |
- **Performance Chart**: Visualize model performance with interactive charts.
|
36 |
- **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
|
37 |
-
- **
|
38 |
|
39 |
## π Getting Started
|
40 |
|
@@ -43,7 +44,9 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
43 |
- Python 3.7+
|
44 |
- Gradio
|
45 |
- Plotly
|
46 |
-
-
|
|
|
|
|
47 |
|
48 |
### Installation
|
49 |
|
@@ -55,7 +58,7 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
55 |
|
56 |
2. Install the required packages:
|
57 |
```
|
58 |
-
pip install gradio plotly
|
59 |
```
|
60 |
|
61 |
3. Ensure Ollama is running locally or via a remote server.
|
@@ -71,9 +74,11 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
|
|
71 |
2. In the "Battle Arena" tab:
|
72 |
- Enter a prompt or use the random prompt generator (π² button).
|
73 |
- Click "Generate Responses" to see outputs from two random models.
|
74 |
-
- Vote for the better response.
|
75 |
3. Check the "Leaderboard" tab to see overall model performance.
|
76 |
4. View the "Performance Chart" tab for a visual representation of model wins and losses.
|
|
|
|
|
77 |
|
78 |
## π Configuration
|
79 |
|
@@ -137,52 +142,24 @@ In addition to the main leaderboard, we also maintain an ELO-based leaderboard:
|
|
137 |
|
138 |
## π€ Models
|
139 |
|
140 |
-
The arena
|
141 |
-
|
142 |
-
- LLaMA 3.
|
143 |
-
-
|
144 |
-
-
|
145 |
-
-
|
146 |
-
-
|
147 |
-
-
|
148 |
-
|
149 |
-
|
150 |
-
|
151 |
-
|
152 |
-
|
153 |
-
|
154 |
-
-
|
155 |
-
-
|
156 |
-
-
|
157 |
-
-
|
158 |
-
- Granite 3 MoE (3B)
|
159 |
-
- Ministral (8B)
|
160 |
-
- Dolphin 2.9.4 (8B)
|
161 |
-
- Yi v1.5 (6B)
|
162 |
-
- Yi v1.5 (9B)
|
163 |
-
- Mistral Nemo (12B)
|
164 |
-
- GLM4 (9B)
|
165 |
-
- InternLM2 v2.5 (7B)
|
166 |
-
- Falcon2 (11B)
|
167 |
-
- StableLM2 (1.6B)
|
168 |
-
- StableLM2 (12B)
|
169 |
-
- Solar (10.7B)
|
170 |
-
- Rombos Qwen (7B)
|
171 |
-
- Rombos Qwen (1.5B)
|
172 |
-
- Aya Expanse (8B)
|
173 |
-
- SmolLM2 (1.7B)
|
174 |
-
- TinyLLama (1.1B)
|
175 |
-
- Pints (1.57B)
|
176 |
-
- OLMoE (7B)
|
177 |
-
- Llama 3.2 Uncensored (3B)
|
178 |
-
- Llama 3.1 Hawkish (8B)
|
179 |
-
- Humanish Llama 3 (8B)
|
180 |
-
- Nemotron Mini (4B)
|
181 |
-
- Teuken (7B)
|
182 |
-
- Llama 3.1 Sauerkraut (8B)
|
183 |
-
- Llama 3.1 SuperNova Lite (8B)
|
184 |
-
- EuroLLM (9B)
|
185 |
-
- Intellect-1 (10B)
|
186 |
|
187 |
## π€ Contributing
|
188 |
|
|
|
4 |
colorFrom: blue
|
5 |
colorTo: purple
|
6 |
sdk: gradio
|
7 |
+
sdk_version: 5.9.1
|
8 |
app_file: app.py
|
9 |
+
pinned: true
|
10 |
license: mit
|
11 |
short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
|
12 |
---
|
13 |
|
14 |
# π GPU-Poor LLM Gladiator Arena π
|
15 |
|
16 |
+
Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at 14B parameters) against each other in a battle of wits and words.
|
17 |
|
18 |
## π€ Starting from "Why?"
|
19 |
|
|
|
31 |
## π Features
|
32 |
|
33 |
- **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
|
34 |
+
- **Dynamic Model Management**: Models list is managed remotely, allowing for easy updates without code changes.
|
35 |
- **Leaderboard**: Track the performance of different models over time using an improved scoring system.
|
36 |
- **Performance Chart**: Visualize model performance with interactive charts.
|
37 |
- **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
|
38 |
+
- **Model Suggestions**: Users can suggest new models to be added to the arena.
|
39 |
|
40 |
## π Getting Started
|
41 |
|
|
|
44 |
- Python 3.7+
|
45 |
- Gradio
|
46 |
- Plotly
|
47 |
+
- OpenAI Python library (for API compatibility)
|
48 |
+
- Nextcloud Python API
|
49 |
+
- Ollama (running via OpenAI compatible API wrapper)
|
50 |
|
51 |
### Installation
|
52 |
|
|
|
58 |
|
59 |
2. Install the required packages:
|
60 |
```
|
61 |
+
pip install gradio plotly openai nc_py_api
|
62 |
```
|
63 |
|
64 |
3. Ensure Ollama is running locally or via a remote server.
|
|
|
74 |
2. In the "Battle Arena" tab:
|
75 |
- Enter a prompt or use the random prompt generator (π² button).
|
76 |
- Click "Generate Responses" to see outputs from two random models.
|
77 |
+
- Vote for the better response or choose "Tie" to continue the battle.
|
78 |
3. Check the "Leaderboard" tab to see overall model performance.
|
79 |
4. View the "Performance Chart" tab for a visual representation of model wins and losses.
|
80 |
+
5. Check the "ELO Leaderboard" for an alternative ranking system.
|
81 |
+
6. Use the "Suggest Models" tab to propose new models for the arena.
|
82 |
|
83 |
## π Configuration
|
84 |
|
|
|
142 |
|
143 |
## π€ Models
|
144 |
|
145 |
+
The arena supports a dynamic list of models that is updated regularly. The current list includes models from various families such as:
|
146 |
+
|
147 |
+
- LLaMA 3.x series (1B to 8B)
|
148 |
+
- Gemma 2 (2B and 9B)
|
149 |
+
- Qwen 2.5 (0.5B to 7B)
|
150 |
+
- Mistral and variants
|
151 |
+
- Yi models
|
152 |
+
- And many more!
|
153 |
+
|
154 |
+
For the complete and current list of models, check the arena's leaderboard.
|
155 |
+
|
156 |
+
## π Technical Details
|
157 |
+
|
158 |
+
The project uses:
|
159 |
+
- Nextcloud for remote storage of models list and leaderboard data
|
160 |
+
- OpenAI-compatible API interface for model interactions
|
161 |
+
- Background thread for periodic model list updates
|
162 |
+
- ELO rating system with size-based adjustments
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
163 |
|
164 |
## π€ Contributing
|
165 |
|