Spaces:

k-mktr
/

gpu-poor-llm-arena

Running

App Files Files Community

k-mktr commited on 3 days ago

Commit

4b8dfae

verified ·

1 Parent(s): 62ecc4c

Update README.md

Browse files

Files changed (1) hide show

README.md +30 -53

README.md CHANGED Viewed

@@ -4,16 +4,16 @@ emoji: 🏆
 colorFrom: blue
 colorTo: purple
 sdk: gradio
-sdk_version: 5.8.0
 app_file: app.py
-pinned: false
 license: mit
 short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
 ---
 # 🏆 GPU-Poor LLM Gladiator Arena 🏆
-Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at 9B parameters) against each other in a battle of wits and words.
 ## 🤔 Starting from "Why?"
@@ -31,10 +31,11 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
 ## 🌟 Features
 - **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
 - **Leaderboard**: Track the performance of different models over time using an improved scoring system.
 - **Performance Chart**: Visualize model performance with interactive charts.
 - **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
-- **Customizable**: Easy to add new models and prompts.
 ## 🚀 Getting Started
@@ -43,7 +44,9 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
 - Python 3.7+
 - Gradio
 - Plotly
-- Ollama (running locally)
 ### Installation
@@ -55,7 +58,7 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
 2. Install the required packages:
    ```
-   pip install gradio plotly requests
    ```
 3. Ensure Ollama is running locally or via a remote server.
@@ -71,9 +74,11 @@ In the recent months, we've seen a lot of these "Tiny" models released, and some
 2. In the "Battle Arena" tab:
    - Enter a prompt or use the random prompt generator (🎲 button).
    - Click "Generate Responses" to see outputs from two random models.
-   - Vote for the better response.
 3. Check the "Leaderboard" tab to see overall model performance.
 4. View the "Performance Chart" tab for a visual representation of model wins and losses.
 ## 🛠 Configuration
@@ -137,52 +142,24 @@ In addition to the main leaderboard, we also maintain an ELO-based leaderboard:
 ## 🤖 Models
-The arena currently supports the following compact models:
-- LLaMA 3.2 (1B)
-- LLaMA 3.2 (3B)
-- LLaMA 3.1 (8B)
-- Gemma 2 (2B)
-- Gemma 2 (9B)
-- Qwen 2.5 (0.5B)
-- Qwen 2.5 (1.5B)
-- Qwen 2.5 (3B)
-- Qwen 2.5 (7B)
-- Phi 3.5 (3.8B)
-- Mistral 0.3 (7B)
-- Hermes 3 (8B)
-- Aya 23 (8B)
-- Granite 3 Dense (2B)
-- Granite 3 Dense (8B)
-- Granite 3 MoE (1B)
-- Granite 3 MoE (3B)
-- Ministral (8B)
-- Dolphin 2.9.4 (8B)
-- Yi v1.5 (6B)
-- Yi v1.5 (9B)
-- Mistral Nemo (12B)
-- GLM4 (9B)
-- InternLM2 v2.5 (7B)
-- Falcon2 (11B)
-- StableLM2 (1.6B)
-- StableLM2 (12B)
-- Solar (10.7B)
-- Rombos Qwen (7B)
-- Rombos Qwen (1.5B)
-- Aya Expanse (8B)
-- SmolLM2 (1.7B)
-- TinyLLama (1.1B)
-- Pints (1.57B)
-- OLMoE (7B)
-- Llama 3.2 Uncensored (3B)
-- Llama 3.1 Hawkish (8B)
-- Humanish Llama 3 (8B)
-- Nemotron Mini (4B)
-- Teuken (7B)
-- Llama 3.1 Sauerkraut (8B)
-- Llama 3.1 SuperNova Lite (8B)
-- EuroLLM (9B)
-- Intellect-1 (10B)
 ## 🤝 Contributing

 colorFrom: blue
 colorTo: purple
 sdk: gradio
+sdk_version: 5.9.1
 app_file: app.py
+pinned: true
 license: mit
 short_description: 'Compact LLM Battle Arena: Frugal AI Face-Off!'
 ---
 # 🏆 GPU-Poor LLM Gladiator Arena 🏆
+Welcome to the GPU-Poor LLM Gladiator Arena, where frugal meets fabulous in the world of AI! This project pits compact language models (maxing out at 14B parameters) against each other in a battle of wits and words.
 ## 🤔 Starting from "Why?"
 ## 🌟 Features
 - **Battle Arena**: Pit two mystery models against each other and decide which pint-sized powerhouse reigns supreme.
+- **Dynamic Model Management**: Models list is managed remotely, allowing for easy updates without code changes.
 - **Leaderboard**: Track the performance of different models over time using an improved scoring system.
 - **Performance Chart**: Visualize model performance with interactive charts.
 - **Privacy-Focused**: Uses local Ollama API, avoiding pricey commercial APIs and keeping data close to home.
+- **Model Suggestions**: Users can suggest new models to be added to the arena.
 ## 🚀 Getting Started
 - Python 3.7+
 - Gradio
 - Plotly
+- OpenAI Python library (for API compatibility)
+- Nextcloud Python API
+- Ollama (running via OpenAI compatible API wrapper)
 ### Installation
 2. Install the required packages:
    ```
+   pip install gradio plotly openai nc_py_api
    ```
 3. Ensure Ollama is running locally or via a remote server.
 2. In the "Battle Arena" tab:
    - Enter a prompt or use the random prompt generator (🎲 button).
    - Click "Generate Responses" to see outputs from two random models.
+   - Vote for the better response or choose "Tie" to continue the battle.
 3. Check the "Leaderboard" tab to see overall model performance.
 4. View the "Performance Chart" tab for a visual representation of model wins and losses.
+5. Check the "ELO Leaderboard" for an alternative ranking system.
+6. Use the "Suggest Models" tab to propose new models for the arena.
 ## 🛠 Configuration
 ## 🤖 Models
+The arena supports a dynamic list of models that is updated regularly. The current list includes models from various families such as:
+- LLaMA 3.x series (1B to 8B)
+- Gemma 2 (2B and 9B)
+- Qwen 2.5 (0.5B to 7B)
+- Mistral and variants
+- Yi models
+- And many more!
+For the complete and current list of models, check the arena's leaderboard.
+## 🛠 Technical Details
+The project uses:
+- Nextcloud for remote storage of models list and leaderboard data
+- OpenAI-compatible API interface for model interactions
+- Background thread for periodic model list updates
+- ELO rating system with size-based adjustments
 ## 🤝 Contributing