Introducing Observers: AI Observability with Hugging Face datasets through a lightweight SDK Nov 21, 2024 • 35
view article Article Fine-tune a SmolLM on domain-specific synthetic data from a LLM By davidberenstein1957 • 1 day ago • 7
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 • 5 days ago • 18
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 13 days ago • 197
Synthetic Data Generator Collection A collection of tools and datasets related to no-code the Synthetic Data Generation. • 16 items • Updated 1 day ago • 5
Smol but mighty Collection A collection of smoll but mighty models • 10 items • Updated 17 days ago • 4
Gradio WebRTC Cookbook ⚡️ Collection Collection of real-time voice and video demos built with gradio-webrtc custom component • 8 items • Updated 25 days ago • 10
Lora Land - 27 High-Quality LoRA Adapters Collection 27 Fine-tuned LoRA Adapters using Mistral-7B. Try them here: https://predibase.com/lora-land • 27 items • Updated Apr 26, 2024 • 4
Self-Instruct: Aligning Language Model with Self Generated Instructions Paper • 2212.10560 • Published Dec 20, 2022 • 9
view article Article 🐺🐦⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram • Dec 4, 2024 • 75
Open Image Preferences Collection Containing all artifacts for the Stable Diffusion 3.5L vs Flux Dev image preference community sprint. • 14 items • Updated 17 days ago • 6
view article Article Let’s make a generation of amazing image generation models By burtenshaw • Nov 26, 2024 • 34
Datasets built with ⚗️ distilabel Collection This collection contains some datasets generated and/or labelled using https://github.com/argilla-io/distilabel • 8 items • Updated 25 days ago • 12
Dataset Creation Collection Spaces and utilities for creating datasets and getting them on the Hub • 3 items • Updated Nov 10, 2024 • 10