zh-ai-community (Chinese LLMs on Hugging Face)

AdinaY

posted an update about 8 hours ago

Post

426

Excited to see Alibaba DAMO Academy release a multimodel dataset for vision language pretraining on the hub🔥

Paper: 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining (2501.00958)
Dataset: DAMO-NLP-SG/multimodal_textbook

✨ 6.5M images + 0.8B text from 22k hours of instructional videos
✨ Covers subjects like math, physics, and chemistry
✨ Apache 2.0

AdinaY

updated a collection about 9 hours ago

📊 Dataset

Collection

4 items • Updated about 9 hours ago • 1

roseking

posted an update 2 days ago

Post

2102

🎉 Update: HF Downloader now supports English!

🌏 We're excited to announce that HF Downloader now fully supports English interface!

✨ What's New:
- Complete English UI
- Bilingual documentation
- Seamless language switching
- Real-time translation of download status

🔍 Whether you're downloading:
- Models
- Datasets
- Spaces

The interface will adapt to your language preference automatically.

🚀 Try it now: Switch languages easily in the top-right corner of the app!

#HuggingFace #OpenSource #Update #GUI

Sri-Vigneshwar-DJ

posted an update 2 days ago

Post

2100

Combining smolagents with Anthropic’s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra

Ningyu

authored a paper 5 days ago

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Paper • 2412.20005 • Published 10 days ago • 13

roseking

posted an update 6 days ago

Post

2534

🤗 Hugging Face Download Tool

The Hugging Face Download Tool is a sophisticated graphical user interface application designed to simplify the process of downloading resources from Hugging Face repositories. This tool addresses common challenges in model and file downloads through its intelligent features and user-friendly interface.

✨ Key Features
- 🖥️ Intuitive graphical interface for easy operation
- 🔄 Advanced retry mechanism with smart error handling
- ⏸️ Resume capability for interrupted downloads
- 📊 Real-time download status monitoring
- 🔐 Secure access to private repositories via token authentication

🛠️ Technical Highlights
The tool implements several advanced features to ensure reliable downloads:
- 📦 Chunk-based downloading with 1MB segments
- ⚡ Adaptive retry intervals (5-300 seconds) based on error types
- 🔌 Connection pooling for optimized performance
- 🛡️ Built-in rate limiting protection
- 🔑 Secure token handling for private repository access

This tool is ideal for researchers, developers, and AI practitioners who regularly work with Hugging Face resources and need a reliable, user-friendly download solution. 💻 It supports all major operating systems and requires minimal setup, making it accessible to users of all technical levels. 🚀

GitHub：https://github.com/2404589803/hf_downloader

3 replies

·

1aurent

posted an update 6 days ago

Post

510

Hey everyone 🤗!
Check out this new Virtual Try Off model (based on SD1.5): 1aurent/TryOffAnyone
This model isn't as accurate as others (e.g. xiaozaa/cat-try-off-flux based on FLUX.1) but it sure is fast!

AdinaY

updated a collection 7 days ago

🧠 Reasoning Models

Collection

7 items • Updated 7 days ago • 36

koalazf99

authored a paper 7 days ago

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published 15 days ago • 41

alielfilali01

posted an update 8 days ago

Post

1727

~75% on the challenging GPQA with only 40M parameters 🔥🥳

GREAT ACHIEVEMENT ! Or is it ?

This new Work, "Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation", take out the mystery about many models i personally suspected their results. Speacially on leaderboards other than the english one, Like the Open Arabic LLM Leaderbaord OALL/Open-Arabic-LLM-Leaderboard.

The authors of this work, first started by training a model on the GPQA data, which, unsurprisingly, led to the model achieving 100% performance.

Afterward, they trained what they referred to as a 'legitimate' model on legitimate data (MedMCQA). However, they introduced a distillation loss from the earlier, 'cheated' model.

What they discovered was fascinating: the knowledge of GPQA leaked through this distillation loss, even though the legitimate model was never explicitly trained on GPQA during this stage.

This raises important questions about the careful use of distillation in model training, especially when the training data is opaque. As they demonstrated, it’s apparently possible to (intentionally or unintentionally) leak test data through this method.

Find out more: Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation (2412.15255)

1 reply

·

AdinaY

posted an update 12 days ago

Post

3522

The Chinese community is shipping 🚢

DeepSeek V3 (685 B MoE) has quietly released on the hub!
Base: deepseek-ai/DeepSeek-V3-Base
Instruct: deepseek-ai/DeepSeek-V3

Can’t wait to see what’s next!

1 reply

·

AdinaY

updated a collection 12 days ago

✨ MoE models

Collection

12 items • Updated 12 days ago • 5

AdinaY

posted an update 13 days ago

Post

2949

QvQ-72B-Preview🎄 an open weight model for visual reasoning just released by Alibaba_Qwen team
Qwen/qvq-676448c820912236342b9888
✨ Combines visual understanding & language reasoning.
✨ Scores 70.3 on MMMU
✨ Outperforms Qwen2-VL-72B-Instruct in complex problem-solving

AdinaY

updated 2 collections 13 days ago

🚀 Trending Demo

Collection

13 items • Updated 13 days ago • 9

🧠 Reasoning Models

Collection

7 items • Updated 7 days ago • 36

MING-ZCH

authored a paper 14 days ago

II-Bench: An Image Implication Understanding Benchmark for Multimodal Large Language Models

Paper • 2406.05862 • Published Jun 9, 2024 • 4

AdinaY

posted an update 21 days ago

Post

541

Megrez-3B-Omni 🔥 an on-device multimodal LLM by Infinigence AI, another startup emerging from the Tsinghua University ecosystem.
Model: Infinigence/Megrez-3B-Omni
Demo: Infinigence/Megrez-3B-Omni
✨Supports analysis of image, text, and audio modalities
✨Leads in bilingual speech ( English & Chinese ) input, multi-turn conversations, and voice-based queries
✨Outperforms in scene understanding and OCR across major benchmarks