AI & ML interests

None defined yet.

Recent Activity

blog-explorers's activity

lunarflu 
in blog-explorers/README about 3 hours ago

[Support] Community Articles

70
#5 opened 10 months ago by
victor
AdinaY 
posted an update about 8 hours ago
akkasayaz 
in blog-explorers/README about 10 hours ago

[Support] Community Articles

70
#5 opened 10 months ago by
victor
DamarJati 
posted an update 6 days ago
view post
Post
2108
Happy New Year 2025 🤗
For the Huggingface community.
alielfilali01 
posted an update 8 days ago
view post
Post
1727
~75% on the challenging GPQA with only 40M parameters 🔥🥳

GREAT ACHIEVEMENT ! Or is it ?

This new Work, "Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation", take out the mystery about many models i personally suspected their results. Speacially on leaderboards other than the english one, Like the Open Arabic LLM Leaderbaord OALL/Open-Arabic-LLM-Leaderboard.

The authors of this work, first started by training a model on the GPQA data, which, unsurprisingly, led to the model achieving 100% performance.

Afterward, they trained what they referred to as a 'legitimate' model on legitimate data (MedMCQA). However, they introduced a distillation loss from the earlier, 'cheated' model.

What they discovered was fascinating: the knowledge of GPQA leaked through this distillation loss, even though the legitimate model was never explicitly trained on GPQA during this stage.

This raises important questions about the careful use of distillation in model training, especially when the training data is opaque. As they demonstrated, it’s apparently possible to (intentionally or unintentionally) leak test data through this method.

Find out more: Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation (2412.15255)
  • 1 reply
·
AdinaY 
posted an update 12 days ago
AdinaY 
posted an update 13 days ago
view post
Post
2949
QvQ-72B-Preview🎄 an open weight model for visual reasoning just released by Alibaba_Qwen team
Qwen/qvq-676448c820912236342b9888
✨ Combines visual understanding & language reasoning.
✨ Scores 70.3 on MMMU
✨ Outperforms Qwen2-VL-72B-Instruct in complex problem-solving
Abhaykoul 
posted an update 17 days ago
view post
Post
1639
🔥 BIG ANNOUNCEMENT: THE HELPINGAI API IS LIVE! 🔥

Yo, the moment you’ve all been waiting for is here! 🚀 The HelpingAI API is now LIVE and ready to level up your projects! 🔥 We’re bringing that next-level AI goodness straight to your fingertips. 💯

No more waiting— it’s time to build something epic! 🙌

From now on, you can integrate our cutting-edge AI models into your own applications, workflows, and everything in between. Whether you’re a developer, a creator, or just someone looking to make some serious moves, this is your chance to unlock the full potential of emotional intelligence and adaptive AI.

Check out the docs 🔥 and let’s get to work! 🚀

👉 Check out the docs and start building (https://helpingai.co/docs)
👉 Visit the HelpingAI website (https://helpingai.co/)
·
AdinaY 
posted an update 21 days ago
view post
Post
541
Megrez-3B-Omni 🔥 an on-device multimodal LLM by Infinigence AI, another startup emerging from the Tsinghua University ecosystem.
Model: Infinigence/Megrez-3B-Omni
Demo: Infinigence/Megrez-3B-Omni
✨Supports analysis of image, text, and audio modalities
✨Leads in bilingual speech ( English & Chinese ) input, multi-turn conversations, and voice-based queries
✨Outperforms in scene understanding and OCR across major benchmarks
alielfilali01 
posted an update 24 days ago
view post
Post
3402
Unpopular opinion: Open Source takes courage to do !

Not everyone is brave enough to release what they have done (the way they've done it) to the wild to be judged !
It really requires a high level of "knowing wth are you doing" ! It's kind of a super power !

Cheers to the heroes here who see this!
·
yjernite 
posted an update 25 days ago
view post
Post
2079
🇪🇺 Policy Thoughts in the EU AI Act Implementation 🇪🇺

There is a lot to like in the first draft of the EU GPAI Code of Practice, especially as regards transparency requirements. The Systemic Risks part, on the other hand, is concerning for both smaller developers and for external stakeholders.

I wrote more on this topic ahead of the next draft. TLDR: more attention to immediate large-scale risks and to collaborative solutions supported by evidence can help everyone - as long as developers disclose sufficient information about their design choices and deployment contexts.

Full blog here, based on our submitted response with @frimelle and @brunatrevelin :

https://huggingface.co/blog/yjernite/eu-draft-cop-risks#on-the-proposed-taxonomy-of-systemic-risks
  • 2 replies
·
julien-c 
posted an update 27 days ago
view post
Post
8025
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
AdinaY 
posted an update 29 days ago
view post
Post
883
Updates from the Chinese community last week 🔥

LLM:
✨ Sailor 2 , multilingual model supporting 10+ South Asian languages by Sea AI Lab. https://huggingface.co/sailor2

MLLM:
✨InternVL 2.5 , new open multimodal LLM by OpenGVLab
https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c
✨Qwen2-VL 2B/7B/72B base model, the latest iteration of our Qwen-VL model by Alibaba Qwen
Qwen/qwen2-vl-66cee7455501d7126940800d

Video model:
✨HunyuanVideo , 13B open video model by Tencent
tencent/HunyuanVideo

Reasoning model:
✨ LLaMA-O1 🦙 base & supervised model; pretrain & finetune datasets and demo all released
zh-ai-community/reasoning-models-67409fb3aa1ed78f10087cd7

Audio model:
✨Fish Speech 1.5, Text-to-speech in 13 languages, trained on 1M+ hours of audio by FishAudio
fishaudio/fish-speech-1.5
✨ClearVoice, An advanced voice processing framework by Alibaba Tongyi SpeechAI https://huggingface.co/alibabasglab

More details 👉 https://huggingface.co/zh-ai-community
alielfilali01 
posted an update 29 days ago
view post
Post
1511
Apparently i forgot to put this here !

Well, this is a bit late but consider given our recent blog a read if you are interested in Evaluation.

You don't have to be into Arabic NLP in order to read it, the main contribution we are introducing is a new evaluation measure for NLG. We made the fisrt application of this measure on Arabic for now and we will be working with colleagues from the community to expand it to other languages.

Blog:
Rethinking LLM Evaluation with 3C3H: AraGen Benchmark and Leaderboard
https://huggingface.co/blog/leaderboard-3c3h-aragen

Space:
inceptionai/AraGen-Leaderboard

Give it a read and let me know your thoughts 🤗
christopher 
posted an update 29 days ago
view post
Post
1586
The folks at Foursquare released a dataset of 104.5 million places of interest ( foursquare/fsq-os-places) and here's all of them on a plot
·
christopher 
posted an update about 1 month ago
ImranzamanML 
posted an update about 1 month ago
view post
Post
492
Deep understanding of (C-index) evaluation measure for better model
Lets start with three patients groups:

Group A
Group B
Group C
For each patient, we will predict risk score (higher score means higher risk of early event).

Step 1: Understanding Concordance Index
The Concordance Index (C-index) evaluate that how well the model ranks survival times.

Understand with sample data:
Group A has 3 patients with actual survival times and predicted risk scores:

Patient Actual Survival Time Predicted Risk Score
P1 5 months 0.8
P2 3 months 0.9
P3 10 months 0.2
Comparable pairs:

(P1, P2): P2 has a shorter survival time and a higher risk score → Concordant ✅
(P1, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
(P2, P3): P3 has a longer survival time and a lower risk score → Concordant ✅
Total pairs = 3
Total concordant pairs = 3

C-index for Group A = Concordant pairs/Total pairs= 3/3 = 1.0

Step 2: Calculate C-index for All Groups
Repeat the process for all groups. For now we can assume:

Group A: C-index = 1.0
Group B: C-index = 0.8
Group C: C-index = 0.6
Step 3: Stratified Concordance Index
The Stratified Concordance Index combines the C-index scores of all groups and focusing on the following:

Average performance across groups (mean of C-indices).
Consistency across groups (low standard deviation of C-indices).
Formula:
Stratified C-index = Mean(C-index scores) - Standard Deviation(C-index scores)

Calculate the mean:
Mean=1.0 + 0.8 + 0.6/3 = 0.8

Calculate the standard deviation:
Standard Deviation= sqrt((1.0-0.8)^2 + (0.8-0.8)^2 + (0.6-0.8)^/3) = 0.16

Stratified C-index:
Stratified C-index = 0.8 - 0.16 = 0.64

Step 4: Interpret the Results
A high Stratified C-index means:

The model predicts well overall (high mean C-index).
  • 1 reply
·
AdinaY 
posted an update about 1 month ago
view post
Post
1580
Sailor 2 🚢 open multilingual model for Southeast Asia by Sea AI Lab🔥
https://huggingface.co/sailor2
sail/Sailor2-20B-Chat

✨ Fully open code & ALL datasets 🙌
✨ 1B/ 8B/20B base & chat expanded on Qwen2.5
✨ Apache 2.0
✨ Supports 15 languages including English, Chinese, Burmese, Cebuano, Ilocano, Indonesian, Javanese, Khmer, Lao, Malay, Sundanese, Tagalog, Thai, Vietnamese, and Waray🇬🇧🇨🇳🇱🇦🇲🇾🇲🇲🇻🇳🇹🇭