151 417 1134

Clem 🤗 PRO

clem

http://huggingface.co

AI & ML interests

multi-modal, time-series, biology and chemistry

Recent Activity

liked a dataset about 13 hours ago

ylecun/mnist

replied to their post about 13 hours ago

Cool to see @ylecun joining the top 10 of most followed on HF! (and leaderboard by @mvaloatto is here: https://huggingface.co/spaces/mvaloatto/TCTF)

reacted to cfahlgren1's post with 🚀 1 day ago

The https://huggingface.co/deepseek-ai/DeepSeek-V3 is very good! I have been playing with it and found it is really good at one-shotting a pretty good landing page. You can play with it here: https://deepseek-artifacts.vercel.app All the responses get saved in the https://huggingface.co/datasets/cfahlgren1/react-code-instructions dataset. Hopefully we can build one of the biggest, highest quality frontend datasets on the hub 💪

View all activity

Organizations

clem's activity

replied to their post about 13 hours ago

yup https://x.com/ylecun/status/1861178764996079752. Would be cool to have a "verified" badge at some point

reacted to cfahlgren1's post with 🚀 1 day ago

Post

3077

The deepseek-ai/DeepSeek-V3 is very good! I have been playing with it and found it is really good at one-shotting a pretty good landing page.

You can play with it here: https://deepseek-artifacts.vercel.app

All the responses get saved in the cfahlgren1/react-code-instructions dataset. Hopefully we can build one of the biggest, highest quality frontend datasets on the hub 💪

reacted to csabakecskemeti's post with 🚀😎🤯 1 day ago

Post

2050

The deepseek-ai/DeepSeek-V3-Base
model has featured today on CNBC tech news. The whale made a splash by using FP8 and shrink the cost of training significantly!

https://youtu.be/NJljq429cGk?si=kgk-ogPTMfJKsaA2

3 replies

reacted to sequelbox's post with 👀👍 1 day ago

Post

2048

Check out the early preview of the upcoming Tachibana-QVQ dataset: code-reasoning and code-instruct data generated with Qwen/QVQ-72B-Preview

Link here: sequelbox/Tachibana-QVQ-PREVIEW

more to come :)

1 reply

reacted to merve's post with ❤️🚀🔥 1 day ago

Post

3894

supercharge your LLM apps with smolagents 🔥

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents

reacted to tomaarsen's post with 😎🔥❤️ 1 day ago

Post

2395

That didn't take long! Nomic AI has finetuned the new ModernBERT-base encoder model into a strong embedding model for search, classification, clustering and more!

Details:
🤖 Based on ModernBERT-base with 149M parameters.
📊 Outperforms both nomic-embed-text-v1 and nomic-embed-text-v1.5 on MTEB!
🏎️ Immediate FA2 and unpacking support for super efficient inference.
🪆 Trained with Matryoshka support, i.e. 2 valid output dimensionalities: 768 and 256.
➡️ Maximum sequence length of 8192 tokens!
2️⃣ Trained in 2 stages: unsupervised contrastive data -> high quality labeled datasets.
➕ Integrated in Sentence Transformers, Transformers, LangChain, LlamaIndex, Haystack, etc.
🏛️ Apache 2.0 licensed: fully commercially permissible

Try it out here: nomic-ai/modernbert-embed-base

Very nice work by Zach Nussbaum and colleagues at Nomic AI.

reacted to singhsidhukuldeep's post with ❤️🤯 1 day ago

Post

1539

Excited to share insights from Walmart's groundbreaking semantic search system that revolutionizes e-commerce product discovery!

The team at Walmart Global Technology(the team that I am a part of 😬) has developed a hybrid retrieval system that combines traditional inverted index search with neural embedding-based search to tackle the challenging problem of tail queries in e-commerce.

Key Technical Highlights:

• The system uses a two-tower BERT architecture where one tower processes queries and another processes product information, generating dense vector representations for semantic matching.

• Product information is enriched by combining titles with key attributes like category, brand, color, and gender using special prefix tokens to help the model distinguish different attribute types.

• The neural model leverages DistilBERT with 6 layers and projects the 768-dimensional embeddings down to 256 dimensions using a linear layer, achieving optimal performance while reducing storage and computation costs.

• To improve model training, they implemented innovative negative sampling techniques combining product category matching and token overlap filtering to identify challenging negative examples.

Production Implementation Details:

• The system uses a managed ANN (Approximate Nearest Neighbor) service to enable fast retrieval, achieving 99% recall@20 with just 13ms latency.

• Query embeddings are cached with preset TTL (Time-To-Live) to reduce latency and costs in production.

• The model is exported to ONNX format and served in Java, with custom optimizations like fixed input shapes and GPU acceleration using NVIDIA T4 processors.

Results:
The system showed significant improvements in both offline metrics and live experiments, with:
- +2.84% improvement in NDCG@10 for human evaluation
- +0.54% lift in Add-to-Cart rates in live A/B testing

This is a fantastic example of how modern NLP techniques can be successfully deployed at scale to solve real-world e-

1 reply

reacted to DamarJati's post with 🚀➕❤️ 1 day ago

Post

2033

Happy New Year 2025 🤗
For the Huggingface community.

reacted to prithivMLmods's post with ❤️🔥 1 day ago

Post

3014

Triangulum Catalogued 🔥💫

🎯Triangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF

2 replies