Hugging Face TB Research

Enterprise

community

AI & ML interests

Exploring smol models and high quality web and synthetic datasets, generated by LLMs (TB is for Textbook, as inspired by the "Textbooks are all your need" paper)

Recent Activity

Xenova new activity about 9 hours ago

HuggingFaceTB/finemath:Arrgh Spam

loubnabnl new activity 6 days ago

HuggingFaceTB/finemath:Why did you use CC rather than FineWeb to create FineMath?

loubnabnl new activity 6 days ago

HuggingFaceTB/finemath:Is any exp results on "finemath-ablation-4plus-160B" & "finemath-ablation-owm"?

View all activity

HuggingFaceTB's activity

Xenova

in HuggingFaceTB/finemath about 9 hours ago

Arrgh Spam

#30 opened about 12 hours ago by

ZiggyS

cfahlgren1

posted an update 1 day ago

Post

1202

You'll notice the AI in the SQL Console is much better at working with chatml conversations:

Here's example of unnesting the cfahlgren1/react-code-instructions in less than 10 seconds by asking it. Check it out here: cfahlgren1/react-code-instructions

- "show me the average assistant response length"
- "extract user, system, and assistant messages into separate columns"

It's super easy to work with conversational datasets now with natural language 🗣️

Xenova

posted an update 3 days ago

Post

4636

First project of 2025: Vision Transformer Explorer

I built a web app to interactively explore the self-attention maps produced by ViTs. This explains what the model is focusing on when making predictions, and provides insights into its inner workings! 🤯

Try it out yourself! 👇
webml-community/attention-visualization

Source code: https://github.com/huggingface/transformers.js-examples/tree/main/attention-visualization

merve

posted an update 4 days ago

Post

3899

supercharge your LLM apps with smolagents 🔥

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents

cfahlgren1

posted an update 5 days ago

Post

3077

The deepseek-ai/DeepSeek-V3 is very good! I have been playing with it and found it is really good at one-shotting a pretty good landing page.

You can play with it here: https://deepseek-artifacts.vercel.app

All the responses get saved in the cfahlgren1/react-code-instructions dataset. Hopefully we can build one of the biggest, highest quality frontend datasets on the hub 💪

loubnabnl

in HuggingFaceTB/finemath 6 days ago

Why did you use CC rather than FineWeb to create FineMath?

#3 opened 13 days ago by

CryptAL

Is any exp results on "finemath-ablation-4plus-160B" & "finemath-ablation-owm"?

#5 opened 11 days ago by

SANGZHIJIE

Is there any experiments on Llama 3.2 1B with fineweb?

#6 opened 11 days ago by

SANGZHIJIE

lewtun

posted an update 6 days ago

Post

1888

This paper ( HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs (2412.18925)) has a really interesting recipe for inducing o1-like behaviour in Llama models:

* Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting.
* Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases)
* Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1
* Use the resulting data for SFT & RL
* Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement.

Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!

1 reply

anton-l

in HuggingFaceTB/finemath 9 days ago

Update README.md

#9 opened 9 days ago by

Amyww

davanstrien

posted an update 9 days ago

Post

2954

🇸🇰 Hovorte po slovensky? Help build better AI for Slovak!

We only need 90 more annotations to include Slovak in the next Hugging Face FineWeb2-C dataset ( data-is-better-together/fineweb-c) release!

Your contribution will help create better language models for 5+ million Slovak speakers.

Annotate here: data-is-better-together/fineweb-c.

Read more about why we're doing it: https://huggingface.co/blog/davanstrien/fineweb2-community

3 replies

anton-l

in HuggingFaceTB/finemath 10 days ago

Upload re.zip

#8 opened 10 days ago by

Amyww

Create test

#7 opened 10 days ago by

Amyww

Create ？

#4 opened 11 days ago by

Amyww

merve

posted an update 11 days ago

Post

4226

QwQ can see 🔥
Qwen team released QvQ, a large vision LM with reasoning 😱

it outperforms proprietary VLMs on several benchmarks, comes with open weights and a demo!
Check them out ⬇️
Demo Qwen/QVQ-72B-preview
Model Qwen/QVQ-72B-Preview
Read more https://qwenlm.github.io/blog/qvq-72b-preview/
Congratulations @JustinLin610 and team!