Hugging Face TB Research

Enterprise
community
Activity Feed

AI & ML interests

Exploring smol models and high quality web and synthetic datasets, generated by LLMs (TB is for Textbook, as inspired by the "Textbooks are all your need" paper)

Recent Activity

HuggingFaceTB's activity

Xenova 
in HuggingFaceTB/finemath about 9 hours ago

Arrgh Spam

1
#30 opened about 12 hours ago by
ZiggyS
cfahlgren1 
posted an update 1 day ago
view post
Post
1202
You'll notice the AI in the SQL Console is much better at working with chatml conversations:

Here's example of unnesting the cfahlgren1/react-code-instructions in less than 10 seconds by asking it. Check it out here: cfahlgren1/react-code-instructions

- "show me the average assistant response length"
- "extract user, system, and assistant messages into separate columns"

It's super easy to work with conversational datasets now with natural language 🗣️





Xenova 
posted an update 3 days ago
merve 
posted an update 4 days ago
view post
Post
3899
supercharge your LLM apps with smolagents 🔥

however cool your LLM is, without being agentic it can only go so far

enter smolagents: a new agent library by Hugging Face to make the LLM write code, do analysis and automate boring stuff!

Here's our blog for you to get started https://huggingface.co/blog/smolagents
cfahlgren1 
posted an update 5 days ago
lewtun 
posted an update 6 days ago
view post
Post
1888
This paper ( HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs (2412.18925)) has a really interesting recipe for inducing o1-like behaviour in Llama models:

* Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting.
* Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases)
* Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1
* Use the resulting data for SFT & RL
* Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement.

Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!
  • 1 reply
·

Update README.md

#9 opened 9 days ago by
Amyww
davanstrien 
posted an update 9 days ago
view post
Post
2954
🇸🇰 Hovorte po slovensky? Help build better AI for Slovak!

We only need 90 more annotations to include Slovak in the next Hugging Face FineWeb2-C dataset ( data-is-better-together/fineweb-c) release!

Your contribution will help create better language models for 5+ million Slovak speakers.

Annotate here: data-is-better-together/fineweb-c.

Read more about why we're doing it: https://huggingface.co/blog/davanstrien/fineweb2-community
  • 3 replies
·

Upload re.zip

#8 opened 10 days ago by
Amyww

Create test

#7 opened 10 days ago by
Amyww

Create ?

#4 opened 11 days ago by
Amyww
merve 
posted an update 11 days ago