Zilikon

Zilikon
ยท

AI & ML interests

None yet

Recent Activity

Organizations

None yet

Zilikon's activity

reacted to Xenova's post with ๐Ÿ”ฅ 5 days ago
view post
Post
4965
First project of 2025: Vision Transformer Explorer

I built a web app to interactively explore the self-attention maps produced by ViTs. This explains what the model is focusing on when making predictions, and provides insights into its inner workings! ๐Ÿคฏ

Try it out yourself! ๐Ÿ‘‡
webml-community/attention-visualization

Source code: https://github.com/huggingface/transformers.js-examples/tree/main/attention-visualization
reacted to s-emanuilov's post with ๐Ÿ‘€ 5 days ago
view post
Post
2503
Hey HF community! ๐Ÿ‘‹

Excited to share Monkt - a tool I built to solve the eternal headache of processing documents for ML/AI pipelines.

What it does: Converts PDFs, Word, PowerPoint, Excel, Web pages or raw HTML into clean Markdown or structured JSON.

Great for:
โœ” LLM training dataset preparation;
โœ” Knowledge base construction;
โœ” Research paper processing;
โœ” Technical documentation management.

It has API access for integration into ML pipelines.

Check it out at https://monkt.com/ if you want to save time on document processing infrastructure.

Looking forward to your feedback!
  • 3 replies
ยท
New activity in deepseek-ai/DeepSeek-V3-Base 9 days ago

Confusing Answer

3
#36 opened 9 days ago by
Zilikon
reacted to lewtun's post with ๐Ÿ”ฅ 19 days ago
view post
Post
6665
We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute ๐Ÿ”ฅ

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

๐Ÿ“ˆ Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

๐ŸŽ„ Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

๐Ÿงญ Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!
  • 2 replies
ยท