AI & ML interests

We develop infrastructure for the evaluation of generated text.

Recent Activity

GEM's activity

prithivMLmodsย 
posted an update about 16 hours ago
view post
Post
1581
Reasoning SmolLM2 ๐Ÿš€

๐ŸŽฏFine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details.

๐Ÿ”ฅBlog : https://huggingface.co/blog/prithivMLmods/smollm2-ft

๐Ÿ”ผ Models :
+ SmolLM2-CoT-360M : prithivMLmods/SmolLM2-CoT-360M
+ Reasoning-SmolLM2-135M : prithivMLmods/Reasoning-SmolLM2-135M
+ SmolLM2-CoT-360M-GGUF : prithivMLmods/SmolLM2-CoT-360M-GGUF

๐Ÿค  Other Details :
+ Demo : prithivMLmods/SmolLM2-CoT-360M
+ Fine-tune nB : prithivMLmods/SmolLM2-CoT-360M




lewtunย 
posted an update about 22 hours ago
view post
Post
1474
I was initially pretty sceptical about Meta's Coconut paper [1] because the largest perf gains were reported on toy linguistic problems. However, these results on machine translation are pretty impressive!

https://x.com/casper_hansen_/status/1875872309996855343

Together with the recent PRIME method [2] for scaling RL, reasoning for open models is looking pretty exciting for 2025!

[1] Training Large Language Models to Reason in a Continuous Latent Space (2412.06769)
[2] https://huggingface.co/blog/ganqu/prime
Sri-Vigneshwar-DJย 
posted an update 2 days ago
view post
Post
2100
Combining smolagents with Anthropicโ€™s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co/blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
prithivMLmodsย 
posted an update 6 days ago
view post
Post
3553
Triangulum Catalogued ๐Ÿ”ฅ๐Ÿ’ซ

๐ŸŽฏTriangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
ยท
lewtunย 
posted an update 8 days ago
view post
Post
1965
This paper ( HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs (2412.18925)) has a really interesting recipe for inducing o1-like behaviour in Llama models:

* Iteratively sample CoTs from the model, using a mix of different search strategies. This gives you something like Stream of Search via prompting.
* Verify correctness of each CoT using GPT-4o (needed because exact match doesn't work well in medicine where there are lots of aliases)
* Use GPT-4o to reformat the concatenated CoTs into a single stream that includes smooth transitions like "hmm, wait" etc that one sees in o1
* Use the resulting data for SFT & RL
* Use sparse rewards from GPT-4o to guide RL training. They find RL gives an average ~3 point boost across medical benchmarks and SFT on this data already gives a strong improvement.

Applying this strategy to other domains could be quite promising, provided the training data can be formulated with verifiable problems!
  • 1 reply
ยท
prithivMLmodsย 
posted an update 15 days ago
prithivMLmodsย 
posted an update 18 days ago
view post
Post
2493
Qwen2VL Models: Vision and Language Processing ๐Ÿ‰

๐Ÿ“FT; [ Latex OCR, Math Parsing, Text Analogy OCRTest ]

Colab Demo: prithivMLmods/Qwen2-VL-OCR-2B-Instruct

โ„๏ธDemo : prithivMLmods/Qwen2-VL-2B . The demo includes the Qwen2VL 2B Base Model.

๐ŸŽฏThe space handles documenting content from the input image along with standardized plain text. It includes adjustment tools with over 30 font styles, file formatting support for PDF and DOCX, textual alignments, font size adjustments, and line spacing modifications.

๐Ÿ“„PDFs are rendered using the ReportLab software library toolkit.

๐ŸงตModels :
+ prithivMLmods/Qwen2-VL-OCR-2B-Instruct
+ prithivMLmods/Qwen2-VL-Ocrtest-2B-Instruct
+ prithivMLmods/Qwen2-VL-Math-Prase-2B-Instruct

๐Ÿš€Sample Document :
+ https://drive.google.com/file/d/1Hfqqzq4Xc-3eTjbz-jcQY84V5E1YM71E/view?usp=sharing

๐Ÿ“ฆCollection :
+ prithivMLmods/vision-language-models-67639f790e806e1f9799979f

.
.
.
@prithivMLmods ๐Ÿค—
  • 1 reply
ยท
prithivMLmodsย 
posted an update 19 days ago
view post
Post
3242
๐ŸŽ„ Here Before - Xmas๐ŸŽ…โœจ

๐Ÿง‘๐Ÿปโ€๐ŸŽ„Models
+ [ Xmas 2D Illustration ] : strangerzonehf/Flux-Xmas-Illustration-LoRA
+ [ Xmas 3D Art ] : strangerzonehf/Flux-Xmas-3D-LoRA
+ [ Xmas Chocolate ] : strangerzonehf/Flux-Xmas-Chocolate-LoRA
+ [ Xmas Isometric Kit ] : strangerzonehf/Flux-Xmas-Isometric-Kit-LoRA
+ [ Xmas Realpix ] : strangerzonehf/Flux-Xmas-Realpix-LoRA
+ [ Xmas Anime ] : strangerzonehf/Flux-Anime-Xmas-LoRA

โ„๏ธCollections
+ [ Xmas Art ] : strangerzonehf/christmas-pack-6758b199487adafaddb68f82
+ [ Stranger Zone Collection ] : prithivMLmods/stranger-zone-collections-org-6737118adcf2cb40d66d0c7e

๐ŸฅถPage
+ [ Stranger Zone ] : https://huggingface.co/strangerzonehf


.
.
.
@prithivMLmods ๐Ÿค—
lewtunย 
posted an update 21 days ago
view post
Post
6665
We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute ๐Ÿ”ฅ

How? By combining step-wise reward models with tree search algorithms :)

We show that smol models can match or exceed the performance of their much larger siblings when given enough "time to think"

We're open sourcing the full recipe and sharing a detailed blog post.

In our blog post we cover:

๐Ÿ“ˆ Compute-optimal scaling: How we implemented DeepMind's recipe to boost the mathematical capabilities of open models at test-time.

๐ŸŽ„ Diverse Verifier Tree Search (DVTS): An unpublished extension we developed to the verifier-guided tree search technique. This simple yet effective method improves diversity and delivers better performance, particularly at large test-time compute budgets.

๐Ÿงญ Search and Learn: A lightweight toolkit for implementing search strategies with LLMs and built for speed with vLLM

Here's the links:

- Blog post: HuggingFaceH4/blogpost-scaling-test-time-compute

- Code: https://github.com/huggingface/search-and-learn

Enjoy!
  • 2 replies
ยท