Csaba  Kecskemeti's picture

Csaba Kecskemeti PRO

csabakecskemeti

AI & ML interests

None yet

Recent Activity

Organizations

Zillow's profile picture DevQuasar's profile picture Hugging Face Party @ PyTorch Conference's profile picture Intelligent Estate's profile picture open/ acc's profile picture

csabakecskemeti's activity

reacted to singhsidhukuldeep's post with ๐Ÿ‘ 2 days ago
view post
Post
2418
Groundbreaking Research Alert: Rethinking RAG with Cache-Augmented Generation (CAG)

Researchers from National Chengchi University and Academia Sinica have introduced a paradigm-shifting approach that challenges the conventional wisdom of Retrieval-Augmented Generation (RAG).

Instead of the traditional retrieve-then-generate pipeline, their innovative Cache-Augmented Generation (CAG) framework preloads documents and precomputes key-value caches, eliminating the need for real-time retrieval during inference.

Technical Deep Dive:
- CAG preloads external knowledge and precomputes KV caches, storing them for future use
- The system processes documents only once, regardless of subsequent query volume
- During inference, it loads the precomputed cache alongside user queries, enabling rapid response generation
- The cache reset mechanism allows efficient handling of multiple inference sessions through strategic token truncation

Performance Highlights:
- Achieved superior BERTScore metrics compared to both sparse and dense retrieval RAG systems
- Demonstrated up to 40x faster generation times compared to traditional approaches
- Particularly effective with both SQuAD and HotPotQA datasets, showing robust performance across different knowledge tasks

Why This Matters:
The approach significantly reduces system complexity, eliminates retrieval latency, and mitigates common RAG pipeline errors. As LLMs continue evolving with expanded context windows, this methodology becomes increasingly relevant for knowledge-intensive applications.
replied to their post 3 days ago
posted an update 3 days ago
posted an update 4 days ago
reacted to s-emanuilov's post with ๐Ÿ‘๐Ÿ‘€ 4 days ago
view post
Post
2503
Hey HF community! ๐Ÿ‘‹

Excited to share Monkt - a tool I built to solve the eternal headache of processing documents for ML/AI pipelines.

What it does: Converts PDFs, Word, PowerPoint, Excel, Web pages or raw HTML into clean Markdown or structured JSON.

Great for:
โœ” LLM training dataset preparation;
โœ” Knowledge base construction;
โœ” Research paper processing;
โœ” Technical documentation management.

It has API access for integration into ML pipelines.

Check it out at https://monkt.com/ if you want to save time on document processing infrastructure.

Looking forward to your feedback!
  • 3 replies
ยท
posted an update 6 days ago
reacted to prithivMLmods's post with โค๏ธ 6 days ago
view post
Post
3553
Triangulum Catalogued ๐Ÿ”ฅ๐Ÿ’ซ

๐ŸŽฏTriangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
ยท
reacted to DamarJati's post with โž• 6 days ago
view post
Post
2108
Happy New Year 2025 ๐Ÿค—
For the Huggingface community.
reacted to prithivMLmods's post with ๐Ÿค— 6 days ago
view post
Post
3553
Triangulum Catalogued ๐Ÿ”ฅ๐Ÿ’ซ

๐ŸŽฏTriangulum is a collection of pretrained and instruction-tuned generative models, designed for multilingual applications. These models are trained using synthetic datasets based on long chains of thought, enabling them to perform complex reasoning tasks effectively.

+ Triangulum-10B : prithivMLmods/Triangulum-10B
+ Quants : prithivMLmods/Triangulum-10B-GGUF

+ Triangulum-5B : prithivMLmods/Triangulum-5B
+ Quants : prithivMLmods/Triangulum-5B-GGUF

+ Triangulum-1B : prithivMLmods/Triangulum-1B
+ Quants : prithivMLmods/Triangulum-1B-GGUF
ยท
reacted to sequelbox's post with ๐Ÿ‘ 6 days ago
posted an update 7 days ago
reacted to ginipick's post with ๐Ÿ”ฅ 7 days ago
view post
Post
3556
๐ŸŒŠ [Dokdo Membership - Next Generation AI Video Creation Platform]

โœจ Transform your imagination into mesmerizing videos with Dokdo Membership, an innovative AI-powered platform that generates unique videos from text and images. Built as a streamlined SaaS boilerplate using Python Gradio for Hugging Face users, this tool offers an intuitive way to create AI-generated videos with minimal effort.

๐ŸŽฏ [Key Features]
- ๐Ÿ“ง Email-based authentication system with secure login/signup
- ๐ŸŽ 15 points automatically credited upon registration
- ๐Ÿ’ฐ 5 points deduction per video generation
- ๐ŸŒ Bilingual support (Korean/English) with automatic translation
- ๐Ÿ–ผ๏ธ Optional first frame image upload capability
- โญ Automatic GiniGEN.AI watermark integration

๐Ÿš€ [Technical Specifications]
1. ๐Ÿ’ซ Modern, responsive user interface with Gradio components
2. ๐Ÿ“Š Efficient resource management through points system
3. ๐ŸŽฅ High-quality video generation using advanced AI models
4. ๐Ÿ”„ Seamless translation pipeline for multilingual support
5. โšก Real-time point tracking and management system
6. ๐Ÿ›ก๏ธ Comprehensive content moderation and filtering

๐Ÿ“ [How to Use]
1. โœ… Register with your email to receive 15 initial points
2. ๐Ÿ’ญ Enter your video description (supports both English and Korean)
3. ๐Ÿ“ค Upload a reference image for the first frame (optional)
4. ๐ŸŽฌ Click "Generate Video" (consumes 5 points)
5. ๐Ÿ“ฅ Preview and download your generated video

๐Ÿ”ง [Technical Implementation]
- Built with Python Gradio for seamless Hugging Face Space integration
- Implements secure user authentication and session management
- Features real-time point tracking and automated deduction system
- Includes comprehensive error handling and input validation
- Utilizes advanced AI models for video generation

๐Ÿ“ฎ Need additional points for more creations? Contact us at [email protected] for point acquisition options through public contributions or paid services.

ginigen/Dokdo-membership
  • 1 reply
ยท
reacted to cfahlgren1's post with ๐Ÿš€ 7 days ago
reacted to onekq's post with ๐Ÿ”ฅ 9 days ago
view post
Post
3009
๐Ÿ‹ DeepSeek ๐Ÿ‹v3 achieves a solid 7 point jump than v2.5, surpassing GPT-4o, but is still behind ๐Ÿ“ o1 ๐Ÿ“and Claude 3.5.

onekq-ai/WebApp1K-models-leaderboard
posted an update 9 days ago
view post
Post
1447
I've built a small utility to split safetensors file by file.
The issue/need came up when I've tried to convert the new Deepseek V3 model from FP8 to BF16.
The only Ada architecture GPU I have is an RTX 4080 and the 16GB vram was just wasn't enough for the conversion.

BTW: I'll upload the bf16 version here:
DevQuasar/deepseek-ai.DeepSeek-V3-Base-bf16
(it will take a while - days with my upload speed)
If anyone has access the resources to test it I'd appreciate a feedback if it's working or not.

The tool, is available from here:
https://github.com/csabakecskemeti/ai_utils/blob/main/safetensor_splitter.py
It's splitting every file to n pieces by the layers if possible, and create a new "model.safetensors.index.json" file.
I've tested it with Llama 3.1 8B and multiple split sizes, and validated by using inference pipeline.
use --help for usage
Please note current version expects the model is already multiple file and have a "model.safetensors.index.json" layer-safetensor mapping file.
reacted to MoritzLaurer's post with ๐Ÿ‘ 17 days ago
view post
Post
2551
Quite excited by the ModernBERT release! 0.15/0.4B small, 2T modern pre-training data and tokenizer with code, 8k context window, great efficient model for embeddings & classification!

This will probably be the basis for many future SOTA encoders! And I can finally stop using DeBERTav3 from 2021 :D

Congrats @answerdotai , @LightOnIO and collaborators like @tomaarsen !

Paper and models here ๐Ÿ‘‡https://huggingface.co/collections/answerdotai/modernbert-67627ad707a4acbf33c41deb
ยท
replied to luigi12345's post 18 days ago
posted an update 20 days ago
reacted to cutechicken's post with โค๏ธ 20 days ago
view post
Post
2861
๐Ÿš€ RAGOndevice: High-Performance Local AI Document Analysis Assistant
๐Ÿ’ซ Core Value
RAGOndevice is a high-performance AI system running locally without cloud dependency. Using CohereForAI's optimized 7B model, it enables professional-grade document analysis on standard PCs. โœจ
๐ŸŒŸ Ondevice AI Advantages
1. ๐Ÿ”‹ Efficient Resource Utilization

๐ŸŽฏ Optimized 7B Model: Runs on standard PCs
โšก Local Processing: Instant response without cloud
๐Ÿ’ป Low-Spec Compatible: Performs well on regular GPUs
๐Ÿ”„ Optimized Memory: Ensures stable operation

2. ๐Ÿ›ก๏ธ Data Security & Cost Efficiency

๐Ÿ”’ Complete Privacy: No external data transmission
๐ŸŒ Offline Operation: No internet required
๐Ÿ’ฐ No Subscription: One-time installation
โš™๏ธ Resource Optimization: Uses existing hardware

๐ŸŽฎ Key Features
1. ๐Ÿ“Š Powerful Document Analysis

๐Ÿ“ Multi-Format Support: TXT, CSV, PDF, Parquet
๐Ÿง  Intelligent Analysis: Automatic structure recognition
๐Ÿ‘๏ธ OCR Support: Advanced PDF text extraction
๐Ÿ’ฌ Real-time Chat: Natural language interaction

2. ๐Ÿ” Local RAG System

๐ŸŽฏ Efficient Search: TF-IDF based local search
๐Ÿงฉ Context Understanding: Accurate information retrieval
๐Ÿ“š Wikipedia Integration: Rich background knowledge

๐ŸŽฏ Use Cases

๐Ÿข Enterprise: Secure confidential document processing
๐Ÿ”ฌ Personal Research: Private data analysis
๐Ÿ“š Education: Personal learning material analysis
๐Ÿ’ป Development: Local codebase analysis

โญ Differentiators

๐Ÿƒโ€โ™‚๏ธ Independent Operation: Zero cloud dependency
โšก Instant Response: No network latency
๐Ÿ” Complete Security: Full data control
๐Ÿ’Ž Cost Efficiency: No ongoing costs

๐Ÿ”ฎ Future Plans

๐Ÿš€ Enhanced model optimization
๐Ÿ“š Local knowledge base expansion
โšก Hardware optimization
๐Ÿ“ Extended file support


๐ŸŒŸ RAGOndevice democratizes high-performance AI, providing the optimal local AI solution for security-sensitive environments. ๐Ÿš€

๐Ÿ”ฅ Power of Local AI: Experience enterprise-grade AI capabilities right on your device!

VIDraft/RAGOndevice