Dhruv's picture

2 1

Dhruv PRO

dhruv3006

·

AI & ML interests

None yet

Recent Activity

reacted to singhsidhukuldeep's post with 👍 about 11 hours ago

Groundbreaking Research Alert: Rethinking RAG with Cache-Augmented Generation (CAG) Researchers from National Chengchi University and Academia Sinica have introduced a paradigm-shifting approach that challenges the conventional wisdom of Retrieval-Augmented Generation (RAG). Instead of the traditional retrieve-then-generate pipeline, their innovative Cache-Augmented Generation (CAG) framework preloads documents and precomputes key-value caches, eliminating the need for real-time retrieval during inference. Technical Deep Dive: - CAG preloads external knowledge and precomputes KV caches, storing them for future use - The system processes documents only once, regardless of subsequent query volume - During inference, it loads the precomputed cache alongside user queries, enabling rapid response generation - The cache reset mechanism allows efficient handling of multiple inference sessions through strategic token truncation Performance Highlights: - Achieved superior BERTScore metrics compared to both sparse and dense retrieval RAG systems - Demonstrated up to 40x faster generation times compared to traditional approaches - Particularly effective with both SQuAD and HotPotQA datasets, showing robust performance across different knowledge tasks Why This Matters: The approach significantly reduces system complexity, eliminates retrieval latency, and mitigates common RAG pipeline errors. As LLMs continue evolving with expanded context windows, this methodology becomes increasingly relevant for knowledge-intensive applications.

liked a Space 5 days ago

black-forest-labs/FLUX.1-Fill-dev

reacted to reach-vb's post with 🔥 11 days ago

VLMs are going through quite an open revolution AND on-device friendly sizes: 1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: https://huggingface.co/collections/google/paligemma-2-release-67500e1e1dbfdd4dee27ba48 2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c 3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: https://huggingface.co/collections/Qwen/qwen2-vl-66cee7455501d7126940800d 4. Microsoft w/ FlorenceVL - 3B & 8B: https://huggingface.co/jiuhai 5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/ What a time to be alive! 🔥

View all activity

Organizations

dhruv3006's activity

upvoted a paper about 1 month ago

In-Context Learning Creates Task Vectors

Paper • 2310.15916 • Published Oct 24, 2023 • 42

upvoted a collection 2 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 222