-
In-situ graph reasoning and knowledge expansion using Graph-PReFLexOR
Paper • 2501.08120 • Published • 4 -
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Paper • 2501.02393 • Published • 8 -
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention
Paper • 2501.00823 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2501.02393
-
lamm-mit/Llama-3.2-3B-Instruct-Sparse-GIN-orca-math-word-problems
Updated • 11 • 1 -
lamm-mit/Llama-3.2-3B-Instruct-Sparse-GIN-logic
Updated • 9 • 1 -
lamm-mit/Llama-3.2-3B-Instruct-Sparse-GIN-bio
Updated • 9 -
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Paper • 2501.02393 • Published • 8
-
Video Creation by Demonstration
Paper • 2412.09551 • Published • 8 -
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation
Paper • 2412.07589 • Published • 45 -
Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
Paper • 2412.06531 • Published • 71 -
APOLLO: SGD-like Memory, AdamW-level Performance
Paper • 2412.05270 • Published • 38
-
Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement
Paper • 2411.06558 • Published • 34 -
SlimLM: An Efficient Small Language Model for On-Device Document Assistance
Paper • 2411.09944 • Published • 12 -
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing
Paper • 2411.19460 • Published • 10 -
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Paper • 2412.05237 • Published • 47
-
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text • Updated • 1.73M • • 1.06k -
In-situ graph reasoning and knowledge expansion using Graph-PReFLexOR
Paper • 2501.08120 • Published • 4 -
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Paper • 2501.02393 • Published • 8 -
Decoupling Knowledge and Reasoning in Transformers: A Modular Architecture with Generalized Cross-Attention
Paper • 2501.00823 • Published