-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
DocGraphLM: Documental Graph Language Model for Information Extraction
Paper • 2401.02823 • Published • 35 -
TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents
Paper • 2312.01279 • Published • 3 -
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models
Paper • 2311.07575 • Published • 13
Collections
Discover the best community collections!
Collections including paper arxiv:2401.00908
-
UI Layout Generation with LLMs Guided by UI Grammar
Paper • 2310.15455 • Published • 2 -
You Only Look at Screens: Multimodal Chain-of-Action Agents
Paper • 2309.11436 • Published • 1 -
Never-ending Learning of User Interfaces
Paper • 2308.08726 • Published • 2 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 65
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 15 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 25 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 8 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
DocFormerv2: Local Features for Document Understanding
Paper • 2306.01733 • Published • 1
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 39 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 16
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 138 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper • 2206.00364 • Published • 14 -
GLU Variants Improve Transformer
Paper • 2002.05202 • Published • 1 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 136