JuanRafap
's Collections
Interés
updated
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum
Reinforcement Learning
Paper
•
2411.02337
•
Published
•
35
Mixture-of-Transformers: A Sparse and Scalable Architecture for
Multi-Modal Foundation Models
Paper
•
2411.04996
•
Published
•
50
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle
Grandmaster Level
Paper
•
2411.03562
•
Published
•
64
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via
Inference-time Hybrid Information Structurization
Paper
•
2410.08815
•
Published
•
44
Game-theoretic LLM: Agent Workflow for Negotiation Games
Paper
•
2411.05990
•
Published
•
7
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large
Language Models on Mobile Devices
Paper
•
2411.10640
•
Published
•
44
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Paper
•
2411.19146
•
Published
•
13
Snowflake/snowflake-arctic-embed-m-v2.0
Sentence Similarity
•
Updated
•
7.19k
•
47
Snowflake/snowflake-arctic-embed-l-v2.0
Sentence Similarity
•
Updated
•
28.6k
•
88
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Paper
•
2412.04862
•
Published
•
50
ruliad/deepthought-8b-llama-v0.01-alpha
Text Generation
•
Updated
•
24.6k
•
139
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's
Reasoning Capability
Paper
•
2411.19943
•
Published
•
56
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on
Retrieval-Augmented Generation
Paper
•
2412.02592
•
Published
•
20
RL Zero: Zero-Shot Language to Behaviors without any Supervision
Paper
•
2412.05718
•
Published
•
4
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal
Retrieval-Augmented Generation
Paper
•
2412.10704
•
Published
•
15
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented
Generation for Preference Alignment
Paper
•
2412.13746
•
Published
•
9
Wonderful Matrices: Combining for a More Efficient and Effective
Foundation Model Architecture
Paper
•
2412.11834
•
Published
•
6
Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation
Model Internet Agents
Paper
•
2412.13194
•
Published
•
12
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Paper
•
2412.14711
•
Published
•
15
Ensembling Large Language Models with Process Reward-Guided Tree Search
for Better Complex Reasoning
Paper
•
2412.15797
•
Published
•
16
Progressive Multimodal Reasoning via Active Retrieval
Paper
•
2412.14835
•
Published
•
71
MixLLM: LLM Quantization with Global Mixed-precision between
Output-features and Highly-efficient System Design
Paper
•
2412.14590
•
Published
•
13
Learned Compression for Compressed Learning
Paper
•
2412.09405
•
Published
•
11
Token-Budget-Aware LLM Reasoning
Paper
•
2412.18547
•
Published
•
44
ericsonwillians/distilbert-base-uncased-steam-sentiment
Text Classification
•
Updated
•
16
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via
Collective Monte Carlo Tree Search
Paper
•
2412.18319
•
Published
•
34