DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15, 2024 • 53
CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents Paper • 2407.01511 • Published Jul 1, 2024
A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models Paper • 2405.18208 • Published May 28, 2024
CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society Paper • 2303.17760 • Published Mar 31, 2023 • 1
Rapid Adaptation in Online Continual Learning: Are We Evaluating It Right? Paper • 2305.09275 • Published May 16, 2023 • 1
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training? Paper • 2402.01832 • Published Feb 2, 2024 • 6
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch Paper • 2406.14563 • Published Jun 20, 2024 • 30
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper • 2405.04434 • Published May 7, 2024 • 14
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23, 2024 • 37
DeepSeek-VL: Towards Real-World Vision-Language Understanding Paper • 2403.05525 • Published Mar 8, 2024 • 40
Can Large Language Model Agents Simulate Human Trust Behaviors? Paper • 2402.04559 • Published Feb 7, 2024