3 9 5

Shuo Zhang

Meteonis

00index

AI & ML interests

None yet

Recent Activity

liked a model 11 days ago

deepseek-ai/DeepSeek-V3-Base

upvoted a collection 12 days ago

long-cot-dataset

upvoted a collection 12 days ago

DeepSeek-V3

View all activity

Organizations

Meteonis's activity

liked a model 11 days ago

deepseek-ai/DeepSeek-V3-Base

Updated 8 days ago • 8.36k • 1.16k

upvoted 2 collections 12 days ago

long-cot-dataset

Collection

16 items • Updated 15 days ago • 3

DeepSeek-V3

Collection

3 items • Updated about 22 hours ago • 98

New activity in deepseek-ai/DeepSeek-V3-Base 12 days ago

vllm/sglang deploy script?

#14 opened 12 days ago by

Meteonis

liked a Space 16 days ago

Running

833

🔍

QwQ-32B-Preview

upvoted a paper 4 months ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23

commented a paper 4 months ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23 •

upvoted a paper 4 months ago

In-Context Imitation Learning via Next-Token Prediction

Paper • 2408.15980 • Published Aug 28, 2024 • 9

upvoted a paper 5 months ago

Amuro & Char: Analyzing the Relationship between Pre-Training and Fine-Tuning of Large Language Models

Paper • 2408.06663 • Published Aug 13, 2024 • 16

liked a Space 5 months ago

Runtime error

920

🐯

ChuanhuChatGPT

authored 3 papers 7 months ago

upvoted 2 papers 7 months ago

Scaling Laws of RoPE-based Extrapolation

Paper • 2310.05209 • Published Oct 8, 2023 • 7

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

Paper • 2402.06332 • Published Feb 9, 2024 • 18

authored a paper 11 months ago

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

Paper • 2402.06332 • Published Feb 9, 2024 • 18

liked a model 12 months ago

internlm/internlm2-chat-20b

Text Generation • Updated Aug 20, 2024 • 4.12k • 87

upvoted a paper 12 months ago

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 44

upvoted a paper about 1 year ago

Leveraging Large Language Models for Automated Proof Synthesis in Rust

Paper • 2311.03739 • Published Nov 7, 2023 • 5

New activity in internlm/internlm-chat-20b over 1 year ago

prompt format?

#3 opened over 1 year ago by

lucasjin