Zesen Cheng's picture

Zesen Cheng

ClownRat

·

AI & ML interests

multi-modal foundation model; Segmentation, Detection, and Tracking;

Recent Activity

authored a paper about 7 hours ago

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

upvoted a paper about 10 hours ago

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

updated a model about 16 hours ago

ClownRat/VideoLLaMA2.1-7B-16F

View all activity

Organizations

Collections 1

Papers 10

arxiv:2501.00599

arxiv:2411.08147

arxiv:2410.17243

arxiv:2410.12787

models 5

ClownRat/VideoLLaMA2.1-7B-16F

Text Generation • Updated about 16 hours ago • 2

ClownRat/resnet-50-torchvision

Updated 12 days ago • 1.57k

ClownRat/mask2former-resnet-50-coco-instance

Updated 12 days ago • 847

ClownRat/resnet-101-torchvision

Updated 14 days ago • 7

ClownRat/mask2former-resnet-101-coco-instance

Updated 21 days ago • 10

datasets 1

ClownRat/COCO2017-Instance

Viewer • Updated 26 days ago • 123k • 21 • 1