jmkim0309
's Collections
daily papers
updated
GenTron: Delving Deep into Diffusion Transformers for Image and Video
Generation
Paper
•
2312.04557
•
Published
•
12
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper
•
2312.04410
•
Published
•
14
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
Paper
•
2312.04461
•
Published
•
60
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes
Interactively
Paper
•
2401.02955
•
Published
•
21
Denoising Vision Transformers
Paper
•
2401.02957
•
Published
•
28
SSR-Encoder: Encoding Selective Subject Representation for
Subject-Driven Generation
Paper
•
2312.16272
•
Published
•
6
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with
Time-Decoupled Training and Reusable Coop-Diffusion
Paper
•
2312.16486
•
Published
•
6
Edify Image: High-Quality Image Generation with Pixel Space Laplacian
Diffusion Models
Paper
•
2411.07126
•
Published
•
28
Motion Control for Enhanced Complex Action Video Generation
Paper
•
2411.08328
•
Published
•
5
JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified
Multimodal Understanding and Generation
Paper
•
2411.07975
•
Published
•
27
Pyramidal Flow Matching for Efficient Video Generative Modeling
Paper
•
2410.05954
•
Published
•
39
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Paper
•
2412.04432
•
Published
•
14
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment
Paper
•
2412.04814
•
Published
•
45
Mind the Time: Temporally-Controlled Multi-Event Video Generation
Paper
•
2412.05263
•
Published
•
10
OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows
Paper
•
2412.01169
•
Published
•
12
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
Paper
•
2410.13861
•
Published
•
53
UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit
Consistency
Paper
•
2412.15216
•
Published
•
5
MotiF: Making Text Count in Image Animation with Motion Focal Loss
Paper
•
2412.16153
•
Published
•
6
Large Motion Video Autoencoding with Cross-modal Video VAE
Paper
•
2412.17805
•
Published
•
24