gary109
's Collections
Text-to-Audio
updated
Large-Scale Automatic Audiobook Creation
Paper
•
2309.03926
•
Published
•
54
FoleyGen: Visually-Guided Audio Generation
Paper
•
2309.10537
•
Published
•
8
MusicAgent: An AI Agent for Music Understanding and Generation with
Large Language Models
Paper
•
2310.11954
•
Published
•
25
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper
•
2310.00704
•
Published
•
21
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper
•
2311.00945
•
Published
•
14
In-Context Prompt Editing For Conditional Audio Generation
Paper
•
2311.00895
•
Published
•
10
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Paper
•
2312.03491
•
Published
•
33
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of
Audio Events in Text-to-audio Generation
Paper
•
2407.02869
•
Published
•
18
FunAudioLLM: Voice Understanding and Generation Foundation Models for
Natural Interaction Between Humans and LLMs
Paper
•
2407.04051
•
Published
•
35
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music
Generation
Paper
•
2407.15060
•
Published
•
9
Improving Text-To-Audio Models with Synthetic Captions
Paper
•
2406.15487
•
Published