Lumina Text-to-Music

We will provide our implementation and pretrained models as open source in this repository recently.

πŸ“° News

  • [2024-06-07] πŸš€πŸš€πŸš€ We release the initial version of Lumina-T2Music for text-to-music generation.

Installation

Before installation, ensure that you have a working nvcc

# The command should work and show the same version number as in our case. (12.1 in our case).
nvcc --version

On some outdated distros (e.g., CentOS 7), you may also want to check that a late enough version of gcc is available

# The command should work and show a version of at least 6.0.
# If not, consult distro-specific tutorials to obtain a newer version or build manually.
gcc --version

Downloading Lumina-T2X repo from github:

git clone https://github.com/Alpha-VLLM/Lumina-T2X

1. Create a conda environment and install PyTorch

Note: You may want to adjust the CUDA version according to your driver version.

conda create -n Lumina_T2X -y
conda activate Lumina_T2X
conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y

2. Install dependencies

The environment dependencies for Lumina-T2Music are different from those for Lumina-T2I. Please install the appropriate environment.

Installing Lumina-T2Music dependencies:

cd .. # If you are in the `lumina_music` directory, execute this line.
pip install -e ".[music]"

or you can use requirements.txt to install the environment.

cd lumina_music # If you are not in the `lumina_music` folder, run this line.
pip install -r requirements.txt

3. Install flash-attn

pip install flash-attn --no-build-isolation

4. Install nvidia apex (optional)

While Apex can improve efficiency, it is not a must to make Lumina-T2X work.

Note that Lumina-T2X works smoothly with either:

  • Apex not installed at all; OR
  • Apex successfully installed with CUDA and C++ extensions.

However, it will fail when:

  • A Python-only build of Apex is installed.

If the error No module named 'fused_layer_norm_cuda' appears, it typically means you are using a Python-only build of Apex. To resolve this, please run pip uninstall apex, and Lumina-T2X should then function correctly.

You can clone the repo and install following the official guidelines (note that we expect a full build, i.e., with CUDA and C++ extensions)

pip install ninja
git clone https://github.com/NVIDIA/apex
cd apex
# if pip >= 23.1 (ref: https://pip.pypa.io/en/stable/news/#v23-1) which supports multiple `--config-settings` with the same key...
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --config-settings "--build-option=--cpp_ext" --config-settings "--build-option=--cuda_ext" ./
# otherwise
pip install -v --disable-pip-version-check --no-cache-dir --no-build-isolation --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Inference

Preparation

Prepare the pretrained checkpoints.

⭐⭐ (Recommended) you can use huggingface-cli downloading our model:

huggingface-cli download --resume-download Alpha-VLLM/Lumina-T2Music --local-dir /path/to/ckpt

or using git for cloning the model you want to use:

git clone https://huggingface.co/Alpha-VLLM/Lumina-T2Music

Web Demo

To host a local gradio demo for interactive inference, run the following command:

  1. updated AutoencoderKL ckpt path

you should update configs/lumina-text2music.yaml to set AutoencoderKL checkpoint path. Please replace /path/to/ckpt with the path where your checkpoints are located ().

  ...
        depth: 16
        max_len: 1000

    first_stage_config:
      target: models.autoencoder1d.AutoencoderKL
      params:
        embed_dim: 20
        monitor: val/rec_loss
        - ckpt_path: /path/to/ckpt/maa2/maa2.ckpt
        + ckpt_path: <real_path>/maa2/maa2.ckpt
        ddconfig:
          double_z: true
          in_channels: 80
          out_ch: 80
  ...
  1. setting Lumina-T2Music and Vocoder checkpoint path and run demo

Please replace /path/to/ckpt with the actual downloaded path.

# `/path/to/ckpt` should be a directory containing `music_generation`, `maa2`, and `bigvnat`.

# default
python -u demo_music.py \
    --ckpt "/path/to/ckpt/music_generation" \
    --vocoder_ckpt "/path/to/ckpt/bigvnat" \
    --config_path "configs/lumina-text2music.yaml" \
    --sample_rate 16000

Disclaimer

Any organization or individual is prohibited from using any technology mentioned in this paper to generate someone's speech without his/her consent, including but not limited to government leaders, political figures, and celebrities. If you do not comply with this item, you could be in violation of copyright laws.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including Alpha-VLLM/Lumina-T2Music