Model Card for Unit 1 of the Diffusion Models Class 🧨
This model is a diffusion model for unconditional image generation of beer images 🍺.
Model Description
This model is based on the DDPM (Denoising Diffusion Probabilistic Models) architecture and has been specifically trained to generate images of beer. It employs a UNet architecture with self-attention mechanisms in the middle layers.
Model Architecture
- Type: UNet2DModel with self-attention
- Input Channels: 3 (RGB)
- Output Channels: 3
- Image Resolution: 32x32 pixels
- Layers per Block: 2
- Channel Dimensions: (64, 128, 128, 256)
- Attention Layers: Present in middle down and up blocks
Usage
from diffusers import DDPMPipeline
# Load the model
pipeline = DDPMPipeline.from_pretrained('ffjefckds/sd-class-beer-32')
# Generate an image
image = pipeline().images[0]
image.save("generated_beer.png")
Training
Training Data
The model was trained on a synthetic dataset of beer images, which is available at ffjefckds/small-beer-images.
Training Procedure
- Optimizer: AdamW
- Learning Rate: 4e-4
- Epochs: 500
- Noise Scheduler: DDPM with 1000 timesteps
- Beta Schedule: "squaredcos_cap_v2"
- Batch Size: Dynamic based on dataloader
Training Infrastructure
- Framework: PyTorch
- Library: 🧨 Diffusers
Limitations and Bias
- The model is trained on a low resolution (32x32)
- Synthetic training data might lead to limited diversity
- The generated images are too small for most practical applications due to the low resolution
- The model may exhibit biases present in the training data
License
MIT License
- Downloads last month
- 113
Inference API (serverless) does not yet support diffusers models for this pipeline type.