Model Card for Unit 1 of the Diffusion Models Class 🧨

This model is a diffusion model for unconditional image generation of beer images 🍺.

Model Description

This model is based on the DDPM (Denoising Diffusion Probabilistic Models) architecture and has been specifically trained to generate images of beer. It employs a UNet architecture with self-attention mechanisms in the middle layers.

Model Architecture

Type: UNet2DModel with self-attention
Input Channels: 3 (RGB)
Output Channels: 3
Image Resolution: 32x32 pixels
Layers per Block: 2
Channel Dimensions: (64, 128, 128, 256)
Attention Layers: Present in middle down and up blocks

Usage

from diffusers import DDPMPipeline

# Load the model
pipeline = DDPMPipeline.from_pretrained('ffjefckds/sd-class-beer-32')

# Generate an image
image = pipeline().images[0]
image.save("generated_beer.png")

Training

Training Data

The model was trained on a synthetic dataset of beer images, which is available at ffjefckds/small-beer-images.

Training Procedure

Optimizer: AdamW
Learning Rate: 4e-4
Epochs: 500
Noise Scheduler: DDPM with 1000 timesteps
Beta Schedule: "squaredcos_cap_v2"
Batch Size: Dynamic based on dataloader

Training Infrastructure

Framework: PyTorch
Library: 🧨 Diffusers

Limitations and Bias

The model is trained on a low resolution (32x32)
Synthetic training data might lead to limited diversity
The generated images are too small for most practical applications due to the low resolution
The model may exhibit biases present in the training data

License

MIT License