Model Card for Unit 1 of the Diffusion Models Class 🧨

This model is a diffusion model for unconditional image generation of beer images 🍺.

Bild 1 Bild 2 Bild 3

Model Description

This model is based on the DDPM (Denoising Diffusion Probabilistic Models) architecture and has been specifically trained to generate images of beer. It employs a UNet architecture with self-attention mechanisms in the middle layers.

Model Architecture

  • Type: UNet2DModel with self-attention
  • Input Channels: 3 (RGB)
  • Output Channels: 3
  • Image Resolution: 32x32 pixels
  • Layers per Block: 2
  • Channel Dimensions: (64, 128, 128, 256)
  • Attention Layers: Present in middle down and up blocks

Usage

from diffusers import DDPMPipeline

# Load the model
pipeline = DDPMPipeline.from_pretrained('ffjefckds/sd-class-beer-32')

# Generate an image
image = pipeline().images[0]
image.save("generated_beer.png")

Training

Training Data

The model was trained on a synthetic dataset of beer images, which is available at ffjefckds/small-beer-images.

Training Procedure

  • Optimizer: AdamW
  • Learning Rate: 4e-4
  • Epochs: 500
  • Noise Scheduler: DDPM with 1000 timesteps
  • Beta Schedule: "squaredcos_cap_v2"
  • Batch Size: Dynamic based on dataloader

Training Infrastructure

  • Framework: PyTorch
  • Library: 🧨 Diffusers

Limitations and Bias

  • The model is trained on a low resolution (32x32)
  • Synthetic training data might lead to limited diversity
  • The generated images are too small for most practical applications due to the low resolution
  • The model may exhibit biases present in the training data

License

MIT License

Downloads last month
113
Inference API
Inference API (serverless) does not yet support diffusers models for this pipeline type.