Papers
arxiv:2311.16863

Power Hungry Processing: Watts Driving the Cost of AI Deployment?

Published on Nov 28, 2023
Authors:

Abstract

Recent years have seen a surge in the popularity of commercial AI products based on generative, multi-purpose AI systems promising a unified approach to building machine learning (ML) models into technology. However, this ambition of "generality" comes at a steep cost to the environment, given the amount of energy these systems require and the amount of carbon that they emit. In this work, we propose the first systematic comparison of the ongoing inference cost of various categories of ML systems, covering both task-specific (i.e. finetuned models that carry out a single task) and `general-purpose' models, (i.e. those trained for multiple tasks). We measure deployment cost as the amount of energy and carbon required to perform 1,000 inferences on representative benchmark dataset using these models. We find that multi-purpose, generative architectures are orders of magnitude more expensive than task-specific systems for a variety of tasks, even when controlling for the number of model parameters. We conclude with a discussion around the current trend of deploying multi-purpose generative ML systems, and caution that their utility should be more intentionally weighed against increased costs in terms of energy and emissions. All the data from our study can be accessed via an interactive demo to carry out further exploration and analysis.

Community

The paper says

We provide all the code used for our experiments in our GitHub repository, alongside the logs
produced by Code Carbon, which not only provides the total energy consumed but also a more fine-grained breakdown
by hardware component

I was not able to find the repo, though. Is there a link to it?

Yes, sorry it's here: https://github.com/sashavor/co2_inference/
and a demo Space here: https://huggingface.co/spaces/sasha/CO2_inference

Can I add them to the paper page?

@sasha

  1. Do I understand correctly for stable-diffusion-xl-base-1.0 run you include idle power draw of 7x GPUs and 1148 GiB of mostly unused RAM, with only a single GPU doing inference?
  2. The step count is like 130?

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2311.16863 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2311.16863 in a dataset README.md to link it from this page.

Spaces citing this paper 2

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.