What is an actorder group and what are the advantages of running this in vLLM?

by nickandbro - opened 4 days ago

Discussion

nickandbro

4 days ago

Would really like to know as I am curious to how this may benefit the Pixtral setup my team uses. Thanks!

kylesayrs

NM Testing org 4 days ago

Hi @nickandbro , you can see all the available activation ordering strategies here.

In short, compressing your model with "group" activation ordering lead to better accuracy recovery, but can lead to slightly higher latency. If latency is a concern for your application, consider compressing your model with "weight" activation ordering.

kylesayrs

NM Testing org 4 days ago

For additional information on multimodal model compression using llm-compressor for vLLM, see the associated VLM support PR

kylesayrs changed discussion status to closed 4 days ago

robertgshaw2

NM Testing org 4 days ago

NOTE: We typically suggest using WEIGHT for the strategy

nickandbro

4 days ago

Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment