What is an actorder group and what are the advantages of running this in vLLM?

#1
by nickandbro - opened

Would really like to know as I am curious to how this may benefit the Pixtral setup my team uses. Thanks!

NM Testing org

Hi @nickandbro , you can see all the available activation ordering strategies here.

In short, compressing your model with "group" activation ordering lead to better accuracy recovery, but can lead to slightly higher latency. If latency is a concern for your application, consider compressing your model with "weight" activation ordering.

NM Testing org

For additional information on multimodal model compression using llm-compressor for vLLM, see the associated VLM support PR

kylesayrs changed discussion status to closed
NM Testing org

NOTE: We typically suggest using WEIGHT for the strategy

Sign up or log in to comment