Can't run vLLM with current setup, which version of xFormers is compatible?

#3
by letao670982 - opened

Hi team,

I'm trying to run vLLM but I hit runtime errors related to xFormers.

My environment:

  • vLLM version: 0.10.2.dev2+gf5635d62e.d20250807
  • PyTorch version: 2.9.0.dev20250804+cu128
  • CUDA: 12.8
  • OS: (Kaggle / Linux)

Problem:

  • When I load the model, vLLM crashes with errors coming from attention kernels.
  • It looks like xFormers is either missing or not compatible with my current PyTorch nightly.

Question:
Could you please share which version of xFormers is officially supported / tested with this vLLM + PyTorch combination?
Or, is xFormers even required, or should we disable it and rely on other attention backends?

Any guidance (exact pip install command or commit hash of xFormers) would be really helpful.

Thanks!

Sign up or log in to comment