latency optimizations - adds AoT compilation & FA3 for faster inference

#4
by linoyts HF Staff - opened

applies AoTI compilation and adds FA3 support for faster inference, equivalent to the changes introduced in this PR :https://huggingface.co/spaces/Qwen/Qwen-Image-Edit/commit/7825c36e2569976239ab6384edbfb2a4fd6da9ea, with some adjustments for the new Plus pipeline

linoyts changed pull request title from latency optimizations - adds AoT compilation & FA3 for faster inference [WIP - don't merge yet] to latency optimizations - adds AoT compilation & FA3 for faster inference
naykun changed pull request status to merged

Sign up or log in to comment