FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper
•
2506.20920
•
Published
•
61
None defined yet.
torchao
Int8WeightOnlyConfig
is already working flawlessly in our tests.import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_
pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)
@spaces.GPU
def generate(prompt: str):
return pipeline(prompt).images[0]
medium
size is now available as a power-user featurelarge
(70GB VRAM)—but this paves the way for:medium
will offer significantly more usage than large
)xlarge
size (141GB VRAM)auto
(future default)medium
large
(current default)large
medium