SkyReels-V2-14B-540P FP8-E5M2 Quantized Models

This repository contains FP8-E5M2 quantized versions of the Skywork SkyReels-V2 14B 540P models, suitable for use with hardware supporting this precision (e.g., NVIDIA RTX 3090/40-series with torch.compile) and popular workflows like those in ComfyUI.

These models were quantized by phazei.

Original Models

These quantized models are based on the following original FP32 models from Skywork:

DF Variant: Skywork/SkyReels-V2-DF-14B-540P
T2V Variant: Skywork/SkyReels-V2-T2V-14B-540P

Please refer to the original model cards for details on their architecture, training, and intended use cases.

Quantization Details & Acknowledgements

The models were converted from their original FP32 sharded format to a mixed-precision format. The specific layers quantized to FP8-E5M2 (primarily weight layers within attention and FFN blocks, while biases and normalization layers were kept in FP32) were identified by analyzing the FP8 quantized models provided by Kijai from his repository Kijai/WanVideo_comfy.

This conversion process replicates the quantization pattern observed in Kijai's converted files to produce these FP8-E5M2 variants. Many thanks to Kijai for sharing his quantized models, which served as a clear reference for this work and benefit the ComfyUI community.

The conversion was performed using PyTorch and safetensors. The scripts used for downloading the original models and performing this conversion are included in the scripts/ directory of this repository.

Key characteristics of the quantized models:

Precision: Mixed (FP32, FP8-E5M2, U8 for metadata)
Target FP8 type: torch.float8_e5m2
Compatibility: Intended for use with PyTorch versions supporting torch.float8_e5m2 and torch.compile. Well-suited for ComfyUI workflows that can leverage these models.

Files in this Repository

SkyReels-V2-DF-14B-540P-fp8e5m2.safetensors: The quantized DF variant (single file).
SkyReels-V2-T2V-14B-540P-fp8e5m2.safetensors: The quantized T2V variant (single file).
scripts/: Contains Python scripts for downloading original models and performing the quantization.
- model_download.py
- convert_to_fp8e5m2.py
- merge_fp8_shards.py
- safetensors_info.py
README.md: This model card.

Disclaimer

This is a community-contributed quantization. While efforts were made to maintain model quality by following an established quantization pattern, performance may differ from the original FP32 models or other quantized versions. Use at your own discretion.

Acknowledgements

Skywork AI for releasing the original SkyReels models.
Kijai for providing the quantized model versions that served as a reference for the quantization pattern applied in this repository.