OPEA
/

DeepSeek-V3-int4-sym-gptq-inc

4-bit precision

Model card Files Files and versions

cicdatopea commited on Feb 12

Commit

21bf2c9

·

verified ·

1 Parent(s): 87faf55

Update README.md

Files changed (1) hide show

README.md +23 -2

README.md CHANGED Viewed

@@ -12,8 +12,6 @@ base_model:
 This model is an int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.
-**Loading the model in Transformers can be quite slow, especially with CUDA devices(30m-1hours). Consider using an alternative serving framework (some frameworks have overflow issues).** However, we have not tested it on other frameworks due to limited cuda resources.
 Please follow the license of the original model.
 ## How To Use
@@ -26,6 +24,29 @@ While we have added a workaround to address this issue, we cannot guarantee reli
 ~~~python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
 quantized_model_dir = "OPEA/DeepSeek-V3-int4-sym-gptq-inc"

 This model is an int4 model with group_size 128 and symmetric quantization of [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) generated by [intel/auto-round](https://github.com/intel/auto-round) algorithm.
 Please follow the license of the original model.
 ## How To Use
 ~~~python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+#  https://github.com/huggingface/transformers/pull/35493
+def set_initialized_submodules(model, state_dict_keys):
+    """
+    Sets the `_is_hf_initialized` flag in all submodules of a given model when all its weights are in the loaded state
+    dict.
+    """
+    state_dict_keys = set(state_dict_keys)
+    not_initialized_submodules = {}
+    for module_name, module in model.named_modules():
+        if module_name == "":
+            # When checking if the root module is loaded there's no need to prepend module_name.
+            module_keys = set(module.state_dict())
+        else:
+            module_keys = {f"{module_name}.{k}" for k in module.state_dict()}
+        if module_keys.issubset(state_dict_keys):
+            module._is_hf_initialized = True
+        else:
+            not_initialized_submodules[module_name] = module
+    return not_initialized_submodules
+transformers.modeling_utils.set_initialized_submodules = set_initialized_submodules
 import torch
 quantized_model_dir = "OPEA/DeepSeek-V3-int4-sym-gptq-inc"