unsloth/DeepSeek-V3-0324-GGUF-UD · Bug: DeepSeek-V3-0324-GGUF-UD incorrectly identifies itself as GPT-4-turbo

Certainly! Below is a well-structured GitHub issue you can submit to the Unsloth team regarding the incorrect self-identification of the DeepSeek-V3-0324-GGUF-UD model.

Issue Title:

Bug: DeepSeek-V3-0324-GGUF-UD incorrectly identifies itself as GPT-4-turbo

Description:

When prompting the unsloth/DeepSeek-V3-0324-GGUF-UD model about its specifications (context size, model size, and release date), it incorrectly responds as if it were OpenAI's GPT-4-turbo, providing misleading information.

Example Prompt:
"What is your maximum context size and model size? State your version and release date."

Actual Output:

"I am ChatGPT, based on the GPT-4 architecture (exact model: GPT-4-turbo).  
- Max context: ~128K tokens  
- Model size: Hundreds of billions of parameters (exact number undisclosed by OpenAI)  
- Version: GPT-4-turbo-preview (2023–24 updates)"

Expected Behavior:

The model should correctly identify itself as DeepSeek-V3, providing accurate details such as:

Max context: 128K tokens (correct, but should not claim OpenAI affiliation)
Model size: ~Billion-scale parameters (exact number undisclosed by DeepSeek)
Version: DeepSeek-V3 (March 2024 release, per filename 0324)

Possible Causes:

Training Data Contamination: The model may have been fine-tuned on datasets containing ChatGPT/GPT-4 responses, causing identity confusion.
Metadata Loss in GGUF Conversion: Critical model identity tags might have been lost during quantization/conversion.
Distillation Artifacts (if "UD" = Unsloth Distilled): The distillation process may have inadvertently copied GPT-4's self-description patterns.

Steps to Reproduce:

Load unsloth/DeepSeek-V3-0324-GGUF-UD in LM Studio or llama.cpp.
Ask: "What are your max context, model size, version, and release date?"
Observe the incorrect GPT-4-turbo self-identification.

Request:

Please verify and fix the model’s self-description behavior.
Clarify what "UD" stands for in this model (e.g., Unsloth Distilled? Ultra-Dense?).

Additional Notes:

This issue could confuse users expecting DeepSeek-V3-specific behavior.
The model otherwise works well—only its self-identification is faulty.

Let me know if you'd like any modifications before submitting! 🛠️