GGUF
English
qwen2
conversational

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Qwen2.5-3B-Instruct Fine-Tuned on Deepseek Research Papers

License: MIT

Model Overview

This model is a fine-tuned version of Qwen2.5-3B-Instruct, specifically adapted using research papers from Deepseek. It enhances the base model’s ability to understand and generate research-related text, making it more suitable for academic and technical use cases.

Model Details

  • Base Model: Qwen2.5-3B-Instruct
  • Fine-tuned Dataset: Research papers sourced from Deepseek
  • License: MIT
  • Language: English
  • Model Format: GGUF
  • Use Cases:
    • Academic question-answering
    • Scientific content generation
    • Research paper analysis

Training Methodology

The model was fine-tuned using a collection of research papers, leveraging supervised fine-tuning (SFT) and reinforcement learning techniques where applicable. The primary goal was to enhance the model's ability to process, summarize, and analyze technical content.

Quickstart

You can quickly get started with the model using llama_cpp:

from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="AkinduH/Qwen2.5-3B-Instruct-Fine-Tuned-on-Deepseek-Research-Papers",
    filename="unsloth.Q4_K_M.gguf",
)

llm.create_chat_completion(
    messages = "Summarize the key findings of a research paper on deep learning optimizers."
)

License

This model is released under the MIT License, allowing for flexible use and distribution.

Downloads last month
31
GGUF
Model size
7.62B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AkinduH/Qwen2.5-3B-Instruct-Fine-Tuned-on-Deepseek-Research-Papers

Base model

Qwen/Qwen2.5-3B
Quantized
(127)
this model

Dataset used to train AkinduH/Qwen2.5-3B-Instruct-Fine-Tuned-on-Deepseek-Research-Papers