lbourdois commited on
Commit
91feb7d
·
verified ·
1 Parent(s): 1f47223

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +119 -107
README.md CHANGED
@@ -1,108 +1,120 @@
1
- ---
2
- license: creativeml-openrail-m
3
- datasets:
4
- - amphora/QwQ-LongCoT-130K
5
- language:
6
- - en
7
- base_model:
8
- - Qwen/Qwen2.5-3B-Instruct
9
- pipeline_tag: text-generation
10
- library_name: transformers
11
- tags:
12
- - text-generation-inference
13
- - long-CoT
14
- - safetensors
15
- - 3B
16
- - Instruct
17
- - QwQ
18
- - Qwen2.5
19
- ---
20
- ### **QwQ-LCoT-3B-Instruct Model Card**
21
-
22
- The **QwQ-LCoT-3B-Instruct** model is a lightweight, instruction-tuned language model designed for complex reasoning and explanation tasks. It is fine-tuned on the **Qwen2.5-3B-Instruct** base model using the **QwQ-LongCoT-130K** dataset, focusing on **long-chain-of-thought (LCoT)** reasoning for enhanced logical comprehension and detailed output generation.
23
-
24
- | **File Name** | **Size** | **Description** | **Upload Status** |
25
- |----------------------------------------|----------------|-------------------------------------------------|--------------------|
26
- | `.gitattributes` | 1.57 kB | Specifies LFS tracking for large files. | Uploaded |
27
- | `README.md` | 267 Bytes | Basic project information file. | Updated |
28
- | `added_tokens.json` | 657 Bytes | Custom tokens added to the tokenizer. | Uploaded |
29
- | `config.json` | 859 Bytes | Configuration file for the model. | Uploaded |
30
- | `generation_config.json` | 281 Bytes | Configuration file for text generation settings.| Uploaded |
31
- | `merges.txt` | 1.82 MB | Contains the byte-pair encoding (BPE) merges. | Uploaded |
32
- | `pytorch_model-00001-of-00002.bin` | 4.96 GB | First shard of the model weights in PyTorch format. | Uploaded (LFS) |
33
- | `pytorch_model-00002-of-00002.bin` | 1.21 GB | Second shard of the model weights in PyTorch format. | Uploaded (LFS) |
34
- | `pytorch_model.bin.index.json` | 36 kB | Index mapping for sharded model weights. | Uploaded |
35
- | `special_tokens_map.json` | 644 Bytes | Maps special tokens to their roles. | Uploaded |
36
- | `tokenizer.json` | 11.4 MB | Serialized tokenizer data. | Uploaded (LFS) |
37
- | `tokenizer_config.json` | 7.73 kB | Tokenizer configuration settings. | Uploaded |
38
- | `vocab.json` | 2.78 MB | Vocabulary file for the tokenizer. | Uploaded |
39
-
40
- ### **Sample Long CoT:**
41
-
42
- ![Screenshot 2024-12-13 211732.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Mgm9LmQZlFZmglKYwEDYA.png)
43
-
44
- ### **Key Features:**
45
-
46
- 1. **Long Chain-of-Thought Reasoning:**
47
- - Specifically designed to generate comprehensive, step-by-step explanations for complex queries.
48
-
49
- 2. **Lightweight and Efficient:**
50
- - With only 3 billion parameters, it is optimized for systems with limited computational resources without compromising reasoning capabilities.
51
-
52
- 3. **Instruction Optimization:**
53
- - Fine-tuned to follow prompts and provide concise, actionable, and structured responses.
54
-
55
- ---
56
-
57
- ### **Training Details:**
58
-
59
- - **Base Model:** [Qwen2.5-3B-Instruct](#)
60
- - **Dataset:** [amphora/QwQ-LongCoT-130K](#)
61
- - Comprising 133,000 annotated samples focusing on logical tasks and structured thinking.
62
- ---
63
-
64
- ### **Capabilities:**
65
-
66
- 1. **Text Generation:**
67
- - Provides detailed, structured, and logical text outputs tailored to user prompts.
68
-
69
- 2. **Reasoning Tasks:**
70
- - Solves step-by-step problems in math, logic, and science.
71
-
72
- 3. **Educational Assistance:**
73
- - Generates coherent explanations for academic and research purposes.
74
-
75
- 4. **Dialogue and Summarization:**
76
- - Handles conversational queries and summarizes long documents effectively.
77
-
78
- ---
79
-
80
- ### **Usage Instructions:**
81
-
82
- 1. **Setup:**
83
- Download all model files and ensure compatibility with the Hugging Face Transformers library.
84
-
85
- 2. **Loading the Model:**
86
- ```python
87
- from transformers import AutoModelForCausalLM, AutoTokenizer
88
-
89
- model_name = "prithivMLmods/QwQ-LCoT-3B-Instruct"
90
- tokenizer = AutoTokenizer.from_pretrained(model_name)
91
- model = AutoModelForCausalLM.from_pretrained(model_name)
92
- ```
93
-
94
- 3. **Generate Long-Chain Reasoning Outputs:**
95
- ```python
96
- input_text = "Explain the process of photosynthesis step-by-step."
97
- inputs = tokenizer(input_text, return_tensors="pt")
98
- outputs = model.generate(**inputs, max_length=300, temperature=0.5)
99
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
100
- ```
101
-
102
- 4. **Customize Output Generation:**
103
- Modify the `generation_config.json` file for different scenarios:
104
- - **`temperature`**: Controls randomness (lower = deterministic, higher = creative).
105
- - **`max_length`**: Sets response length.
106
- - **`top_p`**: Adjusts sampling for diversity in outputs.
107
-
 
 
 
 
 
 
 
 
 
 
 
 
108
  ---
 
1
+ ---
2
+ license: creativeml-openrail-m
3
+ datasets:
4
+ - amphora/QwQ-LongCoT-130K
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ base_model:
20
+ - Qwen/Qwen2.5-3B-Instruct
21
+ pipeline_tag: text-generation
22
+ library_name: transformers
23
+ tags:
24
+ - text-generation-inference
25
+ - long-CoT
26
+ - safetensors
27
+ - 3B
28
+ - Instruct
29
+ - QwQ
30
+ - Qwen2.5
31
+ ---
32
+ ### **QwQ-LCoT-3B-Instruct Model Card**
33
+
34
+ The **QwQ-LCoT-3B-Instruct** model is a lightweight, instruction-tuned language model designed for complex reasoning and explanation tasks. It is fine-tuned on the **Qwen2.5-3B-Instruct** base model using the **QwQ-LongCoT-130K** dataset, focusing on **long-chain-of-thought (LCoT)** reasoning for enhanced logical comprehension and detailed output generation.
35
+
36
+ | **File Name** | **Size** | **Description** | **Upload Status** |
37
+ |----------------------------------------|----------------|-------------------------------------------------|--------------------|
38
+ | `.gitattributes` | 1.57 kB | Specifies LFS tracking for large files. | Uploaded |
39
+ | `README.md` | 267 Bytes | Basic project information file. | Updated |
40
+ | `added_tokens.json` | 657 Bytes | Custom tokens added to the tokenizer. | Uploaded |
41
+ | `config.json` | 859 Bytes | Configuration file for the model. | Uploaded |
42
+ | `generation_config.json` | 281 Bytes | Configuration file for text generation settings.| Uploaded |
43
+ | `merges.txt` | 1.82 MB | Contains the byte-pair encoding (BPE) merges. | Uploaded |
44
+ | `pytorch_model-00001-of-00002.bin` | 4.96 GB | First shard of the model weights in PyTorch format. | Uploaded (LFS) |
45
+ | `pytorch_model-00002-of-00002.bin` | 1.21 GB | Second shard of the model weights in PyTorch format. | Uploaded (LFS) |
46
+ | `pytorch_model.bin.index.json` | 36 kB | Index mapping for sharded model weights. | Uploaded |
47
+ | `special_tokens_map.json` | 644 Bytes | Maps special tokens to their roles. | Uploaded |
48
+ | `tokenizer.json` | 11.4 MB | Serialized tokenizer data. | Uploaded (LFS) |
49
+ | `tokenizer_config.json` | 7.73 kB | Tokenizer configuration settings. | Uploaded |
50
+ | `vocab.json` | 2.78 MB | Vocabulary file for the tokenizer. | Uploaded |
51
+
52
+ ### **Sample Long CoT:**
53
+
54
+ ![Screenshot 2024-12-13 211732.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Mgm9LmQZlFZmglKYwEDYA.png)
55
+
56
+ ### **Key Features:**
57
+
58
+ 1. **Long Chain-of-Thought Reasoning:**
59
+ - Specifically designed to generate comprehensive, step-by-step explanations for complex queries.
60
+
61
+ 2. **Lightweight and Efficient:**
62
+ - With only 3 billion parameters, it is optimized for systems with limited computational resources without compromising reasoning capabilities.
63
+
64
+ 3. **Instruction Optimization:**
65
+ - Fine-tuned to follow prompts and provide concise, actionable, and structured responses.
66
+
67
+ ---
68
+
69
+ ### **Training Details:**
70
+
71
+ - **Base Model:** [Qwen2.5-3B-Instruct](#)
72
+ - **Dataset:** [amphora/QwQ-LongCoT-130K](#)
73
+ - Comprising 133,000 annotated samples focusing on logical tasks and structured thinking.
74
+ ---
75
+
76
+ ### **Capabilities:**
77
+
78
+ 1. **Text Generation:**
79
+ - Provides detailed, structured, and logical text outputs tailored to user prompts.
80
+
81
+ 2. **Reasoning Tasks:**
82
+ - Solves step-by-step problems in math, logic, and science.
83
+
84
+ 3. **Educational Assistance:**
85
+ - Generates coherent explanations for academic and research purposes.
86
+
87
+ 4. **Dialogue and Summarization:**
88
+ - Handles conversational queries and summarizes long documents effectively.
89
+
90
+ ---
91
+
92
+ ### **Usage Instructions:**
93
+
94
+ 1. **Setup:**
95
+ Download all model files and ensure compatibility with the Hugging Face Transformers library.
96
+
97
+ 2. **Loading the Model:**
98
+ ```python
99
+ from transformers import AutoModelForCausalLM, AutoTokenizer
100
+
101
+ model_name = "prithivMLmods/QwQ-LCoT-3B-Instruct"
102
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
103
+ model = AutoModelForCausalLM.from_pretrained(model_name)
104
+ ```
105
+
106
+ 3. **Generate Long-Chain Reasoning Outputs:**
107
+ ```python
108
+ input_text = "Explain the process of photosynthesis step-by-step."
109
+ inputs = tokenizer(input_text, return_tensors="pt")
110
+ outputs = model.generate(**inputs, max_length=300, temperature=0.5)
111
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
112
+ ```
113
+
114
+ 4. **Customize Output Generation:**
115
+ Modify the `generation_config.json` file for different scenarios:
116
+ - **`temperature`**: Controls randomness (lower = deterministic, higher = creative).
117
+ - **`max_length`**: Sets response length.
118
+ - **`top_p`**: Adjusts sampling for diversity in outputs.
119
+
120
  ---