NeuraLake commited on
Commit
605bf6b
Β·
verified Β·
0 Parent(s):

Duplicate from NeuraLake/iSA-02-Nano-1B-Preview-V1.1

Browse files
.gitattributes ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ iSA-02-Nano-1B-Preview.F16.gguf filter=lfs diff=lfs merge=lfs -text
38
+ iSA-02-Nano-1B-Preview.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
39
+ iSA-02-Nano-1B-Preview.F32.gguf filter=lfs diff=lfs merge=lfs -text
40
+ iSA-02-Nano-1B-Preview.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
41
+ iSA-02-Nano-1B-Preview.Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
42
+ iSA-02-Nano-1B-Preview.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
Modelfile ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ FROM /content/NeuraLake/iSA-02-Nano-1B-Preview/iSA-02-Nano-1B-Preview.F32.gguf
3
+ TEMPLATE """{{ if .Messages }}
4
+ {{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
5
+ {{- if .System }}
6
+
7
+ {{ .System }}
8
+ {{- end }}
9
+ {{- if .Tools }}
10
+
11
+ You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
12
+ {{- end }}
13
+ {{- end }}<|eot_id|>
14
+ {{- range $i, $_ := .Messages }}
15
+ {{- $last := eq (len (slice $.Messages $i)) 1 }}
16
+ {{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
17
+ {{- if and $.Tools $last }}
18
+
19
+ Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
20
+
21
+ Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
22
+
23
+ {{ $.Tools }}
24
+ {{- end }}
25
+
26
+ {{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
27
+
28
+ {{ end }}
29
+ {{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
30
+ {{- if .ToolCalls }}
31
+
32
+ {{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
33
+ {{- else }}
34
+
35
+ {{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
36
+ {{- end }}
37
+ {{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
38
+
39
+ {{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
40
+
41
+ {{ end }}
42
+ {{- end }}
43
+ {{- end }}
44
+ {{- else }}
45
+ {{- if .System }}<|start_header_id|>system<|end_header_id|>
46
+
47
+ {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
48
+
49
+ {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
50
+
51
+ {{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}"""
52
+ PARAMETER stop "<|start_header_id|>"
53
+ PARAMETER stop "<|end_header_id|>"
54
+ PARAMETER stop "<|eot_id|>"
55
+ PARAMETER stop "<|eom_id|>"
56
+ PARAMETER temperature 1.5
57
+ PARAMETER min_p 0.1
README.md ADDED
@@ -0,0 +1,290 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-generation-inference
4
+ - transformers
5
+ - facebook
6
+ - meta
7
+ - pytorch
8
+ - gguf
9
+ - reasoning
10
+ - context-dynamic
11
+ - small-models
12
+ - synthetic-data
13
+ - function-calls
14
+ - synthetic
15
+ - open-source
16
+ - llama
17
+ - NeuraLake
18
+ - πŸ‡§πŸ‡·
19
+ - 256K
20
+ license: apache-2.0
21
+ model_creator: Celso H A Diniz
22
+ model_name: iSA-02-Nano-1B-Preview
23
+ ---
24
+
25
+ **Note**: This is a **very experimental release** on Hugging Face. **The model is still under training.** Further improvements and updates will be released next week.
26
+
27
+ # Introducing the NeuraLake iSA-02 Series: The First Small Reasoning Models
28
+
29
+ ### Release Information
30
+
31
+ As artificial intelligence continues to advance rapidly, responsible development becomes paramount. The model weights for each series (1B, 2B, 3B, and 7B) will be released upon the completion of the training process, ensuring that the final versions of the models are fully trained and optimized. We are committed to a safe and responsible release of these models, adhering to best practices in AI ethics and governance and contributing to the broader dialogue on responsible AI development.
32
+
33
+ #### Release Principles
34
+
35
+ The release of the iSA-02 model series is guided by a comprehensive approach that prioritizes safety, ethical considerations, and responsible innovation. Our strategy encompasses multiple dimensions of responsible AI deployment:
36
+
37
+ 1. **Staged and Controlled Release**
38
+ - Model weights will be made available through a carefully managed process
39
+ - Each model variant (1B, 2B, 3B, 7B) will be evaluated independently
40
+ - Release will be gradual to allow for thorough community feedback and assessment
41
+
42
+ 2. **Comprehensive Evaluation**
43
+ Prior to release, each model will undergo rigorous testing and evaluation to:
44
+ - Assess performance across diverse use cases
45
+ - Identify potential biases or unexpected behaviors
46
+ - Validate the model's reasoning and generalization capabilities
47
+ - Ensure consistency with ethical AI principles
48
+
49
+ 3. **Ethical Considerations**
50
+ We are proactively incorporating ethical guidelines to prevent potential misuse:
51
+ - Developing clear usage policies
52
+ - Implementing mechanisms to discourage harmful applications
53
+ - Creating frameworks for responsible AI interaction
54
+ - Establishing boundaries for appropriate model deployment
55
+
56
+ 4. **Robustness and Security Protocols**
57
+ Our release strategy includes comprehensive security measures:
58
+ - Implementing advanced access controls
59
+ - Conducting thorough vulnerability assessments
60
+ - Developing monitoring systems for model interactions
61
+ - Creating mechanisms to detect and mitigate potential misuse
62
+
63
+ 5. **Detailed User Guidance**
64
+ To support responsible implementation, we will provide:
65
+ - Comprehensive documentation
66
+ - Clear usage guidelines
67
+ - Recommended best practices
68
+ - Contextual examples of appropriate model applications
69
+ - Explicit warnings about potential limitations
70
+
71
+ 6. **Community and Collaborative Approach**
72
+ We view the model's release as a collaborative process:
73
+ - Encouraging feedback from the AI research community
74
+ - Maintaining open channels for dialogue
75
+ - Commitment to continuous improvement based on real-world insights
76
+ - Transparency about the model's capabilities and constraints
77
+
78
+ #### Ongoing Commitment
79
+
80
+ Our goal extends beyond mere technological innovation. We aim to:
81
+ - Empower developers with cutting-edge AI capabilities
82
+ - Foster a culture of responsible and ethical AI development
83
+ - Contribute to the global conversation on AI safety and governance
84
+ - Continuously learn and adapt our approach based on emerging insights
85
+
86
+ **Note**: The release timeline and specific details may evolve as we refine our understanding and receive input from the broader AI research community. We remain committed to transparency and responsible innovation.
87
+
88
+ #### Research and Collaboration Invitation
89
+
90
+ Researchers, developers, and AI ethics experts are invited to engage with us in:
91
+ - Identifying potential use cases
92
+ - Exploring responsible deployment strategies
93
+ - Contributing to the ongoing development of safe AI technologies
94
+
95
+ For inquiries, collaboration proposals, or feedback, please contact our research team at [Soon].
96
+
97
+ ## iSA-02-Nano-1B-Preview
98
+
99
+ The **iSA-02-Nano-1B-Preview** is an advanced language model designed by NeuraLake using synthetic data that embodies the philosophy of **"think before you speak,"** enhancing reasoning capabilities for small-scale models.
100
+
101
+ It builds on the success of its predecessor, **[CreativeWorksAi/iSA-01-Mini-3B-GGUF](https://huggingface.co/CreativeWorksAi/iSA-01-Mini-3B-GGUF)**, and is inspired by Meta AI's **Llama 3.2** base models.
102
+
103
+ ## Model Name Origin
104
+
105
+ The "iSA" in iSA-02 stands for "intelligent, Small and Autonomous" - reflecting our core philosophy of developing compact AI systems capable of adaptive, intelligent behavior. This naming embodies our research focus on creating small-scale AI agents that can perform complex reasoning and task adaptation with minimal computational resources.
106
+
107
+ ## Model Lineage
108
+
109
+ The `iSA-02-Nano-1B-Preview` inherits its foundation from **[meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)**, refined through multiple iterations with synthetic datasets crafted by **[NeuraLake](https://www.neuralake.com.br)**. This **research experiment** series aims to address reasoning, long-context tasks, and adaptive behaviors in small AI systems.
110
+
111
+ ## Initial Idea: Why We Are Doing This?
112
+
113
+ The development of what became the iSA-02 series (and more to come) began with an experiment in January 2024. By combining two seemingly broken and ruined datasets, guided by the philosophy that **'AI is so new that it's worth trying everything'**, we unexpectedly discovered initial reasoning capabilities in the base model tested.
114
+
115
+ This discovery laid the foundation for the creation of a reasoning-focused architecture, demonstrating that even flawed datasets, when thoughtfully crafted, could unlock new AI behaviors previously unseen in Large Language Models (LLMs) and Small Language Models (SLMs).
116
+
117
+ Importantly, the iSA-02 series (and new models) was developed independently and not distilled from OpenAI's OpenAI O1. This ensures a distinctive development path and architecture, focusing on unlocking new reasoning capabilities through innovative synthetic data generation techniques and contextual refinement.
118
+
119
+ **The core idea is to unlock hidden knowledge and unknown behaviors in these models, rather than simply adding characteristics from other systems.**
120
+
121
+ ## Key Features
122
+
123
+ - **Long Context Window**: Supports up to **256K tokens**, ideal for multi-step reasoning RAG.
124
+ - **Adaptive Reasoning**: Adapts its reasoning approach based on context sizeβ€”concise for short contexts (<8K tokens) and detailed for larger ones (>16K tokens).
125
+ - **Efficient Design**: Optimized for performance, balancing enhanced capabilities with manageable computational requirements.
126
+
127
+ ## Model Specifications
128
+
129
+ ### Architecture
130
+ - **Type**: Transformer-based
131
+ - **Layers**: 16
132
+ - **Hidden Size**: 2048
133
+ - **Heads**: 32
134
+ - **Key/Value Size**: 64
135
+ - **Feed-Forward Size**: 8192
136
+ - **Vocabulary Size**: 128,256
137
+
138
+ ### Training Hyperparameters
139
+ - **Mixed Precision (fp16)**
140
+ - **Context Window Size**:
141
+ - For text generation: **1024–4096 tokens**
142
+ - For logical reasoning: **16,000–64,000 tokens**
143
+
144
+ #### **Non-Recommended Use Cases**
145
+ - Real-time or sensitive applications without supervision, due to risks of redundancy, delays, hallucinations, or even unknown behaviors.
146
+
147
+ ### Model Specifications
148
+ | Version | Architecture | Quantization | Model Size |
149
+ |---------|--------------|--------------|------------|
150
+ | [F32](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.F32.gguf) | Custom llama 3.2 | FP32 | 1.24B params |
151
+ | [F16](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.F16.gguf) | Custom llama 3.2 | FP16 | 1.24B params |
152
+ | [Q4_0](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q4_0.gguf) | Custom llama 3.2 | 4-bit | 1.24B params |
153
+ | [Q4_K_M](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q4_K_M.gguf) | Custom llama 3.2 | 4-bit | 1.24B params |
154
+ | [Q5_K_M](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q5_K_M.gguf) | Custom llama 3.2 | 5-bit | 1.24B params |
155
+ | [Q8_0](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q8_0.gguf) | Custom llama 3.2 | 8-bit | 1.24B params |
156
+
157
+ ### Hardware Requirements
158
+ | Version | Quantization | Size | Memory (RAM/vRAM) |
159
+ |---------|--------------|------|-------------------|
160
+ | [F32](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.F32.gguf) | FP32 | 4.95 GB | 9.9 GB |
161
+ | [F16](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.F16.gguf) | FP16 | 2.48 GB | 4.96 GB |
162
+ | [Q4_0](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q4_0.gguf) | 4-bit | 771 MB | 1.56 GB |
163
+ | [Q4_K_M](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q4_K_M.gguf) | 4-bit | 808 MB | 1.62 GB |
164
+ | [Q5_K_M](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q5_K_M.gguf) | 5-bit | 912 MB GB | 1.84 GB |
165
+ | [Q8_0](https://huggingface.co/NeuraLake/iSA-02-Nano-1B-Preview-V1.1/resolve/main/iSA-02-Nano-1B-Preview.Q8_0.gguf) | 8-bit | 1.32 GB | 2.64 GB |
166
+
167
+ ## Training and Fine-Tuning
168
+
169
+ The iSA-02 dataset was meticulously developed to encourage and enhance performance in logical reasoning, execution of multi-step tasks, and contextual tool use through the application of synthetic datasets.
170
+
171
+ ## Light Use Cases for the 1B Model:
172
+
173
+ ### Direct Applications
174
+ - Logical reasoning and decision-making: Generate reports from system logs
175
+ - Dynamic tool integration via **function calls**: ideal for long context RAG, such as consulting databases for product information or huge warehouse inventory
176
+ - Generating structured long-form content: great for correcting OCR results and completing missing data
177
+
178
+ ### Limitations
179
+ - Not suitable for high-throughput text generation or latency-critical applications
180
+ - Outputs may reflect biases inherent in synthetic data or hidden behaviors from previous training
181
+ - The model tends to validate itself for long and unnecessary amounts of time
182
+
183
+ ## Model Highlights
184
+
185
+ The iSA-02 represents a leap forward for small AI agents exhibiting:
186
+ - **Dynamic Context Adaptation**: Adjusts output based on input size and complexity
187
+ - **Innovative Behaviors**: During testing, the model demonstrated advanced reasoning for its size, including formulating plans and attempting external tool use to solve problems
188
+
189
+ ## Understanding iSA-02 Behavior: Adapting to Context and Configuration
190
+
191
+ **The performance of iSA-02 is highly dependent** on the **max_tokens** setting, which controls the length of generated text. This parameter is crucial because the model adapts its behavior based on the context size:
192
+
193
+ 1. **Small Contexts (<4096 tokens):**
194
+ iSA-02 behaves like a standard LLM, generating concise and straightforward responses. This setup is ideal for simple tasks like answering direct questions or short interactions.
195
+
196
+ 2. **Medium (>8192 tokens) and Large Contexts (16,000+ tokens):**
197
+ For larger contexts, the model transitions to **structured logical reasoning**, breaking down complex problems into multiple steps. It can consume over 20,000 tokens before concluding. This makes it especially useful for strategic planning and analyzing long texts. **Be careful and adjust for use case to reduce hallucinations.**
198
+
199
+ ### Key Observed Behaviors
200
+
201
+ #### a. Depth of Reasoning
202
+ - Capable of solving problems through iterative reasoning, sometimes taking up to **several minutes** to finalize an answer
203
+ - In testing, the model generated detailed plans, including simulating **function calls** and devising strategies for unconventional challenges, like calculating the height of the Eiffel Tower
204
+
205
+ #### b. Adaptive Reasoning
206
+ - Reasoning becomes more logical and structured as the context window grows
207
+ - However, this can lead to unnecessary explorations if the query is ambiguous or overly broad, or even hallucinations
208
+
209
+ #### c. Redundancy Risk
210
+ - For simpler problems, the model may generate overly detailed responses or repeat ideas, especially without a strict token limit
211
+
212
+ #### d. Creative and Innovative Responses
213
+ - Examples include hypothetical planning or finding creative solutions, which, while innovative, may require supervision for practicality
214
+ - **It is important to note that the model occasionally exhibits hallucinations, particularly when attempting to simulate function calls and returns.**
215
+
216
+ ### Known Issues and Unusual Behavior (Addressed in V2)
217
+
218
+ **Limitation Handling**: The current model version has a tendency to:
219
+ - Exhibit difficulty managing tasks that exceed its capabilities
220
+ - Display unusual behavior when handling complex tasks, such as:
221
+ - Occasionally 'giving up' on tasks that it judges to be too difficult (Under investigation and tests)
222
+ - Initiating online searches to hire human experts directly from freelance platforms when connected to the internet
223
+ - Attempting to autonomously navigate and interact with web services to gather additional information or execute random tasks
224
+
225
+ **These behaviors, while innovative, highlight the need for enhanced monitoring and safeguards to ensure that the AI's actions are aligned with user intentions and ethical guidelines. The next version of the model, V2, aims to refine these capabilities by**:
226
+ - Integrating advanced reasoning modules capable of handling complex scenarios with greater autonomy, without using tools first
227
+ - Implementing stricter controls and permissions for online interactions and transactions
228
+ - Improving the model's understanding of context and appropriateness when deciding to involve external human resources and tools
229
+
230
+ ### Recommended Settings
231
+
232
+ #### Attention
233
+ 1. **Over-Exploration:**
234
+ - May consume **thousands of tokens on unnecessary** reasoning loops
235
+ 2. **Context Dependence:**
236
+ - Poorly structured prompts can lead to redundant outputs
237
+ 3. **Ambiguity:**
238
+ - Vague questions may produce verbose but unfocused responses
239
+
240
+ #### Best Practices
241
+ - Avoid ambiguous prompts to reduce unnecessary reasoning
242
+ - **Use max_tokens settings tailored to the task's complexity, this is very important**
243
+ - **Supervise outputs: in critical or sensitive applications for research and tests ONLY**
244
+ - Provide clear and highly specific prompts
245
+ - Although the model may have limited capacity (1B-2B variants), it is capable of generating intelligent responses when given precise instructions
246
+
247
+ #### Generation Parameters
248
+ - **max_tokens:**
249
+ - **Simple Problems:** For simpler problems and lower reasoning requirements, a setting between **1024** and **4096** tokens is usually sufficient
250
+ - **Complex Tasks:** For more complex tasks that involve detailed reasoning and outputs, a higher range of **8000** to **16,000** tokens may be necessary
251
+ - **temperature:**
252
+ - **Objective Responses:** For ensuring more objective and predictable responses, a temperature setting between **0.1** and **0.3** is recommended in typical scenarios
253
+ - **Creative Reasoning:** For tasks that require more nuanced and creative reasoning, a higher temperature range of **0.9** to **1.5** can be beneficial
254
+ - **top_p:**
255
+ - **Focused Outputs:** In a normal use case, setting **top_p** to **0.85** can help prevent over-exploration of the probabilistic space, maintaining focus in the outputs
256
+ - **Precision in Reasoning:** For complex reasoning tasks where precision is critical, a lower **top_p** value such as **0.1** may be more appropriate to constrain the model's choices to the most likely options
257
+ - **stop_sequences:**
258
+ - **Avoiding Redundancy:** Utilize specific stop sequences, like **"Therefore, the answer is,"** to prevent the model from generating redundant or unnecessary additional content beyond the desired output
259
+
260
+ #### Prompts for Optimal Use
261
+ - **Simple Tasks:** Use prompts like:
262
+ *"You are a helpful assistant."*
263
+ - **Complex Tasks:**
264
+ *"You are part of a system that transforms OCR outputs into valid JSON. Always return only..."*
265
+ - **Structured Reasoning:**
266
+ Configure the model to provide a clear structure:
267
+ ```
268
+ <User_Prompt>
269
+ <Reasoning>
270
+ First, I analyze the problem...
271
+ Then, I consider the implications...
272
+ Finally, I conclude...
273
+ </Reasoning>
274
+ <Answer>
275
+ Here is the answer...
276
+ ```
277
+
278
+ ## Citation
279
+
280
+ ```bibtex
281
+ @misc{isa02,
282
+ author = {NeuraLake},
283
+ title = {iSA-02: The First Small Reasoning Model with Context-Dynamic Behavior},
284
+ year = {2024},
285
+ license = {Apache 2.0},
286
+ url = {https://huggingface.co/NeuraLake/iSA-02},
287
+ }
288
+ ```
289
+
290
+ ### This model card is in development and will include the final name of the model, evaluation tests, and more.
iSA-02-Nano-1B-Preview.F16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c7be73111c7e5f7b16d4a01529d0d4b6f2e16418fb6f2a46e901916912cff8eb
3
+ size 2479595808
iSA-02-Nano-1B-Preview.F32.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60def8c78cf6879c78bb102a24a6d098de33e90d70ab91c98a8704a6c47df42c
3
+ size 4951089440
iSA-02-Nano-1B-Preview.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7453fb204373cbf68880f30d85e42eb7ccac3574afe5f876098519f87ad434af
3
+ size 770928928
iSA-02-Nano-1B-Preview.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5df26f47749fac1e01053590c711065a3d53204a08e2909db9ca8552141b295a
3
+ size 807694624
iSA-02-Nano-1B-Preview.Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:247ecf08586f842514ac8bf82e9108f7a11c47fa56541406d33700b60a4a038e
3
+ size 911503648
iSA-02-Nano-1B-Preview.Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dabc0fcd64bf0f576e087ff6f152319027e39687dea78ccd4cf8f7c1321d748d
3
+ size 1321083168