Note: This is a very experimental release on Hugging Face. The model is still under training. Further improvements and updates will be released next week.

Introducing the NeuraLake iSA-02 Series: The First Small Reasoning Models

Release Information

As artificial intelligence continues to advance rapidly, responsible development becomes paramount. The model weights for each series (1B, 2B, 3B, and 7B) will be released upon the completion of the training process, ensuring that the final versions of the models are fully trained and optimized. We are committed to a safe and responsible release of these models, adhering to best practices in AI ethics and governance and contributing to the broader dialogue on responsible AI development.

Release Principles

The release of the iSA-02 model series is guided by a comprehensive approach that prioritizes safety, ethical considerations, and responsible innovation. Our strategy encompasses multiple dimensions of responsible AI deployment:

Staged and Controlled Release
- Model weights will be made available through a carefully managed process
- Each model variant (1B, 2B, 3B, 7B) will be evaluated independently
- Release will be gradual to allow for thorough community feedback and assessment
Comprehensive Evaluation Prior to release, each model will undergo rigorous testing and evaluation to:
- Assess performance across diverse use cases
- Identify potential biases or unexpected behaviors
- Validate the model's reasoning and generalization capabilities
- Ensure consistency with ethical AI principles
Ethical Considerations We are proactively incorporating ethical guidelines to prevent potential misuse:
- Developing clear usage policies
- Implementing mechanisms to discourage harmful applications
- Creating frameworks for responsible AI interaction
- Establishing boundaries for appropriate model deployment
Robustness and Security Protocols Our release strategy includes comprehensive security measures:
- Implementing advanced access controls
- Conducting thorough vulnerability assessments
- Developing monitoring systems for model interactions
- Creating mechanisms to detect and mitigate potential misuse
Detailed User Guidance To support responsible implementation, we will provide:
- Comprehensive documentation
- Clear usage guidelines
- Recommended best practices
- Contextual examples of appropriate model applications
- Explicit warnings about potential limitations
Community and Collaborative Approach We view the model's release as a collaborative process:
- Encouraging feedback from the AI research community
- Maintaining open channels for dialogue
- Commitment to continuous improvement based on real-world insights
- Transparency about the model's capabilities and constraints

Ongoing Commitment

Our goal extends beyond mere technological innovation. We aim to:

Empower developers with cutting-edge AI capabilities
Foster a culture of responsible and ethical AI development
Contribute to the global conversation on AI safety and governance
Continuously learn and adapt our approach based on emerging insights

Note: The release timeline and specific details may evolve as we refine our understanding and receive input from the broader AI research community. We remain committed to transparency and responsible innovation.

Research and Collaboration Invitation

Researchers, developers, and AI ethics experts are invited to engage with us in:

Identifying potential use cases
Exploring responsible deployment strategies
Contributing to the ongoing development of safe AI technologies

For inquiries, collaboration proposals, or feedback, please contact our research team at [Soon].

iSA-02-Nano-1B-Preview

The iSA-02-Nano-1B-Preview is an advanced language model designed by NeuraLake using synthetic data that embodies the philosophy of "think before you speak," enhancing reasoning capabilities for small-scale models.

It builds on the success of its predecessor, CreativeWorksAi/iSA-01-Mini-3B-GGUF, and is inspired by Meta AI's Llama 3.2 base models.

Model Name Origin

The "iSA" in iSA-02 stands for "intelligent, Small and Autonomous" - reflecting our core philosophy of developing compact AI systems capable of adaptive, intelligent behavior. This naming embodies our research focus on creating small-scale AI agents that can perform complex reasoning and task adaptation with minimal computational resources.

Model Lineage

The iSA-02-Nano-1B-Preview inherits its foundation from meta-llama/Llama-3.2-1B-Instruct, refined through multiple iterations with synthetic datasets crafted by NeuraLake. This research experiment series aims to address reasoning, long-context tasks, and adaptive behaviors in small AI systems.

Initial Idea: Why We Are Doing This?

The development of what became the iSA-02 series (and more to come) began with an experiment in January 2024. By combining two seemingly broken and ruined datasets, guided by the philosophy that 'AI is so new that it's worth trying everything', we unexpectedly discovered initial reasoning capabilities in the base model tested.

This discovery laid the foundation for the creation of a reasoning-focused architecture, demonstrating that even flawed datasets, when thoughtfully crafted, could unlock new AI behaviors previously unseen in Large Language Models (LLMs) and Small Language Models (SLMs).

Importantly, the iSA-02 series (and new models) was developed independently and not distilled from OpenAI's OpenAI O1. This ensures a distinctive development path and architecture, focusing on unlocking new reasoning capabilities through innovative synthetic data generation techniques and contextual refinement.

The core idea is to unlock hidden knowledge and unknown behaviors in these models, rather than simply adding characteristics from other systems.

Key Features

Long Context Window: Supports up to 256K tokens, ideal for multi-step reasoning RAG.
Adaptive Reasoning: Adapts its reasoning approach based on context size—concise for short contexts (<8K tokens) and detailed for larger ones (>16K tokens).
Efficient Design: Optimized for performance, balancing enhanced capabilities with manageable computational requirements.

Model Specifications

Architecture

Type: Transformer-based
Layers: 16
Hidden Size: 2048
Heads: 32
Key/Value Size: 64
Feed-Forward Size: 8192
Vocabulary Size: 128,256

Training Hyperparameters

Mixed Precision (fp16)
Context Window Size:
- For text generation: 1024–4096 tokens
- For logical reasoning: 16,000–64,000 tokens

Non-Recommended Use Cases

Real-time or sensitive applications without supervision, due to risks of redundancy, delays, hallucinations, or even unknown behaviors.

Model Specifications

Version	Architecture	Quantization	Model Size
F32	Custom llama 3.2	FP32	1.24B params
F16	Custom llama 3.2	FP16	1.24B params
Q4_0	Custom llama 3.2	4-bit	1.24B params
Q4_K_M	Custom llama 3.2	4-bit	1.24B params
Q5_K_M	Custom llama 3.2	5-bit	1.24B params
Q8_0	Custom llama 3.2	8-bit	1.24B params

Hardware Requirements

Version	Quantization	Size	Memory (RAM/vRAM)
F32	FP32	4.95 GB	9.9 GB
F16	FP16	2.48 GB	4.96 GB
Q4_0	4-bit	771 MB	1.56 GB
Q4_K_M	4-bit	808 MB	1.62 GB
Q5_K_M	5-bit	912 MB GB	1.84 GB
Q8_0	8-bit	1.32 GB	2.64 GB

Training and Fine-Tuning

The iSA-02 dataset was meticulously developed to encourage and enhance performance in logical reasoning, execution of multi-step tasks, and contextual tool use through the application of synthetic datasets.

Light Use Cases for the 1B Model:

Direct Applications

Logical reasoning and decision-making: Generate reports from system logs
Dynamic tool integration via function calls: ideal for long context RAG, such as consulting databases for product information or huge warehouse inventory
Generating structured long-form content: great for correcting OCR results and completing missing data

Limitations

Not suitable for high-throughput text generation or latency-critical applications
Outputs may reflect biases inherent in synthetic data or hidden behaviors from previous training
The model tends to validate itself for long and unnecessary amounts of time

Model Highlights

The iSA-02 represents a leap forward for small AI agents exhibiting:

Dynamic Context Adaptation: Adjusts output based on input size and complexity
Innovative Behaviors: During testing, the model demonstrated advanced reasoning for its size, including formulating plans and attempting external tool use to solve problems

Understanding iSA-02 Behavior: Adapting to Context and Configuration

The performance of iSA-02 is highly dependent on the max_tokens setting, which controls the length of generated text. This parameter is crucial because the model adapts its behavior based on the context size:

Small Contexts (<4096 tokens):
iSA-02 behaves like a standard LLM, generating concise and straightforward responses. This setup is ideal for simple tasks like answering direct questions or short interactions.
Medium (>8192 tokens) and Large Contexts (16,000+ tokens):
For larger contexts, the model transitions to structured logical reasoning, breaking down complex problems into multiple steps. It can consume over 20,000 tokens before concluding. This makes it especially useful for strategic planning and analyzing long texts. Be careful and adjust for use case to reduce hallucinations.

Key Observed Behaviors

a. Depth of Reasoning

Capable of solving problems through iterative reasoning, sometimes taking up to several minutes to finalize an answer
In testing, the model generated detailed plans, including simulating function calls and devising strategies for unconventional challenges, like calculating the height of the Eiffel Tower

b. Adaptive Reasoning

Reasoning becomes more logical and structured as the context window grows
However, this can lead to unnecessary explorations if the query is ambiguous or overly broad, or even hallucinations

c. Redundancy Risk

For simpler problems, the model may generate overly detailed responses or repeat ideas, especially without a strict token limit

d. Creative and Innovative Responses

Examples include hypothetical planning or finding creative solutions, which, while innovative, may require supervision for practicality
It is important to note that the model occasionally exhibits hallucinations, particularly when attempting to simulate function calls and returns.

Known Issues and Unusual Behavior (Addressed in V2)

Limitation Handling: The current model version has a tendency to:

Exhibit difficulty managing tasks that exceed its capabilities
Display unusual behavior when handling complex tasks, such as:
- Occasionally 'giving up' on tasks that it judges to be too difficult (Under investigation and tests)
- Initiating online searches to hire human experts directly from freelance platforms when connected to the internet
- Attempting to autonomously navigate and interact with web services to gather additional information or execute random tasks

These behaviors, while innovative, highlight the need for enhanced monitoring and safeguards to ensure that the AI's actions are aligned with user intentions and ethical guidelines. The next version of the model, V2, aims to refine these capabilities by:

Integrating advanced reasoning modules capable of handling complex scenarios with greater autonomy, without using tools first
Implementing stricter controls and permissions for online interactions and transactions
Improving the model's understanding of context and appropriateness when deciding to involve external human resources and tools

Recommended Settings

Attention

Over-Exploration:
- May consume thousands of tokens on unnecessary reasoning loops
Context Dependence:
- Poorly structured prompts can lead to redundant outputs
Ambiguity:
- Vague questions may produce verbose but unfocused responses

Best Practices

Avoid ambiguous prompts to reduce unnecessary reasoning
Use max_tokens settings tailored to the task's complexity, this is very important
Supervise outputs: in critical or sensitive applications for research and tests ONLY
Provide clear and highly specific prompts
Although the model may have limited capacity (1B-2B variants), it is capable of generating intelligent responses when given precise instructions

Generation Parameters

max_tokens:
- Simple Problems: For simpler problems and lower reasoning requirements, a setting between 1024 and 4096 tokens is usually sufficient
- Complex Tasks: For more complex tasks that involve detailed reasoning and outputs, a higher range of 8000 to 16,000 tokens may be necessary
temperature:
- Objective Responses: For ensuring more objective and predictable responses, a temperature setting between 0.1 and 0.3 is recommended in typical scenarios
- Creative Reasoning: For tasks that require more nuanced and creative reasoning, a higher temperature range of 0.9 to 1.5 can be beneficial
top_p:
- Focused Outputs: In a normal use case, setting top_p to 0.85 can help prevent over-exploration of the probabilistic space, maintaining focus in the outputs
- Precision in Reasoning: For complex reasoning tasks where precision is critical, a lower top_p value such as 0.1 may be more appropriate to constrain the model's choices to the most likely options
stop_sequences:
- Avoiding Redundancy: Utilize specific stop sequences, like "Therefore, the answer is," to prevent the model from generating redundant or unnecessary additional content beyond the desired output

Prompts for Optimal Use

Simple Tasks: Use prompts like:
"You are a helpful assistant."
Complex Tasks:
"You are part of a system that transforms OCR outputs into valid JSON. Always return only..."

Structured Reasoning:
Configure the model to provide a clear structure:

<User_Prompt>  
<Reasoning>  
First, I analyze the problem...  
Then, I consider the implications...  
Finally, I conclude...  
</Reasoning>  
<Answer>  
Here is the answer...

Citation

@misc{isa02,
  author = {NeuraLake},
  title = {iSA-02: The First Small Reasoning Model with Context-Dynamic Behavior},
  year = {2024},
  license = {Apache 2.0},
  url = {https://huggingface.co/NeuraLake/iSA-02},
}

NeuraLakeAi
/

iSA-02-Nano-1B-Preview

Introducing the NeuraLake iSA-02 Series: The First Small Reasoning Models

Release Information

Release Principles

Ongoing Commitment

Research and Collaboration Invitation

iSA-02-Nano-1B-Preview

Model Name Origin

Model Lineage

Initial Idea: Why We Are Doing This?

Key Features

Model Specifications

Architecture

Training Hyperparameters

Non-Recommended Use Cases

Model Specifications

Hardware Requirements

Training and Fine-Tuning

Light Use Cases for the 1B Model:

Direct Applications

Limitations

Model Highlights

Understanding iSA-02 Behavior: Adapting to Context and Configuration

Key Observed Behaviors

a. Depth of Reasoning

b. Adaptive Reasoning

c. Redundancy Risk

d. Creative and Innovative Responses

Known Issues and Unusual Behavior (Addressed in V2)

Recommended Settings

Attention

Best Practices

Generation Parameters

Prompts for Optimal Use

Citation

This model card is in development and will include the final name of the model, evaluation tests, and more.