Instructions to use deepnight-research/ai1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use deepnight-research/ai1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="deepnight-research/ai1")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("deepnight-research/ai1") model = AutoModelForCausalLM.from_pretrained("deepnight-research/ai1") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use deepnight-research/ai1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "deepnight-research/ai1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepnight-research/ai1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/deepnight-research/ai1
- SGLang
How to use deepnight-research/ai1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "deepnight-research/ai1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepnight-research/ai1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "deepnight-research/ai1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "deepnight-research/ai1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use deepnight-research/ai1 with Docker Model Runner:
docker model run hf.co/deepnight-research/ai1

DEEPNIGHT ai1
The 600 Billion+ Parameter Model. Yes! We did this!
The second largest model in the world, right after GPT-4.
We at DEEPNIGHT have been working on this for quite some time. We have successfully built the second largest model called ai1 which comes with 600 Billion+ parameters.
ai1 can perform as good as GPT-4 and has a context-window of 8k tokens.
ai1 was trained with a new approach where after training the model on a corpus of text from various sources including
but not limited to:
- RefinedWeb
- Opensource code from GitHub
- Common Crawl we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning. We also trained the model for function calling capabilities.
What is special about ai1?
ai1 works on a chaining methodology which is built-in. When it receives an input from the user, it tries to understand the input before acting on generation. It generates an instruction-based prompt internally and then works on generation of the response. Benefit of this? We'll just say the jobs of Prompt Engineering are over.
Unlike ChatGPT, GPT-4, Llama, and other models, ai1 doesn't require heavy prompt engineering to provide answers. The understanding-development phase in the model takes care of that.
What else?
- performs as good as GPT-4
- excels in automation tasks
- can predict emotions of the user by the conversation (while understanding the input in Phase-1) resulting in better and curated generations.
- has an understanding towards human-emotions which helps the model curate the content accordingly
- excels in roleplays
- excels in writing code
- the model has a few global memory units which are used to store data away from the context-window. These memory units are mostly used to store the function schemas but in the end the model decides itself what to store in them.
- if we consider how much would it cost, well, on an average $0.005 per 1000 tokens.
Future goals
We don't discuss that. Specially after seeing how SOME AI COMPANY ON THEIR DEV DAY just used the opensource research and publications to profit themselves... Hah.
Are we going to allow access?
Not for some time. We are still running evaluations and have a lot to learn about how this model can be made better.
Feel free to reach out to us at research@deepnight.tech
- Team DEEPNIGHT
- Downloads last month
- -