|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
# BitAgent-8B |
|
|
|
BitAgent-8B is an open-source, tool-calling language model fine-tuned and incentivized on [Bittensor](https://bittensor.com) Subnet #20 -- BitAgent. This model was trained for complex function-calling tasks, drawing on decentralized AI efforts and an open, community-driven training approach. |
|
|
|
 |
|
|
|
--- |
|
|
|
## Overview |
|
|
|
**BitAgent-8B** arose from the collaborative efforts within Bittensor Subnet #20. It leverages: |
|
- **Decentralized AI**: Community-driven hosting and validation that provide continuous training signals, ensuring that the model adapts to a wide range of function-calling and workflow-building tasks. |
|
- **Previous Dataset Top Performer**: The model’s parameters have been refined in multiple training competitions, this specific model having done well several rounds ago against tool_shuffle_small round 1, our constantly evolving task set. |
|
- **Broad Tool-Calling Agency**: It handles a diverse set of functions spanning financial calculations, workflow management, deployment scripts, and more. |
|
|
|
--- |
|
|
|
## Key Features |
|
- **Enhanced Tool Usage**: Fine-tuned to select, utilize, and chain functions from a toolset effectively. |
|
- **BFCL-Style Adherence**: Engineered for strong performance on the [Berkeley Function Calling Leaderboard](https://gorilla.cs.berkeley.edu/leaderboard.html). |
|
- **Decentralized & Community Driven**: Developed and hosted on a global, miner-supported network (Subnet #20 on Bittensor), encouraging open contribution and verifying model performance without reliance on centralized resources. |
|
|
|
--- |
|
|
|
## Performance on BFCL |
|
BitAgent-8B secured a **top-10 rank (6th place)** on the BFCL, ahead of notable commercial models such as: |
|
- 4o Mini |
|
- Gemini |
|
- Qwen |
|
- DeepSeek |
|
- Claude |
|
|
|
While some high-ranking small-form models on BFCL may have overfit specific function-calling tasks, **BitAgent-8B** was purposefully trained to preserve broad generalization. This emphasis on diverse tasks ensures robust, consistent performance across a variety of real-world use cases. |
|
|
|
--- |
|
|
|
## Open-Source Incentive Training |
|
**BitAgent-8B** was developed with an **incentive mechanism** on Bittensor Subnet #20: |
|
- Miners contributed compute to fine-tune and host candidate models. |
|
- Validators continuously tested these models on domain-specific tool-calling prompts. |
|
- Incentive mechanism drove iterative improvements with unique challenges targetting at combating overfitting. |
|
|
|
This ecosystem ensures BitAgent-8B remains: |
|
1. **Adaptable** to new tasks. |
|
2. **Decentralized** in design, with no single entity controlling its training pipeline. |
|
3. **Transparent** in performance metrics, with all scoring data publicly available on the BFCL. |
|
|
|
For additional background on Subnet #20 and details about the training setup, see the [Subnet 20 Readme](https://github.com/RogueTensor/bitagent_subnet) (or refer to the documentation you already have). |
|
|
|
--- |
|
|
|
## Installation & Usage |
|
|
|
Below is a minimal example of loading **BitAgent-8B** using Hugging Face `transformers`. Adjust paths and parameters according to your environment: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model_id = "BitAgent/Bitagent-8b" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
torch_dtype="auto", |
|
device_map="auto" |
|
) |
|
|
|
# Example usage in Python |
|
# (Please adapt to your environment or use Bittensor's APIs accordingly) |
|
prompt = """You are an expert in function calling. |
|
You are given a question and a set of possible tools. |
|
You must decide if a tool should be invoked. |
|
Format tool calls strictly as: [tool_name(param=value, param2=value2)] |
|
If no tool is relevant or required parameters are missing, please respond that the request can't be fulfilled.""" |
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
outputs = model.generate(inputs["input_ids"], max_new_tokens=100, do_sample=False) |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|
|
## Sample Prompt & Invocation Structure |
|
|
|
Following the BFCL-style function invocation structure, BitAgent-8B expects prompts that look like this: |
|
### Prompt Template |
|
``` |
|
"You are an expert in composing functions. |
|
You are given a question and a set of possible functions. |
|
Based on the question, you will need to make one or more function/tool calls to achieve the purpose. |
|
If none of the functions can be used, point it out. |
|
If the given question lacks the parameters required by any function, also point it out. |
|
You should only return the function call in tools call sections. |
|
|
|
If you decide to invoke any of the function(s), |
|
you MUST put it in the format of [func_name(params_name1=params_value1, params_name2=params_value2...)]. |
|
You SHOULD NOT include any other text in the response. |
|
Here is a list of functions in JSON format that you can invoke: |
|
|
|
{functions}" |
|
``` |
|
|
|
--- |
|
## License |
|
BitAgent-8B is open-sourced under the Apache 2.0 License |
|
|