Safetensors
English
llama
File size: 5,694 Bytes
1a93bc7
4c1b7a4
1a93bc7
 
 
 
 
 
d15b856
 
 
 
1a93bc7
 
96b1599
 
 
1a93bc7
 
 
 
 
 
 
7c9a3ed
1a93bc7
 
 
7c9a3ed
 
 
 
1a93bc7
 
 
 
 
 
04a8bc4
1a93bc7
 
 
 
b5533fa
1a93bc7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
04a8bc4
 
 
 
 
 
 
 
 
 
 
 
1a93bc7
b5533fa
1a93bc7
 
 
 
 
 
 
b5533fa
 
 
 
 
 
 
 
 
 
 
 
 
 
1a93bc7
 
 
 
 
 
 
17fb2ca
5955417
 
 
17fb2ca
754c7d5
17fb2ca
1a93bc7
 
 
 
7c9a3ed
 
 
 
 
 
 
 
1a93bc7
 
 
7c9a3ed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
license: cc-by-nc-4.0
language:
- en
metrics:
- accuracy
base_model:
- meta-llama/Llama-3.3-70B-Instruct
datasets:
- uiuc-convai/CALM-IT
pipeline_tag: text-generation
library_name: transformers
---

# CALM-70B: Conversational Agentic Language Model

[![Made with Oumi](https://badgen.net/badge/Made%20with/Oumi/%23085CFF?icon=https%3A%2F%2Foumi.ai%2Flogo_dark.svg)](https://github.com/oumi-ai/oumi)

## Model Description
**CALM-70B** is our middle scale **Conversational Agentic Language Model**, designed to integrate **Task-Oriented Dialogue (TOD) capabilities** with **Language Agent (LA) functionalities** at a **larger scale** than its predecessor CALM-8B. By leveraging **CALM-IT**, a multi-task dataset interleaving **multi-turn ReAct reasoning** with **complex API usage**, CALM-70B achieves **state-of-the-art performance** across TOD and function-calling benchmarks.

CALM-70B has been fine-tuned on a **comprehensive multi-tasking** covering dialogue state tracking, function calling, and multi-turn reasoning, surpassing even proprietary models like **GPT-4o** on major conversational evaluation benchmarks: **MultiWOZ 2.4 (TOD), BFCL V3 (LA), and API-Bank (LA).**


## Model Sources

<!-- Provide the basic links for the model. -->

- πŸ“ **Paper:** https://arxiv.org/abs/2502.08820
- πŸ’» **Repository:** https://github.com/oumi-ai/oumi/tree/main/configs/projects/calm
- πŸ’Ž **Dataset:** https://huggingface.co/datasets/uiuc-convai/CALM-IT


---
## Model Details

- **Model Name:** CALM-70B  
- **Developed by:** Colloboration of UIUC Conversational AI LAB and Oumi 
- **License:** cc-by-nc-4.0  
- **Architecture:** Fine-tuned **Llama 3.3 70B Instruct**  
- **Parameter Count:** 70B  
- **Training Data:** CALM-IT
- **Training Type:** Full Fine-tunning (FFT)
- **Fine-tuning Framework:** [Oumi](https://github.com/oumi-ai/oumi)
- **Training Hardware:** 8 NVIDIA H100 GPUs  
- **Training Duration:** ~24 hours  
- **Evaluation Benchmarks:** MultiWOZ 2.4, BFCL V3, API-Bank  
- **Release Date:** February 5, 2025  

---
## Capabilities and Features

### πŸ—£ Conversational Agentic Abilities
- **Multi-turn Dialogue Mastery:** Handles long conversations with accurate state tracking.
- **Advanced Function Calling:** Dynamically selects and executes API calls for task completion.
- **Enhanced ReAct-based Reasoning:** Integrates structured reasoning (User-Thought-Action-Observation-Thought-Response).
- **Zero-Shot Generalization:** Excels in unseen function-calling and TOD tasks.

### πŸš€ Benchmark Performance
- **MultiWOZ 2.4 (TOD):** Strong performance in dialogue state tracking and task success.
- **BFCL V3 (LA):** Superior function-calling abilities compared to language agents.
- **API-Bank (LA):** High accuracy in API call generation and response synthesis.

---
## Training Process
### πŸ”§ Fine-tuning Stages
1. **TOD Fine-tuning:** Optimized for dialogue state tracking (e.g., augmented SNIPS in instruction-tuned format).
2. **Function Calling Fine-tuning:** Trained to generate precise API calls from LA datasets.
3. **ReAct-based Fine-tuning:** Enhances multi-turn conversations with API integrations through structured reasoning.

### πŸ” Training Hyperparameters
- **Base Model:** Llama 3.3 70B Instruct
- **LoRA Config:** Rank = 16, Scaling Factor = 32
- **Batch Size:** 7
- **Learning Rate:** 4e-5
- **Optimizer:** AdamW (betas = 0.9, 0.999, epsilon = 1e-8)
- **Precision:** Mixed precision (bfloat16)
- **Warm-up Steps:** 24
- **Gradient Accumulation Steps:** 1

---


## πŸ’‘ CALM-IT Dataset
<img src="table.png" alt="CALM-IT Dataset Statistics" width="800"/>


---
## πŸ“Š Benchmark Performance

<img src="results.png" alt="CALM-IT Dataset Statistics" width="1000"/>


## Usage
### πŸ— How to Load the Model using HuggingFace
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("uiuc-convai/CALM-70B")
model = AutoModelForCausalLM.from_pretrained("uiuc-convai/CALM-70B")
```

### πŸ›  Example Oumi Inference
```bash
pip install oumi

# See oumi_infer.yaml in this model's /oumi/ directory.
oumi infer -i -c ./oumi_infer.yaml
```

### πŸ›  Example Oumi Fine-Tuning
```bash
pip install oumi

# See oumi_train.yaml in this model's /oumi/ directory.
oumi train -c ./oumi_train.yaml
```


---
- **Scalability to CALM-405B:** Next iteration will extend capabilities for even larger-scale conversations.
- **Continuous Open-Source Expansion:** Ongoing release of datasets, model weights, and training artifacts to foster community research.

---
## Acknowledgements
We'd like to thank the [Oumi AI Team](https://github.com/oumi-ai/oumi) for collaborating on training the models, as well as [Together AI](https://www.together.ai/) for providing the compute resources necessary to train CALM 405B.

## License
This model is licensed under [Creative Commons NonCommercial (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/legalcode).

---
## Citation
If you use **CALM-70B** in your research, please cite:
```
@misc{acikgoz2025singlemodelmastermultiturn,
      title={Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model}, 
      author={Emre Can Acikgoz and Jeremiah Greer and Akul Datta and Ze Yang and William Zeng and Oussama Elachqar and Emmanouil Koukoumidis and Dilek Hakkani-TΓΌr and Gokhan Tur},
      year={2025},
      eprint={2502.08820},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2502.08820}, 
}
```

For more details, visit [Project Repository](https://github.com/oumi-ai/oumi/tree/main/configs/projects/calm) or contact **[email protected]**.