zai-org
/

GLM-4-32B-Base-0414

Text Generation

Model card Files Files and versions Community

GLM-4-32B-Base-0414 / README.md

zRzRzRzRzRzRzR

1

fb49307 4 months ago

|

1.94 kB

	---
	license: mit
	language:
	- zh
	- en
	pipeline_tag: text-generation
	library_name: transformers
	---

	# GLM-4-32B-0414

	## Introduction

	Based on our latest technological advancements, we have trained a `GLM-4-0414` series model. During pretraining, we incorporated more code-related and reasoning-related data. In the alignment phase, we optimized the model specifically for agent capabilities. As a result, the model's performance in agent tasks such as tool use, web search, and coding has been significantly improved.

	![Image](https://github.com/user-attachments/assets/e66757a4-d8e8-4ab4-bef1-21116e618f67)

	\| Models \| IFEval \| SWE-Bench \| BFCL-v3 (Overall) \| BFCL-v3 (MultiTurn) \| TAU-Bench (Retail) \| TAU-Bench (Airline) \| SimpleQA \| HotpotQA \|
	\|------------------\|---------\|------------------\|-------------------\|---------------------\|--------------------\|---------------------\|----------\|----------\|
	\| Qwen2.5-Max \| 85.6 \| 24.4 \| 50.9 \| 30.5 \| 58.3 \| 22.0 \| 79.0 \| 52.8 \|
	\| GPT-4o-1120 \| 81.9 \| 38.8 \| 69.6 \| 41.0 \| 62.8 \| 46.0 \| 82.8 \| 63.9 \|
	\| DeepSeek-V3-0324 \| 83.4 \| 38.8（oh） \| 66.2 \| 35.8 \| 60.7 \| 32.4 \| 82.6 \| 54.6 \|
	\| DeepSeek-R1 \| 84.3 \| 34（oh）/ 49.2（al） \| 57.5 \| 12.4 \| 33.0 \| 37.3 \| 83.9 \| 63.1 \|
	\| GLM-4-32B-0414 \| 86.5 \| \| 69.6 \| 41.5 \| 68.7 \| 51.2 \| 88.1 \| 63.8 \|


	## Inference Code

	Make Sure Using `transforemrs>=4.51.3`.

	This model is a base model. If you need a chat model, please use the [GLM-4-32B-Chat-0414](https://huggingface.co/THUDM/GLM-4-32B-Chat-0414) chat model.