--- language: - en - fr - de - es - pt - it - ja - ko - ru - zh - ar - fa - id - ms - ne - pl - ro - sr - sv - tr - uk - vi - hi - bn license: apache-2.0 library_name: vllm inference: false --- # Model Card for Mistral-Small-3.1-24B-Base-2503 (TEXT ONLY) This is the text-only variant of [mistralai/Mistral-Small-3.1-24B-Base-2503](https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Base-2503). This also serves as the base-model for [mistralai/Devstral-Small-2505](https://huggingface.co/mistralai/Devstral-Small-2505), which had no official base model released. Features: - Text-only, no multimodality. - 128k context length. How was a text-only model achieved? The vision encoder was removed and the model architecture was converted from mistral3 to mistral. The tokenizer was not modified. ## Reproduced eval Serve with vLLM: ``` vllm serve casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only ``` The reproduced results can be seen below. | Model | MMLU (0-shot) | |------------------------------------|-----------------| | Small 3.1 24B Base (Text Only) | 77.25% ± 0.0033 | | Small 3.1 24B Base (Multimodal) | 77.34% ± 0.0033 | ### Original Multimodal: Full MMLU (Reproduced) ``` lm_eval --model local-completions \ --model_args "base_url=http://localhost:8000/v1/completions,model=mistralai/Mistral-Small-3.1-24B-Base-2503" \ --tasks mmlu \ --batch_size 128 ``` | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.7734|± |0.0033| | - humanities | 2|none | |acc |↑ |0.6820|± |0.0062| | - formal_logic | 1|none | 0|acc |↑ |0.5714|± |0.0443| | - high_school_european_history | 1|none | 0|acc |↑ |0.8303|± |0.0293| | - high_school_us_history | 1|none | 0|acc |↑ |0.9363|± |0.0171| | - high_school_world_history | 1|none | 0|acc |↑ |0.9241|± |0.0172| | - international_law | 1|none | 0|acc |↑ |0.9091|± |0.0262| | - jurisprudence | 1|none | 0|acc |↑ |0.8148|± |0.0376| | - logical_fallacies | 1|none | 0|acc |↑ |0.8589|± |0.0274| | - moral_disputes | 1|none | 0|acc |↑ |0.8208|± |0.0206| | - moral_scenarios | 1|none | 0|acc |↑ |0.3844|± |0.0163| | - philosophy | 1|none | 0|acc |↑ |0.8296|± |0.0214| | - prehistory | 1|none | 0|acc |↑ |0.8704|± |0.0187| | - professional_law | 1|none | 0|acc |↑ |0.6095|± |0.0125| | - world_religions | 1|none | 0|acc |↑ |0.8713|± |0.0257| | - other | 2|none | |acc |↑ |0.8317|± |0.0064| | - business_ethics | 1|none | 0|acc |↑ |0.8200|± |0.0386| | - clinical_knowledge | 1|none | 0|acc |↑ |0.8679|± |0.0208| | - college_medicine | 1|none | 0|acc |↑ |0.7803|± |0.0316| | - global_facts | 1|none | 0|acc |↑ |0.6600|± |0.0476| | - human_aging | 1|none | 0|acc |↑ |0.7982|± |0.0269| | - management | 1|none | 0|acc |↑ |0.9029|± |0.0293| | - marketing | 1|none | 0|acc |↑ |0.9359|± |0.0160| | - medical_genetics | 1|none | 0|acc |↑ |0.8900|± |0.0314| | - miscellaneous | 1|none | 0|acc |↑ |0.9183|± |0.0098| | - nutrition | 1|none | 0|acc |↑ |0.8791|± |0.0187| | - professional_accounting | 1|none | 0|acc |↑ |0.6277|± |0.0288| | - professional_medicine | 1|none | 0|acc |↑ |0.8603|± |0.0211| | - virology | 1|none | 0|acc |↑ |0.5602|± |0.0386| | - social sciences | 2|none | |acc |↑ |0.8736|± |0.0059| | - econometrics | 1|none | 0|acc |↑ |0.6491|± |0.0449| | - high_school_geography | 1|none | 0|acc |↑ |0.8990|± |0.0215| | - high_school_government_and_politics| 1|none | 0|acc |↑ |0.9637|± |0.0135| | - high_school_macroeconomics | 1|none | 0|acc |↑ |0.8103|± |0.0199| | - high_school_microeconomics | 1|none | 0|acc |↑ |0.9034|± |0.0192| | - high_school_psychology | 1|none | 0|acc |↑ |0.9358|± |0.0105| | - human_sexuality | 1|none | 0|acc |↑ |0.8855|± |0.0279| | - professional_psychology | 1|none | 0|acc |↑ |0.8578|± |0.0141| | - public_relations | 1|none | 0|acc |↑ |0.7909|± |0.0390| | - security_studies | 1|none | 0|acc |↑ |0.8327|± |0.0239| | - sociology | 1|none | 0|acc |↑ |0.9154|± |0.0197| | - us_foreign_policy | 1|none | 0|acc |↑ |0.9300|± |0.0256| | - stem | 2|none | |acc |↑ |0.7545|± |0.0073| | - abstract_algebra | 1|none | 0|acc |↑ |0.4600|± |0.0501| | - anatomy | 1|none | 0|acc |↑ |0.8148|± |0.0336| | - astronomy | 1|none | 0|acc |↑ |0.9211|± |0.0219| | - college_biology | 1|none | 0|acc |↑ |0.9444|± |0.0192| | - college_chemistry | 1|none | 0|acc |↑ |0.5700|± |0.0498| | - college_computer_science | 1|none | 0|acc |↑ |0.7100|± |0.0456| | - college_mathematics | 1|none | 0|acc |↑ |0.6200|± |0.0488| | - college_physics | 1|none | 0|acc |↑ |0.6569|± |0.0472| | - computer_security | 1|none | 0|acc |↑ |0.8300|± |0.0378| | - conceptual_physics | 1|none | 0|acc |↑ |0.8170|± |0.0253| | - electrical_engineering | 1|none | 0|acc |↑ |0.7931|± |0.0338| | - elementary_mathematics | 1|none | 0|acc |↑ |0.7910|± |0.0209| | - high_school_biology | 1|none | 0|acc |↑ |0.9323|± |0.0143| | - high_school_chemistry | 1|none | 0|acc |↑ |0.7586|± |0.0301| | - high_school_computer_science | 1|none | 0|acc |↑ |0.8900|± |0.0314| | - high_school_mathematics | 1|none | 0|acc |↑ |0.5185|± |0.0305| | - high_school_physics | 1|none | 0|acc |↑ |0.6291|± |0.0394| | - high_school_statistics | 1|none | 0|acc |↑ |0.7593|± |0.0292| | - machine_learning | 1|none | 0|acc |↑ |0.6250|± |0.0460| | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.7734|± |0.0033| | - humanities | 2|none | |acc |↑ |0.6820|± |0.0062| | - other | 2|none | |acc |↑ |0.8317|± |0.0064| | - social sciences| 2|none | |acc |↑ |0.8736|± |0.0059| | - stem | 2|none | |acc |↑ |0.7545|± |0.0073| ### Text Only: Full MMLU ``` lm_eval --model local-completions \ --model_args "base_url=http://localhost:8000/v1/completions,model=casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only" \ --tasks mmlu \ --batch_size 128 ``` | Tasks |Version|Filter|n-shot|Metric| |Value | |Stderr| |---------------------------------------|------:|------|-----:|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.7725|± |0.0033| | - humanities | 2|none | |acc |↑ |0.6793|± |0.0062| | - formal_logic | 1|none | 0|acc |↑ |0.5397|± |0.0446| | - high_school_european_history | 1|none | 0|acc |↑ |0.8364|± |0.0289| | - high_school_us_history | 1|none | 0|acc |↑ |0.9363|± |0.0171| | - high_school_world_history | 1|none | 0|acc |↑ |0.9198|± |0.0177| | - international_law | 1|none | 0|acc |↑ |0.9008|± |0.0273| | - jurisprudence | 1|none | 0|acc |↑ |0.8148|± |0.0376| | - logical_fallacies | 1|none | 0|acc |↑ |0.8405|± |0.0288| | - moral_disputes | 1|none | 0|acc |↑ |0.8237|± |0.0205| | - moral_scenarios | 1|none | 0|acc |↑ |0.3765|± |0.0162| | - philosophy | 1|none | 0|acc |↑ |0.8264|± |0.0215| | - prehistory | 1|none | 0|acc |↑ |0.8704|± |0.0187| | - professional_law | 1|none | 0|acc |↑ |0.6108|± |0.0125| | - world_religions | 1|none | 0|acc |↑ |0.8713|± |0.0257| | - other | 2|none | |acc |↑ |0.8339|± |0.0064| | - business_ethics | 1|none | 0|acc |↑ |0.8300|± |0.0378| | - clinical_knowledge | 1|none | 0|acc |↑ |0.8679|± |0.0208| | - college_medicine | 1|none | 0|acc |↑ |0.7746|± |0.0319| | - global_facts | 1|none | 0|acc |↑ |0.6800|± |0.0469| | - human_aging | 1|none | 0|acc |↑ |0.8027|± |0.0267| | - management | 1|none | 0|acc |↑ |0.9029|± |0.0293| | - marketing | 1|none | 0|acc |↑ |0.9402|± |0.0155| | - medical_genetics | 1|none | 0|acc |↑ |0.8900|± |0.0314| | - miscellaneous | 1|none | 0|acc |↑ |0.9208|± |0.0097| | - nutrition | 1|none | 0|acc |↑ |0.8791|± |0.0187| | - professional_accounting | 1|none | 0|acc |↑ |0.6312|± |0.0288| | - professional_medicine | 1|none | 0|acc |↑ |0.8603|± |0.0211| | - virology | 1|none | 0|acc |↑ |0.5602|± |0.0386| | - social sciences | 2|none | |acc |↑ |0.8739|± |0.0059| | - econometrics | 1|none | 0|acc |↑ |0.6667|± |0.0443| | - high_school_geography | 1|none | 0|acc |↑ |0.8939|± |0.0219| | - high_school_government_and_politics| 1|none | 0|acc |↑ |0.9585|± |0.0144| | - high_school_macroeconomics | 1|none | 0|acc |↑ |0.8103|± |0.0199| | - high_school_microeconomics | 1|none | 0|acc |↑ |0.9076|± |0.0188| | - high_school_psychology | 1|none | 0|acc |↑ |0.9358|± |0.0105| | - human_sexuality | 1|none | 0|acc |↑ |0.8855|± |0.0279| | - professional_psychology | 1|none | 0|acc |↑ |0.8578|± |0.0141| | - public_relations | 1|none | 0|acc |↑ |0.7909|± |0.0390| | - security_studies | 1|none | 0|acc |↑ |0.8327|± |0.0239| | - sociology | 1|none | 0|acc |↑ |0.9104|± |0.0202| | - us_foreign_policy | 1|none | 0|acc |↑ |0.9400|± |0.0239| | - stem | 2|none | |acc |↑ |0.7520|± |0.0073| | - abstract_algebra | 1|none | 0|acc |↑ |0.4500|± |0.0500| | - anatomy | 1|none | 0|acc |↑ |0.8296|± |0.0325| | - astronomy | 1|none | 0|acc |↑ |0.9211|± |0.0219| | - college_biology | 1|none | 0|acc |↑ |0.9444|± |0.0192| | - college_chemistry | 1|none | 0|acc |↑ |0.5600|± |0.0499| | - college_computer_science | 1|none | 0|acc |↑ |0.7100|± |0.0456| | - college_mathematics | 1|none | 0|acc |↑ |0.6200|± |0.0488| | - college_physics | 1|none | 0|acc |↑ |0.6569|± |0.0472| | - computer_security | 1|none | 0|acc |↑ |0.8300|± |0.0378| | - conceptual_physics | 1|none | 0|acc |↑ |0.8213|± |0.0250| | - electrical_engineering | 1|none | 0|acc |↑ |0.7862|± |0.0342| | - elementary_mathematics | 1|none | 0|acc |↑ |0.7804|± |0.0213| | - high_school_biology | 1|none | 0|acc |↑ |0.9290|± |0.0146| | - high_school_chemistry | 1|none | 0|acc |↑ |0.7488|± |0.0305| | - high_school_computer_science | 1|none | 0|acc |↑ |0.8900|± |0.0314| | - high_school_mathematics | 1|none | 0|acc |↑ |0.5222|± |0.0305| | - high_school_physics | 1|none | 0|acc |↑ |0.6225|± |0.0396| | - high_school_statistics | 1|none | 0|acc |↑ |0.7500|± |0.0295| | - machine_learning | 1|none | 0|acc |↑ |0.6339|± |0.0457| | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.7725|± |0.0033| | - humanities | 2|none | |acc |↑ |0.6793|± |0.0062| | - other | 2|none | |acc |↑ |0.8339|± |0.0064| | - social sciences| 2|none | |acc |↑ |0.8739|± |0.0059| | - stem | 2|none | |acc |↑ |0.7520|± |0.0073|