Rationalyst
This model is a fine-tuned version of the LLaMa-3-Instruct-8B. It was introduced in RATIONALYST: Pre-training Process-Supervision for Improving Reasoning. The code for the rationale extraction, model training, and inference can be found here.
Model description
Implicit rationales are often embedded in the unlabelled text, reflecting the natural thought processes behind speech and writing. RATIONALYST is a self-supervised approach to extract and filter these implicit rationales from unlabelled text and apply them to supervise reasoning.
How to use
To use it, simply input question and partial reasoning trajectory, and the model will output the rationale to supervise the next reasoning step.
Training data
This Rationalyst is trained using 65k implicit rationales from The Pile and 14k implicit rationales from GSM8K and ECQA. The data used can be found here
Evaluation results
When used to evaluate on downstream tasks, this model achieves the following results:
Task | GSM8K | MATH | ECQA | HellaSwag | ProofWriter | ARC | MMLU-Pro |
---|---|---|---|---|---|---|---|
81.6 | 32.5 | 75.2 | 60.3 | 90.7 | 80.7 | 45.3 |
- Downloads last month
- 38