You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

FlowCoT Checkpoints

Checkpoints for the FlowCoT latent-reasoning code-generation model, built on Qwen3-8B-Base with a normalizing-flow latent path and a VAE-encoded reasoning compression.

Checkpoints

dual_path_stage2_checkpoint-22900

Best dual-path (NF + CE) checkpoint. Stage 2 of training with config_coding.yaml (original cap schedule).

unified_stage1_checkpoint-3125

Stage 1 of the new unified (one-forward) architecture with the fixlen dataset caps. Training ongoing.

Evaluation — pass@1 (mean@16)

Comparison of dual_path_stage2_checkpoint-22900 against LaDiR (diffusion-based baseline, same backbone and training data).

Benchmark	FlowCoT (ours)	LaDiR
MBPP	74.4%	66.8%
MBPP+	77.5%	59.5%
HumanEval	82.9%	87.4%
HumanEval+	77.8%	73.2%

FlowCoT numbers: pass@1 over mean@16 (8 seeds × 2 samples, temperature 0.6).

Usage

Training and evaluation code: https://github.com/GMLR-Penn/FlowCoT

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Penn-GMLR/FlowCoT_ckpt

Base model

Qwen/Qwen3-8B-Base

Finetuned

(385)

this model