Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems
Abstract
Scientists often infer abstract procedures from specific instances of problems and use the abstractions to generate new, related instances. For example, programs encoding the formal rules and properties of a system have been useful in fields ranging from RL (procedural environments) to physics (simulation engines). These programs can be seen as functions which execute to different outputs based on their parameterizations (e.g., gridworld configuration or initial physical conditions). We introduce the term EFA (Executable Functional Abstraction) to denote such programs for math problems. EFA-like constructs have been shown to be useful for math reasoning as problem generators for stress-testing models. However, prior work has been limited to abstractions for grade-school math (whose simple rules are easy to encode in programs), while generating EFAs for advanced math has thus far required human engineering. We explore the automatic construction of EFAs for advanced math problems. We operationalize the task of automatically constructing EFAs as a program synthesis task, and develop EFAGen, which conditions an LLM on a seed math problem and its step-by-step solution to generate candidate EFA programs that are faithful to the generalized problem and solution class underlying the seed problem. Furthermore, we formalize properties any valid EFA must possess in terms of executable unit tests, and show how the tests can be used as verifiable rewards to train LLMs to become better writers of EFAs. We demonstrate that EFAs constructed by EFAGen behave rationally by remaining faithful to seed problems, produce learnable problem variations, and that EFAGen can infer EFAs across multiple diverse sources of competition-level math problems. Finally, we show downstream uses of model-written EFAs e.g. finding problem variations that are harder or easier for a learner to solve, as well as data generation.
Community
TL;DR We propose a way to transform a static math problem into an executable program that can generate unlimited variants of the static problem. It works on diverse sources of math problems (including Olympiad level problems) and we show it is both an effective source of data for augmentation and can be used to search for harder variants of existing problems.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis (2025)
- DeduCE: Deductive Consistency as a Framework to Evaluate LLM Reasoning (2025)
- Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity (2025)
- L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution (2025)
- Evaluating the Generalization Capabilities of Large Language Models on Code Reasoning (2025)
- Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs (2025)
- AutoLogi: Automated Generation of Logic Puzzles for Evaluating Reasoning Abilities of Large Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper