Abstract
Generative models' controllability is theoretically analyzed through a framework that estimates controllable sets with distribution-free bounds, revealing that controllability is fragile and context-dependent.
As generative models become ubiquitous, there is a critical need for fine-grained control over the generation process. Yet, while controlled generation methods from prompting to fine-tuning proliferate, a fundamental question remains unanswered: are these models truly controllable in the first place? In this work, we provide a theoretical framework to formally answer this question. Framing human-model interaction as a control process, we propose a novel algorithm to estimate the controllable sets of models in a dialogue setting. Notably, we provide formal guarantees on the estimation error as a function of sample complexity: we derive probably-approximately correct bounds for controllable set estimates that are distribution-free, employ no assumptions except for output boundedness, and work for any black-box nonlinear control system (i.e., any generative model). We empirically demonstrate the theoretical framework on different tasks in controlling dialogue processes, for both language models and text-to-image generation. Our results show that model controllability is surprisingly fragile and highly dependent on the experimental setting. This highlights the need for rigorous controllability analysis, shifting the focus from simply attempting control to first understanding its fundamental limits.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- A Reason-then-Describe Instruction Interpreter for Controllable Video Generation (2025)
- EVE: A Generator-Verifier System for Generative Policies (2025)
- Eliciting Behaviors in Multi-Turn Conversations (2025)
- SkillWrapper: Generative Predicate Invention for Skill Abstraction (2025)
- From Word to World: Can Large Language Models be Implicit Text-based World Models? (2025)
- SAGE: An Agentic Explainer Framework for Interpreting SAE Features in Language Models (2025)
- Propose, Solve, Verify: Self-Play Through Formal Verification (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper