G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
Abstract
G-FOCUS, a novel inference-time reasoning strategy, enhances Vision-Language Models for assessing UI persuasiveness, complementing A/B testing.
Evaluating user interface (UI) design effectiveness extends beyond aesthetics to influencing user behavior, a principle central to Design Persuasiveness. A/B testing is the predominant method for determining which UI variations drive higher user engagement, but it is costly and time-consuming. While recent Vision-Language Models (VLMs) can process automated UI analysis, current approaches focus on isolated design attributes rather than comparative persuasiveness-the key factor in optimizing user interactions. To address this, we introduce WiserUI-Bench, a benchmark designed for Pairwise UI Design Persuasiveness Assessment task, featuring 300 real-world UI image pairs labeled with A/B test results and expert rationales. Additionally, we propose G-FOCUS, a novel inference-time reasoning strategy that enhances VLM-based persuasiveness assessment by reducing position bias and improving evaluation accuracy. Experimental results show that G-FOCUS surpasses existing inference strategies in consistency and accuracy for pairwise UI evaluation. Through promoting VLM-driven evaluation of UI persuasiveness, our work offers an approach to complement A/B testing, propelling progress in scalable UI preference modeling and design optimization. Code and data will be released publicly.
Community
We introduce WiserUI-Bench, a benchmark with 300 real-world UI image pairs and A/B test results for assessing design persuasiveness. Our reasoning strategy, G-FOCUS, enhances VLMs' reliability in UI evaluation by reducing bias and improving accuracy.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SimUSER: Simulating User Behavior with Large Language Models for Recommender System Evaluation (2025)
- UXAgent: A System for Simulating Usability Testing of Web Design with LLM Agents (2025)
- LLM-Driven Usefulness Judgment for Web Search Evaluation (2025)
- Exploring the Impact of Personality Traits on Conversational Recommender Systems: A Simulation with Large Language Models (2025)
- Empowering Retrieval-based Conversational Recommendation with Contrasting User Preferences (2025)
- Prompt-Based LLMs for Position Bias-Aware Reranking in Personalized Recommendations (2025)
- MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper