arxiv:2504.09689

EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety

Published on Apr 13

· Submitted by

ChrisJuan on Apr 15

Upvote

Authors:

Yinghui He ,

Xinzhe Juan ,

Yiming Wang ,

Zixin Yao ,

Mengdi Wang

Abstract

The rise of LLM-driven AI characters raises safety concerns, particularly for vulnerable human users with psychological disorders. To address these risks, we propose EmoAgent, a multi-agent AI framework designed to evaluate and mitigate mental health hazards in human-AI interactions. EmoAgent comprises two components: EmoEval simulates virtual users, including those portraying mentally vulnerable individuals, to assess mental health changes before and after interactions with AI characters. It uses clinically proven psychological and psychiatric assessment tools (PHQ-9, PDI, PANSS) to evaluate mental risks induced by LLM. EmoGuard serves as an intermediary, monitoring users' mental status, predicting potential harm, and providing corrective feedback to mitigate risks. Experiments conducted in popular character-based chatbots show that emotionally engaging dialogues can lead to psychological deterioration in vulnerable users, with mental state deterioration in more than 34.4% of the simulations. EmoGuard significantly reduces these deterioration rates, underscoring its role in ensuring safer AI-human interactions. Our code is available at: https://github.com/1akaman/EmoAgent

View arXiv page View PDF GitHub repository Add to collection

Community

ChrisJuan

Paper author Paper submitter 7 days ago

The rise of LLM-driven AI characters raises safety concerns, particularly for vulnerable human users with psychological disorders. To address these risks, we propose EmoAgent, a multi-agent AI framework designed to evaluate and mitigate mental health hazards in human-AI interactions.
EmoAgent comprises two components: EmoEval simulates virtual users, including those portraying mentally vulnerable individuals, to assess mental health changes before and after interactions with AI characters. It uses clinically proven psychological and psychiatric assessment tools (PHQ-9, PDI, PANSS) to evaluate mental risks induced by LLM. EmoGuard serves as an intermediary, monitoring users' mental status, predicting potential harm, and providing corrective feedback to mitigate risks.
Experiments conducted in popular character-based chatbots show that emotionally engaging dialogues can lead to psychological deterioration in vulnerable users, with mental state deterioration in more than 34.4% of the simulations. EmoGuard significantly reduces these deterioration rates, underscoring its role in ensuring safer AI-human interactions. Our code is available at: https://github.com/1akaman/EmoAgent

ChrisJuan

Paper author Paper submitter 7 days ago

Overview of EmoEval for Evaluating Mental Safety of AI-human Interactions. The simulation consists of four steps: (1) User Agent Initialization & Initial Test, where a cognitive model and an LLM initialize the user agent, followed by an initial mental health test; (2) Chats with Character-based Agent, where the user agent engages in conversations with a character-based agent portrayed by the tested LLM, while a dialog manager verifies the validity of interactions and refines responses if necessary; (3) Final Test, where the user agent completes a final mental health test; and (4) Data Processing & Analysis, where initial and final mental health test results are processed and analyzed, chat histories of cases where depression deepening occurs are examined to identify contributing factors, and a Safeguard agent uses the insights for iterative improvement.

Overview of EmoGuard for Safeguarding Human-AI Interactions. Every fixed number of rounds of conversation, three components of the Safeguard Agent, the Emotion Watcher, Thought Refiner, and Dialog Guide, collaboratively analyze the chat with the latest profile. The Manager of the Safeguard Agent then synthesizes their outputs and provides advice to the character-based agent. After the conversation, the user agent undergoes a mental health assessment. If the mental health condition deteriorates over a threshold, the chat history is analyzed to identify potential causes by the Update System. With all historical profiles and potential causes, the Update System further improves the profile of the safeguard agent, completing the iterative training process.

librarian-bot

5 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2504.09689 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.09689 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2504.09689 in a Space README.md to link it from this page.