StresSLM

StresSLM is an audio-text-to-text model fine-tuned with LoRA adapters on top of the Qwen/Qwen2-Audio-7B-Instruct base model. It is designed to tackle Sentence Stress Detection (SSD) and Sentence Stress Reasoning (SSR) tasks on the StressTest benchmark. StresSLM predicts stress patterns and reasoning based on spoken audio. For more information, see our paper and code:

πŸ“ƒ StressTest Paper | πŸ’» Code | πŸ€— StressTest Dataset


Usage

This model can be loaded using the HuggingFace Transformers library:

from transformers import AutoProcessor, Qwen2AudioForConditionalGeneration
from peft import PeftModel, PeftConfig

# Load processor
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-Audio-7B-Instruct")

# Load LoRA config and base model
peft_config = PeftConfig.from_pretrained("slprl/StresSLM")
base_model = Qwen2AudioForConditionalGeneration.from_pretrained(peft_config.base_model_name_or_path)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "slprl/StresSLM")

Tasks

  • Sentence Stress Detection (SSD): Identify stressed words in an utterance.
  • Sentence Stress Reasoning (SSR): Reason about the speaker’s intention using stress patterns.

For evaluation scripts and benchmarks, refer to the StressTest GitHub repository.


πŸ“– Citation

If you use this model, please cite:

@misc{yosha2025stresstest,
      title={StressTest: Can YOUR Speech LM Handle the Stress?},
      author={Iddo Yosha and Gallil Maimon and Yossi Adi},
      year={2025},
      eprint={2505.22765},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.22765},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for slprl/StresSLM

Adapter
(7)
this model

Collection including slprl/StresSLM