ITACLIP: Boosting Training-Free Semantic Segmentation with Image, Text, and Architectural Enhancements Paper • 2411.12044 • Published Nov 18, 2024 • 14
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published Sep 17, 2024 • 31
InstrumentGen: Generating Sample-Based Musical Instruments From Text Paper • 2311.04339 • Published Nov 7, 2023
Distortion Audio Effects: Learning How to Recover the Clean Signal Paper • 2202.01664 • Published Feb 3, 2022
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models Paper • 2407.15641 • Published Jul 22, 2024
DSP-informed bandwidth extension using locally-conditioned excitation and linear time-varying filter subnetworks Paper • 2407.15624 • Published Jul 22, 2024
Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech Paper • 2210.13397 • Published Oct 24, 2022
Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project Paper • 2309.15869 • Published Sep 26, 2023
Real-time Speech Summarization for Medical Conversations Paper • 2406.15888 • Published Jun 22, 2024 • 1
VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain Paper • 2404.05659 • Published Apr 8, 2024 • 2