Add Github repo

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +14 -11
README.md CHANGED
@@ -1,13 +1,4 @@
1
  ---
2
- license: bsd-3-clause
3
- pipeline_tag: feature-extraction
4
- tags:
5
- - automatic-speech-recognition
6
- - audio-classification
7
- - audio
8
- - speech
9
- - music
10
- library_name: transformers
11
  datasets:
12
  - openslr/librispeech_asr
13
  - facebook/multilingual_librispeech
@@ -17,7 +8,19 @@ datasets:
17
  - agkphysics/AudioSet
18
  language:
19
  - en
 
 
 
 
 
 
 
 
 
 
 
20
  ---
 
21
  # USAD: Universal Speech and Audio Representation via Distillation
22
 
23
  **Universal Speech and Audio Distillation (USAD)** is a unified **speech**, **sound**, and **music** encoder distilled from domain-specific teachers.
@@ -25,6 +28,7 @@ Trained on 126k hours of mixed data, USAD delivers competitive performance acros
25
 
26
  [πŸ‘€ **Read Full Paper**](https://arxiv.org/abs/2506.18843)
27
 
 
28
  ---
29
 
30
  ## πŸ—‚οΈ Models
@@ -39,7 +43,6 @@ USAD models are all transformer encoders operating at **50Hz frame rate**. The t
39
 
40
  ---
41
 
42
-
43
  ## πŸš€ How To Use
44
 
45
  **Installation**
@@ -89,4 +92,4 @@ See [usad_model.py](https://huggingface.co/MIT-SLS/USAD-Small/blob/main/usad_mod
89
 
90
  ## πŸ™ Acknowledgement
91
 
92
- Our implementation is based on the awesome [facebookresearch/fairseq](https://github.com/facebookresearch/fairseq), [cwx-worst-one/EAT](https://github.com/cwx-worst-one/EAT), and [sooftware/conformer](https://github.com/sooftware/conformer) repositories.
 
1
  ---
 
 
 
 
 
 
 
 
 
2
  datasets:
3
  - openslr/librispeech_asr
4
  - facebook/multilingual_librispeech
 
8
  - agkphysics/AudioSet
9
  language:
10
  - en
11
+ library_name: transformers
12
+ license: bsd-3-clause
13
+ pipeline_tag: feature-extraction
14
+ tags:
15
+ - automatic-speech-recognition
16
+ - audio-classification
17
+ - audio
18
+ - speech
19
+ - music
20
+ - distillation
21
+ - audio-representation
22
  ---
23
+
24
  # USAD: Universal Speech and Audio Representation via Distillation
25
 
26
  **Universal Speech and Audio Distillation (USAD)** is a unified **speech**, **sound**, and **music** encoder distilled from domain-specific teachers.
 
28
 
29
  [πŸ‘€ **Read Full Paper**](https://arxiv.org/abs/2506.18843)
30
 
31
+ Code: [https://github.com/MIT-SLS/universal_audio_representation](https://github.com/MIT-SLS/universal_audio_representation)
32
  ---
33
 
34
  ## πŸ—‚οΈ Models
 
43
 
44
  ---
45
 
 
46
  ## πŸš€ How To Use
47
 
48
  **Installation**
 
92
 
93
  ## πŸ™ Acknowledgement
94
 
95
+ Our implementation is based on the awesome [facebookresearch/fairseq](https://github.com/facebookresearch/fairseq), [cwx-worst-one/EAT](https://github.com/cwx-worst-one/EAT), and [sooftware/conformer](https://github.com/sooftware/conformer) repositories.