taejinp commited on
Commit
d2207a6
·
verified ·
1 Parent(s): ca79a82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -196,7 +196,7 @@ This model accepts single-channel (mono) audio sampled at 16,000 Hz.
196
  The output of the model is a T x S matrix, where:
197
  - S is the maximum number of speakers (in this model, S = 4).
198
  - T is the total number of frames, including zero-padding. Each frame corresponds to a segment of 0.08 seconds of audio.
199
- Each element of the T x S matrix represents the speaker activity probability in the [0, 1] range. For example, a matrix element a(150, 2) = 0.95 indicates a 95% probability of activity for the second speaker during the time range [12.00, 12.08] seconds.
200
 
201
 
202
  ## Train and evaluate Sortformer diarizer using NeMo
 
196
  The output of the model is a T x S matrix, where:
197
  - S is the maximum number of speakers (in this model, S = 4).
198
  - T is the total number of frames, including zero-padding. Each frame corresponds to a segment of 0.08 seconds of audio.
199
+ - Each element of the T x S matrix represents the speaker activity probability in the [0, 1] range. For example, a matrix element a(150, 2) = 0.95 indicates a 95% probability of activity for the second speaker during the time range [12.00, 12.08] seconds.
200
 
201
 
202
  ## Train and evaluate Sortformer diarizer using NeMo