synthetic-speaker-jpko

This model is a fine-tuned version of pyannote/segmentation-3.0 on the objects76/synthetic-ja_ko-speaker-overlap-3200 dataset. It achieves the following results on the evaluation set:

Loss: 0.1189
Model Preparation Time: 0.0017
Der: 0.0382
False Alarm: 0.0128
Missed Detection: 0.0191
Confusion: 0.0064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 768
eval_batch_size: 768
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
num_epochs: 70

Training results

Training Loss	Epoch	Step	Validation Loss	Model Preparation Time	Der	False Alarm	Missed Detection	Confusion
No log	1.0	11	0.4459	0.0017	0.1267	0.0342	0.0530	0.0394
No log	2.0	22	0.3200	0.0017	0.0939	0.0184	0.0507	0.0247
0.547	3.0	33	0.2593	0.0017	0.0794	0.0255	0.0359	0.0180
0.547	4.0	44	0.2373	0.0017	0.0734	0.0197	0.0367	0.0170
0.2434	5.0	55	0.2270	0.0017	0.0720	0.0240	0.0312	0.0168
0.2434	6.0	66	0.2117	0.0017	0.0669	0.0204	0.0321	0.0145
0.198	7.0	77	0.2029	0.0017	0.0653	0.0230	0.0282	0.0141
0.198	8.0	88	0.1998	0.0017	0.0646	0.0211	0.0282	0.0152
0.198	9.0	99	0.1927	0.0017	0.0614	0.0192	0.0290	0.0132
0.1821	10.0	110	0.1872	0.0017	0.0601	0.0212	0.0261	0.0128
0.1821	11.0	121	0.1806	0.0017	0.0583	0.0200	0.0257	0.0125
0.1657	12.0	132	0.1705	0.0017	0.0554	0.0192	0.0248	0.0113
0.1657	13.0	143	0.1674	0.0017	0.0538	0.0163	0.0261	0.0115
0.157	14.0	154	0.1651	0.0017	0.0535	0.0175	0.0245	0.0115
0.157	15.0	165	0.1571	0.0017	0.0502	0.0146	0.0269	0.0088
0.1494	16.0	176	0.1533	0.0017	0.0500	0.0181	0.0230	0.0088
0.1494	17.0	187	0.1521	0.0017	0.0481	0.0125	0.0268	0.0087
0.1494	18.0	198	0.1492	0.0017	0.0481	0.0179	0.0215	0.0087
0.1573	19.0	209	0.1440	0.0017	0.0466	0.0132	0.0252	0.0083
0.1573	20.0	220	0.1440	0.0017	0.0467	0.0159	0.0223	0.0085
0.1346	21.0	231	0.1432	0.0017	0.0465	0.0147	0.0232	0.0086
0.1346	22.0	242	0.1409	0.0017	0.0458	0.0159	0.0213	0.0085
0.1249	23.0	253	0.1391	0.0017	0.0451	0.0147	0.0224	0.0080
0.1249	24.0	264	0.1359	0.0017	0.0441	0.0142	0.0221	0.0078
0.1263	25.0	275	0.1356	0.0017	0.0440	0.0149	0.0212	0.0079
0.1263	26.0	286	0.1369	0.0017	0.0444	0.0141	0.0223	0.0080
0.1263	27.0	297	0.1332	0.0017	0.0434	0.0144	0.0212	0.0078
0.1249	28.0	308	0.1323	0.0017	0.0435	0.0132	0.0225	0.0078
0.1249	29.0	319	0.1304	0.0017	0.0432	0.0145	0.0214	0.0074
0.1189	30.0	330	0.1288	0.0017	0.0427	0.0137	0.0221	0.0069
0.1189	31.0	341	0.1277	0.0017	0.0425	0.0139	0.0223	0.0063
0.1152	32.0	352	0.1262	0.0017	0.0418	0.0140	0.0212	0.0066
0.1152	33.0	363	0.1248	0.0017	0.0415	0.0142	0.0208	0.0065
0.1152	34.0	374	0.1251	0.0017	0.0413	0.0133	0.0216	0.0065
0.1197	35.0	385	0.1230	0.0017	0.0406	0.0139	0.0201	0.0066
0.1197	36.0	396	0.1217	0.0017	0.0404	0.0136	0.0202	0.0065
0.1094	37.0	407	0.1209	0.0017	0.0400	0.0135	0.0202	0.0063
0.1094	38.0	418	0.1227	0.0017	0.0406	0.0127	0.0213	0.0067
0.1023	39.0	429	0.1229	0.0017	0.0408	0.0132	0.0208	0.0068
0.1023	40.0	440	0.1221	0.0017	0.0406	0.0135	0.0206	0.0066
0.1046	41.0	451	0.1198	0.0017	0.0394	0.0138	0.0192	0.0064
0.1046	42.0	462	0.1188	0.0017	0.0389	0.0131	0.0195	0.0063
0.1046	43.0	473	0.1196	0.0017	0.0392	0.0141	0.0186	0.0066
0.1195	44.0	484	0.1221	0.0017	0.0394	0.0117	0.0209	0.0068
0.1195	45.0	495	0.1241	0.0017	0.0402	0.0125	0.0208	0.0068
0.1062	46.0	506	0.1235	0.0017	0.0401	0.0152	0.0181	0.0068
0.1062	47.0	517	0.1223	0.0017	0.0399	0.0140	0.0191	0.0068
0.1155	48.0	528	0.1213	0.0017	0.0392	0.0113	0.0211	0.0068
0.1155	49.0	539	0.1203	0.0017	0.0391	0.0127	0.0196	0.0067
0.1036	50.0	550	0.1200	0.0017	0.0388	0.0133	0.0190	0.0066
0.1036	51.0	561	0.1201	0.0017	0.0388	0.0123	0.0200	0.0065
0.1036	52.0	572	0.1208	0.0017	0.0391	0.0122	0.0203	0.0066
0.1062	53.0	583	0.1216	0.0017	0.0397	0.0125	0.0206	0.0066
0.1062	54.0	594	0.1213	0.0017	0.0394	0.0133	0.0195	0.0065
0.0989	55.0	605	0.1211	0.0017	0.0393	0.0138	0.0190	0.0066
0.0989	56.0	616	0.1206	0.0017	0.0390	0.0137	0.0187	0.0066
0.1015	57.0	627	0.1198	0.0017	0.0388	0.0128	0.0194	0.0065
0.1015	58.0	638	0.1194	0.0017	0.0386	0.0122	0.0199	0.0065
0.1015	59.0	649	0.1193	0.0017	0.0385	0.0122	0.0198	0.0065
0.0971	60.0	660	0.1193	0.0017	0.0384	0.0124	0.0196	0.0065
0.0971	61.0	671	0.1192	0.0017	0.0384	0.0129	0.0192	0.0063
0.0964	62.0	682	0.1191	0.0017	0.0383	0.0131	0.0189	0.0063
0.0964	63.0	693	0.1191	0.0017	0.0383	0.0129	0.0190	0.0064
0.0959	64.0	704	0.1190	0.0017	0.0383	0.0129	0.0190	0.0064
0.0959	65.0	715	0.1189	0.0017	0.0382	0.0128	0.0191	0.0064
0.0956	66.0	726	0.1189	0.0017	0.0382	0.0128	0.0191	0.0064
0.0956	67.0	737	0.1189	0.0017	0.0382	0.0127	0.0191	0.0064
0.0956	68.0	748	0.1189	0.0017	0.0382	0.0127	0.0191	0.0064
0.0979	69.0	759	0.1189	0.0017	0.0382	0.0128	0.0191	0.0064
0.0979	70.0	770	0.1189	0.0017	0.0382	0.0128	0.0191	0.0064

Framework versions

Transformers 4.50.3
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

objects76
/

synthetic-jp_ko-2.25sec-250408_0953

synthetic-speaker-jpko

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for objects76/synthetic-jp_ko-2.25sec-250408_0953

Dataset used to train objects76/synthetic-jp_ko-2.25sec-250408_0953

Evaluation results