WpythonW commited on
Commit
7ad4ec8
·
verified ·
1 Parent(s): b743507

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -48
README.md CHANGED
@@ -57,42 +57,6 @@ This model is a binary classification head fine-tuned version of [MIT/ast-finetu
57
  - **Output**: Probabilities [fake_prob, real_prob]
58
  - **Training Hardware**: 2x NVIDIA T4 GPUs
59
 
60
- ## Training Configuration
61
-
62
- ```python
63
- {
64
- 'learning_rate': 1e-5,
65
- 'weight_decay': 0.01,
66
- 'n_iterations': 1500,
67
- 'batch_size': 16,
68
- 'gradient_accumulation_steps': 8,
69
- 'validate_every': 500,
70
- 'val_samples': 5000
71
- }
72
- ```
73
-
74
- ## Dataset Distribution
75
-
76
- The model was trained on a filtered dataset with the following class distribution:
77
-
78
- ```
79
- Training Set:
80
- - Fake Audio (0): 29,089 samples (53.97%)
81
- - Real Audio (1): 24,813 samples (46.03%)
82
-
83
- Test Set:
84
- - Fake Audio (0): 7,229 samples (53.64%)
85
- - Real Audio (1): 6,247 samples (46.36%)
86
- ```
87
-
88
- ## Model Performance
89
-
90
- Final metrics on validation set:
91
- - Accuracy: 0.9662 (96.62%)
92
- - F1 Score: 0.9710 (97.10%)
93
- - Precision: 0.9692 (96.92%)
94
- - Recall: 0.9728 (97.28%)
95
-
96
  # Usage Guide
97
 
98
  ## Model Usage
@@ -167,16 +131,6 @@ for filename, probs in zip(audio_files, probabilities):
167
  ## Limitations
168
 
169
  Important considerations when using this model:
170
- 1. The model works best with 16kHz audio input
171
  2. Performance may vary with different types of audio manipulation not present in training data
172
- 3. Very short audio clips (<1 second) might not provide reliable results
173
- 4. The model should not be used as the sole determiner for real/fake audio detection
174
-
175
- ## Training Details
176
-
177
- The training process involved:
178
- 1. Loading the base AST model pretrained on AudioSet
179
- 2. Replacing the classification head with a binary classifier
180
- 3. Fine-tuning on the fake audio detection dataset for 1500 iterations
181
- 4. Using gradient accumulation (8 steps) with batch size 16
182
- 5. Implementing validation checks every 500 steps
 
57
  - **Output**: Probabilities [fake_prob, real_prob]
58
  - **Training Hardware**: 2x NVIDIA T4 GPUs
59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
  # Usage Guide
61
 
62
  ## Model Usage
 
131
  ## Limitations
132
 
133
  Important considerations when using this model:
134
+ 1. The model works with 16kHz audio input
135
  2. Performance may vary with different types of audio manipulation not present in training data
136
+ 3. The model was trained on audio samples ranging from 4 to 10 seconds in duration.