alpha31476 commited on
Commit
632cf1e
·
verified ·
1 Parent(s): 0733b34

Image Audio Alingment Train OpenClip With Features

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +3 -0
  2. Vaani/.vscode/settings.json +4 -0
  3. Vaani/Img_Audio_Alignment/Train_OpenClip/_2.1.1_Train_OpenCLIP.py +0 -0
  4. Vaani/Img_Audio_Alignment/_2.1.1_Train_OpenCLIP.py +34 -16
  5. Vaani/Img_Audio_Alignment/_2.1.2_OpenCLIP_Image_Features.ipynb +0 -0
  6. Vaani/Img_Audio_Alignment/_2.1.2_OpenCLIP_Image_Features.py +0 -0
  7. Vaani/Img_Audio_Alignment/_2.1.2_Train_OpenCLIP.py +0 -0
  8. Vaani/Img_Audio_Alignment/_2_Image_data.ipynb +0 -0
  9. Vaani/Img_Audio_Alignment/available_img_audios_TEST3.csv +0 -0
  10. Vaani/Img_Audio_Alignment/available_img_audios_TRAIN3.csv +3 -0
  11. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/_2.1.2_Train_OpenCLIP.py +0 -0
  12. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_best_epoch_539.pt +3 -0
  13. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_best_epoch_567.pt +3 -0
  14. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_epoch_572_best_epoch_-1.pt +3 -0
  15. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_epoch_573_best_epoch_-1.pt +3 -0
  16. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_101_loss_4.1500.png +0 -0
  17. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_102_loss_4.1500.png +0 -0
  18. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_103_loss_4.1499.png +0 -0
  19. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_104_loss_4.1499.png +0 -0
  20. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_105_loss_4.1498.png +0 -0
  21. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_107_loss_4.1498.png +0 -0
  22. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_108_loss_4.1497.png +0 -0
  23. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_109_loss_4.1497.png +0 -0
  24. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_110_loss_4.1496.png +0 -0
  25. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_111_loss_4.1496.png +0 -0
  26. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_113_loss_4.1495.png +0 -0
  27. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_114_loss_4.1495.png +0 -0
  28. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_115_loss_4.1494.png +0 -0
  29. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_116_loss_4.1494.png +0 -0
  30. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_117_loss_4.1494.png +0 -0
  31. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_118_loss_4.1493.png +0 -0
  32. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_120_loss_4.1493.png +0 -0
  33. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_121_loss_4.1492.png +0 -0
  34. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_122_loss_4.1492.png +0 -0
  35. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_123_loss_4.1491.png +0 -0
  36. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_125_loss_4.1491.png +0 -0
  37. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_126_loss_4.1490.png +0 -0
  38. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_128_loss_4.1490.png +0 -0
  39. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_129_loss_4.1489.png +0 -0
  40. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_130_loss_4.1489.png +0 -0
  41. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_131_loss_4.1489.png +0 -0
  42. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_132_loss_4.1488.png +0 -0
  43. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_133_loss_4.1488.png +0 -0
  44. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_134_loss_4.1487.png +0 -0
  45. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_135_loss_4.1487.png +0 -0
  46. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_136_loss_4.1487.png +0 -0
  47. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_137_loss_4.1486.png +0 -0
  48. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_138_loss_4.1486.png +0 -0
  49. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_139_loss_4.1486.png +0 -0
  50. Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_140_loss_4.1485.png +0 -0
.gitattributes CHANGED
@@ -420,3 +420,6 @@ Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_85_loss_4.1363.png
420
  Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_8_loss_4.1522.png filter=lfs diff=lfs merge=lfs -text
421
  Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_96_loss_4.1361.png filter=lfs diff=lfs merge=lfs -text
422
  Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_9_loss_4.1521.png filter=lfs diff=lfs merge=lfs -text
 
 
 
 
420
  Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_8_loss_4.1522.png filter=lfs diff=lfs merge=lfs -text
421
  Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_96_loss_4.1361.png filter=lfs diff=lfs merge=lfs -text
422
  Vaani/Img_Audio_Alignment/checkpoints/csip/probs/probs_epoch_9_loss_4.1521.png filter=lfs diff=lfs merge=lfs -text
423
+ Vaani/Img_Audio_Alignment/available_img_audios_TRAIN3.csv filter=lfs diff=lfs merge=lfs -text
424
+ Vaani/Img_Audio_Alignment/dog[[:space:]]and[[:space:]]cat.png filter=lfs diff=lfs merge=lfs -text
425
+ Vaani/SDFT/SD2_1_Audio/dog[[:space:]]and[[:space:]]cat.png filter=lfs diff=lfs merge=lfs -text
Vaani/.vscode/settings.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "editor.fontFamily": "JetBrains Mono Light",
3
+ "svg.preview.background": "black"
4
+ }
Vaani/Img_Audio_Alignment/Train_OpenClip/_2.1.1_Train_OpenCLIP.py ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/_2.1.1_Train_OpenCLIP.py CHANGED
@@ -5989,7 +5989,8 @@ def load_checkpoint(
5989
 
5990
  # /home/IITB/ai-at-ieor/23m1521/.conda/envs/openclip2/lib/python3.11/site-packages/open_clip/factory.py
5991
  HF_HUB_PREFIX = 'hf-hub:'
5992
- _MODEL_CONFIG_PATHS = [Path(__file__).parent / f"model_configs/"]
 
5993
  _MODEL_CONFIGS = {} # directory (model_name: config) of model architecture configs
5994
 
5995
  import json
@@ -6952,15 +6953,24 @@ class VaaniImageAudioDataset(torch.utils.data.Dataset):
6952
 
6953
  def __getitem__(self, idx):
6954
  return {
6955
- 'image_path': open_clip_imgaug(Image.open(self.image_paths[idx])),
6956
  'audio_path': self.audio_paths[idx]
6957
  }
6958
 
6959
 
6960
  def collate_fn(batch):
6961
- image_tensor = [item['image_path'] for item in batch]
6962
- audio_tensor = CLAPAudioProcessor([item['audio_path'] for item in batch], resample=True)
6963
- return {'image_tensor': torch.stack(image_tensor), 'audio_tensor': audio_tensor}
 
 
 
 
 
 
 
 
 
6964
 
6965
 
6966
 
@@ -6996,8 +7006,10 @@ test_dataloader = torch.utils.data.DataLoader(
6996
  )
6997
 
6998
  batch = next(iter(train_dataloader))
6999
- image_tensor_batch = batch['image_tensor']
7000
- audio_tensor_batch = batch['audio_tensor']
 
 
7001
  print("Image batch shape:", image_tensor_batch.shape) # [BATCH_SIZE, 3, 224, 224]
7002
  print("Audio batch shape:", audio_tensor_batch.shape) # [BATCH_SIZE, 1, 44100]
7003
 
@@ -7063,8 +7075,8 @@ def load_checkpoint(checkpoint_dir, model, optimizer, scheduler):
7063
  path = os.path.join(checkpoint_dir, best_ckpt)
7064
  checkpoint = torch.load(path)
7065
  model.load_state_dict(checkpoint['model_state'])
7066
- optimizer.load_state_dict(checkpoint['optimizer_state'])
7067
- scheduler.load_state_dict(checkpoint['scheduler_state'])
7068
  start_epoch = checkpoint['epoch']
7069
  best_loss = checkpoint['best_loss']
7070
  print(f"Resumed training from epoch {start_epoch+1} with best loss {best_loss:.4f}")
@@ -7187,7 +7199,8 @@ def train_model(model, train_loader, test_loader,
7187
  writer.add_scalar("Similarity/Test", avg_test_sim, epoch + 1)
7188
  writer.add_scalar("Learning Rate", current_lr, epoch + 1)
7189
 
7190
- print(f"Epoch {epoch+1} | Loss: ({avg_train_loss:.4f}, {avg_test_loss:.4f}) |"
 
7191
  f"LR: {current_lr:.2e} |"
7192
  f"Similarity: ({avg_train_sim:.4f}, {avg_test_sim:.4f})"
7193
  f"| I-Loss: ({avg_train_i_loss:.4f}, {avg_test_i_loss:.4f})"
@@ -7224,16 +7237,21 @@ def train_model(model, train_loader, test_loader,
7224
 
7225
 
7226
  model_name = "csip_model_openClip_CLAP"
 
 
7227
  learning_rate = 1e-5
7228
- epochs = 100
7229
  optimizer = torch.optim.AdamW(csip_model.parameters(), lr=learning_rate)
7230
  scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs, eta_min=1e-10)
7231
 
7232
- subprocess.run([
7233
- "rm",
7234
- "-rf",
7235
- f"/home/IITB/ai-at-ieor/23m1521/ashish/MTP/Vaani/Img_Audio_Alignment/{model_name}",
7236
- ])
 
 
 
 
7237
 
7238
  train_model(
7239
  model=csip_model,
 
5989
 
5990
  # /home/IITB/ai-at-ieor/23m1521/.conda/envs/openclip2/lib/python3.11/site-packages/open_clip/factory.py
5991
  HF_HUB_PREFIX = 'hf-hub:'
5992
+ # _MODEL_CONFIG_PATHS = [Path(__file__).parent / f"model_configs/"]
5993
+ _MODEL_CONFIG_PATHS = [Path("/home/IITB/ai-at-ieor/23m1521/ashish/MTP/Vaani/Img_Audio_Alignment/model_configs")]
5994
  _MODEL_CONFIGS = {} # directory (model_name: config) of model architecture configs
5995
 
5996
  import json
 
6953
 
6954
  def __getitem__(self, idx):
6955
  return {
6956
+ 'image_path': self.image_paths[idx],
6957
  'audio_path': self.audio_paths[idx]
6958
  }
6959
 
6960
 
6961
  def collate_fn(batch):
6962
+ image_tensor = [open_clip_imgaug(Image.open(item['image_path'])) for item in batch]
6963
+ image_paths = [item['image_path'] for item in batch]
6964
+
6965
+ audio_paths = [item['audio_path'] for item in batch]
6966
+ audio_tensor = CLAPAudioProcessor(audio_paths, resample=True)
6967
+
6968
+ return {
6969
+ 'image_tensor': torch.stack(image_tensor),
6970
+ 'image_paths': image_paths,
6971
+ 'audio_tensor': audio_tensor,
6972
+ 'audio_paths': audio_paths
6973
+ }
6974
 
6975
 
6976
 
 
7006
  )
7007
 
7008
  batch = next(iter(train_dataloader))
7009
+ image_tensor_batch = batch['image_tensor'].to(device=device)
7010
+ audio_tensor_batch = batch['audio_tensor'].to(device=device)
7011
+ image_paths_batch = batch['image_paths']
7012
+ audio_paths_batch = batch['audio_paths']
7013
  print("Image batch shape:", image_tensor_batch.shape) # [BATCH_SIZE, 3, 224, 224]
7014
  print("Audio batch shape:", audio_tensor_batch.shape) # [BATCH_SIZE, 1, 44100]
7015
 
 
7075
  path = os.path.join(checkpoint_dir, best_ckpt)
7076
  checkpoint = torch.load(path)
7077
  model.load_state_dict(checkpoint['model_state'])
7078
+ # optimizer.load_state_dict(checkpoint['optimizer_state'])
7079
+ # scheduler.load_state_dict(checkpoint['scheduler_state'])
7080
  start_epoch = checkpoint['epoch']
7081
  best_loss = checkpoint['best_loss']
7082
  print(f"Resumed training from epoch {start_epoch+1} with best loss {best_loss:.4f}")
 
7199
  writer.add_scalar("Similarity/Test", avg_test_sim, epoch + 1)
7200
  writer.add_scalar("Learning Rate", current_lr, epoch + 1)
7201
 
7202
+ print(f"\n\n |"
7203
+ f"Epoch {epoch+1} | Loss: ({avg_train_loss:.4f}, {avg_test_loss:.4f}, {best_loss:.4f}) |"
7204
  f"LR: {current_lr:.2e} |"
7205
  f"Similarity: ({avg_train_sim:.4f}, {avg_test_sim:.4f})"
7206
  f"| I-Loss: ({avg_train_i_loss:.4f}, {avg_test_i_loss:.4f})"
 
7237
 
7238
 
7239
  model_name = "csip_model_openClip_CLAP"
7240
+ epochs = 500
7241
+
7242
  learning_rate = 1e-5
 
7243
  optimizer = torch.optim.AdamW(csip_model.parameters(), lr=learning_rate)
7244
  scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs, eta_min=1e-10)
7245
 
7246
+ # learning_rate = 1e-3
7247
+ # optimizer = torch.optim.AdamW(csip_model.parameters(), lr=learning_rate)
7248
+ # scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, T_0=10, T_mult=2, eta_min=1e-5)
7249
+
7250
+ # subprocess.run([
7251
+ # "rm",
7252
+ # "-rf",
7253
+ # f"/home/IITB/ai-at-ieor/23m1521/ashish/MTP/Vaani/Img_Audio_Alignment/{model_name}",
7254
+ # ])
7255
 
7256
  train_model(
7257
  model=csip_model,
Vaani/Img_Audio_Alignment/_2.1.2_OpenCLIP_Image_Features.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/_2.1.2_OpenCLIP_Image_Features.py ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/_2.1.2_Train_OpenCLIP.py ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/_2_Image_data.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/available_img_audios_TEST3.csv ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/available_img_audios_TRAIN3.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:719e5fef279658370c6f754a34c8bc1ba1af4fe259c46b53002b3932a1889e89
3
+ size 17466633
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/_2.1.2_Train_OpenCLIP.py ADDED
The diff for this file is too large to render. See raw diff
 
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_best_epoch_539.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b03f512670a8be47816cdb9132322ce8eddc5fe76c289ffd95dbd887a85bfddd
3
+ size 2691054590
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_best_epoch_567.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd8f88bdc9e15be4d9af9d1378a8019072cf78a566dd6c78d2e113c96603a4c2
3
+ size 2691054590
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_epoch_572_best_epoch_-1.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b88cd9caf9c0049c9c5c97c120df6bfbfda1aa5d27aa23e0d06816503816226b
3
+ size 2691066446
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/csip_epoch_573_best_epoch_-1.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2560edf996a13c7874a2e3fa461d0342b82aa4824f0e48f960e4c6ef487cc03d
3
+ size 2691066446
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_101_loss_4.1500.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_102_loss_4.1500.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_103_loss_4.1499.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_104_loss_4.1499.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_105_loss_4.1498.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_107_loss_4.1498.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_108_loss_4.1497.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_109_loss_4.1497.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_110_loss_4.1496.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_111_loss_4.1496.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_113_loss_4.1495.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_114_loss_4.1495.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_115_loss_4.1494.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_116_loss_4.1494.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_117_loss_4.1494.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_118_loss_4.1493.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_120_loss_4.1493.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_121_loss_4.1492.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_122_loss_4.1492.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_123_loss_4.1491.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_125_loss_4.1491.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_126_loss_4.1490.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_128_loss_4.1490.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_129_loss_4.1489.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_130_loss_4.1489.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_131_loss_4.1489.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_132_loss_4.1488.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_133_loss_4.1488.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_134_loss_4.1487.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_135_loss_4.1487.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_136_loss_4.1487.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_137_loss_4.1486.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_138_loss_4.1486.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_139_loss_4.1486.png ADDED
Vaani/Img_Audio_Alignment/csip_model_openClip_CLAP/checkpoints/csip/logits/raw_logits_epoch_140_loss_4.1485.png ADDED