Spaces:

ShesterG
/

TTIC-SHuBERT-ASLVideo-to-EnglishText

Running on Zero

ShesterG commited on Jul 10

Commit

8ff74f5

1 Parent(s): 7987f3c

updated fps

Files changed (9) hide show

__pycache__/body_features.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/body_features.cpython-38.pyc and b/__pycache__/body_features.cpython-38.pyc differ

__pycache__/crop_face.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/crop_face.cpython-38.pyc and b/__pycache__/crop_face.cpython-38.pyc differ

__pycache__/crop_hands.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/crop_hands.cpython-38.pyc and b/__pycache__/crop_hands.cpython-38.pyc differ

__pycache__/dinov2_features.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/dinov2_features.cpython-38.pyc and b/__pycache__/dinov2_features.cpython-38.pyc differ

__pycache__/inference.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/inference.cpython-38.pyc and b/__pycache__/inference.cpython-38.pyc differ

__pycache__/kpe_mediapipe.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/kpe_mediapipe.cpython-38.pyc and b/__pycache__/kpe_mediapipe.cpython-38.pyc differ

__pycache__/shubert.cpython-38.pyc CHANGED Viewed

Binary files a/__pycache__/shubert.cpython-38.pyc and b/__pycache__/shubert.cpython-38.pyc differ

app.py CHANGED Viewed

@@ -502,10 +502,11 @@ This app uses TTIC's foundation model SHuBERT (introduced in an ACL 2025 paper,
    **Requirements:**
    - We recommend that videos be under 20 seconds.  Performance for longer videos has not been tested.
-   - The signer should be the main part of the video. Videos recorded from a phone camera, tablet, or personal computer should work well. Studio recordings where the signer is farther from the camera may not work as well.
    - Supported formats: MP4, MOV
    **Note:**
    - Videos will be deleted after the output is generated.
    - Inquires or Feedback? Please email us at [email protected]
    """

    **Requirements:**
    - We recommend that videos be under 20 seconds.  Performance for longer videos has not been tested.
+   - The signer should be the main part (like 90% space-wise) of the video. Videos recorded from a phone camera, tablet, or personal computer should work well. Studio recordings where the signer is farther from the camera may not work as well.
    - Supported formats: MP4, MOV
    **Note:**
+   - This is just a demo of a research project, and should NOT be used to replace an interpreter in any way.
    - Videos will be deleted after the output is generated.
    - Inquires or Feedback? Please email us at [email protected]
    """

features.py CHANGED Viewed

@@ -51,8 +51,9 @@ class SHuBERTProcessor:
         signer_video = decord.VideoReader(video_path)
         signer_video_fps = signer_video.get_avg_fps()
-        target_fps = 12
-        stride = max(1, int(round(signer_video_fps / target_fps)))
         index_list = list(range(0, len(signer_video), stride))
         signer_video = signer_video.get_batch(index_list)
         signer_video = signer_video.asnumpy()
@@ -106,7 +107,7 @@ if __name__ == "__main__":
     # input_clip = "/share/data/pals/shester/datasets/openasl/clips_bbox/J-0KHhPS_m4.029676-029733.mp4"
     # input_clip = "/share/data/pals/shester/inference/recordings/sabrin30fps.mp4"
-    input_clip = "/share/data/pals/shester/inference/recordings/sabrina30fps.mp4"
     processor = SHuBERTProcessor(config)
     output_text = processor.process_video(input_clip)
     print(f"The English translation is: {output_text}")

         signer_video = decord.VideoReader(video_path)
         signer_video_fps = signer_video.get_avg_fps()
+        # target_fps = 12
+        # stride = max(1, int(round(signer_video_fps / target_fps)))
+        stride = 1
         index_list = list(range(0, len(signer_video), stride))
         signer_video = signer_video.get_batch(index_list)
         signer_video = signer_video.asnumpy()
     # input_clip = "/share/data/pals/shester/datasets/openasl/clips_bbox/J-0KHhPS_m4.029676-029733.mp4"
     # input_clip = "/share/data/pals/shester/inference/recordings/sabrin30fps.mp4"
+    input_clip = "/share/data/pals/shester/inference/recordings/sample_sabrina.mp4"
     processor = SHuBERTProcessor(config)
     output_text = processor.process_video(input_clip)
     print(f"The English translation is: {output_text}")