TheDenk commited on
Commit
a285a9b
·
verified ·
1 Parent(s): dc550b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -14,6 +14,8 @@ library_name: transformers
14
  # Qwen2.5-VL-3B-TrackAnyObject-LoRa-v1
15
 
16
 
 
 
17
  ## Introduction
18
  Qwen2.5-VL was not originally trained for object tracking tasks. While it can perform object detection on individual frames or across video inputs, processing N frames sequentially results in identical predictions for each frame. Consequently, the model cannot maintain consistent object IDs across predictions.
19
  We provide a LoRA adapter for Qwen2.5-VL-3B that enables object tracking capabilities.
@@ -117,7 +119,7 @@ device = "cuda"
117
  objects_for_tracking = "person" ## "person, cat", "person, cat, dog"
118
 
119
  ## Load video and convert to numpy array of shape (num_frames, height, width, channels)
120
- video, fps = read_video(video_path="bear.mp4", start_frame=0, frames_count=16, max_side=896)
121
  ```
122
 
123
  ### Run inference
 
14
  # Qwen2.5-VL-3B-TrackAnyObject-LoRa-v1
15
 
16
 
17
+ <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63fde49f6315a264aba6a7ed/cPo3S-tuu3UgV9_aIOhU1.mp4"></video>
18
+
19
  ## Introduction
20
  Qwen2.5-VL was not originally trained for object tracking tasks. While it can perform object detection on individual frames or across video inputs, processing N frames sequentially results in identical predictions for each frame. Consequently, the model cannot maintain consistent object IDs across predictions.
21
  We provide a LoRA adapter for Qwen2.5-VL-3B that enables object tracking capabilities.
 
119
  objects_for_tracking = "person" ## "person, cat", "person, cat, dog"
120
 
121
  ## Load video and convert to numpy array of shape (num_frames, height, width, channels)
122
+ video, fps = read_video(video_path="path to video.mp4", start_frame=0, frames_count=16, max_side=896)
123
  ```
124
 
125
  ### Run inference