dcy0577 commited on
Commit
3180509
·
verified ·
1 Parent(s): 6600256

Fix: intersection_area & torch cpu device

Browse files

1. The original function intersection_area was not correct.
Standard intersection x1 (left edge) is max(bbox_left[0], bbox_right[0]). Standard intersection y1 (top edge) is max(bbox_left[1], bbox_right[1]).
The current implementation uses min for both these calculations when determining the intersection dimensions (min(bbox_left[0], bbox_right[0]) and min(bbox_left[1], bbox_right[1])). This will not calculate the correct intersection rectangle dimensions. The intersection_area function [within remove_overlap_new function](https://github.com/microsoft/OmniParser/blob/5171b092483ab3e74ca50b9357e225f9f3571f18/util/utils.py#L242-L247) in the Omni github repo uses the correct standard formula (max for top-left, min for bottom-right).
2. An error occurs when using the CPU. This is because the pretrained model is loaded using torch16, while inference is only performed using torch16 on CUDA and MPS.

Files changed (1) hide show
  1. handler.py +9 -4
handler.py CHANGED
@@ -296,7 +296,7 @@ class EndpointHandler:
296
  return_tensors="pt",
297
  do_resize=False,
298
  )
299
- if self.device.type in {"cuda", "mps"}:
300
  inputs = inputs.to(device=self.device, dtype=torch.float16)
301
 
302
  with torch.inference_mode():
@@ -352,9 +352,14 @@ def area(bbox: List[int]) -> int:
352
 
353
 
354
  def intersection_area(bbox_left: List[int], bbox_right: List[int]) -> int:
355
- return max(
356
- 0, min(bbox_left[2], bbox_right[2]) - min(bbox_left[0], bbox_right[0])
357
- ) * max(0, min(bbox_left[3], bbox_right[3]) - min(bbox_left[1], bbox_right[1]))
 
 
 
 
 
358
 
359
 
360
  def intersection_over_union(bbox_left: List[int], bbox_right: List[int]) -> float:
 
296
  return_tensors="pt",
297
  do_resize=False,
298
  )
299
+ if self.device.type in {"cuda", "mps", "cpu"}:
300
  inputs = inputs.to(device=self.device, dtype=torch.float16)
301
 
302
  with torch.inference_mode():
 
352
 
353
 
354
  def intersection_area(bbox_left: List[int], bbox_right: List[int]) -> int:
355
+ ix1 = max(bbox_left[0], bbox_right[0])
356
+ iy1 = max(bbox_left[1], bbox_right[1])
357
+ ix2 = min(bbox_left[2], bbox_right[2])
358
+ iy2 = min(bbox_left[3], bbox_right[3])
359
+
360
+ width = max(0, ix2 - ix1)
361
+ height = max(0, iy2 - iy1)
362
+ return width * height
363
 
364
 
365
  def intersection_over_union(bbox_left: List[int], bbox_right: List[int]) -> float: