nielsr/yolov10l · Hugging Face

This model has been pushed to the Hub using the PytorchModelHubMixin integration.

Installation

First install the YOLOv10 Github repository along with supervision which provides some nice utilities for bounding box processing.

pip install git+https://github.com/nielsrogge/yolov10.git@feature/add_hf supervision

Usage

One can perform inference as follows:

from ultralytics import YOLOv10
import supervision as sv
from PIL import Image
import requests

# load model
model = YOLOv10.from_pretrained("nielsr/yolov10l")

# load image
url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
image = Image.open(requests.get(url, stream=True).raw)
image = np.array(image)

# perform inference
results = model(source=image, conf=0.25, verbose=False)[0]
detections = sv.Detections.from_ultralytics(results)
box_annotator = sv.BoxAnnotator()

category_dict = {
    0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus',
    6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant',
    11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat',
    16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear',
    22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag',
    27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard',
    32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove',
    36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle',
    40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl',
    46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli',
    51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake',
    56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table',
    61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard',
    67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink',
    72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors',
    77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'
}

labels = [
    f"{category_dict[class_id]} {confidence:.2f}"
    for class_id, confidence in zip(detections.class_id, detections.confidence)
]
annotated_image = box_annotator.annotate(
    image.copy(), detections=detections, labels=labels
)

Image.fromarray(annotated_image)

This shows the following:

https://cdn-uploads.huggingface.co/production/uploads/5f1158120c833276f61f1a84/IL9mL4_WUdcSxRQ7AsrTT.png)

BibTeX Entry and Citation Info

@misc{wang2024yolov10,
     title={YOLOv10: Real-Time End-to-End Object Detection}, 
     author={Ao Wang and Hui Chen and Lihao Liu and Kai Chen and Zijia Lin and Jungong Han and Guiguang Ding},
     year={2024},
     eprint={2405.14458},
     archivePrefix={arXiv},
     primaryClass={cs.CV}
}