Here are a series of pruned and quantized YOLOv8 models I generated for my master's thesis.
Here are my main results:
As you can see, accuracy quickly drops as the pruning ratio increases (the pruning ratio is shown by the black label next to each point on the graph). Quantization and input image resolution reduction offer much better trade-offs in terms of speedup and accuracy! Structured pruning turned out to be too aggressive, whereas quantization and resolution reduction proved to be much more effective! Keep in mind that YOLO already has a very efficient architecture. Structured pruning tends to be more effective on over-parameterized models, such as AlexNet or VGG-Net. Also, detection models are much more sensitive to pruning than plain classification ones.
Here are some real prediction examples:
Check out the code repo with all the code for model optimization: https://github.com/Alejandro-Casanova/YOLOv8-Pruned
And the repo with the demo Android App: https://github.com/Alejandro-Casanova/Android-Trash-Detection-with-YOLO
- Downloads last month
- 9
Model tree for alexNova/YOLO-TrashDet-Pruned
Base model
Ultralytics/YOLOv8