|
--- |
|
license: mit |
|
library_name: pytorch |
|
tags: |
|
- faster-rcnn |
|
- object-detection |
|
- computer-vision |
|
- pytorch |
|
- kitti |
|
- autonomous-driving |
|
- from-scratch |
|
pipeline_tag: object-detection |
|
datasets: |
|
- kitti |
|
widget: |
|
- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png |
|
example_title: "Sample Image" |
|
model-index: |
|
- name: faster-rcnn-kitti-vanilla |
|
results: |
|
- task: |
|
type: object-detection |
|
dataset: |
|
type: kitti |
|
name: KITTI Object Detection |
|
metrics: |
|
- type: mean_average_precision |
|
name: mAP |
|
value: "TBD" |
|
--- |
|
|
|
# Faster R-CNN - KITTI Object Detection Vanilla |
|
|
|
Faster R-CNN model trained from scratch on KITTI dataset for autonomous driving object detection. |
|
|
|
## Model Details |
|
|
|
- **Model Type**: Faster R-CNN Object Detection |
|
- **Dataset**: KITTI Object Detection |
|
- **Training Method**: trained from scratch |
|
- **Framework**: PyTorch |
|
- **Task**: Object Detection |
|
|
|
## Dataset Information |
|
|
|
This model was trained on the **KITTI Object Detection** dataset, which contains the following object classes: |
|
|
|
car, pedestrian, cyclist |
|
|
|
### Dataset-specific Details: |
|
|
|
**KITTI Object Detection Dataset:** |
|
- Real-world autonomous driving dataset |
|
- Contains stereo imagery from vehicle-mounted cameras |
|
- Focus on cars, pedestrians, and cyclists |
|
- Challenging scenarios with varying lighting and weather conditions |
|
|
|
## Usage |
|
|
|
This model can be used with PyTorch and common object detection frameworks: |
|
|
|
```python |
|
import torch |
|
import torchvision.transforms as transforms |
|
from PIL import Image |
|
|
|
# Load the model (example using torchvision) |
|
model = torch.load('path/to/model.pth') |
|
model.eval() |
|
|
|
# Prepare your image |
|
transform = transforms.Compose([ |
|
transforms.ToTensor(), |
|
]) |
|
|
|
image = Image.open('path/to/image.jpg') |
|
image_tensor = transform(image).unsqueeze(0) |
|
|
|
# Run inference |
|
with torch.no_grad(): |
|
predictions = model(image_tensor) |
|
|
|
# Process results |
|
boxes = predictions[0]['boxes'] |
|
scores = predictions[0]['scores'] |
|
labels = predictions[0]['labels'] |
|
``` |
|
|
|
## Model Performance |
|
|
|
This model was trained from scratch on the KITTI Object Detection dataset using Faster R-CNN architecture. |
|
|
|
## Architecture |
|
|
|
**Faster R-CNN** (Region-based Convolutional Neural Network) is a two-stage object detection framework: |
|
|
|
1. **Region Proposal Network (RPN)**: Generates object proposals |
|
2. **Fast R-CNN detector**: Classifies proposals and refines bounding box coordinates |
|
|
|
Key advantages: |
|
- High accuracy object detection |
|
- Precise localization |
|
- Good performance on small objects |
|
- Well-established architecture with extensive research backing |
|
|
|
## Intended Use |
|
|
|
- **Primary Use**: Object detection in autonomous driving scenarios |
|
- **Suitable for**: Research, development, and deployment of object detection systems |
|
- **Limitations**: Performance may vary on images significantly different from the training distribution |
|
|
|
## Citation |
|
|
|
If you use this model, please cite: |
|
|
|
```bibtex |
|
@article{ren2015faster, |
|
title={Faster r-cnn: Towards real-time object detection with region proposal networks}, |
|
author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian}, |
|
journal={Advances in neural information processing systems}, |
|
volume={28}, |
|
year={2015} |
|
} |
|
``` |
|
|
|
## License |
|
|
|
This model is released under the MIT License. |
|
|
|
## Keywords |
|
|
|
Faster R-CNN, Object Detection, Computer Vision, KITTI, Autonomous Driving, Deep Learning, Two-Stage Detection |
|
|