HugoHE's picture
Add comprehensive model card for faster-rcnn-kitti-vanilla
b785728 verified
---
license: mit
library_name: pytorch
tags:
- faster-rcnn
- object-detection
- computer-vision
- pytorch
- kitti
- autonomous-driving
- from-scratch
pipeline_tag: object-detection
datasets:
- kitti
widget:
- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png
example_title: "Sample Image"
model-index:
- name: faster-rcnn-kitti-vanilla
results:
- task:
type: object-detection
dataset:
type: kitti
name: KITTI Object Detection
metrics:
- type: mean_average_precision
name: mAP
value: "TBD"
---
# Faster R-CNN - KITTI Object Detection Vanilla
Faster R-CNN model trained from scratch on KITTI dataset for autonomous driving object detection.
## Model Details
- **Model Type**: Faster R-CNN Object Detection
- **Dataset**: KITTI Object Detection
- **Training Method**: trained from scratch
- **Framework**: PyTorch
- **Task**: Object Detection
## Dataset Information
This model was trained on the **KITTI Object Detection** dataset, which contains the following object classes:
car, pedestrian, cyclist
### Dataset-specific Details:
**KITTI Object Detection Dataset:**
- Real-world autonomous driving dataset
- Contains stereo imagery from vehicle-mounted cameras
- Focus on cars, pedestrians, and cyclists
- Challenging scenarios with varying lighting and weather conditions
## Usage
This model can be used with PyTorch and common object detection frameworks:
```python
import torch
import torchvision.transforms as transforms
from PIL import Image
# Load the model (example using torchvision)
model = torch.load('path/to/model.pth')
model.eval()
# Prepare your image
transform = transforms.Compose([
transforms.ToTensor(),
])
image = Image.open('path/to/image.jpg')
image_tensor = transform(image).unsqueeze(0)
# Run inference
with torch.no_grad():
predictions = model(image_tensor)
# Process results
boxes = predictions[0]['boxes']
scores = predictions[0]['scores']
labels = predictions[0]['labels']
```
## Model Performance
This model was trained from scratch on the KITTI Object Detection dataset using Faster R-CNN architecture.
## Architecture
**Faster R-CNN** (Region-based Convolutional Neural Network) is a two-stage object detection framework:
1. **Region Proposal Network (RPN)**: Generates object proposals
2. **Fast R-CNN detector**: Classifies proposals and refines bounding box coordinates
Key advantages:
- High accuracy object detection
- Precise localization
- Good performance on small objects
- Well-established architecture with extensive research backing
## Intended Use
- **Primary Use**: Object detection in autonomous driving scenarios
- **Suitable for**: Research, development, and deployment of object detection systems
- **Limitations**: Performance may vary on images significantly different from the training distribution
## Citation
If you use this model, please cite:
```bibtex
@article{ren2015faster,
title={Faster r-cnn: Towards real-time object detection with region proposal networks},
author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
journal={Advances in neural information processing systems},
volume={28},
year={2015}
}
```
## License
This model is released under the MIT License.
## Keywords
Faster R-CNN, Object Detection, Computer Vision, KITTI, Autonomous Driving, Deep Learning, Two-Stage Detection