File size: 3,421 Bytes
b785728 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
---
license: mit
library_name: pytorch
tags:
- faster-rcnn
- object-detection
- computer-vision
- pytorch
- kitti
- autonomous-driving
- from-scratch
pipeline_tag: object-detection
datasets:
- kitti
widget:
- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png
example_title: "Sample Image"
model-index:
- name: faster-rcnn-kitti-vanilla
results:
- task:
type: object-detection
dataset:
type: kitti
name: KITTI Object Detection
metrics:
- type: mean_average_precision
name: mAP
value: "TBD"
---
# Faster R-CNN - KITTI Object Detection Vanilla
Faster R-CNN model trained from scratch on KITTI dataset for autonomous driving object detection.
## Model Details
- **Model Type**: Faster R-CNN Object Detection
- **Dataset**: KITTI Object Detection
- **Training Method**: trained from scratch
- **Framework**: PyTorch
- **Task**: Object Detection
## Dataset Information
This model was trained on the **KITTI Object Detection** dataset, which contains the following object classes:
car, pedestrian, cyclist
### Dataset-specific Details:
**KITTI Object Detection Dataset:**
- Real-world autonomous driving dataset
- Contains stereo imagery from vehicle-mounted cameras
- Focus on cars, pedestrians, and cyclists
- Challenging scenarios with varying lighting and weather conditions
## Usage
This model can be used with PyTorch and common object detection frameworks:
```python
import torch
import torchvision.transforms as transforms
from PIL import Image
# Load the model (example using torchvision)
model = torch.load('path/to/model.pth')
model.eval()
# Prepare your image
transform = transforms.Compose([
transforms.ToTensor(),
])
image = Image.open('path/to/image.jpg')
image_tensor = transform(image).unsqueeze(0)
# Run inference
with torch.no_grad():
predictions = model(image_tensor)
# Process results
boxes = predictions[0]['boxes']
scores = predictions[0]['scores']
labels = predictions[0]['labels']
```
## Model Performance
This model was trained from scratch on the KITTI Object Detection dataset using Faster R-CNN architecture.
## Architecture
**Faster R-CNN** (Region-based Convolutional Neural Network) is a two-stage object detection framework:
1. **Region Proposal Network (RPN)**: Generates object proposals
2. **Fast R-CNN detector**: Classifies proposals and refines bounding box coordinates
Key advantages:
- High accuracy object detection
- Precise localization
- Good performance on small objects
- Well-established architecture with extensive research backing
## Intended Use
- **Primary Use**: Object detection in autonomous driving scenarios
- **Suitable for**: Research, development, and deployment of object detection systems
- **Limitations**: Performance may vary on images significantly different from the training distribution
## Citation
If you use this model, please cite:
```bibtex
@article{ren2015faster,
title={Faster r-cnn: Towards real-time object detection with region proposal networks},
author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
journal={Advances in neural information processing systems},
volume={28},
year={2015}
}
```
## License
This model is released under the MIT License.
## Keywords
Faster R-CNN, Object Detection, Computer Vision, KITTI, Autonomous Driving, Deep Learning, Two-Stage Detection
|