--- license: mit library_name: pytorch tags: - faster-rcnn - object-detection - computer-vision - pytorch - bdd100k - autonomous-driving - BDD 100K - from-scratch pipeline_tag: object-detection datasets: - bdd100k widget: - src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png example_title: "Sample Image" model-index: - name: faster-rcnn-bdd-vanilla results: - task: type: object-detection dataset: type: bdd100k name: Berkeley DeepDrive (BDD) 100K metrics: - type: mean_average_precision name: mAP value: "TBD" --- # Faster R-CNN - Berkeley DeepDrive (BDD) 100K Vanilla Faster R-CNN model trained from scratch on Berkeley DeepDrive (BDD) 100K dataset for object detection in autonomous driving scenarios. ## Model Details - **Model Type**: Faster R-CNN Object Detection - **Dataset**: Berkeley DeepDrive (BDD) 100K - **Training Method**: trained from scratch - **Framework**: PyTorch - **Task**: Object Detection ## Dataset Information This model was trained on the **Berkeley DeepDrive (BDD) 100K** dataset, which contains the following object classes: car, truck, bus, motorcycle, bicycle, person, traffic light, traffic sign, train, rider ### Dataset-specific Details: **Berkeley DeepDrive (BDD) 100K Dataset:** - 100,000+ driving images with diverse weather and lighting conditions - Designed for autonomous driving applications - Contains urban driving scenarios from multiple cities - Annotations include bounding boxes for vehicles, pedestrians, and traffic elements ## Usage This model can be used with PyTorch and common object detection frameworks: ```python import torch import torchvision.transforms as transforms from PIL import Image # Load the model (example using torchvision) model = torch.load('path/to/model.pth') model.eval() # Prepare your image transform = transforms.Compose([ transforms.ToTensor(), ]) image = Image.open('path/to/image.jpg') image_tensor = transform(image).unsqueeze(0) # Run inference with torch.no_grad(): predictions = model(image_tensor) # Process results boxes = predictions[0]['boxes'] scores = predictions[0]['scores'] labels = predictions[0]['labels'] ``` ## Model Performance This model was trained from scratch on the Berkeley DeepDrive (BDD) 100K dataset using Faster R-CNN architecture. ## Architecture **Faster R-CNN** (Region-based Convolutional Neural Network) is a two-stage object detection framework: 1. **Region Proposal Network (RPN)**: Generates object proposals 2. **Fast R-CNN detector**: Classifies proposals and refines bounding box coordinates Key advantages: - High accuracy object detection - Precise localization - Good performance on small objects - Well-established architecture with extensive research backing ## Intended Use - **Primary Use**: Object detection in autonomous driving scenarios - **Suitable for**: Research, development, and deployment of object detection systems - **Limitations**: Performance may vary on images significantly different from the training distribution ## Citation If you use this model, please cite: ```bibtex @article{ren2015faster, title={Faster r-cnn: Towards real-time object detection with region proposal networks}, author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian}, journal={Advances in neural information processing systems}, volume={28}, year={2015} } ``` ## License This model is released under the MIT License. ## Keywords Faster R-CNN, Object Detection, Computer Vision, BDD 100K, Autonomous Driving, Deep Learning, Two-Stage Detection