Self-Driving Perception System

A phased exploration of autonomous driving perception, evolving from 2D multi-view object detection to SOTA 3D LiDAR-based perception using the nuScenes dataset.

Project Overview

Objective: Build a perception stack that processes multi-modal sensor data (Camera + LiDAR) for autonomous driving.
Dataset: mini-nuScenes v1.0-mini (6 Cameras, 1 LiDAR, Radar).
Models: PointPillars (3D LiDAR Detection via MMDetection3D)
Output: Visualization videos demonstrating 2D & 3D situational awareness.

Features

Composite Visualization: Synchronized view of Front Camera (Top) and LiDAR Point Cloud (Bottom).
3D Object Rendering: Solid, semi-transparent 3D bounding boxes in LiDAR view.
Class-Specific Styling: Color-coded objects (Cyan=Car, Red=Pedestrian, Orange=Truck) with text labels.
Sensor Fusion: 3D bounding boxes projected onto the 2D camera image.
Point Cloud Features: LiDAR points colored by intensity to reveal road markings.
Ego Vehicle: Visualization of the ego car's position and orientation.

Quick Start

1. Setup Environment

Option A — Phase 1 Environment (Local)

Suitable for running video_demo.py and demo_3d_dual_viz.py.

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Option B — Phase 2 Environment (GPU / Linux / Colab)

Required for running demo_phase2_lidar.py.

MMDetection3D is difficult to compile on macOS — it is recommended to use Google Colab, Linux, or any machine with NVIDIA GPU support.

2. Download Dataset

Download mini-nuScenes v1.0-mini (~4GB) from:
https://www.nuscenes.org/download

Choose:
Full dataset (v1.0) → mini → v1.0-mini.tgz

Extract to: ./data/nuscenes/

Verify that this path exists: ./data/nuscenes/v1.0-mini/maps

3. Run Demos

Generates a composite video with solid 3D boxes, class labels, and intensity rendering.

source venv/bin/activate
python3 viz_3d_video.py
Output: output/lidar_3d_viz_composite.mp4

Project Structure

self_driving_perception/
├── viz_3d_video.py             # [Phase 2] Custom 3D LiDAR Visualizer (Main)
├── video_demo.py               # [Phase 1] Dual-View 2D Detection (YOLO)
├── demo_3d_dual_viz.py         # [Phase 2] Legacy 3D GT Visualization
├── demo_phase2_lidar.py        # [Phase 2] LiDAR Inference (PointPillars)
├── data/
│   ├── nuscenes_loader.py      # Enhanced loader: Camera + LiDAR + Calibration
│   └── nuscenes/v1.0-mini/     # Dataset location
├── models/
│   └── weights/                # Model weights (.pth) + configs (.py)
├── output/                     # Generated demo videos
├── requirements.txt            # Python dependencies for Phase 1
└── README.md                   # This file

Technical Details

Phase 1: 2D Pipeline

Inputs: Synchronized CAM_FRONT and CAM_BACK images
Detection: YOLOv8n (runs per frame)
Fusion: Vertical concatenation using cv2.vconcat for a split-screen dashboard
Depth Estimation: Heuristic estimation using bounding-box height and pinhole camera geometry

Phase 2: 3D Pipeline

Coordinate Transformation: LiDAR Frame (x, y, z) → Ego Frame → Camera Frame → Image Plane (u, v)

Projection:
3D cuboid corners projected into 2D using the camera intrinsic matrix.

Data Fusion:
Overlay LiDAR-derived spatial data (3D boxes, orientation, coordinates) onto camera RGB frames.

Dependencies

Core

Python 3.8+
PyTorch
OpenCV (cv2)
NumPy
nuscenes-devkit
ultralytics (YOLOv8)

Optional (Phase 2)

MMDetection3D
MMCV
CUDA Toolkit (for GPU acceleration)

Troubleshooting

"Dataset not found":
Ensure the directory exists:

data/nuscenes/v1.0-mini/

"Video format not supported" on macOS:
Videos use H.264 (avc1). Try VLC if default macOS apps fail.
"No module named mmdet3d":
You are running a Phase 2 script without an MMDetection3D environment.
Use Google Colab or a Linux GPU machine.

License

MIT License — Educational project for autonomous driving perception.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
models		models
scripts		scripts
utils		utils
visualization		visualization
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
viz_3d_video.py		viz_3d_video.py
yolov8n.pt		yolov8n.pt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Driving Perception System

Project Overview

Features

Quick Start

1. Setup Environment

Option A — Phase 1 Environment (Local)

Option B — Phase 2 Environment (GPU / Linux / Colab)

2. Download Dataset

3. Run Demos

Project Structure

Technical Details

Phase 1: 2D Pipeline

Phase 2: 3D Pipeline

Dependencies

Core

Optional (Phase 2)

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Joey-Chow/Self-Driving-Perception

Folders and files

Latest commit

History

Repository files navigation

Self-Driving Perception System

Project Overview

Features

Quick Start

1. Setup Environment

Option A — Phase 1 Environment (Local)

Option B — Phase 2 Environment (GPU / Linux / Colab)

2. Download Dataset

3. Run Demos

Project Structure

Technical Details

Phase 1: 2D Pipeline

Phase 2: 3D Pipeline

Dependencies

Core

Optional (Phase 2)

Troubleshooting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages