Original article can be found here (source): Deep Learning on Medium
KITTI 3D object detection data set
Exploring 2D bounding boxes
The Kitti 3D detection data set is developed to learn 3d object detection in a traffic setting. In upcoming articles I will discuss different aspects of this dateset.
Motivation for the 3d detection
Autonomous robots and vehicles track positions of nearby objects. These can be other traffic participants, obstacles and drivable areas.
For path planning and collision avoidance, detection of these objects is not enough. To make informed decisions, the vehicle also needs to know relative position, relative speed and size of the object.
The 3d detection task
The task of 3d detection consists of several sub tasks. Objects need to be detected, classified, and located relative to the camera. Finally the objects have to be placed in a tightly fitting boundary box.
The kitti data set has the following directory structure
There are two visual cameras and a velodyne laser scanner.
The two cameras can be used for stereo vision. Overlaying images of the two cameras looks like this