KITTI 3D object detection data set

Original article can be found here (source): Deep Learning on Medium

KITTI 3D object detection data set

Exploring 2D bounding boxes

Example of camera 2 in kitti 3d object detection dataset

The Kitti 3D detection data set is developed to learn 3d object detection in a traffic setting. In upcoming articles I will discuss different aspects of this dateset.

Motivation for the 3d detection

Autonomous robots and vehicles track positions of nearby objects. These can be other traffic participants, obstacles and drivable areas.

For path planning and collision avoidance, detection of these objects is not enough. To make informed decisions, the vehicle also needs to know relative position, relative speed and size of the object.

The 3d detection task

The task of 3d detection consists of several sub tasks. Objects need to be detected, classified, and located relative to the camera. Finally the objects have to be placed in a tightly fitting boundary box.

Directory structure

The kitti data set has the following directory structure

{training,testing}/image_2/id.png
{training,testing}/image_3/id.png
{training,testing}/label_2/id.txt
{training,testing}/velodyne/id.bin
{training,testing}/calib/id.txt

There are two visual cameras and a velodyne laser scanner.

The two cameras can be used for stereo vision. Overlaying images of the two cameras looks like this