Published: 20 October 2016
KITTI/CityScapes dataset has been a popular arena for many years. Its players include many world-class research institutes, such as Baidu, Samsung, NVidia, and NEC, and top universities, such as Stanford, and University of California.
An authoritative public benchmark dataset is important to evaluate the technical competence of a team. The KITTI Vision Benchmark Suite, established by Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago, is a benchmark for vision based autonomous driving. KITTI includes real images collected from a variety of road scenes, from urban streets to country roads to highways. Each image contains a sophisticated scenario involving, for instance, a crowded vehicle and pedestrians, with various levels of occlusion.
KITTI object detection includes vehicle, pedestrian and bicycle detection. KITTI target tracking includes vehicle and pedestrian tracking. KITTI road segmentation includes four individual scenarios, including urban unmarked, urban marked, urban multiple marked and the average of former named urban road.
TuSimple swept KITTI’s nine individual tests, ranking first in the world for all of them, while other well-known institutes had previously had only one or two individual top ranks.
Cityscapes Dataset is published by Mercedes-Benz and provides a segmentation data set in anonymous driving. It is used to evaluate algorithms’ performance of semantic understanding in an urban setting. Cityscapes have 50 cities with different scenes, backgrounds and seasons. It has 5,000 fine annotation images, 20,000 roughly annotation images and 30 class objects.
Cityscapes benchmark has two subsets: fine and coarse. The former provides 5,000 very detailed, pixel-level labeling and the latter provides an extra 20,000 coarse level labeling. TuSimple’s algorithm triumphed under each sets of criteria.
In addition to TuSimple’s success in the self-driving benchmark for KITTI and Cityscapes, TuSimple also achieved first place in facial landmark localization benchmark, 300W and AFLW by a landslide. This technique is mainly used for driver monitoring systems and positioning driver facial landmarks to detect fatigue or distracted driving.