This README describes the Daimler Pedestrian Path Prediction Benchmark Dataset introduced in the publication
N. Schneider and D. M. Gavrila. Pedestrian Path Prediction with Recursive Bayesian Filters: A Comparative Study
German Conference on Pattern Recognition (GCPR) 2013
This dataset contains a collection of sequences with pedestrian trajectory data including stereo image pairs, ground truth annotations and pedestrian detector measurements. It is made publicly available to academic and non-academic entities for research purposes.
This dataset is made freely available to academic and non-academic entities for research purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use, copy, and distribute the data given that you agree:
This dataset contains a collection of 68 sequences with pedestrian trajectory data including 8bit PGM stereo image pairs (1176x640 pixels), ground truth annotations and pedestrian detector measurements.
The sequences have a total of 19612 stereo image pairs (c0 = left image, c1 = right image) with 12489 images containing (single) manual labelled pedestrian bounding boxes and 9316 images containing pedestrian detector measurements. For evaluation a range of 5-50m has been defined and only bounding boxes with valid disparity have been used leading to 9135 ground truth and 7908 measurement objects. Sequences are further labeled with event tags and time-to-event (TTE in frames) values. For stopping pedestrians the last placement of the foot on the ground at the curbside is labeled as TTE = 0. For crossing pedestrians, the closest point to the curbside (before entering the roadway), for pedestrians bending in and starting to walk the first moment of visually recognizable body turning or leg movements are labeled with TTE = 0. All frames previous to an event have TTE values > 0 , therefore all frames following the event have TTE values < 0 .
Sequences are split into training and test data. For installation, simply extract the provided .tar.gz archives. This will create the folders
Data/TestDataand
Data/TrainingData
Ground Truth is obtained by manual labeling of pedestrian bounding
box position in the left camera image (top-left pixel is (0,0)) and
computating the median disparity over the rough upper pedestrian body
area. Pedestrian positions in the vehicle coordinate system are derived
with median
disparity, pedestrian footpoint in the bounding box, camera paremters,
vehicle ego motion and camera-to-vehicle homography matrix. The
transformed positions are fitted with a curvilinear model and 3D ground
truth locations are obtained by longitudinal projections on the fitted
curve. Ego motion compensation is then removed and ground truth
disparity is computed by projecting the camera-pedestrian distance into
the image. Ground truth is provided in Database format in the file
<SequenceFolder>/LabelData/gt.dbSpecification of the Database format is given below. Timestamps (TimestampNs), median disparity (mediandisp), flag for valid disparities (ok3d), TTE values (turning/stopped/starting/atcurb), footpoints in vehicle coordinates (foot_xw/foot_xz), footpoints in image pixels (foot_u/foot_v) and fitted ground truth (fitted_xw/fitted_zw/fitted_disp) are provided as additional key-value pair attributes.
Measurements are obtained by a state-of-the-art HOG/linSVM pedestrian detector, given region-of-interests supplied by an obstacle detection component using dense stereo data. The resulting bounding boxes are used to calculate a median disparity over the upper pedestrian body area based on the disparity maps. The measurement vector z = (u, d) is derived using the central lateral position of the bounding box and this median disparity value. Measurements are provided in Database format in the file
<SequenceFolder>/LabelData/meas.dbSpecification of the Database format is given below. Median disparity (mediandisp), footpoints in vehicle coordinates (foot_xw/foot_xz) and footpoints in image pixels (foot_u/foot_v) are provided as additional key-value pair attributes.
Calibration/camera_parameters.txt
The camera is mounted inside the vehicle below the rear view mirror.
The world coordinate system origin is below the rear axis of the
vehicle on the ground-plane. x is pointing to the right, y is pointing
up and z is pointing towards the driving direction. The camera is 1.2m
above the ground and 1.9m behind the front of the vehicle. Distance of
the camera to the rear axis is 1.8m.
<SequenceFolder>/TxtVehicle/vehicle_<img_idx>.txt
Each file contains the vehicle velocity (m/s), the yaw rate (m/s^2) and a time-stamp (milliseconds). The file name includes an index corresponding to image indices.
Ground truth and pedestrian detector measurements are provided in the ASCII Database format. Specification is given below.
: | sequence separator, initiates an image sequence entry | |
seq_id | string identifier descriping the sequence | |
absolute_path | path to directory containing sequence | |
nr_images | length of sequence, or -1 if unkown | |
; | image (frame) separator, initiates an image frame entry | |
image_name | image file name | |
image_width image_height | image size | |
object_class nr_of_objects | default object class; number of objects, or -1 if unknown | |
# [object_class] | 2D object separator, initiates an object entry in 2D (image) coordinates; optional object class, overriding the default entry above object class: 0=fully-visible pedestrian 1=bicyclist 2=motorcyclist 10=pedestrian group 255=partially visible pedestrian, bicyclist, motorcyclist |
|
obj_id [unique_id] | object ID to identify trajectories of the same physical object; optional additional ID unique to each object entry |
|
<confidence_1 ... confidence_n> | a vector of up to 16 float values | |
min_x min_y max_x max_y | 2D bounding box coordinates (integer values) | |
nr_contour_points | number of contour points to follow | |
x_1 y_1 ... x_n y_n | list of contour points | |
% attr_key attr_value | optional object attributes given as key-value pairs | |
... | (end of 2D object entry, more objects to follow) | |
§ [object_class] | 3D object separator, initiates an object entry in 3D (world) coordinates; optional object class, overriding the default entry above |
|
obj_id [unique_id] | object ID to identify trajectories of the same physical object; optional additional ID unique to each object entry |
|
<confidence_1 ... confidence_n> | a vector of up to 16 float values | |
obj_min_x obj_min_y obj_min_z | 3D bounding box coordinates (float values) | |
obj_max_x obj_max_y obj_max_z | ||
% attr_key attr_value | optional object attributes given as key-value pairs | |
... | (end of 3D object entry, more objects to follow) | |
... | (end of image frame entry, more images to follow) | |
... | (end of image sequence entry, more sequences to follow) |
Authors who wish to evaluate pedestrian path prediction on this dataset
are encouraged to follow the benchmarking procedure and criteria as
detailed in the publication given above. Evaluation methodology and
objective function for process noise parameter optimization are
provided as Matlab functions in the directory
evaluation
See
evaluation/example.m
for details on how to use the functions.
Note that this software is provided "as is" without warranty of any kind.
The original authors would like to hear about other publications that make use of the benchmark data set in order to include corresponding references on the benchmark website.
For convenience, a Matlab parser for the Database format is provided in the directory
evaluation/DBFileParserSee
evaluation/DBFileParser/example.m
for details on how to interpret the Database format.
Note that this software is provided "as is" without warranty of any kind.
Please direct questions regarding the dataset and benchmarking procedure to Prof. Dr. Dariu Gavrila