Daimler Pedestrian Path Prediction Benchmark Dataset


Nicolas Schneider and Dariu M. Gavrila
August 29, 2013
(C) 2013 by Daimler AG

Contents

  1. Introduction
  2. License Agreement
  3. Dataset
  4. Ground Truth
  5. Camera Parameters
  6. Ground Truth File Format
  7. Benchmarking Procedure
  8. Ground Truth Parser (Matlab)
  9. Contact

Introduction

This README describes the Daimler Pedestrian Path Prediction Benchmark Dataset introduced in the publication

N. Schneider and D. M. Gavrila. Pedestrian Path Prediction with Recursive Bayesian Filters: A Comparative Study
German Conference on Pattern Recognition (GCPR) 2013

This dataset contains a collection of sequences with pedestrian trajectory data including stereo image pairs, ground truth annotations and pedestrian detector measurements. It is made publicly available to academic and non-academic entities for research purposes.


License Agreement

This dataset is made freely available to academic and non-academic entities for research purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use, copy, and distribute the data given that you agree:

  1. That the dataset comes "AS IS", without express or implied warranty. Although every effort has been made to ensure accuracy, Daimler does not accept any responsibility for errors or omissions.
  2. That you include a reference to the above publication in any published work that makes use of the dataset.
  3. That if you have altered the content of the dataset or created derivative work, prominent notices are made so that any recipients know that they are not receiving the original data.
  4. That you may not use or distribute the dataset or any derivative work for commercial purposes as, for example, licensing or selling the data, or using the data with a purpose to procure a commercial gain.
  5. That this license agreement is retained with all copies of the dataset.
  6. That all rights not expressly granted to you are reserved by Daimler.

Dataset

This dataset contains a collection of 68 sequences with pedestrian trajectory data including 8bit PGM stereo image pairs (1176x640 pixels), ground truth annotations and pedestrian detector measurements.

The sequences have a total of 19612 stereo image pairs (c0 = left image, c1 = right image) with 12489 images containing (single) manual labelled pedestrian bounding boxes and 9316 images containing pedestrian detector measurements.  For evaluation a range of 5-50m has been defined and only bounding boxes with valid disparity have been used leading to 9135 ground truth and 7908 measurement objects. Sequences are further labeled with event tags and time-to-event (TTE in frames) values. For stopping pedestrians the last placement of the foot on the ground at the curbside is labeled as TTE = 0. For crossing pedestrians, the closest point to the curbside (before entering the roadway), for pedestrians bending in and starting to walk the first moment of visually recognizable body turning or leg movements are labeled with TTE = 0. All frames previous to an event have TTE values > 0 , therefore all frames following the event have TTE values < 0 .

Sequences are split into training and test data. For installation, simply extract the provided .tar.gz archives. This will create the folders

Data/TestData
and
Data/TrainingData


Ground truth

Ground Truth is obtained by manual labeling of pedestrian bounding box position in the left camera image (top-left pixel is (0,0)) and computating the median disparity over the rough upper pedestrian body area. Pedestrian positions in the vehicle coordinate system are derived with median disparity, pedestrian footpoint in the bounding box, camera paremters, vehicle ego motion and camera-to-vehicle homography matrix. The transformed positions are fitted with a curvilinear model and 3D ground truth locations are obtained by longitudinal projections on the fitted curve. Ego motion compensation is then removed and ground truth disparity is computed by projecting the camera-pedestrian distance into the image. Ground truth is provided in Database format in the file

<SequenceFolder>/LabelData/gt.db
Specification of the Database format is given below. Timestamps (TimestampNs), median disparity (mediandisp), flag for valid disparities (ok3d), TTE values (turning/stopped/starting/atcurb), footpoints in vehicle coordinates (foot_xw/foot_xz), footpoints in image pixels (foot_u/foot_v) and fitted ground truth (fitted_xw/fitted_zw/fitted_disp) are provided as additional key-value pair attributes.

Measurements

Measurements are obtained by a state-of-the-art HOG/linSVM pedestrian detector, given region-of-interests supplied by an obstacle detection component using dense stereo data. The resulting bounding boxes are used to calculate a median disparity over the upper pedestrian body area based on the disparity maps. The measurement vector z = (u, d)  is derived using the central lateral position of the bounding box and this median disparity value. Measurements are provided in Database format in the file

<SequenceFolder>/LabelData/meas.db
Specification of the Database format is given below. Median disparity (mediandisp), footpoints in vehicle coordinates (foot_xw/foot_xz) and footpoints in image pixels (foot_u/foot_v) are provided as additional key-value pair attributes.


Camera Parameters

Provided images have been rectified. Camera parameters necessary to project from 2D to 3D are described in the file:
Calibration/camera_parameters.txt

The camera is mounted inside the vehicle below the rear view mirror. The world coordinate system origin is below the rear axis of the vehicle on the ground-plane. x is pointing to the right, y is pointing up and z is pointing towards the driving direction. The camera is 1.2m above the ground and 1.9m behind the front of the vehicle. Distance of the camera to the rear axis is 1.8m.


Vehicle Data

Vehicle velocity and yaw-rate measurements are available in the camera cycle time from on-board sensors. Measurements for each image can be found in  'TxtVehicle' in the sequence folders:
<SequenceFolder>/TxtVehicle/vehicle_<img_idx>.txt

Each file contains the vehicle velocity (m/s), the yaw rate (m/s^2) and a time-stamp (milliseconds). The file name includes an index corresponding to image indices.


Ground Truth and Measurement File Format

Ground truth and pedestrian detector measurements are provided in the ASCII Database format. Specification is given below.

:
sequence separator, initiates an image sequence entry
seq_id
string identifier descriping the sequence
absolute_path
path to directory containing sequence
nr_images
length of sequence, or -1 if unkown

  ;
image (frame) separator, initiates an image frame entry
  image_name
image file name
  image_width image_height
image size
  object_class nr_of_objects
default object class; number of objects, or -1 if unknown

    # [object_class]
2D object separator, initiates an object entry in 2D (image) coordinates;
optional object class, overriding the default entry above
object class:
0=fully-visible pedestrian
1=bicyclist
2=motorcyclist
10=pedestrian group
255=partially visible pedestrian, bicyclist, motorcyclist
    obj_id [unique_id]
object ID to identify trajectories of the same physical object;
optional additional ID unique to each object entry
    <confidence_1 ... confidence_n>
a vector of up to 16 float values
    min_x min_y max_x max_y
2D bounding box coordinates (integer values)
    nr_contour_points
number of contour points to follow
    x_1 y_1 ... x_n y_n
list of contour points
    % attr_key attr_value
optional object attributes given as key-value pairs
    ...
(end of 2D object entry, more objects to follow)

    § [object_class]
3D object separator, initiates an object entry in 3D (world) coordinates;
optional object class, overriding the default entry above
    obj_id [unique_id]
object ID to identify trajectories of the same physical object;
optional additional ID unique to each object entry
    <confidence_1 ... confidence_n>
a vector of up to 16 float values
    obj_min_x obj_min_y obj_min_z
3D bounding box coordinates (float values)
    obj_max_x obj_max_y obj_max_z

    % attr_key attr_value
optional object attributes given as key-value pairs
    ...
(end of 3D object entry, more objects to follow)
  ...
(end of image frame entry, more images to follow)
...
(end of image sequence entry, more sequences to follow)



Benchmarking Procedure

Authors who wish to evaluate pedestrian path prediction on this dataset are encouraged to follow the benchmarking procedure and criteria as detailed in the publication given above. Evaluation methodology and objective function for process noise parameter optimization are provided as Matlab functions in the directory

evaluation

See

evaluation/example.m

for details on how to use the functions.

Note that this software is provided "as is" without warranty of any kind.

The original authors would like to hear about other publications that make use of the benchmark data set in order to include corresponding references on the benchmark website.


Ground Truth Parser (Matlab)

For convenience, a Matlab parser for the Database format is provided in the directory

evaluation/DBFileParser
See
evaluation/DBFileParser/example.m

for details on how to interpret the Database format.

Note that this software is provided "as is" without warranty of any kind.


Contact

Please direct questions regarding the dataset and benchmarking procedure to Prof. Dr. Dariu Gavrila