Long-Term Visual Object Tracking Benchmark

We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for visual object tracking. The dataset consists of 50 HD videos from real world scenarios, encompassing a duration of over 400 minutes (676K frames), making it more than 20 folds larger in average duration per sequence and more than 8 folds larger in terms of total covered duration, as compared to existing generic datasets for visual tracking. We benchmark the dataset on 17 state of the art trackers and rank them according to tracking accuracy and run time speeds. To the best of our knowledge, TLP benchmark is the first large-scale evaluation of the state of the art trackers, focusing on long duration aspect and makes a strong case for much needed research efforts in this direction.

Read paper

Download

TLP V2 and TinyTLP V2 released!

Please visit this page for more information.

TLP dataset consists of 50 long HD sequences (total 676,431 frames). Each sequence consists of a single object to be tracked, marked in the first frame. TinyTLP is a challenging high-resolution short-term dataset for visual tracking, derived from TLP. It consists of first 600 frames (20 sec) of each sequence of the TLP dataset. The length of 20 sec is chosen to align with the average per sequence length of OTB dataset. We propose this TinyTLP dataset to compare and highlight the challenges incurred in long-term tracking.

Annotation Format

Per frame bounding box annotations are provided for the target object in each sequence in the following format:
[frameID, xmin, ymin, width, height, isLost]

frameID - Frame that this annotation represents.
xmin - Top left x-coordinate of the bounding box.
ymin - Top left y-coordinate of the bounding box.
width - Width of annotation box.
height - Height of annotation box.
isLost - If 1, the target object is not visible at all; else 0.

Directory Structure

For easy integration of trackers, each sequence has the same directory structure as that of OTB. To be clear, a sequence, say CarChase1, would have the following directory structure when downloaded from the link below and unzipped:

TLP
 ├──Bharatanatyam/
 ├──Drone1/
 ├──CarChase1/
       ├── groundtruth_rect.txt
       └── img/
            ├── 00001.jpg
            ├── 00002.jpg
            ├── 00003.jpg
            ├── 00004.jpg
            ....

Annotation format and directory structure of TinyTLP sequences are exactly the same as that of TLP dataset.

Tracking Results

Results of all the 17 evaluated trackers can be downloaded from the following link. Each tracker directory contains results in the form of 50 space-seperated text files with format [xmin ymin width height], corresponding to each sequence of TLP dataset.

TLP Dataset

Collection of all the 50 sequences of TLP dataset (39 GB)

TLP Sequences

Link to download individual sequences of TLP dataset

TinyTLP Dataset

Collection of all the 50 sequences of TinyTLP dataset (1.7 GB)

TinyTLP Sequences

Link to download individual sequences of TinyTLP dataset

TLPattr Dataset

Collection of total 90 sequences, 15 sequences of each attribute (3.2 GB)

Tracking Results

Results of all the 17 evaluated trackers on TLP dataset (26 MB)

Citation

If you use any of our datasets or find our work useful in your research, please cite:

@inproceedings{moudgil2018long,
    title={Long-term Visual Object Tracking Benchmark},
    author={Moudgil, Abhinav and Gandhi, Vineet},
    booktitle={Asian Conference on Computer Vision},
    pages={629--645},
    year={2018},
    organization={Springer}
}

Contact

For more information or help, please get in touch with us via email.

Abhinav Moudgil

MS by Research, IIIT Hyderabad

(abhinav.moudgil)@research.iiit.ac.in

Vineet Gandhi

Assistant Professor, IIIT Hyderabad

(vgandhi)@iiit.ac.in