We propose a new long video dataset (called Track Long and Prosper - TLP) and benchmark for visual object tracking. The dataset consists of 50 HD videos from real world scenarios, encompassing a duration of over 400 minutes (676K frames), making it more than 20 folds larger in average duration per sequence and more than 8 folds larger in terms of total covered duration, as compared to existing generic datasets for visual tracking. We benchmark the dataset on 17 state of the art trackers and rank them according to tracking accuracy and run time speeds. To the best of our knowledge, TLP benchmark is the first large-scale evaluation of the state of the art trackers, focusing on long duration aspect and makes a strong case for much needed research efforts in this direction.
Please visit this page for more information.
[frameID, xmin, ymin, width, height, isLost]
frameID
- Frame that this annotation represents. xmin
- Top left x-coordinate of the bounding box.ymin
- Top left y-coordinate of the bounding box.width
- Width of annotation box.height
- Height of annotation box.isLost
- If 1, the target object is not visible at all; else 0.CarChase1
, would have the following directory structure when downloaded from the link below and unzipped:TLP ├──Bharatanatyam/ ├──Drone1/ ├──CarChase1/ ├── groundtruth_rect.txt └── img/ ├── 00001.jpg ├── 00002.jpg ├── 00003.jpg ├── 00004.jpg ....Annotation format and directory structure of TinyTLP sequences are exactly the same as that of TLP dataset.
[xmin ymin width height]
, corresponding to each sequence of TLP dataset.
If you use any of our datasets or find our work useful in your research, please cite:
@inproceedings{moudgil2018long, title={Long-term Visual Object Tracking Benchmark}, author={Moudgil, Abhinav and Gandhi, Vineet}, booktitle={Asian Conference on Computer Vision}, pages={629--645}, year={2018}, organization={Springer} }
For more information or help, please get in touch with us via email.
(abhinav.moudgil)@research.iiit.ac.in
(vgandhi)@iiit.ac.in