EnglishVersion  
     
 
小组动态

NOT-156: Night Object Tracking using Low-light and Thermal Infrared:

From Multi-modal Common-aperture Camera to Benchmark Datasets

Abstract

   The NOT-156 is a novel low-light visible and thermal infrared (LOL-T) multi-modal benchmark dataset for night object tracking. All the videos in the NOT-156 dataset are acquired from real environments at night using the well-designed common-aperture LOL-T camera. Lots of scenes are captured in extreme low-illumination case which is under 10^-2 lux, where LOL-T sensor is able to capture the target and RGB sensor couldn't. The proposed dataset consists of 156 video sequences and a total of 170k annotated frames, including various low-illumination night scenes such as dark room, street, corridor and so on. Compared with existing datasets, the NOT-156 has more comprehensive and distinctive attributes (thermal variation, noise, high illumination overexposure, etc.). We believe that NOT-156 has great potential in application and development of night vision.

1.The NOT-156 dataset
  
   The dataset is captured in Wuhan University and surrounding areas using our common-aperture LOL-T camera, which is meticulously engineered to capture synchronized image pairs within the same field of view. All frames in our dataset are cropped to 640*512 and annotated precisely according to the OTB format. An overview of this dataset is provided in Fig. 1.



Fig. 1. The overview of the NOT-156 dataset.(a) The common-aperture LOL-T camera. (b) Examples with specific subcategories and representative attributes of the NOT-156.

2.Experiment

  Table 1 presents the testing accuracy of multi-modal object tracking methods on the NOT-156 dataset.

Table. 1. Test results of multi-modal object tracking methods on the NOT-156 dataset.
Method ADRNet[1] MANet[2] APFNet[3] DFAT[4] mfDiMP[5] ViPT[6]
SR 0.481 0.422 0.496 0.529 0.548 0.654
PR 0.627 0.560 0.651 0.654 0.706 0.788



3.Download

  We hope that the release of the NOT-156 dataset will promote the development of night vision. You can click the link below to download the data:
NOT-156: Night Object Tracking using Low-light and Thermal Infrared
● Baidu Drive: https://y2npxzyvcl9swgmt.mikecrm.com/b5cp5ui
● Google Drive: https://y2npxzyvcl9swgmt.mikecrm.com/KzDVXHg


4.Copyright

  The copyright belongs to Intelligent Data Extraction, Analysis and Applications of Remote Sensing(RSIDEA) academic research group, State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing (LIESMARS), Wuhan University, China. The NOT-156 dataset only can be used for academic purposes and need to cite the following paper, but any commercial use is prohibited. Any form of secondary development, including annotation work, is strictly prohibited for this dataset. Otherwise, RSIDEA of Wuhan University reserves the right to pursue legal responsibility.

Sun C, Wang X, Fan S et al. NOT-156: Night Object Tracking using Low-light and Thermal Infrared: From Multi-modal Common-aperture Camera to Benchmark Datasets[J].

5.Contact

  If you have any the problem or feedback in using NOT-156 dataset, please contact:
  Mr. Chen Sun: sunchen@whu.edu.cn
  Dr. Xinyu Wang: wangxinyu@whu.edu.cn
  Prof. Yanfei Zhong: zhongyanfei@whu.edu.cn


Reference:

[1] P. Zhang, D. Wang, H. Lu, and X. Yang, “Learning adaptive attribute-driven representation for real-time rgb-t tracking,” Int. J. Comput. Vision, vol. 129, pp. 2714–2729, 2021.

[2] C. L. Li, A. Lu, A. Hua Zheng, Z. Tu, and J. Tang, “Multi-adapter rgbt tracking,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2019, Conference Proceedings, pp. 2262–2270.

[3] Y. Xiao, M. Yang, C. Li, L. Liu, and J. Tang, “Attribute-based progressive fusion network for rgbt tracking,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 3, 2022, pp. 2831–2838.
[4] Z. Tang, T. Xu, H. Li, X.-J. Wu, X. Zhu, and J. Kittler, “Exploring fusion strategies for accurate rgbt visual object tracking,” Inf. Fusion, p. 101881, 2023.
[5] L. Zhang, M. Danelljan, A. Gonzalez-Garcia, J. Van De Weijer, and F. Shahbaz Khan, “Multi-modal fusion for end-to-end rgb-t tracking,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2019, Conference Proceedings, pp. 0–0.
[6] J. Zhu, S. Lai, X. Chen, D. Wang, and H. Lu, “Visual prompt multi-modal tracking,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 9516–9526.
 
版权所有@RS-IDEA | 地址:武汉市珞喻路129号 | 单位:武汉大学测绘遥感信息工程国家重点实验室 | 办公室:星湖楼709 | zhongyanfei@whu.edu.cn