|
|||||||||||||||
|
|
| WHU-CR: A large-scale benchmark dataset specifically designed for rural road extraction in China | ||||
|
|
||||
|
Abstract
We construct a large-scale benchmark dataset, WHU-CR, specifically designed for rural road extraction in China. The dataset contains 130,412 pairs of high-resolution images and corresponding annotations collected from typical rural regions across 14 provinces and seven major grain-producing areas. WHU-CR encompasses diverse rural landscapes, terrain conditions, and road types, thereby capturing the complexity and pronounced spatial heterogeneity of rural road systems in China. Owing to its scale, geographic breadth, and close alignment with real rural environments, WHU-CR provides an indispensable and representative data foundation for training and evaluating rural road extraction models under a wide range of geographic and environmental settings. 1.The WHU-CR dataset 1.1 Study areas To construct a representative and widely applicable rural road remote sensing dataset, we selected 14 provinces from China's seven major grain-producing regions, namely Xinjiang, Ningxia, Shaanxi, Qinghai, Sichuan, Guangxi, Hubei, Hunan, Jiangsu, Zhejiang, Hebei, Shandong, Jilin, and Heilongjiang (Fig. 1). Rural roads in the Northeast are generally straight and wide, while those in the Huanghuaihai Plain and Yangtze River Basin form relatively dense networks. In the Southeast, rural roads mainly follow hills and river systems, whereas in the Southwest and Northwest, they are often narrow and winding. On the Qinghai-Tibet Plateau, rural roads are sparse and highly region-specific. The selected regions also exhibit substantial differences in natural geography, land use, and urbanization levels. We further analyzed the distribution of rural road types and scene categories. Farmland, villages, mountainous, and other scenes account for approximately 45%, 45%, 5%, and 5%, respectively. Road types include asphalt, cement, dirt, and others, representing about 29%, 43%, 24%, and 4% of the dataset, respectively. These statistics indicate that the dataset encompasses a wide variety of road types and scenarios, ensuring its representativeness and providing robust support for rural road extraction research.
1.2 Data acquisition and annotation This dataset is primarily based on high-resolution remote sensing images from Google Earth (Google Inc.), with spatial resolutions ranging from 0.3 to 0.8 meters. Each image was cropped into 512*512-pixel patches, and detailed semantic segmentation labels were created to ensure a clear distinction between road targets and the background. To guarantee annotation quality, all annotators underwent standardized training to unify their understanding of road types and labeling protocols. Each patch was independently labeled by at least two annotators, with consistency assessed through cross-validation, yielding an overall agreement rate above 90%. Any discrepancies or potential errors were reviewed and corrected by experts, ensuring the accuracy and reliability of the final annotations.In total, 130,412 image-mask pairs were produced, with representative examples shown in Fig. 2. To avoid spatial overlap and ensure dataset independence, the images were split at the patch level, with non-overlapping regions randomly assigned to the training or test sets in an approximate 3:2 ratio, resulting in 79,000 pairs for training and 51,412 for testing.
1.3 Download We provide download links of the WHU-CR dataset on Baidu Drive and MEGA. We hope you can fill in a simple questionnaire before downloading, which will appear after clicking the following link: Baidu Drive and MEGA: download 2.Copyright The copyright belongs to Intelligent Data Extraction, Analysis and Applications of Remote Sensing(RSIDEA) academic research group, State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing (LIESMARS), Wuhan University. The WHU-CR dataset can be used for academic purposes only and need to cite the following paper, but any commercial use is prohibited. Otherwise, RSIDEA of Wuhan University reserves the right to pursue legal responsibility. Wang N, Wang X, Pan Y, et al. Identifying rural roads in remote sensing imagery: From benchmark dataset to coarse-to-fine extraction network-A case study in China[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2026, 231: 487-506. 3.Contact If you have any the problem or feedback in using WHU-CR dataset, please contact: Ms. Ningjing Wang: 1121906691@qq.com Dr. Xinyu Wang: wangxinyu@whu.edu.cn Prof. Yanfei Zhong: zhongyanfei@whu.edu.cn |
|