Acta Geodaetica et Cartographica Sinica ›› 2022, Vol. 51 ›› Issue (11): 2355-2364.doi: 10.11947/j.AGCS.2022.20200522

• Photogrammetry and Remote Sensing • Previous Articles     Next Articles

Semantic segmentation of aerial image based on semi-supervised network with multi-scale shared coding

LI Jiatian1, YANG Ruchun1, YAO Yanji1, HE Rixing2,3, A Xiaohui1, LÜ Shaoyun1   

  1. 1. Faculty of Land and Resources Engineering, Kunming University of Science and Technology, Kunming 650093, China;
    2. College of Resources Environment and Tourism, Capital Normal University, Beijing 100048, China;
    3. Key Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China
  • Received:2020-10-23 Revised:2022-05-05 Published:2022-11-30
  • Supported by:
    The National Natural Science Foundation of China (No. 41561082)

Abstract: In semi-supervised semantic segmentation, the segmentation accuracy of aerial images is mainly improved by using the structure of encoder—master-auxiliary decoder which applies the unlabeled samples to the calculation. However, the loss of shallow detail features which is caused by continuous downsampling in the process of encoding makes the boundary of ground objects incomplete. Therefore, a semi-supervised network combining multi-scale shared encoding is proposed for semantic segmentation of aerial images. The encoder uses ResNet-50 to obtain the shallow features of the image, and links the shallow features by embedding a multi-scale shared coding module at the end of ResNet-50 to build a dense feature pyramid and expand the receptive field, thereby obtaining multi-scale detailed information of the target feature. The effectiveness of the proposed method is verified by compared with UNet, DeepLabv3+, FCN and CCT, XModalNet, VLCNet on the two datasets of LandCover.ai and DroneDeploy, and the result shows that our network has obvious advantages in terms of label number and accuracy. For the LandCover.ai dataset, under the premise of 6000 labeled samples and 6500 unlabeled samples, the overall mIoU increased by 1.15%. For the DroneDeploy dataset, under the premise of 30 labeled samples and 5 unlabeled samples, the overall mIoU increased by 0.94%, while significantly improving the segmentation accuracy of ground objects to obtain a clear and complete ground boundary.

Key words: semi-supervised, semantic segmentation, multiscale shared encoder, master-auxiliary decoder, aerial images

CLC Number: