Knowledge graph-guided deep network for high-resolution remote sensing image scene classification

doi:10.11947/j.AGCS.2024.20230125

Abstract

Abstract:

Thanks to the rapid development of deep network theory and methods, deep networks have gradually become the mainstream technology for remote sensing image scene classification tasks. However, existing deep network-based remote sensing image scene classification methods are highly dependent on a large number of manually labeled training samples and cannot effectively integrate and utilize the rich prior knowledge in the remote sensing field. In order to improve the utilization of domain knowledge while reducing the dependence on labeled samples, this paper proposes a knowledge graph-guided deep network learning method for high-resolution remote sensing image scene classification. First, this paper constructs a land cover concept knowledge graph that includes various sources of knowledge in the field to more flexibly and conveniently apply domain prior knowledge. Furthermore, through the knowledge graph representation learning method, the semantic categories of remote sensing scenes in the land cover concept knowledge graph are expressed as semantic vectors to form a semantic benchmark for remote sensing scene categories. In the knowledge-guided learning stage, the cross-modal alignment constraint between the scene category semantic vector and the shallow visual feature vector of the deep network is applied to guide the shallow part of the deep network to more effectively learn shared features of different categories of remote sensing image scenes, while in the deep part of the deep network, it is still guided by scene category labels to learn discriminative features of different remote sensing scenes. In the testing stage, the optimized deep network model can complete high-precision remote sensing image scene classification without relying on any prior knowledge. The experimental results on the currently largest publicly available remote sensing image scene classification dataset show that the proposed knowledge-guided learning method can obtain optimal classification performance at different training sample ratios such as 10%, 30%, and 50% compared with existing methods. Under the condition of 10% sample ratio, our proposed method can achieve an improvement of 5.11% in overall accuracy (OA) compared with baseline deep networks.

Key words: remote sensing image scene classification, land cover concept knowledge graph, knowledge graph representation learning, knowledge graph-guided deep network

CLC Number:

P237

Yansheng LI, Minlang WU, Yongjun ZHANG. Knowledge graph-guided deep network for high-resolution remote sensing image scene classification[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(4): 677-688.

Figures/Tables 7

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Tab. 1

Tab. 2

References 54

[1]	LI Yansheng, ZHANG Yongjun, ZHU Zhihui. Error-tolerant deep learning for remote sensing image scene classification[J]. IEEE Transactions on Cybernetics, 2021, 51(4):1756-1768.
[2]	龚希, 吴亮, 谢忠, 等. 融合全局和局部深度特征的高分辨率遥感影像场景分类方法[J]. 光学学报, 2019, 39(3):0301002.
	GONG Xi, WU Liang, XIE Zhong, et al. Classification method of high-resolution remote sensing scenes based on fusion of global and local deep features[J]. Acta Optica Sinica, 2019, 39(3):0301002.
[3]	白坤, 慕晓冬, 陈雪冰, 等. 融合半监督学习的无监督遥感影像场景分类[J]. 测绘学报, 2022, 51(5):691-702.DOI: 10.11947/J.AGCS.2022.20210270.
	BAI Kun, MU Xiaodong, CHEN Xuebing, et al. Unsupervised remote sensing image scene classification based on semi-supervised learning[J]. Acta Geodaetica et Cartographica Sinica, 2022, 51(5):691-702.DOI: 10.11947/J.AGCS.2022.20210270.
[4]	ZHU Qiqi, ZHONG Yanfei, ZHANG Liangpei, et al. Adaptive deep sparse semantic modeling framework for high spatial resolution image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(10):6180-6195.
[5]	HUANG Xin, HAN Xiaopeng, MA Song, et al. Monitoring ecosystem service change in the city of Shenzhen by the use of high-resolution remotely sensed imagery and deep learning[J]. Land Degradation & Development, 2019, 30(12):1490-1501.
[6]	YAO Xiwen, HAN Junwei, CHENG Gong, et al. Semantic annotation of high-resolution satellite images via weakly supervised learning[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(6):3660-3671.
[7]	LI Yansheng, CHEN Wei, HUANG Xin, et al. MFVNet: a deep adaptive fusion network with multiple field-of-views for remote sen-sing image semantic segmentation[J]. Science China Information Sciences, 2023, 66(4):140305.
[8]	YANG Yi, NEWSAM S. Comparing SIFT descriptors and Gabor texture features for classification of remote sensed imagery[C]//Proceedings of the 15th IEEE International Conference on Image Processing. San Diego: IEEE, 2008: 1852-1855.
[9]	ZHONG Yanfei, ZHU Qiqi, ZHANG Liangpei. Scene classification based on the multifeature fusion probabilistic topic model for high spatial resolution remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(11):6207-6222.
[10]	LI Yansheng, TAO Chao, TAN Yihua, et al. Unsupervised multilayer feature learning for satellite image scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2016, 13(2):157-161.
[11]	XU Kejie, DENG Peifang, HUANG Hong. Vision transformer: an excellent teacher for guiding small networks in remote sensing image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:3152566.
[12]	XIA Guisong, HU Jingwen, HU Fan, et al. AID: a benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7):3965-3981.
[13]	CHENG Gong, HAN Junwei, LU Xiaoqiang. Remote sensing image scene classification: benchmark and state of the art[J]. Proceedings of the IEEE, 2017, 105(10):1865-1883.
[14]	YANG Yi, NEWSAM S. Bag-of-visual-words and spatial extensions for land-use classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems.San Jose: ACM Press, 2010: 270-279.
[15]	CAO Ran, FANG Leyuan, LU Ting, et al. Self-attention-based deep feature fusion for remote sensing scene classification[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(1):43-47.
[16]	ZHAO Zhicheng, LI Jiaqi, LUO Ze, et al. Remote sensing image scene classification based on an enhanced attention module[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 18(11):1926-1930.
[17]	ZHANG Yue, ZHENG Xiangtao, LU Xiaoqiang. Pairwise comparison network for remote-sensing scene classification[J]. IEEE Geo-science and Remote Sensing Letters, 2022, 19:1-5.
[18]	BAZI Y, BASHMAL L, AL RAHHAL M M, et al. Vision transformers for remote sensing image classification[J]. Remote Sensing, 2021, 13(3):516.
[19]	LÜ Pengyuan, WU Wenjun, ZHONG Yanfei, et al. SCViT: a spatial-channel feature preserving vision transformer for remote sensing image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60:3157671.
[20]	LI Lingjun, HAN Junwei, YAO Xiwen, et al. DLA-MatchNet for few-shot remote sensing image scene classification[J]. IEEE Tran-sactions on Geoscience and Remote Sensing, 2021, 59(9):7844-7853.
[21]	LI Yansheng, KONG Deyu, ZHANG Yongjun, et al. Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 179:145-158.
[22]	LI Yansheng, OUYANG Song, ZHANG Yongjun. Combining deep learning and ontology reasoning for remote sensing image semantic segmentation[J]. Knowledge-Based Systems, 2022, 243:108469.
[23]	LI Yansheng, ZHOU Yuhan, ZHANG Yongjun, et al. DKDFN: domain knowledge-guided deep collaborative fusion network for multimodal unitemporal remote sensing land cover classification[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2022, 186:170-189.
[24]	MA Xiaorui, WANG Hongyu, LIU Yi, et al. Knowledge guided classification of hyperspectral image based on hierarchical class tree[C]//Proceedings of 2019 IEEE International Geoscience and Remote Sensing Symposium.Yokohama: IEEE, 2019: 2702-2705.
[25]	ZHU Qiqi, LEI Yang, SUN Xiongli, et al. Knowledge-guided land pattern depiction for urban land use mapping: a case study of Chinese cities[J]. Remote Sensing of Environment, 2022, 272:112916.
[26]	FENSEL D, SIMSEK U, ANGELE K, et al.Introduction: what is a knowledge graph? [M]//Knowledge Graphs. Cham: Springer, 2020: 1-10.
[27]	田玲, 张谨川, 张晋豪, 等. 知识图谱综述：表示、构建、推理与知识超图理论[J]. 计算机应用, 2021, 41(8):2161-2186.
	TIAN Ling, ZHANG Jinchuan, ZHANG Jinhao, et al. Knowledge graph survey: representation, construction, reasoning and know-ledge hypergraph theory[J]. Journal of Computer Applications, 2021, 41(8):2161-2186.
[28]	BOLLACKER K, EVANS C, PARITOSH P, et al. Freebase: a collaboratively created graph database for structuring human knowledge[C]//Proceedings of 2008 ACM SIGMOD international conference on Management of data. Vancouver: ACM Press, 2008: 1247-1250.
[29]	LEHMANN J, ISELE R, JAKOB M, et al. DBpedia-a large-scale, multilingual knowledge base extracted from Wikipedia[J]. Semantic Web, 2015, 6(2):167-195.
[30]	CHEN Jindong, WANG Ao, CHEN Jiangjie, et al. CN-Probase: a data-driven approach for large-scale Chinese taxonomy construction[C]//Proceedings of the 35th International Conference on Data Engineering (ICDE).Macao: IEEE, 2019: 1706-1709.
[31]	李彦胜, 武康, 欧阳松, 等.地学知识图谱引导的遥感影像语义分割[J]. 遥感学报, 2024, 28(2):455-469. DOI: 10.11834/Jrs.20231110.
	LI Yansheng, WU Kang, OUYANG Song, et al. Geographic knowledge graph-guided deep semantic segmentation network for remote sensing imagery [J]. National Remote Sensing Bulletin, 2024, 28(2):455-469. DOI: 10.11834/Jrs.20231110.
[32]	JANOWICZ K, HITZLER P, LI Wenwen, et al. Know, know where, KnowWhereGraph: a densely connected, cross-domain know-ledge graph and geo-enrichment service stack for applications in environmental intelligence[J]. AI Magazine, 2022, 43(1):30-39.
[33]	张雪英, 张春菊, 吴明光, 等. 顾及时空特征的地理知识图谱构建方法[J]. 中国科学：信息科学, 2020, 50(7):1019-1032.
	ZHANG Xueying, ZHANG Chunju, WU Mingguang, et al. Spatiotemporal features based geographical knowledge graph construction[J]. Scientia Sinica (Informationis), 2020, 50(7):1019-1032.
[34]	STADLER C, LEHMANN J, HÖFFNER K, et al. LinkedGeoData: a core for a web of spatial open data[J]. Semantic Web, 2012, 3(4):333-354.
[35]	李彦胜, 张永军. 耦合知识图谱和深度学习的新一代遥感影像解译范式[J]. 武汉大学学报(信息科学版), 2022, 47(8):1176-1190.
	LI Yansheng, ZHANG Yongjun. A new paradigm of remote sensing image interpretation by coupling knowledge graph and deep learning[J]. Geomatics and Information Science of Wuhan University, 2022, 47(8):1176-1190.
[36]	张永军, 程鑫, 李彦胜, 等. 利用知识图谱的国土资源数据管理与检索研究[J]. 武汉大学学报(信息科学版), 2022, 47(8):1165-1175.
	ZHANG Yongjun, CHENG Xin, LI Yansheng, et al. Research on land and resources management and retrieval using knowledge graph[J]. Geomatics and Information Science of Wuhan University, 2022, 47(8):1165-1175.
[37]	张永军, 王飞, 李彦胜, 等. 遥感知识图谱创建及其典型场景应用技术[J]. 遥感学报, 2023, 27(2):249-266.
	ZHANG Yongjun, WANG Fei, LI Yansheng, et al. Remote sensing knowledge graph construction and its application in typical scenarios[J]. National Remote Sensing Bulletin, 2023, 27(2):249-266.
[38]	MIKOLOV T, SUTSKEVER I, CHEN Kai, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems.Lake Tahoe: ACM Press, 2013: 3111-3119.
[39]	DEVLIN J, CHANG Mingwei, LEE K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL]. [2024-01-05].http://arxiv.org/abs/1810.04805.
[40]	BORDES A, USUNIER N, GARCIA-DURÁN A, et al. Translating embeddings for modeling multi-relational data[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems.Lake Tahoe: ACM Press, 2013: 2787-2795.
[41]	WANG Zhen, ZHANG Jianwen, FENG Jianlin, et al. Knowledge graph embedding by translating on hyperplanes[C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence.Québec City: ACM Press, 2014: 1112-1119.
[42]	JI Guoliang, HE Shizhu, XU Liheng, et al. Knowledge graph embedding via dynamic mapping matrix[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Stroudsburg: IEEE, 2015: 687-696.
[43]	SUN Zhiqing, DENG Zhihong, NIE Jianyun, et al. RotatE: knowledge graph embedding by relational rotation in complex space.[EB/OL]. [2024-01-01].http://arxiv.org/abs/1902.10197.
[44]	ZHANG Zhanqiu, CAI Jianyu, ZHANG Yongdong, et al. Learning hierarchy-aware knowledge graph embeddings for link prediction[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(3):3065-3072.
[45]	NICKEL M, TRESP V, KRIEGEL H P. A three-way model for collective learning on multi-relational data[C]//Proceedings of 2011 International Conference on Machine Learning. New York: ACM Press, 2011.
[46]	YANG Bishan, YIH Wentau, HE Xiaodong, et al. Embedding entities and relations for learning and inference in knowledge bases[EB/OL]. [2024-01-01].http://arxiv.org/abs/1412.6575.
[47]	TROUILLON T, WELBL J, RIEDEL S, et al. Complex embeddings for simple link prediction[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York: ACM Press, 2016: 2071-2080.
[48]	BALAZEVIC I, ALLEN C, HOSPEDALES T. TuckER: tensor factorization for knowledge graph completion[C]//Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Stroudsburg: IEEE, 2019: 5185-5194.
[49]	MARINO K, SALAKHUTDINOV R, GUPTA A. The more you know: using knowledge graphs for image classification[C]//Procee-dings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 20-28.
[50]	SPEER R, CHIN J, HAVASI C. ConceptNet 5.5: an open multilingual graph of general knowledge[C]//Proceedings of 2017 AAAI Conference on Artificial Intelligence.San Francisco: ACM Press, 2017: 4444-4451.
[51]	TAN Mingxing, LE Q. Efficientnet: rethinking model scaling for convolutional neural networks[C]//Proceedings of 2019 International Conference on Machine Learning. Long Beach: IEEE, 2019: 6105-6114.
[52]	LI Yansheng, ZHU Zhihui, YU Jingang, et al. Learning deep cross-modal embedding networks for zero-shot remote sensing image scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(12):10590-10603.
[53]	BAZI Y, AL RAHHAL M M, ALHICHRI H, et al. Simple yet effective fine-tuning of deep CNNs using an auxiliary classification loss for remote sensing scene classification[J]. Remote Sensing, 2019, 11(24):2908.
[54]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2024-01-01].http://arxiv.org/abs/1409.1556.

消融试验模型	10%训练样本	30%训练样本	50%训练样本
EFNet	83.86	91.38	94.40
EFNet＋One-hot	84.23(＋0.46)	91.98(＋0.60)	94.45(＋0.05)
EFNet＋Word2Vec	86.13(＋2.27)	91.98(＋0.60)	94.58(＋0.18)
EFNet＋Bert	86.87(＋3.01)	92.09(＋0.71)	94.64(＋0.24)
EFNet＋KGE	87.62(＋3.76)	92.85(＋1.47)	94.85(＋0.45)
本文提出的知识引导学习方法	88.97(＋5.11)	94.57(＋3.19)	96.62(＋2.22)

主干网络	遥感影像场景分类方法	10%训练样本	30%训练样本	50%训练样本
ViT	v16_21k^[18]	81.90	83.87	88.92
ViT	SCViT^[19]	82.34	83.65	90.71
VGG16	SAFF^[15]	83.81	89.42	91.21
VGG16	本文提出的知识引导学习方法(VGG16)	85.52	91.65	94.36
efficientnet-b3	EFNet-aux^[54]	84.32	91.98	94.45
efficientnet-b3	本文提出的知识引导学习方法(EFNet)	88.97	94.57	96.62
ResNet-101	EAM^[16]	87.37	92.76	93.26
	PCNet^[17]	88.76	91.57	94.46
	本文提出的知识引导学习方法(PCNet)	90.53	94.48	96.09