测绘学报 ›› 2020, Vol. 49 ›› Issue (4): 509-521.doi: 10.11947/j.AGCS.2020.20190174

• 摄影测量学与遥感 • 上一篇    下一篇

基于Sentinel-1A数据的多种机器学习算法识别冰山的比较

肖湘文1,2, 沈校熠1, 柯长青1, 周兴华2   

  1. 1. 南京大学地理与海洋科学学院, 江苏 南京 210000;
    2. 自然资源部第一海洋研究所, 山东 青岛 266000
  • 收稿日期:2019-05-07 修回日期:2019-10-17 发布日期:2020-04-17
  • 通讯作者: 柯长青 E-mail:kecq@nju.edu.cn
  • 作者简介:肖湘文(1996-),男,硕士,研究方向为极地遥感影像识别。E-mail:m18709205750@163.com
  • 基金资助:
    国家重点研发计划(2018YFC1407200;2018YFC1407203);国家自然科学基金(41830105)

Comparison of machine learning algorithms based on Sentinel-1A data to detect icebergs

XIAO Xiangwen1,2, SHEN Xiaoyi1, KE Changqing1, ZHOU Xinghua2   

  1. 1. School of Geographic & Oceanographic Science, Nanjing University, Nanjing 210000, China;
    2. The First Institute of Oceanography, MNR, Qingdao 266000, China
  • Received:2019-05-07 Revised:2019-10-17 Published:2020-04-17
  • Supported by:
    National Key Research and Development Program of China (Nos. 2018YFC1407200;2018YFC1407203);National Natural Science Foundation of China (No. 41830105)

摘要: 冰山识别对于海洋环境监测和船只安全运行等具有重要的意义,是北极航道开通和北极开发过程中的重要内容。采用合成孔径雷达(SAR)影像进行冰山识别具有独特的优势,多种机器学习算法均可用于SAR影像的冰山识别中。为了最大限度地发挥机器学习算法的性能,有必要对不同机器学习算法及其搭配使用的特征与特征标准化方法进行评估,从而进行最优冰山识别方法的选择。因此,本文基于Sentinel-1A SAR影像,采用多种机器学习方法、多种特征组合及多种特征标准化方法进行冰山识别,并比较各流程方法的识别性能差异。采用的机器学习算法包括贝叶斯分类器(Bayes)、反向神经网络(BPNN)、线性判别分析(LDA)、随机森林(RF)以及支持向量机(SVM);特征标准化方法包括Min-max标准化、Z-score标准化及log函数标准化;数据集是含有12个SAR影像特征的969个冰山与非冰山样本,样本主要位于格陵兰岛东海岸。分类效果采用接收者操作特性(ROC)曲线下的面积(AUC)进行衡量。结果显示,最佳搭配下的RF的AUC值最高,达到了0.945,比最差的Bayes高出0.09。从识别率上来看,RF在冰山查全率为80%的情况下非冰山查全率达到92.6%,效果最好,比第2位的BPNN高出1.4%,比最差的Bayes高出2.6%;BPNN在冰山查全率为90%的情况下非冰山查全率达到87.4%,比第2位的RF高出0.8%,比最差的Bayes高出2.7%。上述结果表明,对冰山识别而言,选择最优的机器学习算法和最佳的特征与特征标准化方法都是十分重要的。

关键词: 冰山, 机器学习, Sentinel-1A, SAR

Abstract: Iceberg detection is of great significance for marine environmental monitoring and safe sailing of vessels. It is an important part of the construction of the Arctic channel and the exploitation of the Arctic. Iceberg detection using synthetic aperture radar (SAR) images has unique advantages. Many machine learning algorithms can be used in the recognition of icebergs in SAR images. In order to maximize the performance of machine learning algorithms, it is necessary to evaluate different machine learning algorithms and their matching feature and feature standardization methods, so as to select the optimal iceberg detection process method. Therefore, based on Sentinel-1A SAR image, this paper uses a variety of machine learning methods, a variety of feature combinations and a variety of feature standardization methods for iceberg detection, and compares the performance differences of each process method. Machine learning algorithms include Bayes classifier (Bayes), back propagation neural network (BPNN), linear discriminant analysis (LDA), random forest (RF) and support vector machine (SVM); feature standardization methods include Min-max standardization, Z-score standardization and log function standardization; data sets are comprised of 969 iceberg and non-iceberg samples with 12 SAR image features, located mainly on the east coast of Greenland. The classification result is measured by the area under the receiver operating characteristic (ROC) curve (AUC). The results show that the AUC value of RF with the best configuration is the highest, reaching 0.945, which is 0.09 higher than worst Bayes. In terms of detection rate, under the case of 80% iceberg recall rate, the non-iceberg recall rate of RF is 92.6%, which is the best, 1.4% higher than the second BPNN, 2.6% higher than the worst Bayes; under the case of 90% iceberg recall rate, the non-iceberg recall rate of BPNN is 87.4%, 0.8% higher than the second RF and 2.7% higher than the worst Bayes. The above results show that it is very important to select the best machine learning algorithm, the best features and feature standardization method for iceberg detection.

Key words: iceberg, machine learning, Sentinel-1A, SAR

中图分类号: