A Geographic Weighted Regression Method Based on Semi-supervised Learning

  • ZHAO Yangyang ,
  • LIU Jiping ,
  • XU Shenghua ,
  • ZHANG Fuhao ,
  • YANG Yi
Expand
  • 1. School of Mapping and Geographical Science, Liaoning Technical University, Fuxin 123000, China;
    2. Chinese Academy of Surveying and Mapping, Beijing 100830, China

Received date: 2015-09-14

  Revised date: 2016-07-11

  Online published: 2017-02-06

Supported by

The Special Scientific Research Fund of Public Welfare Profession of China (No.201512032);The National Key Research and Development Program of China(No.2016YFC0803101)

Abstract

Geographically weighted regression (GWR) approach will be affected by the quantity of label data. However, it is difficult to get labeled data but easy to get the unlabeled data in applications. Therefore it is indispensable to find an useful way that can use the unlabeled data to improve the regression results. As we know semi-supervised learning is a class of supervised learning tasks and techniques that also make use of unlabeled data for training typically a small amount of labeled data with a large amount of unlabeled data. So this article develops a semi-supervised-learning geographically weighted regression (SSLGWR). Firstly it builds the GWR model by labeled data. Then the unlabeled data can be calculated the value by the GWR model and they will be signed as new labeled data. Thirdly, use both labeled data and new labeled data to rebuild the GWR model to improve the model's precision. The experiments use both simulated data and real data to compare GWR COGWR and SSLGWR. Mean square error is closed as the framework to estimate the models. Experiments using simulated data have shown that the proposed model improves the performance by 39.66%, 11.92% and 0.94% relative to 10%,30% and 50% label data. And experiments using real data have shown that the proposed model improves the performance by 8.94%, 3.36% and 5.87%. The results demonstrate that there are substantial benefits of SSLGWR in the improvement of GWR.

Cite this article

ZHAO Yangyang , LIU Jiping , XU Shenghua , ZHANG Fuhao , YANG Yi . A Geographic Weighted Regression Method Based on Semi-supervised Learning[J]. Acta Geodaetica et Cartographica Sinica, 2017 , 46(1) : 123 -129 . DOI: 10.11947/j.AGCS.2017.20150470

References

[1] 杨康, 李满春, 刘永学, 等. 基于累积相似度表面的空间权重矩阵构建方法[J]. 测绘学报, 2012, 41(2):259-265, 272. YANG Kang, LI Manchun, LIU Yongxue, et al. Accumulated Similarity Surface for Spatial Weights Matrix Construction[J]. Acta Geodaetica et Cartographica Sinica, 2012, 41(2):259-265, 272.
[2] 朱长明, 张新, 路明, 等. 湖盆数据未知的湖泊动态库容遥感监测方法[J]. 测绘学报, 2015, 44(3):309-315. DOI:10.11947/j.AGCS.2015.20130438. ZHU Changming, ZHANG Xin, LU Ming, et al. Lake Storage Change Automatic Detection by Multi-source Remote Sensing without Underwater Terrain Data[J]. Acta Geodaetica et Cartographica Sinica, 2015, 44(3):309-315. DOI:10.11947/j.AGCS.2015.20130438.
[3] 禹文豪, 艾廷华, 刘鹏程, 等. 设施POI分布热点分析的网络核密度估计方法[J]. 测绘学报, 2015, 44(12):1378-1383, 1400. DOI:10.11947/j.AGCS.2015.20140538. YU Wenhao, AI Tinghua, LIU Pengcheng, et al. Network Kernel Density Estimation for the Analysis of Facility POI Hotspots[J]. Acta Geodaetica et Cartographica Sinica, 2015, 44(12):1378-1383, 1400. DOI:10.11947/j.AGCS.2015.20140538.
[4] FOTHERINGHAM A S, CHARLTON M, BRUNSDON C. Measuring Spatial Variations in Relationships with Geographically Weighted Regression[M]//FISCHER M M, GETIS A. Recent Developments in Spatial Analysis. Berlin Heidelberg:Springer, 1997:60-82.
[5] HUANG Bo, WU B, BARRY M. Geographically and Temporally Weighted Regression for Modeling Spatio-temporal Variation in House Prices[J]. International Journal of Geographical Information Science, 2010, 24(3):383-401.
[6] 张晨光, 张燕. 半监督学习[M]. 北京:中国农业科学技术出版社, 2013:26-29. ZHANG Chenguang, ZHANG Yan. Semi-supervised Learning[M]. Beijing:China Agricultural Sciences and Technology Press, 2013:26-29.
[7] 黎铭. 单视图协同训练方法的研究[D]. 南京:南京大学, 2008. LI Ming. Research on Single-view Co-training Approaches[D]. Nanjing:Nanjing University, 2008.
[8] ZHOU Zhihua, LI Ming. Semi-supervised Learning by Dis-agreement[J]. Knowledge and Information Systems, 2010, 24(3):415-439.
[9] 周志华, 王珏. 机器学习及其应用2007[M]. 北京:清华大学出版社, 2007:259-275. ZHOU Zhihua, WANG Jue. Machine Learning and Applications 2007[M]. Beijing:Tsinghua University Press, 2007:259-275.
[10] WANG Wei, ZHOU Zhihua. A New Analysis of Co-training[C]//Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel:[s.n.], 2010.
[11] YANG Yi, LIU Jiping, XU Shenghua, et al. An Extended Semi-supervised Regression Approach with Co-training and Geographical Weighted Regression:A Case Study of Housing Prices in Beijing[J]. ISPRS International Journal of Geo-Information, 2016, 5(1):4.
[12] GOLDMAN S A, ZHOU Yan. Enhancing Supervised Learning with Unlabeled Data[C]//Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco, CA:ACM, 2000:327-334.
[13] ZHOU Zhihua, LI Ming. Semisupervised Regression with Cotraining-style Algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(11):1479-1493.
[14] 马蕾, 汪西莉. 基于支持向量机协同训练的半监督回归[J]. 计算机工程与应用, 2011, 47(3):177-180. MA Lei, WANG Xili. Semi-supervised Regression Based on Support Vector Machine Co-training[J]. Computer Engineering and Applacation, 2011, 47(3):177-180.
[15] 周志华. 基于分歧的半监督学习[J]. 自动化学报, 2013, 39(11):1871-1878. ZHOU Zhihua. Disagreement-based Semi-supervised Learning[J]. Acta Automatica Sinica, 2013, 39(11):1871-1878.
[16] WU Bo, LI Rongrong, HUANG Bo. A Geographically and Temporally Weighted Autoregressive Model with Application to Housing Prices[J]. International Journal of Geographical Information Science, 2014, 28(5):1186-1204.
[17] ROBINSON D P, LLOYD C D, MCKINLEY J M. Increasing the Accuracy of Nitrogen Dioxide (NO2) Pollution Mapping Using Geographically Weighted Regression (GWR) and Geostatistics[J]. International Journal of Applied Earth Observation and Geoinformation, 2013, 21:374-383.
[18] 覃文忠. 地理加权回归基本理论与应用研究[D]. 上海:同济大学, 2007. QIN Wenzhong. The Basic Theoretics and Application Research on Geographically Weighted Regression[D]. Shanghai:Tongji University, 2007.
[19] CLEVELAND W S. Robust Locally Weighted Regression and Smoothing Scatterplots[J]. Journal of the American Statistical Association, 1979, 74(368):829-836.
[20] 柏中强, 王卷乐, 杨雅萍, 等. 基于乡镇尺度的中国25省区人口分布特征及影响因素[J]. 地理学报, 2015, 70(8):1229-1242. BAI Zhongqiang, WANG Juanle, YANG Yaping, et al. Characterizing Spatial Patterns of Population Distribution at Township Level across the 25 Provinces in China[J]. Acta Geographica Scinica, 2015, 70(8):1229-1242.
[21] 戚伟, 李颖, 刘盛和, 等. 城市昼夜人口空间分布的估算及其特征——以北京市海淀区为例[J]. 地理学报, 2013, 68(10):1344-1356. QI Wei, LI Ying, LIU Shenghe, et al. Estimation of Urban Population at Daytime and Nighttime and Analyses of Their Spatial Pattern:A case Study of Haidian District, Beijing[J]. Acta Geographica Scinica, 2013, 68(10):1344-1356.
[22] GOLDSTEIN R. Conditioning Diagnostics:Collinearity and Weak Data in Regression[J]. Technometrics, 1993, 35(1):85-86.
Outlines

/