Extracting spatial relations and semantic relations between two geo-entities from Web texts, asks robust and effective solutions. This paper puts forward a novel approach: firstly, the characteristics of terms (part-of-speech, position and distance) are analyzed by means of bootstrapping. Secondly, the weight of each term is calculated and the keyword is picked out as the clue of geo-entity relations. Thirdly, the geo-entity pairs and their keywords are organized into structured information. Finally, an experiment is conducted with Baidubaike and Stanford CoreNLP. The study shows that the presented method can automatically explore part of the lexical features and find additional relational terms which neither the domain expert knowledge nor large scale corpora need. Moreover, compared with three classical frequency statistics methods, namely Frequency, TF-IDF and PPMI, the precision and recall are improved about 5% and 23% respectively.
YU Li
,
LU Feng
,
LIU Xiliang
. A Bootstrapping Based Approach for Open Geo-entity Relation Extraction[J]. Acta Geodaetica et Cartographica Sinica, 2016
, 45(5)
: 616
-622
.
DOI: 10.11947/j.AGCS.2016.20150181
[1] 陆锋, 张恒才. 大数据与广义GIS[J]. 武汉大学学报(信息科学版), 2014, 39(6): 645-654. LU Feng, ZHANG Hengcai. Big Data and Generalized GIS[J]. Geomatics and Information Science of Wuhan University, 2014, 39(6): 645-654.
[2] 刘纪平, 栗斌, 石丽红, 等. 一种本体驱动的地理空间事件相关信息自动检索方法[J]. 测绘学报, 2011, 40(4): 502-508. LIU Jiping, LI Bin, SHI Lihong, et al. An Automated Retrieval Method of Geo-spatial Event Information Based on Ontology[J]. Acta Geodaetica et Cartographica Sinica, 2011, 40(4): 502-508.
[3] 张春菊. 面向中文文本的事件时空与属性信息解析方法研究[J]. 测绘学报, 2015, 44(5): 590. DOI: 10.11947/j.AGCS.2015.20140657. ZHANG Chunju. Interpretation of Event Spatio-temporal and Attribute Information in Chinese Text[J]. Acta Geodaetica et Cartographica Sinica, 2015, 44(5): 590. DOI: 10.11947/j.AGCS.2015.20140657.
[4] 张恒才, 陆锋, 陈洁. 微博客蕴含交通信息的提取[J]. 中国图象图形学报, 2013, 18(1): 123-129. ZHANG Hengcai, LU Feng, CHEN Jie. Extracting Traffic Information from Massive Micro-blog Messages[J]. Journal of Image and Graphics, 2013, 18(1): 123-129.
[5] JONES C B, PURVES R S, CLOUGH P D, et al. Modelling Vague Places with Knowledge from the Web[J]. International Journal of Geographical Information Science, 2008, 22(10): 1045-1065.
[6] JONES C B, PURVES R S. Geographical Information Retrieval[J]. International Journal of Geographical Information Science, 2008, 22(3): 219-228.
[7] 赵军, 刘康, 周光有, 等. 开放式文本信息抽取[J]. 中文信息学报, 2011, 25(6): 98-110. ZHAO Jun, LIU Kang, ZHOU Guangyou, et al. Open Information Extraction[J]. Journal of Chinese Information Processing, 2011, 25(6): 98-110.
[8] 杨博, 蔡东风, 杨华. 开放式信息抽取研究进展[J]. 中文信息学报, 2014, 28(4): 1-11, 36. YANG Bo, CAI Dongfeng, YANG Hua. Progress in Open Information Extraction[J]. Journal of Chinese Information Processing, 2014, 28(4): 1-11, 36.
[9] 张雪英, 张春菊, 朱少楠. 中文文本的地理空间关系标注[J]. 测绘学报, 2012, 41(3): 468-474. ZHANG Xueying, ZHANG Chunju, ZHU Shaonan. Annotation for Geographical Spatial Relations in Chinese Text[J]. Acta Geodaetica et Cartographica Sinica, 2012, 41(3): 468-474.
[10] SCHOCKAERT S, SMART P D, ABDELMOTY A I, et al. Mining Topological Relations from the Web[C]//Proceedings of the 19th International Workshop on Database and Expert Systems Application. Turin: IEEE, 2008: 652-656.
[11] CAO Cungen, WANG Shi, JIANG Lin. A Practical Approach to Extracting Names of Geographical Entities and Their Relations from the Web[C]//Proceedings of the 7th International Conference on Knowledge Science, Engineering and Management. Switzerland: Springer, 2014: 210-221.
[12] ELIA A, GUGLIELMO D, MAISTO A, et al. A Linguistic-based Method for Automatically Extracting Spatial Relations from Large Non-structured Data[C]//Proceedings of the 13th International Conference on Algorithms and Architectures for Parallel Processing. Switzerland: Springer, 2013: 193-200.
[13] ZHU Shaonan, ZHANG Xueying, ZHANG Chunju. Syntactic Pattern Recognition of Geospatial Relations Described in Natural Language[C]//Proceedings of the 2010 International Conference on Broadcast Technology and Multimedia Communication. New York: IEEE, 2010: 354-357.
[14] WALLGRüN J O, KLIPPEL A, BALDWIN T. Building a Corpus of Spatial Relational Expressions Extracted from Web Documents[C]//Proceedings of the 8th Workshop on Geographic Information Retrieval. New York: ACM, 2014.
[15] BLESSING A, SCHVTZE H. Fine-grained Geographical Relation Extraction from Wikipedia[C]//Proceedings of the 7th International Conference on Language Resources and Evaluation. Valletta: LREC, 2010.
[16] LOGLISCI C, IENCO D, ROCHE M, et al. Toward Geographic Information Harvesting: Extraction of Spatial Relational Facts from Web Documents[C]//Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops. Brussels: IEEE, 2012: 789-796.
[17] MORO A, NAVIGLI R. Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm[C]//Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Beijing:[s.n.], 2013: 2148-2154.
[18] LIU Zhiyuan, CHEN Xinxiong, ZHENG Yabin, et al. Automatic Keyphrase Extraction by Bridging Vocabulary Gap[C]//Proceedings of the 15th Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2011: 135-144.
[19] ABNEY S P. Bootstrapping[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2002: 360-367.
[20] 邓敏, 徐锐, 李志林, 等. 空间查询中自然语言空间关系与度量空间关系的转换方法研究: 以面目标为例[J]. 测绘学报, 2009, 38(6): 527-531. DENG Min, XU Rui, LI Zhilin, et al. A Spatial-query-driven Transformation between Metric Spatial Relations and Natural Language Spatial Relations: Taking Regions as Example[J]. Acta Geodaetica et Cartographica Sinica, 2009, 38(6): 527-531.