测绘学报 ›› 2017, Vol. 46 ›› Issue (12): 2032-2040.doi: 10.11947/j.AGCS.2017.20170166

• 地图学与地理信息 • 上一篇    下一篇

利用词向量模型分析城市道路交通空间相关性

刘康1,2, 仇培元1, 刘希亮1, 张恒才1, 王少华1, 陆锋1,2,3   

  1. 1. 中国科学院地理科学与资源研究所, 北京 100101;
    2. 中国科学院大学, 北京 100049;
    3. 江苏省地理信息资源开发与利用协同创新中心, 江苏 南京 210023
  • 收稿日期:2017-04-05 修回日期:2017-11-08 出版日期:2017-12-20 发布日期:2017-12-28
  • 通讯作者: 陆锋 E-mail:luf@lreis.ac.cn
  • 作者简介:刘康(1991-),女,博士生,研究方向为时空数据挖掘。E-mail:liukang@lreis.ac.cn
  • 基金资助:
    国家自然科学基金(41631177);国家重点研究发展项目(2016YFB0502104);中国科学院重点项目(ZDRW-ZS-2016-6-3)

Measuring Traffic Correlations in Urban Road System Using Word Embedding Model

LIU Kang1,2, QIU Peiyuan1, LIU Xiliang1, ZHANG Hengcai1, WANG Shaohua1, LU Feng1,2,3   

  1. 1. Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China;
    3. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
  • Received:2017-04-05 Revised:2017-11-08 Online:2017-12-20 Published:2017-12-28
  • Supported by:
    The National Natural Science Foundation of China(No. 41631177) The National Key Research and Development Program (No. 2016YFB0502104) Key Project of the Chinese Academy of Sciences(No. ZDRW-ZS-2016-6-3)

摘要: 刻画城市道路之间的交通相关性是提高交通插值及预测水平的基础。现有研究及应用通常假设一定空间或拓扑距离内的道路相互之间具有相关性,这种方式忽视了道路之间交通影响的时空异质性。例如,上游道路交通流通常不会均匀扩散到所有下游道路,而是集中在特定方向上。道路之间产生交通影响和交互作用的根本原因是大量机动车辆穿梭其中。为从数据驱动的角度度量道路之间的交通相关性,从而顾及其时空异质性,本文利用词向量模型Word2Vec从大量机动车出行路径中挖掘道路之间的交通交互影响关系。首先把“路段-路径”类比为“词-文档”;其次利用Word2Vec模型从大量路径(文档)中为每条路段(词)训练出一个实数向量(词向量);然后以向量之间的余弦相似度度量对应路段之间的交通相关性;最后利用交通状态数据对结果进行验证。以北京市200万条出租车出行路径为数据进行试验,结果表明:①平均水平上,向量相似度越高的邻近路段,其交通状态变化趋势也越相似,证明了本文方法可以正确度量道路之间的交通相关性,并刻画出其空间异质性;②工作日早、晚高峰及节假日路段之间的交通相关性大于工作日平峰和周六日,其合理性体现了本文方法可以正确捕捉道路交通相关性的时间异质性。本文方法及分析可为交通规划、诱导等提供方法论和理论基础。

关键词: 交通相关性, Word2Vec, 出行路径, 浮动车数据

Abstract: Good characterization of road traffic correlations among urban roads can help improve the traffic-related applications,such as traffic interpolation and short-term traffic forecasting. Previous studies model the traffic correlations between two roads by their spatial or topological distances. However,the distance-based methods neglect the spatio-temporal heterogeneity of traffic influence among roads. In this paper,we integrate GPS-enabled vehicle operating travel routes and word embedding techniques in Natural Language Processing (NLP) domain to quantify traffic correlations of road segments in different time intervals. Firstly,the corresponding relationships between transportation elements (i.e.,road segments,travel routes) and NLP terms (i.e.,words,documents) are established. Secondly,the real-valued vectors of road segments are trained from massive travel routes using a word-embedding model called "Word2Vec". Thirdly,the traffic correlation between two roads is measured by the cosine similarity of their vectors. Finally,the results are evaluated using real traffic condition data. Results of a case study using a large-scale taxi trajectory dataset in Beijing show that:①road segments that have stronger traffic correlations are also more similar in their traffic conditions measured by roads' average travel speeds,proving that our approach is capable of quantifying road segment traffic correlations and detecting their spatial heterogeneity;②road segments' traffic correlations are stronger on workday rush hours and holidays than on weekends and workday non-rush hours,proving that our approach is capable of detecting temporal variations. Our approach and analysis provide methodological and theoretical basis for transportation related applications using NLP and machine learning models.

Key words: traffic correlation, Word2Vec, travel routes, floating car data

中图分类号: