Acta Geodaetica et Cartographica Sinica ›› 2017, Vol. 46 ›› Issue (12): 2032-2040.doi: 10.11947/j.AGCS.2017.20170166

Previous Articles     Next Articles

Measuring Traffic Correlations in Urban Road System Using Word Embedding Model

LIU Kang1,2, QIU Peiyuan1, LIU Xiliang1, ZHANG Hengcai1, WANG Shaohua1, LU Feng1,2,3   

  1. 1. Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China;
    2. University of Chinese Academy of Sciences, Beijing 100049, China;
    3. Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
  • Received:2017-04-05 Revised:2017-11-08 Online:2017-12-20 Published:2017-12-28
  • Supported by:
    The National Natural Science Foundation of China(No. 41631177) The National Key Research and Development Program (No. 2016YFB0502104) Key Project of the Chinese Academy of Sciences(No. ZDRW-ZS-2016-6-3)

Abstract: Good characterization of road traffic correlations among urban roads can help improve the traffic-related applications,such as traffic interpolation and short-term traffic forecasting. Previous studies model the traffic correlations between two roads by their spatial or topological distances. However,the distance-based methods neglect the spatio-temporal heterogeneity of traffic influence among roads. In this paper,we integrate GPS-enabled vehicle operating travel routes and word embedding techniques in Natural Language Processing (NLP) domain to quantify traffic correlations of road segments in different time intervals. Firstly,the corresponding relationships between transportation elements (i.e.,road segments,travel routes) and NLP terms (i.e.,words,documents) are established. Secondly,the real-valued vectors of road segments are trained from massive travel routes using a word-embedding model called "Word2Vec". Thirdly,the traffic correlation between two roads is measured by the cosine similarity of their vectors. Finally,the results are evaluated using real traffic condition data. Results of a case study using a large-scale taxi trajectory dataset in Beijing show that:①road segments that have stronger traffic correlations are also more similar in their traffic conditions measured by roads' average travel speeds,proving that our approach is capable of quantifying road segment traffic correlations and detecting their spatial heterogeneity;②road segments' traffic correlations are stronger on workday rush hours and holidays than on weekends and workday non-rush hours,proving that our approach is capable of detecting temporal variations. Our approach and analysis provide methodological and theoretical basis for transportation related applications using NLP and machine learning models.

Key words: traffic correlation, Word2Vec, travel routes, floating car data

CLC Number: