Search
2023 Volume 2
Article Contents
REVIEW   Open Access    

Overview of machine learning-based traffic flow prediction

More Information
  • Traffic flow prediction is an important component of intelligent transportation systems. Recently, unprecedented data availability and rapid development of machine learning techniques have led to tremendous progress in this field. This article first introduces the research on traffic flow prediction and the challenges it currently faces. It then proposes a classification method for literature, discussing and analyzing existing research on using machine learning methods to address traffic flow prediction from the perspectives of the prediction preparation process and the construction of prediction models. The article also summarizes innovative modules in these models. Finally, we provide improvement strategies for current baseline models and discuss the challenges and research directions in the field of traffic flow prediction in the future.
  • 加载中
  • [1]

    Guo M, Sun Z, Pan J, Xu M. 2008. Research on short time traffic flow forecasting method. Application Research of Computers 25(9):2676−78

    doi: 10.3969/j.issn.1001-3695.2008.09.031

    CrossRef   Google Scholar

    [2]

    Asghari M, Deng D, Shahabi C, Demiryurek U, Li Y. 2016. Price-aware real-time ride-sharing at scale: an auction-based approach. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, Burlingame, CA, USA, 2016. Association for Computing Machinery, New York, USA. pp. 1−10. https://doi.org/10.1145/2996913.2996974

    [3]

    Gilmore JF, Abe N. 1995. Neural network models for traffic control and congestion prediction. Journal of Intelligent Transportation Systems 2(3):231−52

    doi: 10.1080/10248079508903828

    CrossRef   Google Scholar

    [4]

    Qin X. 2023. Traffic flow prediction based on Two-Channel Multi-Modal fusion of MCB and attention. IEEE Access 11:58745−53

    doi: 10.1109/ACCESS.2023.3280068

    CrossRef   Google Scholar

    [5]

    Nguyen H, Kieu LM, Wen T, Cai C. 2018. Deep learning methods in transportation domain: a review. IET Intelligent Transport Systems 12:998−1004

    doi: 10.1049/iet-its.2018.0064

    CrossRef   Google Scholar

    [6]

    Zhang J, Wang F, Wang K, Lin W, Xu X, et al. 2011. Data-driven intelligent transportation systems: a survey. IEEE Transactions on Intelligent Transportation Systems 12(4):1624−39

    doi: 10.1109/TITS.2011.2158001

    CrossRef   Google Scholar

    [7]

    Singh G, Al’Aref SJ, Van Assen M, Kim TS, van Rosendael A et al. 2018. Machine learning in cardiac CT: basic concepts and contemporary data. Journal of Cardiovascular Computed Tomography 12(3):192−201

    doi: 10.1016/j.jcct.2018.04.010

    CrossRef   Google Scholar

    [8]

    Ahsan MM, Luna SA, Siddique Z. 2022. Machine-learning-based disease diagnosis: a comprehensive review. Healthcare 10(3):541

    doi: 10.3390/healthcare10030541

    CrossRef   Google Scholar

    [9]

    Dey A. 2016. Machine learning algorithms: a review. International Journal of Computer Science and Information Technologies 7(3):1174−79

    Google Scholar

    [10]

    Dhall D, Kaur R, Juneja M. 2019. Machine learning: a review of the algorithms and its applications. In Proceedings of ICRIC 2019. Lecture Notes in Electrical Engineering, eds. Singh P, Kar A, Singh Y, Kolekar M, Tanwar S. vol 597. Switzerland: Springer, Cham. pp. 47−63. https://doi.org/10.1007/978-3-030-29407-6_5

    [11]

    Osisanwo FY, Akinsola JET, Oludele A, Hinmikaiye JO, Olakanmi O, et al. 2017. Supervised machine learning algorithms: classification and comparison. International Journal of Computer Trends and Technology 48(3):128−38

    doi: 10.14445/22312803/IJCTT-V48P126

    CrossRef   Google Scholar

    [12]

    Obulesu O, Mahendra M, ThrilokReddy M. 2018. Machine learning techniques and tools: a survey. International conference on inventive research in computing applications (ICIRCA), Coimbatore, India, 2018. USA: IEEE. pp. 605−11. https://doi.org/10.1109/ICIRCA.2018.8597302

    [13]

    Ray S. 2019. A quick review of machine learning algorithms. International conference on machine learning, big data, cloud and parallel computing (COMITCon), Faridabad, India, 2019. USA: IEEE. pp. 35−39. https://doi.org/10.1109/COMITCon.2019.8862451

    [14]

    Kumar R, Verma RK. 2012. Classifcation algorithms for data mining: a survey. International Journal of Innovations in Engineering and Technology 1(2):7−14

    Google Scholar

    [15]

    Nikam SS. 2015. A comparative study of classifcation techniques in data mining algorithms. Oriental Journal of Computer Science & Technology 8(1):13−19

    Google Scholar

    [16]

    Stein G, Chen B, Wu AS, Hua KA. 2005. Decision tree classifier for network intrusion detection with GA-based feature selection. Proceedings of the 43rd annual Southeast regional conference, Kennesaw, Georgia, 2005. vol 2. New York, USA: Association for Computing Machinery. pp: 136-41. https://doi.org/10.1145/1167253.1167288

    [17]

    Damanik IS, Windarto AP, Wanto A, Poningsih, Andani SR, et al. 2019. Decision tree optimization in C4. 5 Algorithm using genetic algorithm. Journal of Physics: Conference Series 1255:012012

    doi: 10.1088/1742-6596/1255/1/012012

    CrossRef   Google Scholar

    [18]

    Mahesh B. 2020. Machine learning algorithms—a review. International Journal of Science and Research 9:381−86

    doi: 10.21275/ART20203995

    CrossRef   Google Scholar

    [19]

    Charbuty B, Abdulazeez A. 2021. Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology 2(1):20−28

    doi: 10.38094/jastt20165

    CrossRef   Google Scholar

    [20]

    Belgiu M, Drăguţ L. 2016. Random forest in remote sensing: a review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing 114:24−31

    doi: 10.1016/j.isprsjprs.2016.01.011

    CrossRef   Google Scholar

    [21]

    He Y, Lee E, Warner TA. 2017. A time series of annual land use and land cover maps of China from 1982 to 2013 generated using AVHRR GIMMS NDVI3g data. Remote Sensing of Environment 199:201−17

    doi: 10.1016/j.rse.2017.07.010

    CrossRef   Google Scholar

    [22]

    Maxwell AE, Warner TA, Fang F. 2018. Implementation of machine-learning classification in remote sensing: an applied review. International Journal of Remote Sensing 39(9):2784−817

    doi: 10.1080/01431161.2018.1433343

    CrossRef   Google Scholar

    [23]

    Gow J, Baumgarten R, Cairns P, Colton S, Miller P. 2012. Unsupervised modeling of player style with LDA. IEEE Transactions on Computational Intelligence and AI in Games 4(3):152−66

    doi: 10.1109/TCIAIG.2012.2213600

    CrossRef   Google Scholar

    [24]

    Achille A, Soatto S. 2018. Information dropout: Learning optimal representations through noisy computation. IEEE Transactions on Pattern Analysis and Machine Intelligence 40:2897−905

    doi: 10.1109/TPAMI.2017.2784440

    CrossRef   Google Scholar

    [25]

    Wilkes JT, Gallistel CR. 2017. Information theory, memory, prediction, and timing in associative learning. In Computational Models of Brain and Behavior, ed. Moustafa AA. | Hoboken, NJ, USA: John Wiley & Sons. pp. 481−92. https://doi.org/10.1002/9781119159193.ch35

    [26]

    Lizotte DJ, Laber EB. 2016. Multi-objective Markov decision processes for data-driven decision support. Journal of Machine Learning Research 17:211

    Google Scholar

    [27]

    Nguyen G, Dlugolinsky S, Bobák M, Tran V, López García Á, Heredia I et al. 2019. Machine learning and deep learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review 52(1):77−124

    doi: 10.1007/s10462-018-09679-z

    CrossRef   Google Scholar

    [28]

    LeCun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature 521:436−44

    doi: 10.1038/nature14539

    CrossRef   Google Scholar

    [29]

    Schmidhuber J. 2015. Deep learning in neural networks: an overview. Neural Networks 61:85−117

    doi: 10.1016/j.neunet.2014.09.003

    CrossRef   Google Scholar

    [30]

    Wang D, Cai Z, Zeng J, Zhang G, Guo J. 2020. Review of traffic data collection research on urban traffic control. Journal of Transportation Systems Engineering and Information Technology 20(3):95−102

    doi: 10.16097/j.cnki.1009-6744.2020.03.015

    CrossRef   Google Scholar

    [31]

    Zhou L, Zhang Q, Yin C, Ye W. 2022. Research on Short-term Traffic Flow Prediction Based on KNN-GRU. 2022 China Automation Congress (CAC), Xiamen, China, 2022. USA: IEEE. pp:1924−28. https://doi.org/10.1109/CAC57257.2022.10055164

    [32]

    Yu B, Yin H, Zhu Z. 2018. Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI '18), Stockholm, 2018. USA: International Joint Conferences on Artificial Intelligence. pp. 3634−40. https://doi.org/10.24963/ijcai.2018/505

    [33]

    Guo S, Lin Y, Feng N, Song C, Wan H. 2019. Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Hawaii, USA, 2019. Palo Alto, California USA: AAAI Press. pp. 922−29. https://doi.org/10.1609/aaai.v33i01.3301922

    [34]

    Diao Z, Wang X, Zhang D, Liu Y, Xie K, et al. 2019. Dynamic spatial-temporal graph convolutional neural networks for traffic forecasting. Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 31st Innovative Applications of Artificial Intelligence Conference and 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, Hawaii, USA, 2019. Palo Alto, California USA: AAAI Press. pp. 890−97. https://doi.org/10.1609/aaai.v33i01.3301890

    [35]

    Wu J, Fu J, Ji H, Liu L. 2023. Graph convolutional dynamic recurrent network with attention for traffic forecasting. Applied Intelligence 00:1−15

    doi: 10.1007/s10489-023-04621-5

    CrossRef   Google Scholar

    [36]

    Ni Q, Zhang M. 2022. STGMN: A gated multi-graph convolutional network framework for traffic flow prediction. Applied Intelligence 52:15026−39

    doi: 10.1007/s10489-022-03224-w

    CrossRef   Google Scholar

    [37]

    Yu H, Wu Z, Wang S, Wang Y and Ma X. 2017. Spatiotemporal recurrent convolutional networks for traffic prediction in transportation networks. Sensors 17(7):1501

    doi: 10.3390/s17071501

    CrossRef   Google Scholar

    [38]

    Yao H, Tang X, Wei H, Zheng G, Li Z. 2019. Revisiting spatial-temporal similarity: A deep learning framework for traffic prediction. 33rd Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, Hawaii, USA, 2019. Palo Alto, California, USA: AAAI Press. pp. 5668−75. https://doi.org/10.1609/aaai.v33i01.33015668

    [39]

    Ma X, Dai Z, He Z, Ma J, Wang Y, et al. 2017. Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction. Sensors 17(4):818

    doi: 10.3390/s17040818

    CrossRef   Google Scholar

    [40]

    Khaleghi B, Khamis A, Karray FO, Razavi SN. 2013. Multi-sensor data fusion: A review of the state-of-the-art. Information Fusion 14(1):28−44

    doi: 10.1016/j.inffus.2011.08.001

    CrossRef   Google Scholar

    [41]

    Castanedo F. 2013. A review of data fusion techniques. The Scientific World Journal 2013:704504

    doi: 10.1155/2013/704504

    CrossRef   Google Scholar

    [42]

    Lu B, Shu Q, Ma G. 2019. Short-time traffic flow prediction based on multi-source traffic data fusion. Journal of Chongqing Jiaotong University (Natural Science) 5:13−19+56

    doi: 10.3969/j.issn.1674-0696.2019.05.03

    CrossRef   Google Scholar

    [43]

    Xiang C, Yang P, Xiao F, Fan X. 2023. Urban traffic application: Traffic volume prediction. In Multi-dimensional Urban Sensing Using Crowdsensing Data. Singapore: Springer. pp. 113−50. https://doi.org/10.1007/978-981-19-9006-9_5

    [44]

    Cai B, Wang Y, Huang C, Liu J, Teng W. 2022. GLSNN network: A multi-scale spatiotemporal prediction model for urban traffic flow. Sensors 22:8880

    doi: 10.3390/s22228880

    CrossRef   Google Scholar

    [45]

    Fang Z, Pan L, Chen L, Du Y, Gao Y. 2021. MDTP: a multi-source deep traffic prediction framework over spatio-temporal trajectory data. Proceedings of VLDB Endowment 14(8):1289−97

    doi: 10.14778/3457390.3457394

    CrossRef   Google Scholar

    [46]

    Lin L, Li J, Chen F, Ye J, Huai J. 2018. Road traffic speed prediction: A probabilistic model fusing multi-source data. IEEE Transactions on Knowledge and Data Engineering 30(7):1310−23

    doi: 10.1109/TKDE.2017.2718525

    CrossRef   Google Scholar

    [47]

    Zhang J, Zheng Y, Qi D. 2017. Deep spatio-temporal residual networks for citywide crowd flows prediction. Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI'17), San Francisco, California, USA, 2017. Palo Alto, California USA: AAAI Press. pp. 1655−61. https://doi.org/10.1609/aaai.v31i1.10735

    [48]

    Zhang Q, Jin Q, Chang J, Xiang S, Pan C. 2018. Kernel-weighted graph convolutional network: A deep learning approach for traffic forecasting. 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 2018. USA: IEEE. pp. 1018−23. https://doi.org/10.1109/ICPR.2018.8545106

    [49]

    Hu J, Guo C, Yang B, Jensen CS. 2019. Stochastic weight completion for road networks using graph convolutional networks. IEEE 35th International Conference on Data Engineering (ICDE), Macao, China, 2019. USA: IEEE. pp: 1274−85. https://doi.org/10.1109/ICDE.2019.00116

    [50]

    Luo X, Peng J, Liang J. 2022. Directed hypergraph attention network for traffic forecasting. IET Intelligent Transport Systems 16(4):85−98

    doi: 10.1049/itr2.12130

    CrossRef   Google Scholar

    [51]

    Li J, Han Z, Cheng H, Su J, Wang P, et al. 2019. Predicting path failure in time-evolving graphs. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19), Anchorage, USA, 2019. New York, United States: Association for Computing Machinery. pp: 1279−89. https://doi.org/10.1145/3292500.3330847

    [52]

    Zhao L, Song Y, Zhang C, Liu Y, Wang P. 2020. T-GCN: A temporal graph convolutional network for traffic prediction. IEEE Transactions on Intelligent Transportation Systems 21:3848−58

    doi: 10.1109/TITS.2019.2935152

    CrossRef   Google Scholar

    [53]

    Yu JJQ, Gu J. 2019. Real-time traffic speed estimation with graph convolutional generative autoencoder. IEEE Transactions on Intelligent Transportation Systems 20(10):3940−51

    doi: 10.1109/TITS.2019.2910560

    CrossRef   Google Scholar

    [54]

    Huang Y, Weng Y, Yu S, Chen X. 2019. Diffusion convolutional recurrent neural network with rank influence learning for traffic forecasting. 18th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/13th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE), Rotorua, New Zealand, 2019. USA: IEEE. pp. 678–85. https://doi.org/10.1109/TrustCom/BigDataSE.2019.00096

    [55]

    Li F, Feng J, Yan H, Jin G, Yang F, et al. 2023. Dynamic graph convolutional recurrent network for traffic prediction: Benchmark and solution. ACM Transactions on Knowledge Discovery from Data 17(1):1−12

    doi: 10.1145/3532611

    CrossRef   Google Scholar

    [56]

    Guo S, Lin Y, Feng N, Song C, Wan H. 2019. Attention based spatial temporal graph convolutional networks for traffic flow forecasting. Proceeding of The 33rd AAAI Conference on Artificial Intelligence, Honolulu, Hawaii, USA, 2019. Palo Alto, California USA: AAAI Press. pp: 922−29. https://doi.org/10.1609/aaai.v33i01.3301922

    [57]

    Ge L, Li H, Liu J, Zhou A. 2019. Temporal graph convolutional networks for traffic speed prediction considering external factors. 20th IEEE International Conference on Mobile Data Management (MDM), Hong Kong, China, 2019. USA: IEEE. pp: 234−42. https://doi.org/10.1109/MDM.2019.00-52

    [58]

    Salort Sánchez C, Wieder A, Sottovia P, Bortoli S, Baumbach J. 2020. GANNSTER: Graph-Augmented Neural Network Spatio-Temporal Reasoner for Traffic Forecasting. International Workshop on Advanced Analytics and Learning on Temporal Data (AALTD), eds. Lemaire V, Malinowski S, Bagnall A, Guyet T, Tavenard R, et al. vol 12588. Switzerland: Springer, Cham. pp. 63−76. https://doi.org/10.1007/978-3-030-65742-0_5

    [59]

    Zhang Y, Wang S, Chen B, Cao J. 2019. GCGAN: generative adversarial nets with graph CNN for network-scale traffic prediction. International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 2019. USA: IEEE. pp. 1−8. https://doi.org/10.1109/IJCNN.2019.8852211

    [60]

    Chai D, Wang L, Yang Q. 2018. Bike flow prediction with multi-graph convolutional networks. Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL '18), Seattle, Washington, 2018. New York: Association for Computing Machinery. pp. 397−400. https://doi.org/10.1145/3274895.3274896

    [61]

    Han Y, Wang S, Ren Y, Wang C, Gao P, et al. 2019. Predicting Station-Level Short-Term Passenger Flow in a Citywide Metro Network Using Spatiotemporal Graph Convolutional Neural Networks. ISPRS International Journal of Geo-Information 8(6):243

    doi: 10.3390/ijgi8060243

    CrossRef   Google Scholar

    [62]

    Li Y, Yu R, Shahabi C, Liu Y. 2018. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. International Conference on Learning Representations 2018. https://arxiv.org/pdf/1707.01926v3.pdf

    [63]

    Jiang J, Han C, Xin W, Wang J. 2023. PDFormer: Propagation delay-aware dynamic long-range transformer for traffic flow prediction. Proceeding of 37th AAAI Conference on Artificial Intelligence, Washington DC, USA, 2023. Washington, DC, USA: AAAI Press. pp. 4365−73. https://doi.org/10.1609/aaai.v37i4.25556

    [64]

    Deng P, Zhao Y, Liu J, Jia X, Wang M. 2023. Spatio-temporal neural structural causal models for bike flow prediction. Proceeding of 37th AAAI Conference on Artificial Intelligence, Washington DC, USA, 2023. Washington, DC, USA: AAAI Press. pp. 4242−49. https://doi.org/10.1609/aaai.v37i4.25542

    [65]

    Guo M, Xiao X, Lan J. 2009. A summary of the short-time traffic flow forecasting methods. Techniques of Automation and Applications 28(6):8−9

    doi: 10.3969/j.issn.1003-7241.2009.06.003

    CrossRef   Google Scholar

    [66]

    Williams BM, Hoel LA. 2003. Modeling and forecasting vehicular traffic flow as a seasonal arima process: Theoretical basis and empirical results. Journal of Transportation Engineering 129(6):664−72

    doi: 10.1061/(ASCE)0733-947X(2003)129:6(664)

    CrossRef   Google Scholar

    [67]

    Pan B, Demiryurek U, Shahabi C. 2012. Utilizing real-world transportation data for accurate traffic prediction. 2012 IEEE 12th International Conference on Data Mining, Brussels, Belgium, 2012. USA: IEEE. pp. 595-604. https://doi.org/10.1109/ICDM.2012.52

    [68]

    Apaydin H, Feizi H, Sattari MT, Colak MS, Shamshirband S, et al. 2020. Comparative analysis of recurrent neural network architectures for reservoir inflow forecasting. Water 12(5):1500

    doi: 10.3390/w12051500

    CrossRef   Google Scholar

    [69]

    Zhao Z, Chen W, Wu X, Chen PCY, Liu J. 2017. LSTM network: a deep learning approach for short-term traffic forecast. IET Intelligent Transport Systems 11(2):68−75

    doi: 10.1049/iet-its.2016.0208

    CrossRef   Google Scholar

    [70]

    Liu C. 2022. Short-term traffic flow prediction based on LSTM and its variants. Transport Energy Conservation & Environmental Protection 18(4):99−105

    doi: 10.3969/j.issn.1673-6478.2022.04.019

    CrossRef   Google Scholar

    [71]

    Xue X, Jia X, Wang Y, Sheng Y. 2020. Expressway Traffic Flow Prediction Model Based on Bi-LSTM Neural Networks. 2020 4th International Conference on Traffic Engineering and Transportation System, IOP Conference Series: Earth and Environmental Science, Dalian, China, 2020. UK: IOP publishing. 587:012007

    [72]

    Fu R, Zhang Z, Li L. 2016. Using LSTM and GRU neural network methods for traffic flow prediction. 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 2016. USA: IEEE. pp. 324−28. https://doi.org/10.1109/YAC.2016.7804912

    [73]

    Bai S, Kolter JZ, Koltun V. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. ArXiv In press

    doi: 10.48550/arXiv.1803.01271

    CrossRef   Google Scholar

    [74]

    Wu Z, Pan S, Long G, Jiang J, Zhang C. 2019. Graph wavenet for deep spatial-temporal graph modeling. Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI-19). California, USA: International Joint Conferences on Artificial Intelligence Organization. pp. 1907−13. https://doi.org/10.24963/ijcai.2019/264

    [75]

    Ren H, Kang J, Zhang K. 2022. Spatio-temporal graph-TCN neural network for traffic flow prediction. 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), Chengdu, China, 2022. USA: IEEE. pp. 1-4. https://doi.org/10.1109/ICCWAMTIP56608.2022.10016530

    [76]

    Sun Y, Jiang X, Hu Y, Duan F, Guo K, et al. 2022. Dual dynamic spatial-temporal graph convolution network for traffic prediction. IEEE Transactions on Intelligent Transportation Systems 23(12):23680−93

    doi: 10.1109/TITS.2022.3208943

    CrossRef   Google Scholar

    [77]

    Gao H, Jia H, Yang L, Li R. 2022. An Improved CEEMDAN-FE-TCN Model for Highway Traffic Flow Prediction. Journal of Advanced Transportation 2022:2265000

    doi: 10.1155/2022/2265000

    CrossRef   Google Scholar

    [78]

    Brauwers G, Frasincar F. 2023. A general survey on attention mechanisms in deep learning. IEEE Transactions on Knowledge and Data Engineering 35(4):3279−98

    doi: 10.1109/TKDE.2021.3126456

    CrossRef   Google Scholar

    [79]

    Zhang Z, Jiao X. 2021. A deep network with analogous self-attention for short-term traffic flow prediction. IET Intelligent Transport Systems 15(7):902−15

    doi: 10.1049/itr2.12070

    CrossRef   Google Scholar

    [80]

    Zhang H, Zou Y, Yang X, Yang H. 2022. A temporal fusion transformer for short-term freeway traffic speed multistep prediction. Neurocomputing 500:329−40

    doi: 10.1016/j.neucom.2022.05.083

    CrossRef   Google Scholar

    [81]

    Cai L, Janowicz K, Mai G, Yan B, Zhu R. 2020. Traffic transformer: Capturing the continuity and periodicity of time series for traffic forecasting. Transactions in GIS 24:736−55

    doi: 10.1111/tgis.12644

    CrossRef   Google Scholar

    [82]

    Tedjopurnomo DA, Choudhury FM, Qin AK. 2023. TrafFormer: A transformer model for predicting long-term traffic. ArXiv In press

    doi: 10.48550/arXiv.2302.12388

    CrossRef   Google Scholar

    [83]

    Xu J, Deng D, Demiryurek U, Shahabi C, van der Schaar M. 2015. Mining the situation: Spatiotemporal traffic prediction with big data. IEEE Journal of Selected Topics in Signal Processing 9(4):702−15

    doi: 10.1109/JSTSP.2015.2389196

    CrossRef   Google Scholar

    [84]

    Min W, Wynter L. 2011. Real-time road traffic prediction with spatio-temporal correlations. Transportation Research Part C:Emerging Technologies 19(4):606−16

    doi: 10.1016/j.trc.2010.10.002

    CrossRef   Google Scholar

    [85]

    Zhou J, Cui G, Hu S, Zhang Z, Yang C, et al. 2020. Graph neural networks: A review of methods and applications. AI Open 1:57−81

    doi: 10.1016/j.aiopen.2021.01.001

    CrossRef   Google Scholar

    [86]

    Liu Q, Li J, Lu Z. 2021. ST-Tran: Spatial-temporal transformer for cellular traffic prediction. IEEE Communications Letters 25(10):3325−29

    doi: 10.1109/LCOMM.2021.3098557

    CrossRef   Google Scholar

    [87]

    Feng A, Tassiulas L. 2022. Adaptive Graph Spatial-Temporal Transformer Network for Traffic Forecasting. Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM '22), Atlanta, USA, 2022. New York, United States: Association for Computing Machinery. pp. 3933−37. https://doi.org/10.1145/3511808.3557540

    [88]

    Fang Y, Jiang J, He Y. 2021. Traffic speed prediction based on LSTM-Graph attention network (L-GAT). 4th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), Changsha, China, 2021. USA: IEEE. pp. 788−93. https://doi.org/10.1109/AEMCSE51986.2021.00163

    [89]

    Guo H, Xie K. 2021. Research on traffic forecasting based on graph structure generation. 16th International Conference on Computer Science & Education (ICCSE), Lancaster, United Kingdom, 2021. USA: IEEE. pp. 855−58. https://doi.org/10.1109/ICCSE51940.2021.9569274

    [90]

    Yeghikyan G, Opolka FL, Nanni M, Lepri B, Liò P. 2020. Learning mobility flows from urban features with spatial interaction models and neural networks. IEEE International Conference on Smart Computing (SMARTCOMP), Bologna, Italy, 2020. USA: IEEE. pp. 57−64. https://doi.org/10.1109/SMARTCOMP50058.2020.00028

    [91]

    Zhang W, Yao R, Du X, Liu Y, Wang R, et al. 2023. Traffic flow prediction under multiple adverse weather based on self-attention mechanism and deep learning models. Physica A: Statistical Mechanics and its Applications 625:128988

    doi: 10.1016/j.physa.2023.128988

    CrossRef   Google Scholar

    [92]

    Dong L, Zhang X, Liu L. 2022. Deep Spatial-Temporal Network Based on Residual Networks and Dilated Convolution for Traffic Flow Prediction. IEEE 7th International Conference on Intelligent Transportation Engineering (ICITE), Beijing, China, 2022. USA: IEEE. pp. 284−89. https://doi.org/10.1109/ICITE56321.2022.10101467

    [93]

    Sun K, Ren Q, Jin H, Lv X. 2022. Deep Spatio-Temporal Residual Shrinkage Networks for Traffic Prediction. IEEE 24th International Conference on High Performance Computing & Communications, Hainan, China, 2022. USA: IEEE. pp. 1036−41. https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00164

    [94]

    Zhao Y, Deng P, Liu J, Jia X, Wang M. 2023. Causal conditional hidden Markov model for multimodal traffic prediction. Proceeding of 37th AAAI Conference on Artificial Intelligence, Washington DC, USA, 2023. Washington, DC, USA: AAAI Press. pp. 4929−36. https://doi.org/10.1609/aaai.v37i4.25619

    [95]

    Liu C, Sun X, Wang J, Tang H, Li T, et al. 2020. Learning causal semantic representation for out-of-distribution prediction. arXiv In press

    doi: 10.48550/arXiv.2011.01681

    CrossRef   Google Scholar

    [96]

    Koesdwiady A, Soua R, Karray F. 2016. Improving traffic flow prediction with weather information in connected cars: A deep learning approach. IEEE Transactions on Vehicular Technology 65:9508−17

    doi: 10.1109/TVT.2016.2585575

    CrossRef   Google Scholar

    [97]

    Yuan L, Zeng Y, Chen H, Jin J. 2022. Terminal Traffic Situation Prediction Model under the Influence of Weather Based on Deep Learning Approaches. Aerospace 9(10):580

    doi: 10.3390/aerospace9100580

    CrossRef   Google Scholar

    [98]

    Fan Z. 2023. Short-term traffic flow prediction method with multiple factors and deep learning. 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information (ICETCI), Changchun, China, 26−28 May 2023, pp.1237−43. USA: IEEE. https://doi.org/10.1109/ICETCI57876.2023.10176734

    [99]

    Lai Y, Chen S, Wang S, Lin B. 2022. A weather-based traffic prediction system using big data techniques. 12th International Conference on Advanced Computer Information Technologies (ACIT), Ruzomberok, Slovakia, 2022. USA: IEEE. pp: 379-83. https://doi.org/10.1109/ACIT54803.2022.9913125

    [100]

    Wang M, Tian S, Chen C, Zhong J. 2020. Short-time traffic flow forecast with weather characteristics. International Conference on Computer Communication and Network Security (CCNS), Xi'an, China, 2020. USA: IEEE. pp: 142−45. https://doi.org/10.1109/CCNS50731.2020.00039

    [101]

    Zhang W, Yao R, Du X, Ye J. 2021. Hybrid deep spatio-temporal models for traffic flow prediction on holidays and under adverse weather. IEEE Access 9:157165−81

    doi: 10.1109/ACCESS.2021.3127584

    CrossRef   Google Scholar

    [102]

    Yao R, Zhang W, Long M. 2021. DLW-Net model for traffic flow prediction under adverse weather. Transportmetrica B: Transport Dynamics 10:499−524

    doi: 10.1080/21680566.2021.2008280

    CrossRef   Google Scholar

    [103]

    Li T, Ma J, Lee C. 2020. Markov-based time series modeling framework for traffic-network state prediction under various external conditions. Journal of Transportation Engineering, Part A: Systems 146(6):04020042

    doi: 10.1061/jtepbs.0000347

    CrossRef   Google Scholar

    [104]

    Shabarek A, Chien S, Hadri S. 2020. Deep learning framework for freeway speed prediction in adverse weather. Transportation Research Record: Journal of the Transportation Research Board 2674(10):28−41

    doi: 10.1177/0361198120947421

    CrossRef   Google Scholar

    [105]

    Gao Y, Chiang Y, Zhang X, Zhang M. 2022. Traffic volume prediction for scenic spots based on multi-source and heterogeneous data. Transactions in GIS 26:2415−39

    doi: 10.1111/tgis.12975

    CrossRef   Google Scholar

    [106]

    Song C, Lin Y, Guo S, Wan H. 2020. Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, New York, USA, 2020. Palo Alto, California USA: AAAI Press. pp: 914−21. https://doi.org/10.1609/aaai.v34i01.5438

  • Cite this article

    Xing Z, Huang M, Peng D. 2023. Overview of machine learning-based traffic flow prediction. Digital Transportation and Safety 2(3):164−175 doi: 10.48130/DTS-2023-0013
    Xing Z, Huang M, Peng D. 2023. Overview of machine learning-based traffic flow prediction. Digital Transportation and Safety 2(3):164−175 doi: 10.48130/DTS-2023-0013

Figures(14)  /  Tables(3)

Article Metrics

Article views(2399) PDF downloads(228)

Other Articles By Authors

REVIEW   Open Access    

Overview of machine learning-based traffic flow prediction

Digital Transportation and Safety  2 2023, 2(3): 164-175  |  Cite this article

Abstract: Traffic flow prediction is an important component of intelligent transportation systems. Recently, unprecedented data availability and rapid development of machine learning techniques have led to tremendous progress in this field. This article first introduces the research on traffic flow prediction and the challenges it currently faces. It then proposes a classification method for literature, discussing and analyzing existing research on using machine learning methods to address traffic flow prediction from the perspectives of the prediction preparation process and the construction of prediction models. The article also summarizes innovative modules in these models. Finally, we provide improvement strategies for current baseline models and discuss the challenges and research directions in the field of traffic flow prediction in the future.

    • In recent years, intelligent transportation systems have gradually developed, involving multiple aspects of traffic management, rail transportation, smart highways, and operation management, providing many conveniences for people's daily lives. Short-term traffic flow prediction, as a prerequisite for real-time traffic signal control, traffic allocation, path guidance, automatic navigation, and determination of residential travel connection schemes in intelligent transportation systems, is currently a research hotspot in the transportation field[1]. The goal of traffic flow prediction is to estimate the future traffic conditions of the traffic network based on historical observations. According to the prediction time span, traffic prediction can be divided into short-term prediction and long-term prediction. As shown in Fig. 1, traffic flow prediction has significant application value in reducing road congestion, optimizing vehicle dispatch[2], formulating traffic control measures[3], reducing environmental pollution, and so on.

      Figure 1. 

      Benefits of traffic flow prediction.

      Short-term traffic flow prediction research poses certain challenges. On the one hand, due to the randomness and uncertainty of traffic flow changes, the shorter the prediction period, the more difficult the prediction becomes. On the other hand, traffic flow has a complex temporal and spatial dependence[4]. For example, regarding the traffic flow on road A, in terms of temporal dynamics, sudden accidents or rush hour periods on the road can form an unstable traffic flow time series. In terms of spatial correlation, as shown in Fig. 2, the traffic flow of adjacent upstream and downstream roads b/c in the same direction as the target prediction section a will exhibit stronger correlation with a because of their closer Euclidean distance, whereas the traffic flow on the opposing road d with a similar Euclidean distance to a may exhibit weaker correlation. Moreover, a region in the road network is usually spatially dependent on another region through various non-Euclidean relationships, such as spatial adjacency, point-of-interest (POI), and semantic information. Therefore, how to model these dependency relationships remains a challenge.

      Figure 2. 

      Schematic diagram of spatial location distribution.

      With the development of Intelligent Transportation Systems (ITS) related technologies, traffic information collection devices and transmission technologies are becoming increasingly mature. Devices such as loop detectors and vehicle GPS can acquire vast amounts of real-time traffic data. Therefore, the focus of traffic flow prediction has shifted from knowledge-driven to data-driven approaches[5]. Therefore, this article will provide an overview of research on machine learning-based traffic flow prediction.

      In this study, the literature search was conducted using the Web of Science core database. The search scope was from 2000 to 2023, and the keywords used in the search included traffic prediction, traffic flow prediction, machine learning and deep learning. We proposed a novel classification method for the literature. Firstly, the research process of traffic flow prediction was divided into the prediction preparation process and the model establishment process. In the prediction preparation process, the literature was categorized and summarized based on data types and road network topologies. Next, in the model establishment process, the literature was classified and discussed based on whether spatial dependencies were modeled. Additionally, we provided a summary of innovative external modules that have improved prediction accuracy. Finally, we presented improvement strategies for current baseline models and discussed the challenges and research directions in the field of traffic flow prediction in the future.

    • ITS provides a large amount of high-quality traffic data for data-driven traffic flow prediction[6]. As shown in Fig. 3, machine learning and deep learning are considered subsets of artificial intelligence (AI) and have grown exponentially in the past few years. These methods have performed well in predicting traffic flow. This section presents the theoretical background of machine learning and deep learning for traffic flow prediction.

      Figure 3. 

      Relationships between Artificial learning (AL), Machine learning (ML) and Deep learning (DL).

    • Machine learning (ML) techniques are considered statistical models used for classification and prediction based on provided data. Machine learning is a field of artificial intelligence that focuses on developing predictive algorithms and aims to fairly discover the intrinsic rules in large datasets rather than designing models specifically for a particular task[7,8]. ML models can be classified into three categories based on the learning technique they use: supervised learning, unsupervised learning, and reinforcement learning. The main methods contained within each category are shown in Table 1.

      Table 1.  Main methods of machine learning.

      Machine learning categoryMain methods
      Supervised learningSupport vector machine(SVM)[911]
      K-nearest neighbors(KNN)[12,13]
      Logistic regression[9, 13]
      Linear regression[12,13 ]
      Decision trees[1419]
      Random forest[2022]
      Unsupervised learningK-means clustering[9,13 ]
      Principal component analysis[9]
      Latent dirichlet allocation[23]
      Reinforcement learningQ-learning[24,25]
      Monte Carlo tree search[26]
    • Around a decade ago, Deep learning (DL) emerged as an effective machine learning technique and has shown good performance in multiple application domains. The core idea of deep learning methods is to use Deep Neural Networks (DNN) to learn abstract features extracted from data. These algorithms do not require providing pre-created features manually, as they automatically learn complex features[27].

      Deep Learning Architectures (DLAs) usually consist of nonlinear modules that transform low-level feature representations into higher and more abstract representations[28]. With enough of these transformations, the model can learn complex functions and structures. For example, in classification tasks, important features are typically preserved from higher-level representations while suppressing irrelevant variations. In contrast to traditional methods, the key advantage of deep learning is that the feature selection process is automatically done by a universal learning process without human intervention. With its specifiable depth of hierarchical learning, deep learning performs well in discovering high-dimensional data structures. Figure 4 illustrates the comparison between a traditional neural network and a DLA, where the difference lies in the number of hidden layers. Simple neural networks usually have only one hidden layer and require a feature selection process. Deep learning neural networks have two or more hidden layers, allowing for optimal feature selection and model adjustment during the learning process[29]. Currently, deep learning architectures mainly include Recurrent Neural Network (RNN), Long-Short Term Memory Network (LSTM), Convolutional Neural Network (CNN), Graph Convolutional Neural Network (GCN), Stacked Auto-Encoders (SAEs), Deep Belief Network (DBN), etc.

      Figure 4. 

      Difference between simple neural network and deep learning neural network.

      Deep learning is a subset of machine learning, so we will focus on reviewing and discussing research that utilizes machine learning and deep learning to address traffic flow prediction problems in the following sections.

    • In this section, we categorized existing research from two aspects: the types of data used in the study and the topology of the road network. We discussed their characteristics separately.

    • We categorize the input data for traffic flow prediction models in current research into three types: fixed detection data, mobile detection data, and multi-source fusion data.

    • Fixed detection data mainly includes loop detection data, geomagnetic detection data, and microwave detection data[30]. Their principles and characteristics are shown in Table 2.

      Table 2.  Fixed detection data.

      TypeDetection
      data type
      Characteristic
      Loop
      detection data
      Traffic flow
      Speed
      Occupancy
      High detection accuracy but detection accuracy decreases in traffic congestion
      Geomagnetic detection dataTraffic flow
      Speed
      Occupancy
      Unable to detect stationary and slow-moving vehicles
      Microwave
      detection data
      Traffic flow
      Speed
      Occupancy
      Density
      Queue
      Detection errors may occur when large vehicles obstruct the reflection waves of small vehicles

      Fixed detection data is collected by various corresponding fixed detectors, with most detectors collecting data every 30 s, which is then aggregated into data samples with a 5-min cycle. As the most widely used method for collecting traffic data, fixed detectors play a crucial role in the development of ITS. In well-known public datasets in the transportation field, such as PeMS, NGSIM, and the UK Highways Agency Traffic Flow Data, fixed detectors are used to collect data. For example, Zhou et al.[31] used traffic flow data from the PeMS dataset as input for their prediction model, and many other researchers[3236] also utilize the PeMS dataset to train their traffic volume prediction models.

    • Mobile detection data usually refers to data collected by floating cars and connected vehicles[30]. Vehicles equipped with GPS positioning devices or network modules can record detailed information such as the vehicle's geographic coordinates and instantaneous speed in real-time while driving on the road. If the vehicle's driving trajectory is first converted into traffic volume[37, 38], travel time, driving speed[39], and other parameters, and then matched to the map, it can serve as a basis for subsequent research, as shown in Fig. 5. Unlike fixed detection data, mobile detection data covers a wider and more continuous range of vehicle driving trajectories, and is typically used for studies that consider road network or long-range dependencies.

      Figure 5. 

      Track data is matched to the map.

    • With the development of traffic data collection technology, people have collected more and more useful data through various methods. At the same time, different data sources have their own applicability. Therefore, research on using multi-source data[40,41] as model inputs has begun to emerge. The process of multi-source data fusion is shown in Fig. 6. Lu et al.[42] input microwave, geomagnetic, floating car, and video detection data separately into the prediction model, calculate the weight of each individual source data model's prediction result based on the prediction error separately, and finally use weighted fusion to obtain the prediction result. The results showed that the accuracy of the prediction model constructed with multi-source data was higher than that of the single data source prediction model. Xiang et al.[43] innovatively proposed a method of using fusion data from surrounding building sensors to predict traffic volume. Many scholars comprehensively consider the extent to which data affects the research being conducted, and choose different data sources to fuse, such as taxi data and detector data fusion[44,45], sensor speed data and map and traffic platform data fusion[46], etc. Experimental results have proven that inputting multi-source data into the model can improve the prediction accuracy of the model. Due to the ability of multi-source data fusion to fully utilize the advantages of different data types, and combining the accurate characteristics of fixed detection data with the characteristics of full-time and space coverage of mobile detection data, the use of multi-source data as model inputs can effectively improve the prediction accuracy of the model.

      Figure 6. 

      Multi-source data fusion process.

    • The two main types of road topology structures that have emerged in current research are grid structure and graph structure.

    • Figure 7 provides an example of dividing the road network into a grid structure. The regular grid structure is convenient for convolutional neural networks to slide on layers to extract features. Therefore, many studies[37,47] process the transportation network into uniformly sized grids to achieve spatial dependency modeling. Ma et al.[39] used GPS trajectory data to match the data onto a map. After modeling the traffic network as a grid of images, they employed CNN for spatial feature learning. Yao et al.[38] divided New York City (USA) into grids, with each grid representing a region. They defined the initial traffic volume of a region as the number of times that vehicles departed from or arrived at the region within a fixed time interval. They then utilized local CNN and LSTM to process spatio-temporal information for traffic flow prediction.

      Figure 7. 

      Grid structure of traffic network.

    • Since roads are continuous in real life and transportation networks are not regular Euclidean structures, there is a downside to dividing the road network into a grid structure, which destroys the underlying structure of the transportation network. Therefore, another topology structure of transportation networks − graph structure[32] was developed, as shown in Fig. 8.

      Figure 8. 

      Graph structure of traffic network.

      The general graphical representation of a transportation network is typically referred to as follows:

      $ {\bf G} = ({\bf V}, {\bf E}, {\bf A}) $

      The graphical representation of a transportation network can be classified into weighted graphs[4850] and unweighted graphs[51, 52], directed graphs[5355], and undirected graphs[33]. V represents the nodes in the graph structure, which can be detectors[56, 57], road segments[58,59], or intersections[48, 49]. Each node can contain one or multiple types of feature information, such as traffic volume[54], traffic speed[34], etc. E represents the edges that connect the nodes. A is the adjacency matrix that contains the topological information of the traffic network. In a basic adjacency matrix, the elements are either 0 or 1. A value of 1 indicates a connection between two nodes, while 0 signifies no connectivity between them. Additionally, the elements in the adjacency matrix can also represent the distance between nodes[54,57].

      Convolutional operations[32,60,61] can extract high-dimensional features of the entire graph using the graph structure of the transportation network. Subsequent researchers have made many improvements, such as adding attention structures in ASTGCN[33], modeling transportation networks into directed graphs and adding diffusion processes in DCRNN[62], and using dynamic graph convolution in DGCN[34]. Many experimental results[36,63,64] have shown that the graph structure topology improves the model prediction accuracy.

    • In this section, we summarize and analyze existing traffic volume prediction models or frameworks from three perspectives: temporal modeling, spatial modeling, and other external modules.

    • Early traffic flow prediction was often modeled as a time series regression problem, such as in Formula 1. Therefore, various time series analysis methods have been applied to the field of traffic flow prediction.

      $ \tilde t = {({s_{i,j}} {s_{i + 1,j}} \cdots {s_{n - 1,j}} {s_{n,j}})^T} $ (1)

      The historical average model calculates the average of historical data over the entire time period and directly uses the average value as the prediction. Therefore, the historical average method has a simple calculation, but it has low prediction accuracy and is easily affected by data outliers[65]. Auto-Regressive Integrated Moving Average (ARIMA) is a commonly used time series analysis model that has been successfully applied in traffic flow prediction. As research progresses, various variations of ARIMA have also appeared in the field of traffic flow prediction. For example, Seasonal ARIMA[66] is used to capture the periodicity of traffic flow; ARIMA combined with historical average values[67] better simulates traffic behavior during peak periods. Other time series traffic prediction methods include KNN, SVR, etc. However, time series models usually rely on the assumption of stationarity, which often contradicts the real-world traffic data. In order to simulate non-linear time dependence, neural network-based methods have been applied to traffic prediction.

      As a powerful neural network model with memory function in time series analysis, Recurrent Neural Network (RNN) uses the previous output as the input for the next stage and repeatedly cycles in the hidden layer, making the model output more comprehensive[68]. The main structure of RNN is shown in Fig. 9.

      Figure 9. 

      Structure of RNN.

      Because RNN has the problem of gradient disappearance when dealing with long-term time series data, Long Short-Term Memory (LSTM) is proposed as a variation of RNN. LSTM introduces gate structures, selectively forgetting the input of the previous node through a forget gate, and keeping important features for transmission to the next node. The main structure of LSTM is shown in Fig. 10.

      Figure 10. 

      Structure of LSTM.

      Zhao et al.[69] used LSTM for traffic flow prediction and compared their proposed LSTM network with other deep learning methods such as RNN and SAE. The experimental results show that the performance of the LSTM network is better than other methods and is better than RNN in dealing with long time series problems. Zhou et al. used the Euclidean distance to figure the spatial correlation between traffic networks and the gated recurrent neural network obtains the temporal dependency of traffic volume, then they proved that the model can better fit the trend of traffic flow changes compared to LSTM[31]. From the comprehensive research status, LSTM plays a pivotal role in traffic flow prediction research without considering spatial modeling. Various optimization models and variants of the LSTM network[70] have been continuously proposed, such as Bi-LSTM[71], GRU[72], etc., and have achieved good results in time series prediction.

      In 2018, Bai et al. emphasized that Temporal Convolutional Networks (TCN) can effectively handle sequence modeling tasks, and its performance even surpasses other models[73]. The structure diagram of TCN is shown in Fig. 11. Due to the parallel computing advantage of TCN[74] , many studies on traffic flow prediction have started adopting TCN to extract temporal correlations[7577].

      Figure 11. 

      Structure of TCN.

      With the attention mechanism[78] being used in the field of traffic flow prediction[33, 79], using Transformer to establish time dependence[80,81] has become a hot spot in the field of traffic flow prediction. As shown in Fig. 12, the Transformer structure contains multiple self-attention mechanisms, so it can capture the correlation coefficients of multiple dimensional features of the original data and obtain more accurate prediction structure. Tedjopurnomo et al. proposed a new Transformer model with time and date embeddings[82], which avoids the issues associated with using recurrent neural networks and effectively captures medium to long-term traffic patterns, improving the accuracy of medium to long-term traffic prediction.

      Figure 12. 

      Structure of Transformer.

      The above discussion is about modeling traffic time series, but traffic flow also shows complex spatial correlations. In order to further improve the prediction accuracy, people have begun to consider the spatial dependence between traffic flows.

    • In order to capture the spatial dependence between traffic time series, many scholars first extended the existing multivariate time series processing methods. These mainly include spatio-temporal HMM[83], spatio-temporal ARIMA[84], and so on. With the rise of deep learning, Convolutional Neural Network (CNN) has entered the public's view due to its excellent ability to extract high-dimensional features, and the spatial dependence modeling of traffic flow has also taken an important step forward.

      The basic CNN structure is shown in Fig. 13. CNN completes the feature extraction process through convolutional layers and pooling layers. In the convolutional layer, a specific 'receptive field' is used to mine local area features, and pooling layer is used to screen the mined features. The clever combination of the two appears alternately several times, achieving data feature extraction for each local area. In order to model spatial dependence, some researchers[3739] divide the traffic network into grids and model it as a 2D matrix, and use CNN to extract spatial features. Although this method extracts spatial features, it ignores the underlying structure of the traffic network.

      Figure 13. 

      Structure of CNN.

      Because transportation networks are non-Euclidean in structure in reality, the modeling method of dividing the network into grids for convolutional operations using CNNs to some extent destroys the structural information of the transportation network. However, the Graph Convolutional Neural Network (GCN)[32] solves this issue. The basic structure of GCN is shown in Fig. 14. Unlike CNN, GCN is a method that directly propagates node features on graph data. It aggregates the feature information of neighboring nodes and performs feature transformation to transfer the information in the graph structure to the feature representation of the nodes. Specifically, graph convolution utilizes the feature aggregation of neighboring nodes, introduces non-linearity through linear transformation and activation functions, and updates the feature representation of the nodes. Through multiple iterations, the graph convolutional model can learn spatial dependencies between nodes and extract richer and more meaningful node representations[85]. After the spread of graph neural networks to various domains, Yu et al.[32] proposed a new deep learning framework, the Spatio-Temporal Graph Convolutional Network (STGCN), which does not use conventional CNNs and RNNs, but instead represents the transportation network with a graph structure and establishes a model with complete convolutional structures. This reduces the number of model parameters and increases training speed. Meanwhile, it also solves the limitation of traditional convolutional neural networks, which require Euclidean structures for convolutional operations, and better extracts spatial features.

      Figure 14. 

      Structure of GCN.

      After the theoretical proposal of modeling transportation networks with graph structures, GCN has made remarkable achievements in the field of traffic flow prediction[86, 87]. Not only has it improved the accuracy of traditional road traffic flow prediction[44, 80, 81], but it has also appeared in multiple research fields such as bicycle flow prediction[60] and subway flow prediction[61]. Based on this, Guo et al.[33] proposed a Spatio-Temporal Graph Convolutional Network (ASTGCN) model based on the attention mechanism. This model models the dependencies of traffic flow on a daily and weekly basis using GCN with an added attention mechanism to more effectively capture the dynamic correlations in traffic data. Diao et al.[34] proposed a Dynamic Graph Convolutional Neural Network (DGCNN) for traffic prediction and designed a dynamic Laplacian matrix to track the spatial dependencies between traffic data dynamically.

      In the field of graph neural networks, apart from Graph Convolutional Neural Networks (GCN), there are other networks that can model spatial dependencies, such as Graph Attention Networks[88] (GAT), Graph Convolutional Recurrent Networks[89] (GCRN), Graph Autoencoders[90] (GAE), and Graph Generative Adversarial Networks (Graph GANs). Currently, their application in the field of traffic prediction is relatively limited, but they can be considered as future research directions.

      Due to the enormous advantages of graph structures in preserving transportation network features, GCN will be the best choice for future spatial dependency modeling.

    • With the development of relevant theories, an increasing number of researchers have not only modeled traditional spatio-temporal dependencies but also added various auxiliary modules to the entire prediction model to further improve the accuracy of traffic flow prediction. For example, attention mechanisms[4, 33, 79, 91], residual connections[47, 92, 93], and other modules were incorporated. In this section, we will focus on discussing some innovative modules.

    • An increasing number of traffic prediction models have started to overly focus on spatio-temporal correlations while neglecting other factors that contribute to observed outcomes. Moreover, the influence of spatio-temporal correlations is considered unstable under different conditions[94]. Random contextual conditions in historical observation data can lead to erroneous correlations between data and features[95], causing a decline in model performance. To accurately capture the correlations between observed outcomes and influencing factors, Deng et al. proposed a spatio-temporal neural structure called causal models from a causal relationship perspective[64]. Specifically, they first constructed a causal graph to describe traffic prediction and further analyzed the causal relationships among input data, contextual conditions, spatio-temporal states, and prediction outcomes. They then applied the backdoor criterion to eliminate confounding factors during the feature extraction process. Finally, they introduced a counterfactual representation inference module to extrapolate the spatio-temporal states from factual scenarios to counterfactual scenarios in the future. Experimental results demonstrated that the model exhibits superior performance and robust anti-interference capabilities.

    • Weather, as one of the significant factors influencing traffic flow[96,97] , has received considerable attention from researchers and has been integrated into prediction models[98101]. Yao et al. developed a hybrid deep learning model called DLW-Net[102], which focuses on adverse weather conditions. The model utilizes LSTM to capture the variations in both traffic flow and weather data. Li et al.[103] and Shabarek et al.[104] also proposed deep learning models for traffic flow prediction under adverse weather conditions. Experimental results have shown that considering weather factors in prediction models improves the accuracy to some extent.

      However, Zhang et al. pointed out that many existing studies on traffic flow prediction considering weather factors only use specific weather conditions as input features, resulting in a lack of generalizability to accurately predict traffic under various adverse weather conditions[91]. To address this issue, they developed a Deep Hybrid Attention (DHA) model that considers light rain, moderate rain, heavy rain, light fog, haze, fog, moderate wind, and strong wind. In the DHA model, the weather module is constructed using a ConvLSTM network with an added attention mechanism, allowing it to capture the spatio-temporal patterns of weather data. According to experimental results, the DHA model achieves satisfactory performance under adverse weather conditions.

    • The spatial dependency between each location in a traffic system is highly dynamic rather than static, as it changes over time due to travel patterns and unexpected events. On one hand, due to the division of urban functionalities, two locations that are far apart may exhibit almost identical traffic patterns due to their similar functions. This implies that spatial dependency can be long-distance in certain cases. However, existing Graph Neural Network (GNN) models suffer from oversmoothing issues, making it difficult to capture long-distance spatial correlations. On the other hand, the impact of unexpected events on the spatial dependency of the traffic system is undeniable. When a traffic accident occurs at a location, it takes several minutes (delay) to affect the traffic conditions of adjacent locations. This characteristic is often overlooked in GNN models.

      To address these issues, Jiang et al. proposed a PDFormer model[63] based on a spatio-temporal self-attention mechanism. It mainly consists of a spatial self-attention module that models local geographic neighborhoods and global semantic neighborhoods, as well as a traffic delay-aware feature transformation module that models the time delay in spatial information propagation. The PDFormer model achieves high accuracy, computational efficiency, and interpretability.

    • Although scholars have conducted extensive research in the field of traffic flow prediction, there is still a lot of work to be done in this area. We present in Table 3 the current performance of baseline models on public datasets and potential optimization methods that can be considered in the future. Furthermore, we discuss and analyze the future research directions in traffic flow prediction.

      Table 3.  Baseline model performance and its prospects.

      ModelPerformance (PeMS04)Optimization prospect
      MAERMSE
      ARIMA32.1144.59Improve the robustness
      LSTM28.8337.32Multi-LSTM stack/
      Increase Dropout
      GRU28.3240.21Bi-GRU/
      Use semi-supervised training
      Transformer[81]18.9221.28Reduce quadratic complexity
      DCRNN[62]24.7033.60Replacement activation function/
      Add a Residual connection
      STGCN[32]25.1531.45Change convolution kernel size/
      Join Jump connection
      ASTGCN[33]21.8028.05Modified spatial convolution/
      Integrate external factors
      STSGCN[106]21.1924.26Add multi-grained information/
      Use semi-supervised learning
      to increase robustness
    • So far, the prediction of traffic flow under normal conditions has been well developed, but it is also worth exploring the issue of predicting traffic flow under extreme environments. For example, during holiday peak periods or after accidents, or in certain specific regions during the rainy season or prolonged icy conditions on the road during extreme weather.

    • Traffic flow usually exhibits a very long-term time dependence, and the current traffic situation may be strongly correlated with a day, a week, or even several months ago. However, the most popular non-linear time correlation modeling method, RNN and its various variants, are difficult to model long-term dependent correlations. In addition, RNN is difficult to parallelize, so the training time cost is relatively high. Therefore, future research can focus on modeling long-term nonlinear time dependencies.

    • We have seen that some scholars have begun to pay attention to external factors such as weather[54] and geographical information[105] on the impact of traffic flow, but the complex dependencies between traffic flow and external factors also include various aspects such as road characteristics, traffic demand, road planning, vehicle delay, flow control strategies, etc.

    • As we discussed earlier, researchers have explored the use of multi-source data fusion as model input, and some have used cross-domain data[43,46,106] to predict traffic flow. However, the heterogeneity of multi-source data poses a significant challenge for data fusion, and it may be necessary to adjust the data from three aspects: data structure, data parameters, and data distribution.

    • As the accuracy of the model improves, the model structure becomes more complex and multiple algorithms are combined, which incurs a higher time cost. To meet the requirements of real-time prediction tasks, it is also important to focus on the efficiency of the model and reduce the time cost of model operation.

    • Common evaluation metrics for traffic flow prediction include Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE), which are calculated by averaging all predicted points in the model. However, not all prediction points have the same level of error when evaluating performance. Generally, the greater the standard deviation, the more difficult it is to predict traffic flow at that location. For example, busy intersections during peak hours may have larger prediction errors, while roads in the early hours may have higher accuracy with smaller errors. Therefore, evaluating metrics that assign weights to difficult and easy-to-predict points and then calculating the final error value can more accurately describe the performance of the model.

    • Most prediction models are trained and tested only on data from specific road segments, which may result in reduced model performance when predicting traffic flow on certain road segments. Therefore, how to effectively improve the generalization ability of the model is also a continuous concern for the future.

    • Most current prediction models assume that the spatial dependencies are fixed. However, in a real-world traffic network, spatial dependencies are dynamic with different time-steps, which are based on many other factors, such as accidents, weather conditions, and rush and non-rush hours. Therefore, an investigation on how to develop models for capturing the dynamical spatial dependencies to improve performance across multi-step predictions is required.

      • This work was supported by 2022 Shenyang Philosophy and Social Science Planning under grant SY202201Z, Liaoning Provincial Department of Education Project under grant LJKZ0588.

      • The authors declare that they have no conflict of interest.

      • Copyright: © 2023 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.
    Figure (14)  Table (3) References (106)
  • About this article
    Cite this article
    Xing Z, Huang M, Peng D. 2023. Overview of machine learning-based traffic flow prediction. Digital Transportation and Safety 2(3):164−175 doi: 10.48130/DTS-2023-0013
    Xing Z, Huang M, Peng D. 2023. Overview of machine learning-based traffic flow prediction. Digital Transportation and Safety 2(3):164−175 doi: 10.48130/DTS-2023-0013

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return