Abstract
Traffic forecasting is vital for Intelligent Transportation Systems, forwhich Machine Learning (ML) methods have been extensively explored to developdata-driven Artificial Intelligence (AI) solutions. Recent research focuses onmodelling spatial-temporal correlations for short-term traffic prediction,leaving the favourable long-term forecasting a challenging and open issue. Thispaper presents a comparative study on large-scale real-world signalizedarterials and freeway traffic flow datasets, aiming to evaluate promising MLmethods in the context of large forecasting horizons up to 30 days. Focusing onmodelling capacity for temporal dynamics, we develop one ensemble ML method,eXtreme Gradient Boosting (XGBoost), and a range of Deep Learning (DL) methods,including Recurrent Neural Network (RNN)-based methods and the state-of-the-artTransformer-based method. Time embedding is leveraged to enhance theirunderstanding of seasonality and event factors. Experimental results highlightthat while the attention mechanism/Transformer framework is effective forcapturing long-range dependencies in sequential data, as the forecastinghorizon extends, the key to effective traffic forecasting gradually shifts fromtemporal dependency capturing to periodicity modelling. Time embedding isparticularly effective in this context, helping naive RNN outperform Informerby 31.1% for 30-day-ahead forecasting. Meanwhile, as an efficient and robustmodel, XGBoost, while learning solely from time features, performscompetitively with DL methods. Moreover, we investigate the impacts of variousfactors like input sequence length, holiday traffic, data granularity, andtraining data size. The findings offer valuable insights and serve as areference for future long-term traffic forecasting research and the improvementof AI's corresponding learning capabilities.