Jinghui Ma1,2,3, Zhongqi Yu 2,3, Yuanhao Qu2,3, Jianming Xu2,3,4, Yu Cao2,3

Fudan University, Shanghai 200433, China
Shanghai Typhoon Institute, Shanghai Meteorological Service, Shanghai 200030, China
Shanghai Key Laboratory of Meteorology and Health, Shanghai Meteorological Service, Shanghai 200030, China
Anhui Province Key Laboratory of Atmospheric Science and Satellite Remotes Sensing, Hefei 230000, China

Received: August 23, 2019
Revised: November 18, 2019
Accepted: November 28, 2019
Download Citation: ||https://doi.org/10.4209/aaqr.2019.08.0408 

  • Download: PDF

Cite this article:

Ma, J., Yu, Z., Qu, Y., Xu, J. and Cao, Y. (2020). Application of the XGBoost Machine Learning Method in PM2.5 Prediction: A Case Study of Shanghai. Aerosol Air Qual. Res. 20: 128-138. doi: 10.4209/aaqr.2019.08.0408.


  • XGBoost can improve the accuracy of WRF-Chem prediction of PM2.5.
  • XGBoost model can accurately predict winter heavy pollution.
  • It provides a new method enhance the capacity of air quality forecasting in China.


Air quality forecasting is crucial to reducing air pollution in China, which has detrimental effects on human health. Atmospheric chemical-transport models can provide air pollutant forecasts with high temporal and spatial resolution and are widely used for routine air quality predictions (e.g., 1–3 days in advance). However, the model’s performance is limited by uncertainties in the emission inventory and biases in the initial and boundary conditions, as well as deficiencies in the current chemical and physical schemes. As a result, experimentation with several new methods, such as machine learning, is occurring in the field of air quality forecasting. This study combined hourly PM2.5 mass concentration forecasts from an operational air quality numerical prediction system (WRF-Chem) at the Shanghai Meteorological Service (SMS) with comprehensive near-surface measurements of air pollutants and meteorological conditions to develop a machine learning model that estimates the daily PM2.5 mass concentration in Shanghai, China. With correlation coefficients that are higher by 50–100% and a standard deviation that is lower by 14–24 µg m–3, the machine learning model provides significantly better daily forecasting of PM2.5 than the WRF-Chem model. Thus, this research offers a new technique for enhancing air quality forecasting in China.

Keywords: XGBoost algorithm; PM2.5; WRF-Chem; Machine learning.

Impact Factor: 2.735

5-Year Impact Factor: 2.827

SCImago Journal & Country Rank

Enter your email below to receive latest published articles in your field.