Lu Liang This email address is being protected from spambots. You need JavaScript enabled to view it.1, Jacob Daniels2 

1 Department of Geography and the Environment, University of North Texas, Denton, TX 76203, USA
2 Department of Electrical Engineering, University of North Texas, Denton, TX 76203, USA

Received: February 14, 2022
Revised: June 6, 2022
Accepted: June 23, 2022

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||  

Cite this article:

Liang, L., Daniels, J. (2022). What Influences Low-cost Sensor Data Calibration? - A Systematic Assessment of Algorithms, Duration, and Predictor Selection. Aerosol Air Qual. Res. 22, 220076.


  • Regression-based algorithms appear to be an efficient yet reliable methods.
  • Small sample size can greatly affect the sensor calibration performance.
  • Particle counts and relative humidity are crucial factors in correcting LCS data.


The low-cost sensor has changed the air quality monitoring paradigm with the capacity for efficient network expansion and community engagement. The surge in its use has sparked new research interests in understanding its data quality. Many studies have employed field calibration to improve sensor agreement with co-located reference monitors. Yet, studies that systematically examine the performance of different calibration techniques are limited in scope and depth. This study comprehensively assessed ten widely used data techniques, namely AdaBoost, Bayesian ridge, gradient tree boosting, K-nearest neighbors, Lasso, multivariable linear regression, neural network, random forest, ridge regression, and support vector machine. We compared their performance using a standardized baseline dataset and their responses to various parameter combinations. We further assessed the training sample size effect to understand the optimal duration of field calibration for achieving good accuracy. Finally, we tested different predictor combinations to address whether the inclusion of more predictors will lead to better performance. Using baseline data, the neural network achieved the best performance, followed by the four regression-based methods, showing very consistent and stable performance. While confirming that the latest research tendency is deep learning, regression is still a viable option for studies with limited effort in parameter tuning and method selection, especially considering its computational efficiency and simplicity. The sample size effect is most evident when the sample size drops below 30%, which is equivalent to six weeks of continuously collected hourly data. Although algorithms react differently to the number of predictors, their performance was typically boosted by adding more predictors, especially the particle count and humidity. Our study not only describes an approach of sophisticated data-driven calibration for practical applications, but also provides insights into the compounding impacts of parameters, samples, and predictors in algorithm performance.

Keywords: PurpleAir, Machine learning, Particulate matter, PM2.5, Air quality

Share this article with your colleagues 


Subscribe to our Newsletter 

Aerosol and Air Quality Research has published over 2,000 peer-reviewed articles. Enter your email address to receive latest updates and research articles to your inbox every second week.

77st percentile
Powered by
   SCImago Journal & Country Rank

2021 Impact Factor: 4.53
5-Year Impact Factor: 3.668

Aerosol and Air Quality Research partners with Publons

Aerosol and Air Quality Research partners with Publons

Aerosol and Air Quality Research (AAQR) is an independently-run non-profit journal that promotes submissions of high-quality research and strives to be one of the leading aerosol and air quality open-access journals in the world. We use cookies on this website to personalize content to improve your user experience and analyze our traffic. By using this site you agree to its use of cookies.