Special Issue on Air Quality in a Changed World: Regional, Ambient, and Indoor Air Concentrations from the COVID to Post-COVID Era (VI)

Ludmilla Manera Conti1, Dirceu Luís Herdies  This email address is being protected from spambots. You need JavaScript enabled to view it.1, Débora Souza Alvim1,2, Sergio Machado Corrêa3

1 National Institute for Space Research, Cachoeira Paulista, SP 12630-000, Brazil
2 Lorena School of Engineering (EEL), University of Sao Paulo (USP), Lorena 05508-050, SP, Brazil
3 Faculty of Technology, Rio de Janeiro State University (UERJ), Resende, RJ 27537-000, Brazil

Received: November 30, 2021
Revised: April 14, 2022
Accepted: June 10, 2022

 Copyright The Author(s). This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are cited.

Download Citation: ||https://doi.org/10.4209/aaqr.210364  

Cite this article:

Conti, L.M., Herdies, D.L., Alvim, D.S., Corrêa, S.M. (2022). Analysis of the Effect of the Truck Strike and COVID-19 on the concentration of NOx and O3 in the Metropolitan Region of the Vale do Paraiba, São Paulo, Brazil. Aerosol Air Qual. Res. 22, 210364. https://doi.org/10.4209/aaqr.210364


  • During the truckers´ strike and COVID-19 the NO concentrations decreased by 58%–70%.
  • Due the lockdown imposed by the pandemic there was an increase of O3 by 24%.
  • Effects on pollutant concentrations were not affected by meteorological parameters.


The daily diurnal data pattern of nitrogen oxides (NO, NO2) and ozone (O3), temperature, relative humidity, pressure, wind direction and speed and solar radiation were studied from 2017 to 2020 within a period of 21 days in two towns in Paraiba Valley: São José dos Campos (SJC) and Guaratinguetá (GRT). In 2018, there was a truckers' strike in Brazil and in 2020 a partial lockdown was imposed in response to the coronavirus pandemic; in this study, Machine Learning techniques and a multivariate statistical analysis were conducted to compare these different periods. During both 2018 and 2020, there was a reduction in the NO and NO2 concentrations, (particularly NO), which is a primary pollutant during peak hours of vehicular traffic; this was notably the case in 2018 owing to the truckers´ strike. Through an application of the Tukey test, a comparison was made between the NO, NO2 and O3 data which showed that there was a similarity in each element of the dataset on a decreasing scale, however they continue to be statistically significant. Regarding the Principal Component Analysis (PCA), this procedure identified the first major component for both towns in the entire study period and explained around 42% of the data and the proper interconnections between the data, with a strong positive influence of O3 concentrations, temperature (T), wind speed (WS) and solar radiation (SR). In addition, when analyzing data by means of the Boruta algorithm, there was a considerable difference in the variables that influence O3 concentrations, with GRT showing NO2 and relative humidity, while SJC, NO2 and global solar radiation were the most important variables for feature selection.

Keywords: Air Pollution, COVID-19 pandemic, Strike, Machine Learning, PCA


Brazil faces a number of regional environmental problems, including the concentration of industrial plants, high traffic density, wildfires and densely populated areas (Gonçalves et al., 2017; Souza et al., 2017; Souza et al., 2014; Souza et al., 2015). The indices for these factors in the South-East region, in particular the State of São Paulo, are the highest in the country (IBGE, 2019). The towns of São José dos Campos (SJC) and Guaratinguetá (GRT) located in the metropolitan region of Paraiba Valley (MRPV) were chosen for a case study.

According to the U.S. EPA, the presence of NOx in the atmosphere is the main indicator of anthropogenic sources of air pollution, such as motor vehicles and energy generators (U.S. EPA, 2018). On a mass basis, the amount of pollutants emitted per year caused by mobile sources is 59.0 and 76.1% for GRT and SJC, respectively (CETESB, 2018). Tropospheric O3, on the other hand, is a secondary toxic pollutant, that is harmful to human and plant health and caused by the greenhouse effect. It is formed as a result of photochemical reactions between NOx and volatile organic compounds (VOCs), in the presence of sunlight (Chiquetto et al., 2021; Seinfeld and Pandis, 2016).

Previous studies on changes in air quality and vehicular activity have reached different findings regarding their effects on pollutant concentrations. Decrease in pollutants concentrations like NO, CO, BC, PM and O3 was observed during the decrease in vehicle traffic in Spain, Italy, Israel and India (Sharma et al., 2020; Basagaña et al., 2018; Meinardi et al., 2008; Levy, 2013). Increases in O3 concentration were found during the COVID-19 lockdown periods in Brazil, Spain, Chile, Indonesia and India (Mahato et al., 2020; Nakada and Urban, 2020; Siciliano et al., 2020; Tobías et al., 2020; Morales-Solís et al., 2021; Rendana, 2021). Recent study by Naqvi et al. (2021) showed a moderate positive correlation between cases and mortality by COVID-19 and NO2 concentrations, while O3 showed a weak to moderate positive correlation. This shows that populations living in adverse conditions are more susceptible to infections and mortality due to COVID-19 (Coccia, 2021). Thus, emission patterns and concentrations of pollutants are complex, and further studies are needed to understand these relationships.

Atmospheric chemistry is a complex scientific area and the relationship between the variables involved is highly non-linear. Machine Learning and Multivariate statistical techniques have been employed to discover potential hidden relationships in big data sets, as paired correlations and time series graphs do not provide visual or behavioral patterns. New spatio-temporal knowledge about the environment obtained from complex relationships can be used in the workflow of the process of forecasting and monitoring atmospheric pollutants. They can also be used as input data for other processes and models in order to improve the performance of the accuracy of the results, such as decreasing the overall computational cost by enabling streaming analyzes (Schultz et al., 2021; Amuthadevi et al., 2021). Machine learning models have been used to understand the chemical composition and spatiotemporal characteristics of the variation of air pollutants (Xiao et al., 2020; Oduber et al., 2021; Govender and Sivakumar, 2020; Binaku and Schmeling, 2017; Khatri and Hayasaka, 2021). Enthusiasts about artificial intelligence and machine learning have shown a rejuvenating interest, raising discussions about relevant applications to solve specific problems of data analysis, numerical modeling and post-processing, mainly due to the statistical and probabilistic approach that these methodologies provide (Schultz et al., 2021).

From May 21 to June 1, 2018, there was a national truck drivers’ strike in Brazil, caused by demands by the SCC (truck drivers’ union) for a reduction in fuel prices and improvements in working conditions for the sector. This industrial action paralyzed almost the entire circulation of heavy vehicles throughout Brazil. The year 2020 was characterized by the implementation of social distancing measures in response to the COVID-19 pandemic. In the case of the State of São Paulo, place where this study was carried out, the suspension of public services in commercial establishments began on March 16 (the adoption of these measures naturally led to a similar reduction in road traffic in 2018), and only began to be eased at the beginning of June.

Our aim in this study is to focus on air quality by adopting a multivariate statistical approach as a means of analyzing meteorological and gaseous pollutant data between May and June, in each year from 2017 to 2020, with 2018 and 2020 being years with the reduction in vehicular traffic, due to the truck drivers' strike and the partial lockdown due to the pandemic caused by the COVID-19 virus, respectively.


2.1 Description of the Sites

The MRPV, where this study was carried out, was based in an important industrial and technological zone, which is a key factor in the growth of the economy in Brazil (contributing approximately 5% of the Brazilian GNP in 2016). It is located on the axis between the two largest cities of Brazil, São Paulo and Rio de Janeiro, and is thus densely populated.

Hourly averages of NO, NO2 and O3, T, SR, RH, WS and WD were obtained from May 21 to June 1 in the years 2017, 2018, 2019 and 2020, making a total of 48 days with 29,184 hourly averages, for the two towns under study (SJC and GRT).

The SJC station (23k 408858 7431443 UTM) is located in an "urban district" with an extensive commercial area, although most of it is still residential. The station is located 1000 m from the Presidente Dutra Highway, where there is a with low impact of primary pollutants, such as the GRT station that is located within the State University of São Paulo, which comprises a total area of 175 thousand m2, with 14 thousand m2 being built up and the rest a green area (23k 480385 7478395 UTM) (Fig. 1).

Fig. 1. Location of the two studied sites GRT and SJC at MRPV.Fig. 1. Location of the two studied sites GRT and SJC at MRPV.

2.2 Description of Local Meteorological Conditions

According to the Technical Bulletin of Synoptic Analysis of the National Institute for Space Research (INPE - http://tempo.cptec.inpe.br/boletimtecnico/pt), South America was under the influence of the High of South Pacific Subtropical (HSPS), the High of South Atlantic Subtropical High (HSAS) and the Intertropical Convergence (ITCZ) during the study period. The Subtropical Jet (JST) was also present in parts of the study period, with the exception of the year 2017, which had no observation. The JST favored the passage of low convection cloud bands over part of Brazil, including the study region. The years 2018, 2019 and 2020 were also marked by the presence of anticyclonic circulations close to the study region that hinder the formation of clouds and contribute to the low RH (average 3% lower than 2017), as can be seen in Table 1.

Table 1. Average values for the variables during 2017–2020 for SJC and GRT.

2.3 Dataset Treatment

The first step was to confirm that there was no rainfall in the dataset, also confirmed by the local synotic situation of the periods. The daytime (approximately 6 AM to 6 PM) and nighttime (approximately 6 PM to 6 AM) data were handled in the same way.

After the stage of filtering and validating the collected data has been completed, the analysis of the descriptive statistics of the data begins, with a record of the minimum, maximum, averages, medians and quartiles, as well as the amount of missing data.

After providing an initial overview by means of descriptive statistics, the equation for the Pearson correlation coefficient (Friendly, 2002) was calculated, to determine the correlation between all the variables (Asuero et al., 2006). The data are displayed in the form of correlation matrices, so that all the data can be used for the study, without removing outliers, to ensure the extreme values can be determined precisely.

The Tukey HSD test is a useful tool for comparing datasets observations. The Tukey HSD test is used to test the significance of differences between sample means. All pairwise differences are tested while the probability of making one or more Type I errors (also called false positives) is controlled (Härdle and Simar, 2015; Montgomery and George, 2002). It was applied between the two towns for the 3 pollutants and showed the statistically significant differences between the datasets for each of the years, meaning that the differences between the GRT and SJC data are not random, so the observations have non-similar characteristics of their own.

The Principal Component Analysis (PCA) simplifies the complexity of the dataset variables with high dimensionality by reducing dataset, projecting them geometrically in lower dimensions called principal components (PCs) valuing the highest intra-group variance. It is also able to briefly explain the results and relationships between the variables over the years (Lever et al., 2017). SJC and GRT can be described differently from the result of their PCs.

Finally, the Boruta algorithm was used, which is a classification method derived from the Random Forest methodology. Its objective is to reflect the importance of the variables in the dataset in terms of the value of the target variable, which in this study was O3, to perform a selection of features that can be used by a prediction algorithm, for example (Kursa and Rudnicki, 2010). Initially, the method duplicates the dataset and shuffles the variables in each column, in a paired way. The method then calculates a series of polynomial regressions and increases/decreases the value of the “small value” variables and accumulates the variation in the O3 concentration and is thus able to rank the importance of all the variables in the O3 formation.

In undertaking this study, the Language R (R Core Team, 2020) was used with the following libraries: FactoMineR, Boruta, ggpubr, factoextra, corrplot, openair, ggplot2 and GGally.


Descriptive statistics for SJC and GRT with values from the lower and upper outliers, as well as from the first and third quartiles and from the median and mean of NO, NO2 and O3, can be analyzed from different perspectives, as shown in Fig. 2. Nitrogen oxides are characterized by more pronounced median values around lower values for the years 2018 and 2020 for GRT and SJC, respectively, which in addition is considered as a chemical response the decrease in emissions evidenced by both periods, due to the truck drivers' strike and the partial lockdown due to the pandemic caused by the COVID-19 virus, respectively.

Fig. 2. Violin plot for O3, NO and NO2 for GRT and SJC. It is possible to verify the data distribution, a holistic view of outliers, averages and dispersion. All SJC distributions present a greater dispersion, and consequently reaching higher values. This may be due to the variation in the size of the city and, consequently, greater exposure to quantitative variation in values.Fig. 2. Violin plot for O3, NO and NO2 for GRT and SJC. It is possible to verify the data distribution, a holistic view of outliers, averages and dispersion. All SJC distributions present a greater dispersion, and consequently reaching higher values. This may be due to the variation in the size of the city and, consequently, greater exposure to quantitative variation in values.

The years 2017 and 2019, on the other hand, present a more unequal distribution of values, that is, a greater distribution of the variance around mean. NO has a similar distribution for the years 2017 and 2019 in the violin chart for SJC and GRT and similar characteristics for the years 2018 (the truck drivers' strike) and 2020 (the partial blockade).

NO2 values vary for GRT and SJC. In the case of the GRT, the distribution is approximately similar for each of the four years studied; they all have a median close to 10 µg m3, but with well distributed data. When compared to SJC, the distribution profile in general is more restricted, with a median around 20 µg m3 for all years. However, there is a clear disparity in the year 2020 compared to other years due to a more pronounced volume of distribution around the mean. Indicative of the decrease in values above that, more pronounced than in 2018. This difference may have occurred because the truck drivers' strike showed a decrease in heavy vehicle traffic and later a gradual decrease in light vehicle traffic. In contrast to the lockdown that brought traffic to an abrupt halt.

However, when observing the values for O3, it is observed that this had a median between 20 and 40 µg m3 for GRT and values between 25 and 50 µg m3 for SJC. O3 also showed a more uniform distribution between the lower and upper outliers, with more accentuated bimodality for SJC. Not suffering such marked differences as those mentioned above. Below, the Tukey test is performed to verify if there is really a statistically significant difference between these data, with main attention to O3, which does not present a visual difference as marked as the other compounds.

The average hourly concentrations are shown in Fig. 3. In general, the behavior of NO and NO2 is remarkable since there are two pronounced peaks, located at times characterized by heavy traffic, between 7–10 h and 20–22 h. Between 11 am and 4 pm there is a decrease in NOx owing to the reduction in vehicular traffic and also because this is the time when there is the greatest consumption of these pollutants due to the formation of O3, which has higher concentrations between 12–16 h, with a maximum at 15 h as a result of its close relationship with solar radiation. Corroborating the behavior that can be observed in several studies previously mentioned in several cities around the world, each one with its specific variations.

Fig. 3. Diurnal pattern of O3, NO and NO2 for GRT and SJC. O3, NO and NO2 have different diurnal patterns and are well established in the literature. The increase in O3 concentrations and decrease in NO and NO2 can be observed for 2020 and 2018.Fig. 3. Diurnal pattern of O3, NO and NO2 for GRT and SJC. O3, NO and NO2 have different diurnal patterns and are well established in the literature. The increase in O3 concentrations and decrease in NO and NO2 can be observed for 2020 and 2018.

Therefore, in Fig. 2 and Fig. 3 for 2018 (the truck drivers' strike) and 2020 (the partial blockade), there is a reduction in NO and NO2 concentrations, particularly for NO. This is a primary pollutant, during peak periods of heavy traffic when there is the highest emission of these pollutants, and which is more accentuated in 2018 because of the truckers' strike. An increase in O3 concentrations are also observed due to the decrease in concentrations of primary pollutants, which are the main ways of chemical consumption. Such a decrease in NO2 and increase in O3 was also observed in similar studies carried out during the period of COVID-19 lockdown (Dantas et al., 2021; Morales-Solís et al., 2021; Naqvi et al., 2021). In the study by Morales-Solís et al. (2021) in 16 cities in central and southern Chile comparing the period between March and May 2020 with the same corresponding months during 2017–2019, significant decreases were observed in 4 cities where NO2 data were available, between –27% and –5% for this pollutant; while significant increases in O3, between 1% and 4%, were found in 4 of the 5 cities. Local meteorological variables did not show significant changes between the two periods for this work. For Brazil, a decrease of around 1% in the optical density of atmospheric aerosols was also observed (Naqvi et al., 2021).

The Tukey test was conducted with a confidence interval of 95%, as shown in Fig. 4, so that the concentration of the key pollutants in the four years of study could be compared. In these graphs, the further away the horizontal line is, (which compares a couple of years), from the dotted vertical line, the greater the difference in the dataset. In the case of O3, the biggest differences can be observed for the two towns between the years 2017–2018, followed by the years 2018–2019 and 2017–2020, which suggests that the events in 2018 (the truck drivers' strike) and 2020 (the partial blockade – COVID19) affect statistically the O3 concentrations. In the case of NO, the differences between the two towns were not so sharp (i.e., the differences in the coordinate axis), and only the different pairs of years (2018–2019 and 2019–2020) can be considered, which are again two pairs of atypical years. In the case of NO2, the two towns also behave in a similar way with differences between the datasets between 2017–2020, 2018–2019 and 2019–2020, once again showing the significance that can be attached to the aforementioned events. An interesting observation, with regard to all the pollutants in the two towns, is that in the case of the years 2018 (the truck drivers' strike) and 2020 (the partial blockade – COVID19), (the years of the events under study), the comparison between the data shows that the dataset is similar, on a decreasing/increasing scale of similarity: NO, NO2 and O3.

Fig. 4. Tukey plot 95% family-wise confidence level for SJC and GRT. When the value zero is present between two pairs, the difference between them is not significant, the years 2018 and 2020 have the highest number of pairs with significant values.Fig. 4. Tukey plot 95% family-wise confidence level for SJC and GRT. When the value zero is present between two pairs, the difference between them is not significant, the years 2018 and 2020 have the highest number of pairs with significant values.

By grouping the entire dataset, including pollutants and meteorological data, it was possible to create a Pearson's correlation matrix coupled with a hierarchical cluster analysis (dendrogram) based on Euclidean distances (Berthold and Höppner, 2016), as shown in Fig. 5. In the case of both towns, positive correlations can be observed between T, O3 and SR and between NO and NO2, which is also corroborated by the dendrograms at the right of the matrices. This confirms that the O3 formation results from photochemistry and there is little possibility that this pollutant has been transported over long distances. Ozone, on the other hand, shows negative correlations with RH, NO and NO2, and this once again corroborates the fact that the formation of ozone is the result of reactions of NOx with VOCs (not measured here) with sunlight. The wind direction has little influence on the formation of pollutants, which suggests there is little likelihood of NO being transported in SJC (correlation of 0.24).

Fig. 5. Pearson correlation matrices for SJC and GRT during 2017–2020. Positive correlations were observed between O3, T and SR and between NO and NO2 and negative correlations between T and RH and between O3, NO and NO2.Fig. 5. Pearson correlation matrices for SJC and GRT during 2017–2020. Positive correlations were observed between O3, T and SR and between NO and NO2 and negative correlations between T and RH and between O3, NO and NO2.

We observed significant improvements in air quality considering reductions in monitored air pollutants in areas highly influenced by vehicular traffic (NO and NO2). Intense reductions in the concentration of air pollutants were found during 2018 (the truck drivers' strike) and 2020 (the partial blockade – COVID19). During these periods, vehicle traffic decreased considerably in all areas analyzed, improving air quality.

Since the data were analyzed from the standpoint of the complete dataset, as in the previous tests, in this stage a minimum dataset will be considered for the Principal Component Analysis (PCA). In this classification, the missing data in each line of observations were removed, and this applied to all the variables. Out of a total of 1824 observations and 11 variables from each town, 1426 and 1445 observations remained for SJC and GRT, respectively. PCA was applied based on the Kaiser Rule, 3 PC (Principal Component), 2 PC, 3 PC and 2 PC were retained for 2017, 2018, 2019 and 2020, respectively for SJC; for GRT, 3 PC were retained for all the years. All the PCs were responsible for about 70% of the accumulated variations of the original data. The main loading values of the components are shown in Tables S1 to S4 in which the eigenvalue and the cumulative percentage of variance explained for each PC, are also shown. Only the loading values that were ± 0.500 or more, were interpreted.

The years 2018 and 2020, are characterized by a decrease in concentrations of 70% for NO in SJC and 58% in GRT, caused by the truckers´ strike in 2018 and the partial lockdown imposed by the coronavirus pandemic in 2020, when compared with the average for the years 2017 and 2019 when there was neither a strike nor a pandemic. There was also a decrease of 20% of NO2 in SJC and 22% in GRT and an increase of 24% of O3 in SJC and 2% in GRT, since the NO is a primary pollutant and NO2 either primary or secondary, but mainly secondary. In a study by Morales-Solís et al. (2021), a 55% increase in NOx was evidenced during the COVID-19 pandemic lockdown period for the central and southern urban regions of Chile. Likewise, an increase between 18 and 43% in O3 levels. Overall, the impacts of the strike on NO2 are more complex than on primary pollutants (CO and NO), as demonstrated by recent studies that investigated changes in pollutant emissions during the shutdowns caused by the COVID-19 pandemic (Kanniah et al., 2020; Muhammad et al., 2020; Nakada and Urban, 2020). Further studies must be carried out to understand this pollutant, by taking account of the meteorology and atmospheric chemistry.

Increases in O3 were also found in a study conducted in Rio de Janeiro during the 2018 truckers´ strike (Dantas et al., 2019). Similar situations were also encountered during the COVID-19 lockdown periods in Brazil, Spain and India, which altered vehicle emissions in a comparable way (Mahato et al., 2020; Nakada and Urban, 2020; Siciliano et al., 2020; Tobías et al., 2020). The processes for the formation and consumption of O3 are highly non-linear, since their concentrations are dependent on the availability of sunlight, the NOx/VOCs ratio and the speciation of VOCs (mixture of reactivity). Although we do not have access to VOCs data, clearly NOx emissions decreased, which certainly changed the NOx VOC ratio. Since MRPV is a NOx-saturated environment, the reduction in the O3 concentration will depend on the decrease in the VOCs concentration. However, a reduction in NOx concentrations leads to an increase in O3 concentrations, as was seen for the town of SJC during the strike and pandemic period, when there was a decrease in NOx and an increase in O3, as was also found by Morales-Solís et al. (2021) for the blockade period in Chile, for example. Thus, the decrease in NO, which reacts more quickly with O3, together with the increased availability of sunlight, may have played a decisive role in the increase in O3 observed in the events (Alvim et al., 2018). This was also studied to explain the high O3 levels during the weekends in Rio de Janeiro (Geraldino et al., 2020).

In general, based on the PC analyzes between the years and between both cities, it is possible to notice that all the first components have a strong positive influence of O3 concentrations, temperature (T), wind speed (WS) and solar radiation (SR). And negative variation for the concentrations of NO2, NO and relative humidity (RH) (Fig. 6). WS has already been associated with greater dispersions and consequently a lower rate of infections by COVID-19 given by a previous study carried out by Coccia (2021). That same study also demonstrated how the increase in the number of individuals infected by COVID-19 is directly related to high rates of pollution, which interact with viral agents.

Fig. 6. Main PCA dimensions for (a) 2017, (b) 2018, (c) 2019, and (d) 2020, for SJC and GRT. Positive and negative influences can be observed in the first three dimensions of the first PC. Both cities show similar behavior for all years.Fig. 6. Main PCA dimensions for (a) 2017, (b) 2018, (c) 2019, and (d) 2020, for SJC and GRT. Positive and negative influences can be observed in the first three dimensions of the first PC. Both cities show similar behavior for all years.

The second component is marked by the positive influence of the concentrations of NO and NO2 and the negative influence of pressure. Except for the years 2017 and 2020, only for the city of GRT, which presented a positive/negative variation in wind speed (WS)/RH for 2017 and RH/WS for 2020. The third component is mainly marked by the influence of the wind direction (WD).

The first PC (PC1) for SJC (Fig. 6(a1) and Table S1) for 2017 explains 58.2% of the original data and mainly corresponds with O3 (0.897), T (0.839), RH (–0.824) and SR (0.715), as these variables have the highest loading values. The positive sign of the loading values suggests a similarity in the behavior, but with an inverse pattern with regard to humidity. As the main pathway for O3 production is NO2 photolysis, PC1 can be taken as a measure of the air processed during the middle of the day and in the afternoon. With regard to the hours when there are the highest concentrations of O3, this characteristic can be validated through the values of negative loading of NO2 and NO (–0.452 and –0.584, respectively). From 12:00 to 16:00, the local temperatures increase (great positive weight), which leads to the formation of O3 that occurs soon after the emission of primary pollutants (VOCs, NOx) and also the availability of sunlight, because at this time too there is a reduction in NOx concentrations owing to the formation of O3.

The loading values for PC2 suggest there is a relationship between NO (0.583) and NO2 (0.652), which is described in their behavior while O3 (0.067), tends to zero, and shows no influence. PC2 can thus be regarded as the description of the nighttime behavior of the atmosphere. It is also the time when O3 tends to have the lowest values because of the absence of sunlight; this prevents the photolysis process from taking place, and the atmospheric boundary layer is smaller and more stable as well as having a greater pressure owing to lower temperatures. At night, O3 is no longer formed as there is a lack of sunlight; this the time when O3 is consumed by NO2 and forms NO3, which will form N2O5 (Abdul-Wahab et al., 2005). The other components can be analyzed in a similar way and a more detailed description can be found in the Supplementary Material.

When all the years are considered, Fig. 7 shows the individual contribution made by the main variables to the two first dimensions. In the case of SJC, O3, RH, T and P make the main contribution, while in the case of GRT it is made by O3, RH, T, P and SR.

Fig. 7. The main contributions made by the top 10 variables for dimensions 1 and 2 all over the years.Fig. 7. The main contributions made by the top 10 variables for dimensions 1 and 2 all over the years.

Given the importance of O3, (as discussed earlier), the classification provided by Boruta's algorithm with respect to the other variables is shown in Fig. 8 with O3 as the target variable. SJC and GRT show significant differences, but NO2 is important for both. In the case of SJC, there is a notable level of importance for NO2 and SR of around 50; for GRT, RH is approximately 60, whereas NO2 is equivalent to SJC. This variation may be due to the different geographical location of both towns and the wind direction factor (not shown) may explain why there is a considerable difference.

Fig. 8. Importance of all the variables for the O3 formation estimated by the Boruta algorithm, for the two cities. The existence of the singularity between sites is important for computational modeling to present values closer to reality.Fig. 8. Importance of all the variables for the O3 formation estimated by the Boruta algorithm, for the two cities. The existence of the singularity between sites is important for computational modeling to present values closer to reality.


This study allowed us to compare the behavior of the NO, NO2 and O3 pollutants during periods with and without strikes and periods with a pandemic and without a pandemic. It was also possible to correlate the variables measured by two automatic air quality monitoring stations in the towns of SJC and GRT that belong to RMVP and thus assist in understanding the interrelationship of the variables, in a synergy between the chemistry of the atmosphere and statistical tools.

During years 2018 (the truck drivers' strike) and 2020 (the partial blockade – COVID19), there was a reduction in NO concentrations of 70% and 58%, for SJC and GRT respectively. In the case of GRT, owing to both a) the truckers' strike in 2018 and b) the partial lockdown imposed by the coronavirus pandemic in 2020, there was a 20% reduction in NO2 in SJC and 22% in GRT and a 24% increase in O3 in SJC and 2% in GRT, when these periods are compared with the years 2017 and 2019. The meteorological variables showed little variation during the years of study that could have affected the studied concentrations.

The PCA provides an effective reduction in the amount of data and ensures a secure relationship between the variables. In the case of both towns during the 4 years under study, PCI accounts for around 42% of the original data. PC1 is a measure of air masses during the day that are under the influence of O3 concentrations, (T, RH and SR in all cases), whereas PC2 is influenced by the concentrations of NO and NO2, without interference of the O3 loading values. PC2 also shows signs of the influence of P and residual relations with the WD, especially when describing the behavior of NO and NO2. This is because O3 tends towards zero in PC2. PC2 can thus be a description of nocturnal behavior, which is a time that has lower O3 values because of the lack of photolysis and higher P values (i.e., a lower absolute value). PC3 in general only showed a positive correlation with P or a negative correlation with WS. It did not show a close relationship with the other components, considering the other loading values.

Some variations can be observed, but they are linked to specific processes that caused changes in the values, such as meteorological events, for example. However, in general, the loading values for the towns of SJC and GRT were close and had significant patterns, as well as average values around an average central value. Several relationships between pollutants and meteorology have been found and are useful in understanding the evolutionary behavioral pattern of pollution over time and makes it possible to make meteorological forecasts about in a predictive manner in the variability of air pollution.


The authors express their gratitude to Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for providing the post-doc fellowship for Débora Souza Alvim under contract number 88887.371883/2019-00 and CAPES for partially funding this work through the project CAPES/Modelagem Grant number 88881.148662/2017-01. This work has been supported by the following Brazilian research agencies: FAPERJ, CAPES, CNPq. Thanks to CETESB for the availability of the observed data. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.


  1. Abdul-Wahab, S.A., Bakheit, C.S., Al-Alawi, S.M. (2005). Principal component and multiple regression analysis in modelling of ground-level ozone and factors affecting its concentrations. Environ. Model. Softw. 20, 1263–1271. https://doi.org/10.1016/j.envsoft.2004.09.001

  2. Alvim, D.S., Gatti, L.V., Corrêa, S.M., Chiquetto, J.B., Santos, G.M., de Souza Rossatti, C., Pretto, A., Rozante, J.R., Figueroa, S.N., Pendharkar, J., Nobre, P. (2018). Determining VOCs reactivity for ozone forming potential in the megacity of São Paulo. Aerosol Air Qual. Res. 18, 2460–2474. https://doi.org/10.4209/aaqr.2017.10.0361

  3. Amuthadevi, C., Vijayan, D.S., Ramachandran, V. (2021). Development of air quality monitoring (AQM) models using different machine learning approaches. J. Ambient Intell. Hum. Comput. https://doi.org/10.1007/s12652-020-02724-2

  4. Asuero, A.G., Sayago, A., González, A.G. (2006). The correlation coefficient: An overview. Crit. Rev. Anal. Chem. 36, 41–59. https://doi.org/10.1080/10408340500526766

  5. Basagaña, X., Triguero-Mas, M., Agis, D., Pérez, N., Reche, C., Alastuey, A., Querol, X. (2018). Effect of public transport strikes on air pollution levels in Barcelona (Spain). Sci. Total Environ. 610–611, 1076–1082. https://doi.org/10.1016/j.scitotenv.2017.07.263

  6. Berthold, M.R., Höppner, F. (2016). On clustering time series using Euclidean distance and pearson correlation. https://doi.org/10.48550/arXiv.1601.02213

  7. Binaku, K., Schmeling, M. (2017). Multivariate statistical analyses of air pollutants and meteorology in Chicago during summers 2010-2012. Air Qual. Atmos. Health 10, 1227–1236. https://doi.org/10.1007/s11869-017-0507-7

  8. Companhia Ambiental do Estado de São Paulo (CETESB) (2018). Air Quality Report for the Sao Paulo State 2017. Environmental Agency of the State of São Paulo. http://ar.cetesb.sp.gov.br/​publicacoes-relatorios/

  9. Chiquetto, J.B., Alvim, D.S., Rozante, J.R., Faria, M., Rozante, V., Gobo, J.P.A. (2021). Impact of a truck Driver’s strike on air pollution levels in São Paulo. Atmos. Environ. 246, 118072. https://doi.org/10.1016/j.atmosenv.2020.118072

  10. Coccia, M. (2021). How do low wind speeds and high levels of air pollution support the spread of COVID-19? Atmos. Pollut. Res. 12, 437–445. https://doi.org/10.1016/j.apr.2020.10.002

  11. Dantas, G., Siciliano, B., Freitas, L., Guedes de Seixas, E., da Silva, C.M., Arbilla, G. (2019). Why did ozone levels remain high in Rio de Janeiro during the Brazilian truck driver strike? Atmos. Pollut. Res. 10, 2018–2029. https://doi.org/10.1016/j.apr.2019.09.010

  12. Dantas, G., Siciliano, B., França, B.B., Estevam, D.O., da Silva, C.M., Arbilla, G. (2021). Using mobility restriction experience for urban air quality management. Atmos. Pollut. Res. 12, 101119. https://doi.org/10.1016/j.apr.2021.101119

  13. Friendly, M. (2002). Corrgrams. The American Statistician 56, 316–324. https://doi.org/10.1198/​000313002533

  14. Geraldino, C.G.P., Arbilla, G., da Silva, C.M., Corrêa, S.M., Martins, E.M. (2020). Understanding high tropospheric ozone episodes in Bangu, Rio de Janeiro, Brazil. Environ. Monit. Assess. 192, 156. https://doi.org/10.1007/s10661-020-8119-3

  15. Gonçalves, C., Figueiredo, B.R., Alves, C.A., Cardoso, A.A., Vicente, A.M. (2017). Size-segregated aerosol chemical composition from an agro-industrial region of São Paulo state, Brazil. Air Qual. Atmos. Health 10, 483–496. https://doi.org/10.1007/s11869-016-0441-0

  16. Govender, P., Sivakumar, V. (2020). Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmos. Pollut. Res. 11, 40–56. https://doi.org/​10.1016/j.apr.2019.09.009

  17. Härdle, W.K., Simar, L. (2015). Applied Multivariate Statistical Analysis, 4th ed. Springer Berlin Heidelberg, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45171-7

  18. Instituto Brasileiro de Geografia e Estatística (IBGE) (2019). Brazilian Institute of Geography and Statistics. 2018 Brazilian census. Rio de Janeiro. Results of the universe. Information base by census sector. Rio de Janeiro: IBGE.

  19. Kanniah, K.D., Kamarul Zaman, N.A.F., Kaskaoutis, D.G., Latif, M.T. (2020). COVID-19’s impact on the atmospheric environment in the Southeast Asia region. Sci. Total Environ. 736, 139658. https://doi.org/10.1016/j.scitotenv.2020.139658

  20. Khatri, P., Hayasaka, T. (2021). Impacts of COVID-19 on air quality over China: Links with meteorological factors and energy consumption. Aerosol Air Qual. Res. 21, 200668. https://doi.org/10.4209/aaqr.200668

  21. Kursa, M.B., Rudnicki, W.R. (2010). Feature selection with the boruta package. J. Stat. Software 36, 1–13. https://doi.org/10.18637/jss.v036.i11

  22. Lever, J., Krzywinski, M., Altman, N. (2017). Principal component analysis. Nat. Methods 14, 641–642. https://doi.org/10.1038/nmeth.4346

  23. Levy, I. (2013). A national day with near zero emissions and its effect on primary and secondary pollutants. Atmos. Environ. 77, 202–212. https://doi.org/10.1016/j.atmosenv.2013.05.005

  24. Mahato, S., Pal, S., Ghosh, K.G. (2020). Effect of lockdown amid COVID-19 pandemic on air quality of the megacity Delhi, India. Sci. Total Environ. 730, 139086. https://doi.org/10.1016/j.​scitotenv.2020.139086

  25. Meinardi, S., Nissenson, P., Barletta, B., Dabdub, D., Sherwood Rowland, F., Blake, D.R. (2008). Influence of the public transportation system on the air quality of a major urban center. A case study: Milan, Italy. Atmos. Environ. 42, 7915–7923. https://doi.org/10.1016/j.atmosenv.2008.07.046

  26. Montgomery, D.C., George, C.R. (2002). Applied Statistics and Probability for Engineers, 3rd ed. John Wiley & Sons, Inc., USA.

  27. Morales-Solís, K., Ahumada, H., Rojas, J.P., Urdanivia, F.R., Catalán, F., Claramunt, T., Toro, R.A., Manzano, C.A., Leiva-Guzmán, M.A. (2021). The effect of COVID-19 lockdowns on the air pollution of urban areas of central and southern Chile. Aerosol Air Qual. Res. 21, 200677. https://doi.org/10.4209/aaqr.200677

  28. Muhammad, S., Long, X., Salman, M. (2020). COVID-19 pandemic and environmental pollution: A blessing in disguise? Sci. Total Environ. 728, 138820. https://doi.org/10.1016/j.scitotenv.​2020.138820

  29. Nakada, L.Y.K., Urban, R.C. (2020). COVID-19 pandemic: Impacts on the air quality during the partial lockdown in São Paulo state, Brazil. Sci. Total Environ. 730, 139087. https://doi.org/​10.1016/j.scitotenv.2020.139087

  30. Naqvi, H.R., Mutreja, G., Hashim, M., Singh, A., Nawazuzzoha, M., Naqvi, D.F., Siddiqui, M.A., Shakeel, A., Chaudhary, A.A., Naqvi, A.R. (2021). Global assessment of tropospheric and ground air pollutants and its correlation with COVID-19. Atmos. Pollut. Res. 12, 101172. https://doi.org/​10.1016/j.apr.2021.101172

  31. Oduber, F., Calvo, A.I., Castro, A., Blanco-Alegre, C., Alves, C., Calzolai, G., Nava, S., Lucarelli, F., Nunes, T., Barata, J., Fraile, R. (2021). Characterization of aerosol sources in León (Spain) using Positive Matrix Factorization and weather types. Sci. Total Environ. 754, 142045. https://doi.org/​10.1016/j.scitotenv.2020.142045

  32. R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria.

  33. Rendana, M. (2021). Air pollutant levels during the large-scale social restriction period and its association with case fatality rate of COVID-19. Aerosol Air Qual. Res. 21, 200630. https://doi.org/10.4209/aaqr.200630

  34. Schultz, M.G., Betancourt, C., Gong, B., Kleinert, F., Langguth, M., Leufen,. LH., Mozaffari, A., Stadtler, S. (2021). Can deep learning beat numerical weather prediction? Philos. Trans. R. Soc. London, Ser. A 379, 20200097. https://doi.org/10.1098/rsta.2020.0097

  35. Seinfeld, J.H., Pandis, S.N. (2016). Atmospheric Chemistry and Physics: From Air Pollution to Climate Change, 3rd ed. Wiley.

  36. Sharma, S., Zhang, M., Anshika, Gao, J., Zhang, H., Kota, S.H. (2020). Effect of restricted emissions during COVID-19 on air quality in India. Sci. Total Environ. 728, 138878. https://doi.org/​10.1016/j.scitotenv.2020.138878

  37. Siciliano, B., Dantas, G., da Silva, C.M., Arbilla, G. (2020). Increased ozone levels during the COVID-19 lockdown: Analysis for the city of Rio de Janeiro, Brazil. Sci. Total Environ. 737, 139765. https://doi.org/10.1016/j.scitotenv.2020.139765

  38. Souza, A., De, Santos, D.A.S., Aristone, F., Kovač-Andrić, E., Marković, B., Matasović, B., Pavao, H.G., Pires, J.C.M., Ikefuti, P.V (2017). Impacto de fatores meteorológicos sobre as concentrações de ozônio modelados por análise de séries temporais e métodos estatísticos multivariados. HOLOS 5, 2. https://doi.org/10.15628/holos.2017.5033

  39. Souza, J.B., de, Reisen, V.A., Santos, J.M., Franco, G.C. (2014). Principal components and generalized linear modeling in the correlation between hospital admissions and air pollution. Rev. Saude Publica 48, 451–458. https://doi.org/10.1590/S0034-8910.2014048005078

  40. Souza, R., Coelho, G., Silva, A.E., Pozza, S.A. (2015). Using ensembles of artificial neural networks to improve PM10 forecasts. Chem. Eng. Tramsactions 43, 2161–2166. https://doi.org/10.3303/​CET1543361

  41. Tobías, A., Carnerero, C., Reche, C., Massagué, J., Via, M., Minguillón, M.C., Alastuey, A., Querol, X. (2020). Changes in air quality during the lockdown in Barcelona (Spain) one month into the SARS-CoV-2 epidemic. Sci. Total Environ. 726, 138540. https://doi.org/10.1016/j.scitotenv.​2020.138540

  42. U.S. Environmental Protection Agency (U.S. EPA) (2018). 2014 National Emissions Inventory, version 2, Technical Support Document. U.S. Environmental Protection Agency. 

  43. Xiao, Z., Miao, Y., Du, X., Tang, W., Yu, Y., Zhang, X., Che, H. (2020). Impacts of regional transport and boundary layer structure on the PM2.5 pollution in Wuhan, Central China. Atmos. Environ. 230, 117508. https://doi.org/10.1016/j.atmosenv.2020.117508

Share this article with your colleagues 


Subscribe to our Newsletter 

Aerosol and Air Quality Research has published over 2,000 peer-reviewed articles. Enter your email address to receive latest updates and research articles to your inbox every second week.

77st percentile
Powered by
   SCImago Journal & Country Rank

2022 Impact Factor: 4.0
5-Year Impact Factor: 3.4

Aerosol and Air Quality Research partners with Publons

CLOCKSS system has permission to ingest, preserve, and serve this Archival Unit
CLOCKSS system has permission to ingest, preserve, and serve this Archival Unit

Aerosol and Air Quality Research (AAQR) is an independently-run non-profit journal that promotes submissions of high-quality research and strives to be one of the leading aerosol and air quality open-access journals in the world. We use cookies on this website to personalize content to improve your user experience and analyze our traffic. By using this site you agree to its use of cookies.