Assessment of PM2.5 Patterns in Malaysia Using the Clustering Method

Particulate matter is the parameter of most concern in air quality monitoring in Malaysia. This study discusses the variations and clustering of PM2.5 recorded from 2018 to 2019 at 65 stations of the Continuous Air Quality Monitoring Network of the Malaysian Department of Environment. PM2.5 concentrations were recorded continuously using a tapered element oscillating microbalance. The cluster analysis was conducted using the Agglomerative Hierarchical Cluster (AHC) method. The results show that the daily average of PM2.5 concentrations ranged between 8 and 31 μg m–3. The cluster regions were classified into High Pollution Regions (HPR), Medium Pollution Regions (MPR) and Low Pollution Regions (LPR) based on the AHC analysis. The mean concentration of PM2.5 recorded in HPR was significantly higher with 23.04 μg m–3 followed by MPR and LPR. The results also showed that the highest concentration of PM2.5 was recorded during the 2019 haze episode for all three regions, with the air pollutant index indicating very unhealthy and dangerous


INTRODUCTION
Air pollution has been found to kill more people worldwide than other diseases such as breast cancer, malaria or tuberculosis (WHO, 2014). As described in Beelen et al. (2013), airborne particulate matter (PM) is especially detrimental to health and has previously been estimated to cause between three and seven million deaths every year, primarily by creating or worsening cardio-respiratory disease (Hoek et al., 2013). The two main categories of particulate are fine and coarse. Coarse particulate has an aerodynamic parameter below 10 µm (PM10) and fine particulate has an aerodynamic diameter below 2.5 µm (PM2.5) (Shaylinda et al., 2008;Wang and Ogawa, 2015). Most studies focus on PM2.5 due to its effects on the environment such as visibility and climate, and its ability to pass through the lungs and affect human health (Franceschi et al., 2018).
Rapid development and urbanization have affected air quality and have led to an interest in studying the causes and effects of PM2.5. Sinkemani et al. (2018) and Khalili et al. (2018) indicated that PM2.5 derives from fuel burning, vehicular exhaust, and some industrial activities, while Khan et al. (2016) found that motor vehicle emissions, secondary inorganic aerosol, and coal-fired power plants are the predominant sources of PM2.5. PM10 and PM2.5 also originate from industrial and intensive commercial activities (Fava and Letizia Ruello, 2008). Li et al. (2020Li et al. ( , 2021 suggested that airborne dust events contributed to high PM2.5 concentrations in Middle Eastern countries such as Kuwait, Iraq, Iran and Saudi Arabia. High concentrations of PM2.5 in Malaysia have been linked to Southeast Asian haze incidents, which are the consequence of the uncontrolled burning of forests in Indonesia (Rahman et al., 2015). These trends show that PM2.5 is a critical issue requiring immediate development of regulation and policies to address the problem of PM2.5 globally.
As a developing country, it is crucial for Malaysia to have a good air quality monitoring system to adopt the measurement of PM2.5 for its citizens. A recent improvement to the Malaysian Department of Environment (DOE) air quality monitoring network has been to incorporate continuous measurement of PM2.5 in the national environmental monitoring program. Standards and guidelines for PM2.5 were introduced in the middle of 2017. By monitoring PM2.5, the actual situation concerning high particulate matter concentration due to combustion such as from biomass burning and vehicle emissions can be better represented compared with PM10. Hence, the calculation of the Air Pollutant Index (API) in Malaysia has been improved with the addition of PM2.5 in the group of sub-index parameters. The API value is determined based on the subindex of six types of air pollutants including ozone (O3), carbon monoxide (CO), nitrogen dioxide (NO2), sulfur dioxide (SO2), and particulate matter sized under 10 µm (PM10) and 2.5 µm (PM2.5) (DOE, 2019). The higher the API number the more polluted the air is and the greater the health risk. The Malaysian Department of Environment (DOE) has been monitoring and reporting on air quality in Malaysia since 2014 under the Malaysia Ambient Air Quality Standard (MAAQS).
Clustering is an exploratory data analysis technique used for investigating the underlying structure in data. Since the 1980s, two well-known and widely used approaches, k-means and hierarchical agglomerative clustering, have been applied in air pollution research and have received a lot of attention (Govender and Sivakumar, 2020). Previous reviews on clustering applications, such as those by Gong and Richman (1995) and Jolliffe and Philipp (2010), have primarily concentrated on climate and precipitation, with a minor focus on air pollution. Given the dangers of PM exposure to humans, a better knowledge of the temporal and spatial behavior and dynamics of air pollutants is essential. Agglomerative Hierarchical Cluster (AHC) analysis is a technique for grouping things into clusters in which the objects (monitoring stations) inside a cluster are similar to each other while objects in other clusters are dissimilar (Pires et al., 2008a, b). Therefore, this study aims to illustrate the overall trend of PM2.5 from 2018 to 2019 at the 65 monitoring stations in Malaysia based on the spatial classification of cluster analysis. The annual Air Pollutant Index (API), monthly and annual PM2.5 variations have also been explored in this study. The relationship between PM2.5 and other air pollutants and meteorological factors have been investigated according to the clustered groups.

Study Area
Malaysia is situated between 2°30′N, 112°30′E, within 150 km of the Equator, in central Southeast Asia on the South China Sea. The country is comprised of Peninsular Malaysia (West Malaysia) and the states of Sabah and Sarawak (East Malaysia) on the island of Borneo. The area of Malaysia is approximately 330,000 square kilometers, the majority of which is on the island of Borneo, with Peninsular Malaysia accounting for only roughly 40% of the total. Tropical forests encompass around half of Malaysia, with the majority in Sabah and Sarawak. The Klang Valley, located in the middle of Peninsular Malaysia's west coast, is the most developed and fastest growing region (DOS, 2017a, b). Malaysia experiences a moderately uniform annual temperature which ranges from 26°C to 28°C. Malaysia has two monsoon seasons, the northeast monsoon and the southwest monsoon, which occur from November to March and from May to September. Occasionally, the northeast monsoon brings heavy rain while the southwest monsoon causes a lack of rain cloud formation resulting in less rainfall during the period. Despite the monsoons, Malaysia is safe from natural disasters such as volcanic eruptions and typhoons.

PM and Other Air Pollutant Measurements
Malaysia's air quality is measured at 65 stations located throughout the country to continuously monitor ( Fig. 1) and detect any significant change in air quality that could harm human health and the environment. These monitoring stations are located in industrial (I), urban (U), sub urban (SU) and rural (R) areas based on Malaysian Department of Environment classifications to represent different backgrounds. One monitoring station has been classified as a background station (B), based on the surrounding area and location in relation to potential major air pollutant sources. The classifications are listed in Table S1. Data from air monitoring stations are transmitted to the DOE Environmental Data Centre (EDC) in Putrajaya and go through quality assurance and quality control (QA/QC) procedures. In addition to PM10 and PM2.5, other variables used in this study are SO2, NO2, O3 and CO. Along with air quality data, these stations also record the meteorological parameters of wind speed, wind direction, humidity, temperature and solar radiation. The data used in this study are from January 1, 2018, to December 31, 2019. The data were recorded hourly at each station and went through QA/QC procedures to ensure their validity. The levels of air pollutants were monitored hourly using the following specific and calibrated equipment: A Thermo Scientific tapered element oscillating microbalance (TEOM) 1405-DF (USA) was used to measure PM10 and PM2.5; CO and O3 were measured with a Thermo Scientific Model 48i (USA) CO analyzer and a Thermo Scientific Model 49i (USA) O3 analyzer; SO2 was measured with a Thermo Scientific Model 43i (USA) SO2 analyzer; and NO2 was measured with a Thermo Scientific Model 42i (USA) NO2 analyzer. A Climatronics AIO 2 Weather Sensor (Climatronics Corporation, USA) was used to measure the relative humidity and temperature.

Air Pollutant Index (API)
The Malaysian Air Pollutant Index (API) is primarily based on the U.S. EPA Ambient Air Quality Index which takes into account six criteria pollutants. These six parameters are particulate matter (PM10 and PM2.5), ozone (O3), sulfur dioxide (SO2), nitrogen dioxide (NO2) and carbon monoxide (CO). For each pollutant, a sub-index is calculated hourly and the pollutant with the highest sub-index value is selected as the representative API for the hour. The data for API in this study were calculated and recorded by the Malaysian DOE. The API informs the public about air pollution levels, recommends actions and gives health advice. The index is numbered 0-500 and is divided into five breakpoints or bands providing air pollution levels in a simple way such as: 0-50 (good), 51-100 (moderate), 101-200 (unhealthy), 201-300 (very unhealthy), and > 300 (hazardous). The calculation of the API in different categories is presented in Table S2.

Quality Control and Quality Assurance
All air quality data from the Continuous Air Quality Monitoring Network (CAQM) go through QA/QC procedures before submission to the DOE. Gas detection devices are examined manually every two weeks, while PM10 and PM2.5 instruments are calibrated once a month in accordance with standard operating protocols. Insufficient data leads to data deletion (which results in negative values), and outliers trigger second-level QC tests. Some of the observations were confirmed using observable sources, while others were ruled out due to instrument malfunction.

Statistical and Multivariate Analysis
The air pollutants analyzed in this study are those listed in the MAAQS IT-2 as shown in Table S3. The daily average was calculated using hourly PM2.5 data, which totaled 47,450 data points (730 per station × 65 stations). The Pearson correlation test was used to determine the correlation or linear relationship between PM2.5 with other pollutants and meteorological factors. If the two variables have a perfect linear relationship with a positive slope, r = 1. If the two variables have a perfect linear relationship with a negative slope, r = −1. A correlation coefficient of 0 indicates that the variables have no linear relationship (Carslaw and Ropkins, 2012).
Agglomerative Hierarchical Cluster is one of the multivariate analyses used in this study. The AHC process begins with single observation clusters and gradually joins pairs of clusters, resulting in smaller clusters with more observations (Miligan, 1980;Myatt, 2009). Ward's approach was employed in this study and is one of several well-established AHC procedures (Cunningham and Ogilvie, 1972;Johnson and Wichern, 2002). The classification of the objects can be depicted in a dendrogram, which displays the degree of similarity between them, as measured by Euclidean distances using Ward's approach (Juahir et al., 2011). The quotient of the linkage distance divided by the maximal distance is represented as [(Dlink/Dmax)100]. Euclidean distance is based on a single linkage (also known as a nearest neighbor). The quotient is commonly multiplied by 100 to normalize the connectivity distance indicated by the y-axis (Singh et al., 2004;Shrestha and Kazama, 2007). Euclidean distance can be defined by Eq. (1) (Sharma, 1996): where 2 ij D is the squared distance between subjects i and j, xik is the value of the kth variable for the ith subject, xjk is the value of the kth variable for the jth subject and p is the number of variables.
The daily average PM2.5 data from the 65 monitoring stations were analyzed using AHC analysis based on the characteristics of PM2.5 throughout Malaysia. Further analysis and discussion will focus on the cluster formation from AHC analysis. The AHC analysis, Pearson correlation and the other statistical analysis were performed using XLSTAT 2014 add-in software developed by Addinsoft.

Overall Variation Pattern of PM2.5 Concentration
Descriptive statistics of daily average of PM2.5 for the 65 monitoring stations from 2018 to 2019 are presented in Table S4. The minimum, maximum, first quartile, median, third quartile, and mean values between the whiskers were represented using a boxplot (Fig. 2). Overall, the trend shows that PM2.5 concentrations were below the standard for all stations, yet some stations sometimes exceeded the MAAQS IT-2 of daily PM2.5 which is 50 µg m -3 . The annual mean PM2.5 concentration recorded at all stations was between 8 and 31 µg m -3 . The lowest concentration of 24-h averaged PM2.5 recorded was 2 µg m -3 in Jerantut, (CA39C) while the highest concentration was recorded in Sri Aman, (CA63Q) and followed by ILP Miri, (CA55Q) with 382.64 µg m -3 and 345.05 µg m -3 respectively.  Most of the stations recorded a mean value of PM2.5 higher than the median. This shows that high concentrations of PM2.5 influence the normal distribution of the particulate matter parameter. By chance, Malaysia was hit by a severe haze event caused by local and transboundary haze from neighboring countries in mid-September 2019 where the maximum concentration of PM2.5 was recorded. Open burning locally and transboundary haze from Sumatra and Kalimantan, Indonesia are among the reasons for the high concentration of PM2.5 (Latif et al., 2018). This is in line with Heil and Goldammer (2001) who found that an increase in PM2.5 caused the majority of the particle loading in the atmosphere during the smoke haze episode. This demonstrates that higher PM2.5 affected suburban and rural areas, due to open burning during the hot season (DOE, 2019).
Compared with other ASEAN countries, the daily concentration of PM2.5 is relatively low in Malaysia. Fold et al. (2020)

PM2.5 and Air Pollutant Index (API) Based on Clustering
In this study, the AHC method was used to cluster daily average concentrations of PM2.5 parameter pollutants collected from the 65 monitoring stations with different backgrounds from 2018 to 2019. Three significant clusters that share the characteristic of homogeneity are shown in the dendrogram in Fig. S5. The AHC results shown in the dendrogram reveal the dissimilarity between the clusters involved. The three clusters termed High Pollution Regions (HPR), Medium Pollution Regions (MPR) and Low Pollution Regions (LPR) are shown in Table S6.
The classification of stations using AHC (HPR, MPR and LPR) based on PM2.5 concentrations in Malaysia is shown in Fig. 3 one station located in Sarawak, and some stations located in the east, southern and northern region of Peninsular Malaysia were classified as MPR. All the LPR stations are located in Sarawak. Fig. S7 shows the annual average concentration of PM2.5 from 2018 to 2019 for HPR, MPR and LPR. The trend was higher in 2019 than in 2018 due to annual haze events from Sumatera and Kalimantan, Indonesia and as well as open burning from bush fires and agricultural clearance in Malaysia. The events have affected most rural areas in LPR such as Sri Aman and ILP Miri in Sarawak leading to the annual average concentration of PM2.5 being greater than in MPR in 2019.
The overall trend in the annual average PM2.5 concentration in ambient air for all regions in 2018 and 2019 was within the limit of the MAAQS which is 25 µg m -3 , except for HPR which exceeded the MAAQS in 2019 with 26.158 µg m -3 . In general, HPR recorded the highest annual average PM2.5 concentrations compared with MPR and LPR. The boxplots in Fig. 4 show the daily PM2.5 average from 2018 to 2019 for the HPR, MPR and LPR clusters.
The mean value of PM2.5 was 23.04 µg m -3 in HPR, and 16.41 µg m -3 and 16.18 µg m -3 in MPR and LPR respectively. PM2.5 presented a high mean and median concentration value where the value of the mean is higher than the median in each of the regions. Table S8 shows that the distribution of PM2.5 was skewed to the right in all regions. Positive skew suggests the presence of significant pollution levels (Abd Wahab et al., 2016;Sansuddin et al., 2011). In this study, LPR recorded the highest skewness value therefore LPR were the most affected during the study period due to the concentration of PM2.5 in the atmosphere.
Air quality status for each region is shown in the API. Figs. 5 and 6 show the frequency of good, moderate, unhealthy, very unhealthy and hazardous air quality status in all three regions. It shows that moderate API levels have the highest incidence followed by a good API level from 2018 to 2019 in HPR and MPR. Meanwhile, good and moderate API levels in LPR did not show significant changes in 2018 and 2019.
Meanwhile, unhealthy API levels were recorded in 2018 and 2019 in all three regions. In 2019, unhealthy and very unhealthy API levels became dominant compared with 2018. HPR, MPR and LPR recorded 222, 342 and 111 occurrences of unhealthy levels and 10, 23 and 40 very unhealthy levels. Hazardous API levels were also observed in 2019 in LPR with 13 occurrences. The high number of occurrences of unhealthy, very unhealthy and hazardous API levels in 2019 was due to the prolonged and massive haze from Kalimantan and Sumatra, Indonesia in that year.

Comparison of PM2.5 in HPR, MPR and LPR
The number of stations with different classifications based on HPR, MPR and LPR is shown in Table 1. Each cluster represents the stations located in industrial (I), urban (U), suburban SU) and   rural (R) areas while the background (B) represents in MPR. The trend of PM2.5 based on U, SU, I, R and B areas is shown in Fig. 7 for HPR, MPR and LPR.
There are 12 stations located in suburban areas followed by urban areas (four stations) and rural areas (three stations), and none in industrial areas in HPR. The highest number of stations is found in MPR, with 20 stations located in suburban areas, six in urban areas, six in industrial areas and four in rural areas. In contrast, there are five stations located in rural areas, two in suburban areas and one each in industrial and urban areas in LPR. The highest annual average of PM2.5 in HPR was observed in urban and suburban areas in 2019 with 26.3 µg m -3 and 26.7 µg m -3 respectively, which exceeded the MAAQS. For industrial areas in MPR and LPR, the highest reading was recorded in 2019 and the same trend applies to the other areas in MPR and LPR, but the readings fall within the MAAQS limit.
Several stations located mostly in the Klang Valley namely Batu Muda, Cheras, Klang, Shah Alam and Petaling Jaya were classified as HPR. The rapid transformation of the Klang Valley into an expansive urban region during the 1990s has resulted in air pollution from motor vehicles, industrial activities and the urbanization process. Particulate matter in the ambient air is one of the main contaminants in urbanized environments (Mahapatra et al., 2018). Furthermore, in most developing countries, motor vehicles are the primary mobile source of air pollution in metropolitan areas (Azmi et al., 2010;Zakaria et al., 2010;Ishii et al., 2007). Leh et al. (2012) stated that air pollutants are produced at a higher rate in urban areas, than in less developed areas and the natural environment. Therefore, many stations being situated in urban areas is a factor in their classification as HPR. Whereas forest fires in Sumatra are one of the events that promote long range transportation of haze to Malaysia during the northeast monsoon season and affect most of the suburban stations classified as HPR located along the west coast of Peninsular Malaysia in the states of Selangor, Negeri Sembilan, Melaka and Johor. While three rural stations classified as HPR recorded a high concentration of PM2.5 due to the local bush fires and open burning activities in peatland areas. A fire in peat soil can burn for a long time, giving it enough time to spread deep underground (Zaccone et al., 2014). These prolonged peat fires can cause a hazy condition with high PM2.5 concentrations.
Most of the stations located in the northern, southern, and eastern areas of Peninsular Malaysia and Sabah were classified as MPR. Most MPR are suburban which vary greatly in terms of commercial and industrial development, motor vehicles and transboundary haze, which are similar conditions to those in the studies of Leh et al. (2012), Zakaria et al. (2010), Afroz et al. (2003) and Makmom et al. (2012). Power plants, industrial waste incinerators, dust from urban construction works and quarries, as well as open burning, are all sources of air pollution in suburban areas (Dominick et al., 2012). Meteorology is a crucial factor influencing particles of various sizes in different ways in MPR. Regardless of the impact of changes in sources, a changing climate which alters local and global climatic factors might have an impact on particle properties (de Jesus et al., 2020). Latif et al. (2014) concluded that wind direction was a factor in the transport of particulate matter from more urbanized and industrial areas to rural areas.
Meanwhile, most of the stations classified as LPR were less polluted due to the low PM2.5 concentrations in the industrial, urban and suburban areas located there. However, these areas are sometimes affected by high PM2.5 concentrations due to forest and bush fires, PM2.5 long range transportation from Kalimantan, Indonesia during forest fires, and local open burning for clearing agricultural land in rural areas during the northeast monsoon season. According to  Leewe et al. (2016), most forest fires are caused by human activities during prolonged dry and hot weather. Forest fires can be detected by satellites using hotspots. DOE (2019) reported a total of 520 hotspots in Malaysia and 174 in Sarawak, which was the highest recorded among Malaysian states in 2018. Frequent open burning incidents result from activities such as burning of garbage in residential areas, garbage burning by the roadside and the burning of brush and agricultural land, which normally occurs during the hot and dry period. Besides the transboundary haze affecting air quality in Malaysia, local burning in urban, suburban and rural areas also contributes to the presence of pollutants. Motor vehicles are one of the primary contributors to air pollution in urban areas, which includes dust, suspended particulate matter, and lead (Awang et al., 2000). Air pollution has a variety of sources including automobiles, industrial waste incinerators, power plants and airborne dust from construction and quarries in urban areas (Rahman et al., 2015). The accelerated urbanization process has resulted in significant levels of pollution in each region. Azmi et al. (2010) found that extremely poor air quality occurs in heavily populated areas. However, in this study, industrial areas are not categorized as HPR based on PM2.5, while suburban and rural areas recorded high annual concentrations of PM2.5. Industrial areas showed good air quality in terms of low PM2.5 concentration. This is supported by DOE (2019), which reported that there were 31 categories of industries achieving 100% compliance subject to the Environmental Quality (Clean Air) Regulation, 2014.
According to Rosofsky et al. (2018), inequality in exposure to pollutants recorded at the stations located in urban, suburban, industrial as well as rural areas might be due to spatiotemporal shifts in air pollution. This is due to the presence of a high background concentration of contaminants, which is common in urban, suburban, and industrial areas. It has been shown that the variety of station locations gives different readings for each region (Amran et al., 2015). Therefore, the various pollutant sources present in urban, suburban, industrial and background areas promote inequitable exposure to pollutants. Fig. 8 shows the monthly average concentration of PM2.5 from 2018 to 2019 by region at monitoring stations in Malaysia. The highest monthly concentration of PM2.5 is found in HPR compared with MPR and LPR, except in August where the monthly concentration of PM2.5 is slightly lower than LPR. Meanwhile, the concentration of PM2.5 in LPR is slightly lower than in MPR throughout the year except for July, August and September where the concentration of PM2.5 increased drastically.
September and December had the highest and lowest monthly PM2.5 values, respectively. The southwest monsoon occurs between May and September, whereas the northeast monsoon occurs between November and March. As a result, the monsoon event is most likely responsible for the highest and lowest monthly PM2.5 concentrations in September and December. PM2.5 concentrations were generally greater during the southwest monsoon (May-September) than during the northeast monsoon (November-March) (Abdullah et al., 2017). The higher PM2.5 concentrations during this period is mostly due to drier weather conditions, a stable atmosphere, local effects, and transboundary movement of air pollution from biomass burning in neighboring countries (Abdullah et al., 2011). According to Asif et al. (2018), low rainfall and steady meteorological conditions caused the high PM2.5 concentrations, as stagnant meteorological conditions accelerate the accumulation of PM2.5. Results from this study also show that PM2.5 levels remained quite high during the southwest monsoon.

The Relationship between Air Contaminants and Weather Conditions
Correlation analysis can be used to determine the strength of a link between two variables. The relationship between PM2.5 and other pollutant parameters in Malaysia in HPR, MPR and LPR is shown in Table S9. CO and PM10 were shown to have a strong and significant relationship with PM2.5 in HPR. The correlation coefficient (r) between PM2.5 and PM10 is 0.981and PM2.5 and CO is 0.777. Meanwhile for MPR, CO, PM10 and O3 were correlated with PM2.5 with the correlation as follows: PM2.5 and PM10 (r = 0.987), CO (r = 0.786) and O3 (r = 0.556). For LPR, CO, PM10 and O3 were correlated with PM2.5 with the correlation as follows: PM2.5 and PM10 (r = 0.978), CO (r = 0.790) and O3 (r = 0.606). The strong correlation of PM2.5 and PM10 in HPR, MPR and LPR indicates that PM2.5 is significantly associated with PM10. Dominick et al. (2012) suggested that CO is the major pollutant creating high particulate concentrations due to the combustion process in motor vehicles based on the relationship between particulate matter (PM10 and PM2.5) and CO. While the precursors of pollutants (NOx and VOCs) are shown by the correlation between particulate matter (PM10 and PM2.5) and O3 in MPR and LPR and connected to particulate matter from the same sources.
Table S10 presents the relationship between PM2.5 and meteorological parameters in Malaysia for HPR, MPR and LPR. For each region, there was a moderate to poor association between PM2.5 and meteorological indicators such as wind speed, temperature, and humidity. However, they remained significant and had an indirect impact on pollution concentrations. Wind speed is the primary meteorological characteristic that dilutes pollutants (Akhtar et al., 2018). Meanwhile, rising temperatures aid chemical reactions that generate finely divided particulate matter (Afzali et al., 2014). As a result, the concentration of PM2.5 is indirectly increased. Although this study showed a low correlation between PM2.5 and meteorological parameters, atmospheric dynamics and meteorological conditions are assumed to be an essential part of controlling air pollution. The Pearson correlation enables the determination of a relationship between PM2.5 and each of the meteorological parameters. In addition, a modeling technique such as a mixed-effect regression model may further explain the variation of PM2.5 as it allows both fixed and random effects in the model. The changes in PM2.5 due to other pollutants and meteorological factors could be determined using a mixed-effect regression model which provides a better understanding of the effect of meteorological parameters and pollutants on PM concentrations .

CONCLUSION
PM2.5 is of most concern as it is more harmful to human health than PM10. In this study, the average concentration of PM2.5 recorded in Malaysia was between 8 µg m -3 to 31 µg m -3 . Using AHC analysis, 65 monitoring stations in Malaysia were classified into HPR, MPR and LPR with 19, 9 and 37 stations respectively based on the parameter PM2.5. The results show that the concentration of PM2.5 in HPR was the highest with the averaged concentration of 23.04 µg m -3 , and higher concentrations were recorded in urban, suburban and rural areas compared with industrial areas in HPR and MPR. Meanwhile, LPR was most affected when extreme events occurred (haze) with the greatest skew compared with HPR and MPR. In 2019, the highest levels of PM2.5 were observed for HPR, MPR, and LPR with API levels recorded both unhealthy and dangerous (40 and 13 occurrences, respectively). The annual concentration of PM2.5 also exceeded the standard in HPR in 2019. During the southwest monsoon, PM2.5 levels were extremely high due to the hot and dry season. From the correlation analysis, PM10 and CO were significantly correlated with PM2.5 while meteorological parameters had moderate and poor correlation with PM2.5 in all regions. For future work, other air pollutants such as SO2, NO2, CO and O3 should be investigated in detail together with PM2.5 concentration based on different regions. Mixed-effect regression could be considered in the future to give a better understanding of the relationship between PM2.5 and other pollutants and meteorological factors.