Principal Component Analysis and Mapping to Characterize the Emission of Volatile Organic Compounds in a Typical Petrochemical Industrial Park

The petrochemical industry generates a notorious amount of VOC pollution. Given the vastness and complexity of this particular industry as well as dissimilarities between the sectors and companies, comprehensively understanding and controlling these VOC emissions is a challenge in many countries. This study demonstrates an approach of characterizing and identifying critical sources by using multivariate analysis, including principal component analysis (PCA) and projection pursuit. A representative petrochemical industrial park in southern Taiwan comprising 20 up-, mid-, and downstream companies and 519,442 emission sources from 2012 till 2014 was selected for analysis. The results indicated that although the total emissions decreased during this period, which was attributable to controlling upstream vendors (65.5–72.1% of the total emission) and larger sources, such as equipment components (ECs; 63.5% of the total emission in 2012, which decreased to 59.3% in 2014), significant emissions were associated with the midand downstream companies and other sources, such as cooling towers (CTs) and storage tanks. PCA revealed that the sources in 5 PCs—namely, in decreasing importance, storage tanks, CTs, ECs, wastewater treatment plants, and stacks—explained 88.7% of the data variability. Furthermore, the variance in emission exhibited stronger correlations with the midand downstream sectors than their upstream counterpart. In addition to the amount of emissions they produce, key sources that must be controlled in order to effectively reduce VOCs in the petrochemical industry may be indicated by the correlation among their emission variations.


INTRODUCTION
The petrochemical industry plays a critical role in benefiting the economy of Taiwan. The presence of this industry has been accompanied by notorious air pollution (Liu et al., 2008;Yang et al., 2013;Chen et al., 2019). The air pollutants, namely volatile organic compounds (VOCs), have attracted public concern because of their environmental nuisance and public health concerns (Simpson et al., 2013;Chen et al., 2014;Liu et al., 2014). Many VOCs are well known as toxic air pollutants associated with carcinogenicity, neurotoxicity, and other serious health symptoms, such as headaches, nausea, and damage to liver and kidney by inhalation (Guo et al., 2004;Yang et al., 2012;U.S. EPA, 2017). Examples of VOCs from a petrochemical industry include benzene, butadiene, or dichloromethane, and cancers resulted from the long-term exposure to these compounds have been reported (Thepanondh et al., 2011;U.S. EPA, 2017).
Another concern regarding VOC pollution is the potential of these compounds being involved in photochemical reactions forming ozone and particulate contaminations and affecting urban air quality (Tsai et al., 2012;Simpson et al., 2013;Kim et al., 2018). It was reported by the Taiwan Environmental Protection Agency (TWEPA) that the major pollutants affecting air quality in most cities were ozone and fine particulate matter (PM 2.5 ) TWEPA, 2019). Furthermore, the contribution of ozone increased in these years, suggesting the increasingly important impact of photochemical reactions on urban air quality. There is a well-established understanding of the tropospheric chemistry of VOCs involved in the photochemical formation of ozone (Kansal, 2009;Yan et al., 2017). High levels of ozone formation potential are typically associated with the occurrence of aromatic or VOCs with a high molecular weight. These compounds are commonly produced in the petrochemical industry (Cheng and Chou, 2003;Ras et al., 2009;Chen et al., 2016); because of this, it is critical to modify and strengthen the control strategies of VOCs emitted from this industry.
The air pollutant sources in the petrochemical industry typically include the manufacturing processes, combustion, fugitive sources, storage and handling, as well as auxiliary emissions (U.S. EPA, 2008;Ragothaman and Anderson, 2017). The question becomes more complex as the manufacturing processes typical to the petrochemical industry are complicated and cumbersome with diversified factories, equipment, and pipe fittings among up-, mid-, and downstream companies. In addition to intentional emissions during manufacturing processes, raw materials, intermediates, and final products commonly in the gas or liquid phases elevate the potential of fugitive emissions that have originated from unintentional equipment leaks such as those from flanges or valves, increasing the challenge of effectively and efficiently reducing VOC emissions (Na et al., 2001;Cetin et al., 2003;Leuchner and Rappengluck, 2010).
Multivariate analysis, particularly principal component analysis (PCA), has been used to investigate air pollution data with respect to source identification and characterization (Han et al., 2006;Tokalioglu and Kartal, 2006;Liu et al., 2014). The statistical method used is based on a covariance matrix with the mapping of multivariate data onto lowdimensional manifolds. This analysis is useful to point out the structure of the data set while maintaining the contributions of variables to the variance of observation. In a system with multiple input variables, this approach is used to determine the variables with the greater eigenvalues, further identifying the core of the problem (Wold et al., 1987;Edwards et al., 2001). Projection pursuit is a data-driven linear transformation, as this mapping approach visualizes multivariate data in a low-dimensional scale that optimally preserve the structure of the original data (Koren and Carmel, 2004). Projection pursuit has been frequently applied in the studies of different research fields. Rajeevan et al. (2007) forecasted southwest monsoon rainfall over India by models based on the projection pursuit regression technique. The projection pursuit method was used to effectively access the temporal water quality variations at different sampling sites and then classify the critical pollution features (Huang and Lu, 2014). In another study assessing the phytotoxicity during composting, the projection pursuit was combined with other techniques to reduce the phytotoxin concentration and increase the treatment efficiency (Cui et al., 2017).
Instead of naming the companies or sources for VOC control simply based on their emission amounts, this study identified the critical companies and sources (including stacks, equipment components [ECs], flares, storage tanks, loading/unloading areas [L/ULs], wastewater treatment plants [WWTPs], oilwater separators [OWSs], and cooling towers [CTs]) in up-, mid-, and downstream sectors of a representative petrochemical industrial park in southern Taiwan. This study provides insights into effective VOC management for this particular and important emission source.

Study Site
This study selected one typical petrochemical industrial park that has been established for more than 40 years (since the 1970s) in the suburban area of Kaohsiung City in southern Taiwan (Fig. 1). The site of concern in this study is the second largest petrochemical industrial park in Taiwan with an overall area of 403 hectares. A complete petrochemical industrial structure containing 1 upstream vendor for cracking light oil (denoted as F01 in the following discussion), 10 midstream manufacturers for synthesis of petrochemical materials from the upstream vendor (denoted as F02 to F11), and 9 downstream manufacturers responsible for production of different petrochemical products (denoted as F12 to F20) are present in the park. In spite of only 1 upstream vendor in the study, the vendor (F1) contains a comprehensive structure as an upstream vendor in the petrochemical industry. The representative vendor includes 3 naphtha cracking plants, 1 catalytic reforming unit, 3 aromatic separation plants, and 1 dimethyl benzene separation plant, comprising 27 processes producing typical materials in the petrochemical industry such as ethylene, propylene, butadiene, benzene, toluene, and dimethyl benzene mixtures. In these companies (F01-F20), the amount of VOCs emitted from different potential sources, including stacks, ECs, flare towers, storage tanks, L/UL, WWTPs, OWSs, and CTs, were monitored with the analytical details being provided in the next section.

VOC Analysis
VOC emission levels from different sources in 20 companies from 2012 to 2014 were analyzed. The potential sources comprise 126 stacks, 13 flare towers, 517,750 ECs (including pump shaft seals, compressor shaft seals, valves, flanges, pressure relief valves, and open pipelines), 1,420 storage tanks (including fixed tanks, inner floating roof tanks, outer floating roof tanks, underground tanks, and pressure tanks), 71 L/ULs, 18 WWTPs, 6 OWSs, and 38 CTs. Two different approaches were employed to determine VOC emissions from these sources. VOC concentrations and mass flow rates were sampled and analyzed to calculate VOC emissions from the stacks. For other sources, AP-42 (Compilation of Air Pollutant Emission Factors), which was developed from source test data, material balance studies, and engineering estimates and contains emissions factors and process information for more than 200 air pollution source categories, was used for estimation of the associated VOC emissions (U.S. EPA, 1995). AP-42, which was developed from source test data, material balance studies, and engineering estimates and contains emissions factors and process information for more than 200 air pollution source categories, was used for estimation of associated VOC emissions (U.S. EPA, 1995). With the total emissions given in the permits of different companies, the method estimates the VOC emissions from different sources in 20 companies. Zhang et al. (2017) combined the AP-42 method with aerosol emission models to predict the silt loading of paved roads in China. The AP-42 method was used to estimate the VOC emission during river barge transportation of petrochemicals, emphasizing the need for reviewing the VOC policy in the field (Mihajlovic et al., 2016).
VOC emissions from stacks were investigated by following the U.S. Environmental Protection Agency (U.S. EPA)'s TO-15 standard method with modification (U.S. EPA, 1999). The air sample was collected by using pre-cleaned and certified canisters. VOC analysis of canister samples was accomplished with gas chromatography (7890B; Agilent) coupled with mass spectrometry (5977A; Agilent) (GC-MS). Air samples from the canister were passed through multi-sorbent packing followed by purging with helium for drying. The packing was heated to desorb VOCs on the surface. The air sample was then passed through a concentrator to condense VOCs on a reduced temperature surface. Condensed VOCs were thermally desorbed and injected into the GC-MS to determine concentrations. A capillary column (60 m × 0.25 mm I.D., DB-VRX; Agilent) was used to achieve a high temporal resolution of VOC analysis. The acquisition mode was set to scan in the proper ranges of mass-to-charge ratios at 0.5 scan per second.

Principal Component Analysis
PCA is a method for prediction of the directions in observations along which the data have the highest variability through an eigenvalue decomposition of the variance matrix of the data (Yu et al., 1998;Larsen and Baker, 2003;. This approach is common to analyze air pollution data, which are typically observed and recorded in discrete forms composed of a series of environmental variables and pollutants, and to identify critical pollutant species, sources, or environmental variables (Han et al., 2006;Tokalioglu and Kartal, 2006;Liu et al., 2014). The complex observations are converted into a set of linearly independent variables to determine principal components (PCs) represented as functions of the original variables. The 1 st PC (PC1) has the largest possible variance to account for as much of the variability in the observations as possible. The succeeding component (e.g., PC2 and PC3) in turn has the largest variance uncorrelated with those of the previous components. PCA was conducted using SPSS 17.0 for VOC emission profiles observed from 2012 to 2014. The emission data were standardized to give all variables identical variations and to calculate associated eigenvalues. PCs consisting of different companies or sources were analyzed by applying varimax orthogonal rotation with an identical sum of eigenvalues. The number of PCs was determined by extracting those with eigenvalues larger than 1, namely the Kaiser criterion. The loading values of the original variables (i.e., different companies or emission sources) in each PC were calculated. A loading value larger than 0.6 was considered a strong correlation between the PC and emission factories or sources.

Total VOC Emission
The total VOC emissions from 20 companies including 519,442 potential sources from 2012 to 2014 were estimated (Fig. 2) and Table 1 lists the emission data of each company. In the results, the emissions were 1252.8, 1038.0, and 993.0 tons in 2012, 2013, and 2014, respectively. By comparison with 2012, the emissions were decreased by 17.1% (214.8 t) and 20.7% (259.8 t) in 2013 and 2014, respectively. During this period, the petrochemical industrial park managed to limit the VOC emissions from fugitive sources. The approaches comprised the exclusive use of flare towers for emergent VOC release, replacement of ECs with leak-free types, and the addition of caps in WWTPs and OWSs. It seems that these methods assisted in reducing total VOC emissions in this industrial park. Fig. 3 further differentiates VOC emissions from different sources from 2012 to 2014. In the results, the EC was the dominant source, contributing 63.5% (795.1 t), 60.1% (624.2 t), and 59.3% (589.3 t) of the total emissions in 2012, 2013, and 2014, respectively (Fig. 2). The CT represents another critical source, with the emissions contributing 11.8%, 14.3%, and 14.8% of the total emissions from 2012 to 2014. Possibly attributable to the regulations and strategies by TWEPA to control air pollution by stationary sources in the last decades, VOC reductions in this typical petrochemical industrial park were observed. By increasingly using no-leak  ECs, the reduction of the emission by this source was the highest (decreased by 16.4% from 2012 to 2014). The reduction ratios of VOC emissions by OWSs and WWTPs (by capping the separators and plants) were 3.1% (38.9 t) and 2.3% (28.6 t) in these three years, respectively. However, 25% of the total emission was associated with stacks, storage tanks, and CTs and had limited change during this period, suggesting a need for more effective control strategies for these particular sources. Fig. 4 illustrates VOC emissions by different sectors of manufacturing in this petrochemical industrial park from 2012 to 2014. VOCs emitted by the upstream vendors were 902.7, 704.0, and 650.2 tons (decreasing over the years) and represent 72.1%, 67.8%, and 65.5% of the total emissions in 2012, 2013, and 2014, respectively. However, VOC emissions by the mid-and downstream companies were not reduced (19.3%, 21.4%, and 23.8% of the total VOCs were emitted by midstream companies, with 8.6%, 10.8%, and 10.8% being contributed by downstream companies in 2012, 2013, and 2014, respectively). Current VOC regulations and control strategies seem to be effective for upstream vendors with relatively greater emissions, whereas the management for emissions by mid-and downstream companies were limited.

Company and Source Identification by PCA
To verify the hypothesis regarding the existence of critical companies or sources in the study site for effective VOC management, PCA was applied as a classification method to  data. Fig. 5(a) depicts the result of PC1 accounting for 29.2% of the total data variance. The companies with the loading value greater than 0.6 comprised F04, F06, F08, F15, F16, F17, and F20. Figs. 5(b)-5(d) further illustrates the ratios of different sources contributing to VOC emissions in these critical companies from 2012 to 2014. VOCs emitted from tank storage appeared to be relatively more important in these companies (the contribution ratios ranged from 38.5-77.6%, from 30.6-77.9%, and from 38. 4-82.1% in 2012, 2013, and 2014, respectively). In the result of PC2 (19.5% of the data variance), the critical companies included F02, F03, F07, F09, and F13 ( Fig. 5(a)). Figs. 6(b)-6(d)  indicates that the CT was the dominant source (the contribution ratios ranged from 30.6-66.1%, from 37.6-65.7%, and from 40. 1-77.3% in 2012, 2013, and 2014, respectively). For PC3, the results showed 19.5% of the data variance and only three critical companies were identified. VOC data appeared to be mainly loaded by the ECs (the contribution ratios ranged from 36.6-78.5%, from 53.8-73.4%, and from 31. 0-77.4% in 2012, 2013, and 2014, respectively).
For the last two PCs, Fig. S1(a) depicts the result of PC4 that accounts for 11.5% of the total data variance; the critical companies comprised F10, F12, and F14. The WWTPs appeared to be the important source (the contribution ratios ranged from 53.9-74.6%, from 39.7-66.9%, and from 23. 7-54.9% in 2012, 2013, and 2014, respectively). For PC5, the result showed 9.0% of the data variance and only one company (F11) was identified and with the stack being the dominant source (the contribution ratios ranged from 44.6-60.9% during the period of 2012-2014).
To consider the effects of particular companies or sources by PCA, 88.5% of total VOC emissions were loaded by the companies considered critical in five PCs. The order of arrangement of emission sources according to their decreasing importance is tank storage > cooling towers > equipment components > wastewater treatment plants > stacks. These sources potentially represent the higher priorities to effectively and efficiently control VOC emissions at this site. As to the influences of different sectors of the industry (Table 2), EC was the critical source in the upstream sector. Additional sources including the CTs (40%), tank storage (20%), ECs (20%), WWTPs (10%), and stacks (10%) dominated the midstream sector, as the tank storage (44.4%), WWTPs (22.2%), CTs (11.1%), and ECs (11.1%) were important sources for the downstream sector.

Projection Pursuit of the PCA Result
Mapping of multivariate data onto low-dimensional manifolds is useful to convert the data into a more perceivable format where the inherent pattern and characteristics can be observed more effortlessly (Friedman and Tukey, 1974;Wold et al., 1987). With the concepts of vector projection and weighted averaging, the comprehensive eigenvalue of a company was estimated by applying its loading values and the eigenvalues of five individual PCs, as listed in the equation below (Vo and Durlofsky, 2014): where LV i and EV i represent the loading value and eigenvalue of the company in i th PC, respectively; V i denotes the variance of the i th PC; and S represents the sum of variances of all individual PCs. Table 3 shows the results of PCA and comprehensive eigenvalues calculated in this study. By calculating the comprehensive eigenvalues that quantify the factor by mapping the data to one dimension, the order of arrangement considered the decreasing importance of the company on VOC emission is F04 > F16 > F18 > F03 > F20. These companies belong to the mid-and downstream sectors of the Table 2. Companies with different critical emission sources in up-, mid-, and downstream sectors of the study site and the ratios of VOC emission by the companies to the total emission. CT, TS, and WWTP denote equipment component, cooling tower, tank storage, and wastewater treatment plant, respectively. b The number in the parentheses denotes the ratio of VOCs emitted by the selected companies to the total VOC emission. petrochemical industry, with the critical sources being associated with storage tanks, CTs, and ECs. Fig. 4 shows VOC emissions in the upstream sector was larger than in the midand downstream sectors (p < 0.01). However, more processes in the mid-or downstream sectors such as polymerization, esterification, and alkylation potentially elevated the importance of companies in mid-and downstream sectors. In addition to the amount of emissions and loading values in the first few PCs, the quantity and complexity of sources could be another factor to identify the critical companies for effective VOC control in the petrochemical industry. Wei et al. (2018) developed and applied an inversedispersion method to calculate the VOC emissions from a typical petrochemical plant in central Taiwan. The estimated average VOC emissions totaled 666.0 t year -1 , with ethylene (115.4 t), propylene (102.5 t), benzene (92.7 t), isopentane (79.3 t), toluene (74.1 t), isobutane (61.6 t), n-pentane (56.8 t), n-butane (44.3 t), propane (21.1 t), and ethane (18.1 t) being among the major species, and were on the same order as the results using the AP-42 method. Another study that investigated the VOC emissions from a petrochemical industrial district in the same region found that equipment components, storage tanks, and stacks were primarily responsible for VOCs leaching into the district (Yen and Horng, 2009), similar to the observations in this study. Our study evaluated the VOC emissions from a petrochemical park with a typical industrial structure-an upstream sector for producing the material, a midstream sector for synthesis and processing, and a downstream sector for manufacturing the products-and monitored the reduction in these emissions from 2012 to 2014. We discovered that this reduction mainly resulted from controlling the ECs, WWTPs, and OWSs, as far more limited measures were implemented for the stacks, storage tanks, and CTs. Furthermore, the mid-and downstream sectors displayed smaller decreases in the VOCs than the upstream sector.

DISCUSSION AND CONCLUSIONS
PCA, which was applied to determine the emission characteristics of the study site, revealed that 88.7% of the data variability during the monitoring period was explained by the sources in 5 PCs, with PC1 through PC3 being more significant. Ranking the sources according to their importance, the total VOC emissions were influenced by storage tanks > CTs > ECs > WWTPs > stacks; appropriate measures for reduction depend upon the specific emission source. For example, a closed vent system solves the issue of storage tanks emitting VOCs. Anti-erosion measures for the equipment and piping, as well as the inhibition of algae growth, effectively reduce emissions from CTs, whereas the use of leak-free components decreases those from ECs. The highly concentrated emissions from stacks can be pre-treated with the thermal method to enhance the VOC removal efficiencies. Furthermore, the variance in emission exhibited stronger correlations with the mid-and downstream sectors than their upstream counterpart during the monitoring period. In addition to the amount of emissions they produce, key sources that must be controlled in order to effectively reduce VOCs in the petrochemical industry may be indicated by the correlation among their emission variations. The findings in this study add to our knowledge of VOC pollution in this industry and can be used to develop effective management strategies that may also be applicable to other industries with similar emission characteristics.

ACKNOWLEDGMENTS
This study was conducted under the auspices of the Taiwan Environmental Protection Agency (TWEPA) and Ministry of Science and Technology (MOST) in Taiwan under Contract Numbers MOST 106-EPA-F-011-001 and MOST 106-2621-M-110-003. Additional financial support from the Industrial Development Bureau in the Ministry of Economic Affairs in Taiwan under Grant Number 980521 is greatly appreciated. Its content is solely the responsibility of the authors and does not necessarily represent the official views of the institutions.

SUPPLEMENTARY MATERIAL
Supplementary data associated with this article can be found in the online version at http://www.aaqr.org.