A Prognostic Approach Based on Fuzzy-Logic Methodology to Forecast PM 10 Levels in Khaldiya Residential Area , Kuwait

A prognostic approach is proposed based on a fuzzy-logic model to estimate suspended dust concentrations, related to PM10, in a specific residential area in Kuwait with high traffic and industrial influences. Seven input variables, including four important meteorological parameters (wind speed, wind direction, relative humidity and solar radiation) and the ambient concentrations of three gaseous pollutants (methane, carbon monoxide and ozone) were fuzzified using a sytem with a graphical user interface (GUI) and an artificial intelligence-based approach. Trapezoidal membership functions with ten and fifteen levels were employed for the fuzzy subsets of each model variable. A Mamdani-type fuzzy inference system (FIS) was developed to introduce a total of 146 rules in the IF-THEN format. The product (prod) and the centre of gravity (centroid) methods were performed as the inference operator and defuzzification methods, respectively, for the proposed FIS. The results obtained using uzzy-logic were compared with the outputs of an exponential regression model. The predictive performances of the models were compared based on various descriptive statistical indicators, and the proposed method was tested against additional observed data. The prognostic model presented in this work produced very small deviations from the actual results, and showed better predictive performance than the other model with regard to forecasting PM10 levels, with a very high determination coefficient of over 0.99.


INTRODUCTION
Particulate matter (PM) is one of the most prevalent atmospheric pollutants in urban atmosphere.It consists of suspended solids and liquids, and it comes from a variety of natural and anthropogenic sources.Some particles are directly emitted in the air from vehicles and industries, whereas other particles are indirectly formed by the chemical change of combustion gases in the presence of sunlight and water vapor.
Air quality problem, related to PM 10 has become a topic of considerable importance.The term PM 10 is refereed to atmospheric particles with an aerodynamic diameter of less than 10 μm.These small particles are targeted because they can easily penetrate into the deepest regions of the lungs.The epidemiological studies indicated that exposure to PM 10 induces an increase of lung cancer, morbidity and cardiopulmonary mortality (Nel, 2005;Bhaskaran et al., 2011).Dockery et al. (1993) were the first to report significant association between PM 10 and mortality.Pope et al. (1995), in another study, also linked PM 10 to cardiopulmonary and lung cancer mortality.Studies also indicated that there is a significant association between PM 10 concentrations and the medical visits for lower respiratory symptoms in children and upper respiratory in the elderly (Ostro et al., 2001).Other environmental effects of particulate matter are visibility reduction, acidic precipitation, and the transport of pollutants from industrial regions to remote and pristine areas (Marshall et al., 1986;Wolff et al., 1986;Swietlicki et al., 1996).They also have been suggested to be responsible for possible global climate change through their direct and indirect role in the earth's radiation balance (Study of Critical Environmental Problems (SCEP), 1970;Charlson et al., 1992) and possible modification of cloud processes (Barret et al., 1979;Parungo et al., 1979;Parungo et al., 1982).
PM 10 consists of many different compounds and has a variety of primary and secondary sources (Wilson and Suh, 1997).This makes prediction and control of PM 10 a difficult mission.According to Grivas and Chaloulakou (2006), the prediction of particulate concentrations is more difficult as compared to the modeling of gaseous pollutants.This is due to the complexity of the processes which control their formation, transportation, and removal of aerosol in the atmosphere.It is expected that the input variables which are responsible for the PM 10 levels in the air may differ from one location to another.The identification which sources or chemical processes are associated with PM 10 levels in a certain location can be challenging.Therefore, the selection of the right input variables for a particular location is crucial and it should be done before any modeling can be made.Studies reported in the literature showed that meteorology and ambient concentrations of gaseous pollutants play a very important role in the behavior of PM 10 .In this regard, Van der Wal and Janssen (2000) showed that 45% of the variance of PM 10 concentrations may be explained by the changes in wind direction, temperature, and durations of precipitations.Other studies found that PM 10 concentrations in ambient air are significantly affected by wind speed, wind direction, solar radiation, relative humidity, rainfall, boundary layer depth, precipitation, temperature and number of consecutive days with synoptic weather patterns (Alpert et al., 1998;Monn, 2001;Giri et al., 2008;Barmpadimos et al., 2012).Furthermore, the ambient concentrations of some gaseous pollutants such as ozone (O 3 ), carbon monoxide (CO) and sulfur dioxide (SO 2 ) also have their roles in the behavior of PM 10 (Rizzo et al., 2002).
In order to curb the increasing deterioration of ambient air quality, urgent risk assessment and proper risk management tools are required to ensure a robust and resilient control of PM 10 levels.Considering the complicated inter-relationships among a number of system factors in the dispersion and transport of atmospheric pollutants under several meteorological conditions, mathematical models have become essential tools to develop early-warning and control strategies, as well as to investigate future emission scenarios.Although statistical models may be able to establish a relationship between the input and the output variables without detailing the causes and effects in the formation of pollutants, however, they are not capable of capturing the inherent non-linear nature of the problem and forecasting short-term pollution levels (Agirre-Basurko et al., 2006;Barai et al., 2007;Akkoyunlu et al., 2010).Since the number of meteorological and pollution parameters implies highdimensional input space and high computational capacity, it is believed that artificial intelligence-based techniques may provide a good alternative to traditional techniques due to their speed, robustness and non-linear characteristics (Yetilmezsoy and Sapci-Zengin, 2009;Akkoyunlu et al., 2010).
Because of their non-parametric regression capabilities, generalization properties and easiness of working with high-dimensional data, several artificial intelligence-based methods, such as artificial neural networks (Abdul-Wahab and Al-Alawi, 2002;Yetilmezsoy, 2006;Yetilmezsoy and Saral, 2007;Akkoyunlu et al., 2010) and fuzzy-logic/neurofuzzy (Nunnari et al., 2004;Yildirim and Bayramoglu, 2006;Carnevale et al., 2009;Noori et al., 2010) methodology, have recently been utilized in the modeling of various reallife problems in air pollution field.There have also been other specific studies reporting the advantages and adaptability properties of artificial intelligence-based models for the prediction of daily and/or hourly particulate matter (PM 2.5 and PM 10 ) emissions in many urban and residential areas (Chaloulakou et al., 2003;Chelani, 2005;Grivas and Chaloulakou, 2006;Karaca et al., 2009).
Considering the non-linear nature of PM 10 -based air pollution problems, a number of attempts in developing an artificial intelligence-based control of PM 10 emissions may help to provide a continuous early-warning strategy without requiring a complex formulation and laborious parameter estimation procedures.Therefore, implementation of a knowledge-based methodology may be regarded as a particular field of investigation for controlling of PM 10 emissions that are necessary to mitigate one of the major public health issues associated with exposures to high concentrations of atmospheric particles.
Based on the above-mentioned facts, the specific objectives of this study were: (1) to estimate suspended dust concentrations by means of a new fuzzy-logic-based model consisted of several important meteorological parameters (i.e., wind speed, wind direction, relative humidity and solar radiation) and ambient concentrations of some gaseous pollutants (i.e., methane, carbon monoxide and ozone) affecting PM 10 concentrations; (2) to compare the proposed artificial intelligence-based approach with the conventional multiple regression-based method for various descriptive statistical indicators; and (3) to verify the validity of the proposed prognostic methodology by several testing data.

Description of Study Domain
Khaldiya (Al-Khaldiyah) residential area is a suburb of Kuwait City and it is located in the boundaries of Al-Asimah Governorate in Kuwait (Fig. 1).The center of the area is situated at the latitude 29°19′32′′ north and the longitude 47°57′47′′ east.The Shuwaikh industrial area lies towards the western boundaries of Khaldiya, Yarmouk subarea lies to the south, Kaifan subarea is located northern boundaries of the area, while Adailiyah subarea marks the eastern boundary.
Relatively heavy traffic movement surrounds the area of study at Khaldiya residential area and therefore it is mainly affected by the air pollutants that are emitted from the traffic load in view of the proximity of major highways, such as the Third and the Fourth Ring Roads and the International Airport Road (Abdul-Wahab and Al-Alawi, 2002).The monitoring site is also situated downwind from the Shuwaikh industrial area and the Shuwaikh power plant.Hence, the monitoring site can also be affected by the Shuwaikh industrial area and the Shuwaikh power plant in case the levels of air pollutants released from them are significant.It should be noted that Shuwaikh Industrial Area is known as the industrial section of Kuwait as most manufacturers can be found in it.Car repairs are mostly located in this part of Kuwait.Also, many car dealerships are located in this area.The climate is in the region typically arid with very hot summers and relatively cold and dry winters.Summer season, which lasts from May to September, is extremely hot and dry with temperatures easily exceeding 45°C during daytime.Winter season, from November through February, is cool with some precipitation and average temperatures around 13°C with extremes from -2°C to 27°C.Annual rainfall averages less than 127 mm and occurs mainly between October and April.The spring season in March is warm and pleasant with occasional thunderstorms (Yassin and Almouqatea, 2010).Dust and sandstorms are also common throughout the year.They are more frequent in the winter months and in midsummer (Abdul-Wahab and Al-Alawi, 2002).The frequent winds from the northwest are cool in winter and spring and hot in summer.Southeasterly winds, usually hot and damp, spring up between July and October whilst hot and dry south winds prevail in spring and early summer (Yassin and Almouqatea, 2010).
Considering the above-mentioned facts, investigating PM 10 concentrations in Khaldiya residential area is very vital to predict environmental changes and to study future scenarios that include the impacts of changing populations and of new commercial developments.

Collection of the Data
Ambient air quality data and meteorological conditions were recorded every five minutes using an air pollution mobile monitoring station located at Khaldiya residential area in the state of Kuwait.These 5-min data were used to determine the variations of PM 10 with the other pollutants and with meteorological parameters.The mobile station was operated for 24 hours on a daily basis in July (summer period).Sampling and analysis were conducted automatically and subsequently and data were transferred to the data station.The station was monitored on a daily basis by examination of the collected data.In addition, all equipment were recalibrated and aligned on a monthly basis.
The location of the air pollution mobile monitoring station (29°19′20′′N, 48°58′18′′E) was selected as the sampling site on the basis of availability of power and security and topography of the area.Care was taken that no high buildings or trees were present for 500 m of the site in any direction.The reason for this condition is to eliminate the effect instrumentation and measurements.
In terms of its operation, the mobile laboratory is characterized by the following: sampling inlets were located on top of the laboratory 10 m above the ground; all the monitors were controlled by an intelligent data logger; automatic zero and span calibrations were performed using a calibration gas once every 23 hours (thus the same hourly data were not lost each day).
The mobile laboratory has the capability of measuring 18 variables, which include concentrations of methane, nonmethane hydrocarbons, nitrogen oxides (NO and NO 2 ), sulfur dioxide, carbon oxides (CO and CO 2 ), hydrogen sulfide, ammonia, ozone, and total dust.In addition, several meteorological parameters, such as temperature, pressure, humidity, solar radiation, wind direction, and wind speed, can be measured by the mobile laboratory.
Collected data were processed on a monthly basis in order to assess consistency of measurements.Analysis included inspection of spot readings in order to locate and eliminate measurements corresponding to calibration and adjustment periods.In addition, a quality check of the data was performed by examining all data in graphical form.Subsequently, hourly averages and spot minimums and maximums were generated.The number of complete data points with values for all 8 variables recorded was 1096.

Instrumentation and Measurements
The air pollution mobile monitoring station was fitted with chemical monitors and real-time instruments for assessing pollutants concentrations with a high sensitivity and specificity.In the present study, gaseous pollutants measured include methane (CH 4 ), carbon monoxide (CO) and ozone (O 3 ).Meteorological parameters monitored simultaneously consisted of wind speed, wind direction, relative humidity and solar radiation.
Methane (CH 4 ) was measured by gas chromatography using a flame ionization detector (Model MAS-1030A, Mine Safety Appliances Company) which had a detection limit of 0.05 ppm.Carbon monoxide (CO) concentrations were measured by using a non-dispersive infrared (IR) analyzer (Model 48 of Thermo Environmental Instruments) with a minimum detectable limit of 0.1 ppm and a measuring range up to 20 ppm.Ozone (O 3 ) concentrations were measured by using a non-dispersive ultraviolet (UV) photometer (Model ML 9812, Monitors Labs) with a measuring range of 1000 ppb.Suspended dust (PM 10 ) was measured gravimetrically (TEOM® Series 1400a, Thermo Electron Corporation).This was a real-time device used for assessing particulate concentration for sizes smaller than 10 μm in diameter (Abdul-Wahab and Al-Alawi, 2002).
Sensor for solar radiation enabled readings with an accuracy of 0.02 kW/m 2 and a range of 0.0 to 2 kW/m 2 .Sensor for relative humidity was calibrated for measurements with an accuracy of 3% and a range of 0.0 to 100%.Sensor for wind speed had an accuracy of 0.2 m/s and a range of 0.4 to 76 m/s.Furthermore, sensor for wind direction provided measurements with an accuracy of 5° and a range of 0.0 to 360°.Other details of the mobile laboratory's meteorological sensors can be found in previous studies (Abdul-Wahab et al., 1996, Abdul-Wahab et al., 2000, Elkamel et al., 2001).Table 1 summarizes methods, ranges and accuracy of the measurements conducted in the mobile laboratory.Variations of the model components considered in the proposed prognostic approach are depicted in Fig. 2.

Variation of PM 10 Levels during the Study Period
It is known that atmospheric pollutants and meteorological conditions can exhibit remarkable seasonal varitions.During the measurement period, PM 10 levels ranged from 35 to 2257.42 μg/m 3 , with an average concentration of 183.55 μg/m 3 , as depicted in Fig. 2. On the basis of the complete PM 10 data set, about 94.4% of the overall suspended dust concentrations was recorded to be lower than 500 μg/m 3 .While the maximum PM 10 concentration observed during the study period was 2257.42 μg/m 3 , some local peaks between 892.50-1823.14μg/m 3 were also recorded during the study period.These peaks refer to some construction activities in the area surrounding the monitoring site during the investigation period.

Fuzzy-Logic Methodology
In the fuzzy-logic-based methodology (Zadeh, 1965), there are five parts of the fuzzy inference process: fuzzification of the input variables, application of the fuzzy operator (AND or OR) in the antecedent, implication from the antecedent to the consequent, aggregation of the consequents across the rules, and defuzzification.In the fuzzification step, numerical inputs and outputs (crisp variables) are converted into linguistic terms (i.e., A, B, C, etc.) or some specific adjectives (i.e., cold, warm, hot, low,  high, big, small, etc.) according to the corresponding degrees and numbers of specific membership functions used in the fuzzy inference system (FIS) (Altunkaynak et al., 2005;Yetilmezsoy, 2012).The input is always a crisp numerical value of the input variable (i.e., in the present case, for wind speed (WS), the interval between 0.82 and 3.73 m/s) and the output is a fuzzified degree of membership in the qualifying linguistic set (always the interval between 0 and 1).Once the inputs have been fuzzified, the fuzzy operator is applied to obtain a number that represents the result of the antecedent for a given rule.This number will then be applied to the output function.In the FIS, two built-in AND methods (min (minimum) and prod (product)) and two built-in OR methods (max (maximum) and probor (the probabilistic OR method)) are basically performed (Rubens, 2006).
Once proper weighting (a number between 0 and 1) has been assigned to each rule, the implication method is implemented in the third step.The input of this process is a single number given by the antecedent, and the output is a fuzzy set represented by a specific membership function.For the implication process, two built-in methods are basically supported by the FIS, and they are the same functions that are used by the AND method: min (minimum), which truncates the output fuzzy set, and prod (product), which scales the output fuzzy set.
Since decision is based on all of the rules in the FIS, the rules are combined to make the decision in the fouth step.Aggregation is the process by which the fuzzy sets that represent the outputs of each rule are combined into a single fuzzy set.The input of the aggregation process is the list of fuzzy sets that represent the outputs of each rule.There are a number of different aggregation methods (i.e., max (maximum), sum (simply the sum of each rule's output set), probor, etc.) supported by the FIS (Rubens, 2006).
In the fifth and final step, fuzzy set is defuzzified in order to resolve a single output value from the set.In this process, linguistic results obtained from the FIS are transformed into a crisp numerical outputs (real values of variables) based on the predefined fuzzy rules in the fuzzy rule base (Biyikoglu et al., 2005;Kusan et al., 2010).Briefly, the input for the defuzzification process is the aggregated output fuzzy set and the output is a single number.In the relevant literature, several defuzzification techniques, such as centre of gravity (COG or centroid), bisector of area, mean of maxima, leftmost maximum, rightmost maximum, have been reported (Jantzen, 1999;Rubens, 2006).
In this study, the product (prod) technique was conducted as the inference operator due to its better performance in collection of all the relations among inputs and outputs fuzzy sets in the fuzzy rule base (Turkdogan-Aydinol and Yetilmezsoy, 2010;Yetilmezsoy et al., 2012;Yetilmezsoy, 2012).Moreover, the sum operator was used for the aggregation method implemented in the proposed FIS, as similarly performed in the previous studies (Turkdogan-Aydinol and Yetilmezsoy, 2010;Yetilmezsoy et al., 2012;Yetilmezsoy, 2012).Furthermore, centre of gravity (COG or centroid) method which is the most commonly used defuzzification technique was employed as conducted in several fuzzy-logic-based studies (Akkurt et al., 2004;Sadiq et al., 2004;Altunkaynak et al., 2005;Turkdogan-Aydinol and Yetilmezsoy, 2010;Yetilmezsoy et al., 2012;Yetilmezsoy, 2012).Considering the above-mentioned steps, a detailed schematic of the proposed prognostic approach to forecast PM 10 levels in the Khaldiya area of Kuwait is illustrated in Fig. 3.

Selection of Membership Functions
In the fuzzy-logic-based models, the shape of membership functions of fuzzy sets can be triangular, trapezoidal, bellshaped, sigmoidal, or another appropriate form, depending on the nature of the system being studied.Among them, triangular-and trapezoidal-shaped membership functions are predominant in current applications of fuzzy set theory, due to their simplicity in both design and implementation based on little information (Rihani et al., 2009;Yetilmezsoy, 2012, Yetilmezsoy et al., 2012).In this regard, several combinations of triangular (trimf) and trapezoidal (trapmf) shaped membership functions were pre-trained with different levels (i.e., 8, 10 and 15) to investigate the bestfit fuzzy-logic model structure the present study.The measured data collected from Khaldiya residential area were arbitrarily classified into different fuzzy set categories with respective minimum and maximum values of model variables.Then, different scalar ranges of both triangular and trapezoidal membership functions were tested until the satisfactory outputs were obtained with respect to the set of rules used in the FIS, as similarly conducted in previous studies (Mitra et al., 1998;Turkdogan-Aydinol and Yetilmezsoy, 2010;Yetilmezsoy et al., 2012;Yetilmezsoy, 2012).Results of the preliminary analysis indicated that trapezoidal shaped membership functions with ten levels for the input variables and fifteen levels for the output variable demonstrated the optimum prediction performance in estimation of PM levels at the studied area.

Fuzzification of Input and Output Variables
In this study, the FIS (Fuzzy Inference System) Editor GUI (graphical user interface) in the Fuzzy Logic Toolbox within the framework of MATLAB® V7.0 (The MathWorks, Inc., USA, R14) software, running on a Pentium® 4 CPU (Intel® Atom™ Processor 3.00 GHz, 480 MB of RAM) PC, was used for modeling and simulation purposes.In the computational analysis, input variables (wind speed, wind direction, relative humidity and solar radiation, methane, carbon monoxide and ozone) and the output variable (suspended dust concentration) were built by using a Mamdani-type FIS Editor, and fuzzified with ten and fifteen trapezoidal membership functions, respectively.Fig. 4 shows the input and output variables on the MATLAB® numeric computing environment.
Methane (CH 4 ) concentration ranged from 1.617 to 2.083 ppm in X-axis.Fig. 5(a) depicts the shape and range of each level for the first input variable.Carbon monoxide (CO) concentration, the second input variable, ranged from 0.205 to 7.063 ppm, and the shape and range of its membership functions are illustrated in Fig. 5(b).Ozone (O 3 ) concentration and Wind speed (WS), considered as the seventh and the fourth input variables, ranged from 1.5 to 97.54 ppb, and from 0.82 to 3.73 m/s, respectively (Fig. 5(c) and Fig. 6(a)).Other input variables were fuzzified in the following ranges: Wind direction (WD) = 58.41-300.9°,relative humidity (RH) = 10.77-38.69%and solar energy (SOLAR) = 0.038-0.875kW/m 2 .Shapes and ranges of trapezoidal membership functions for these input variables (WD, RH and SOLAR) are depicted in Fig. 6  Suspended dust concentration (PM 10 ) being the output variable of the proposed fuzzy-logic model ranged from 35 to 2257.42 μg/m 3 , as shown in Fig. 7.According to the variation of the output data, ten of trapezoidal shaped membership functions (from A to J) were restricted into narrow ranges compared to the remaining five membership functions (from K to O).This enabled to make better predictions on the suspended dust concentrations lower than 500 μg/m 3 .Table 2 summarizes the number of trapezoidal membership functions (trapmf) and their ranks, for each of the input and output variables considered in the present fuzzy-logic-based model.
In order to simplify processing of the implemented rules, present fuzzy set categories were defined in the form letters (i.e., A, B, C, etc.) instead of long definitions such as moderately low, low, moderate, moderately high, high, very high, etc.In this regard, each input variable had ten trapezoidal shaped membership functions namely A, B, C, D, E, F, G, H, I and J.Likewise, the output variable consisted of fifteen trapezoidal shaped membership functions namely A, B, C, D, E, F, G, H, I, J, K, L, M, N and O.For instance, according to the ranges and codes given in Table 1, an experimental set of "methane concentration = 1.648 ppm, carbon monoxide concentration = 0.893 ppm, wind peed = 1.99 m/s, wind direction = 261.57°,relative humidity = 14.33%, solar energy = 0.6955 kW/m 2 and ozone concentration = 41.29 ppb" was coded as "A, B, D, I, B, H, E and J", respectively.Based on both developed fuzzy set categories and ranges of the existing measured data, a total of 146 rules were established in the IF-THEN format by using the Fuzzy Rule Editor for the best-fit model structure (trapezoidal shaped membership functions with ten levels for the input variables and fifteen levels for the output variable).For example, Table 3 presents the rule base of 25 rule sets randomly selected from the overall fuzzy sets built within the framework of MATLAB® software.
As mentioned above, fuzzy-logic-based models, called fuzzy inference systems (FIS), consist of a number of conditional "IF-THEN" rules.Although there are no universally accepted criteria that can be applied in all cases, for the designer who understands the system, these rules are easy to write, and as many rules as necessary can be supplied to describe the system adequately.However, the number of induced rules may become enormous and the rule description can be is complex because of the number of variables.On the other hand, the rules will be easier to interpret if they are defined by the most influential variables and the system behavior will be easier to understand as the number of rules is getting smaller.Therefore, variable selection and rule reduction are two important steps of the rule generation process (Guillaume, 2001).
The measured data were imported directly from Microsoft ® Excel used as an open database connectivity data source, and then the regression analysis was conducted.As regression models were solved, they were automatically sorted according to the goodness-of-fit criteria into a graphical interface on the DataFit® numeric computing nvironment.Additionally, t-ratios and the corresponding pvalues were computed to evaluate the significance of the regression coefficients.Descriptive statistics of the residual errors were also calculated for the appraisal of the multiple regression model performance.An alpha (α) level of 0.05 (or 95% confidence) was used to determine the statistical significance of the model components.

Measuring of the Goodness of the Estimate
Measuring the goodness of the estimate is an important art of model development, and it can be achieved by several visual and numerical methods (Akkoyunlu et al.,   ).Kolehmainen (2004) has reported that although visual methods helps to get an intuitive hold of the model performance, numerical methods provide a more robust ground for comparing and enhancing the models in a scientific way.In this regard, various statistical indicators, such as coefficient of determination (R 2 ), mean absolute error (MAE), root mean square error (RMSE), systematic and unsystematic RMSE (RMSE S and RMSE U , respectively), index of agreement (IA), the factor of two (FA2), fractional variance (FV), proportion of systematic error (PSE), coefficient of variation (CV) and Durbin-Watson statistic (DW) were utilized as helpful mathematical tools to appraise the fit between the measured data and the estimated outputs.ome descriptive statistics were also checked by using StatsDirect (V2.7.2, Copyright© 1990-2008, StatsDirect Ltd.) statistical software package for the verification of the obtained results.Detailed definitions of these estimators can be found in several studies (Kolehmainen, 2004;Agirre-Basurko et al., 2006;Gomez-Sanchis et al., 2006;Appel et al., 2007;Ibarra-Berastegi et al., 2008;Yetilmezsoy and Yetilmezsoy and Abdul-Wahab, Aerosol and Air Quality Research, 12: 1217-1236, 20121226Sakar, 2008;Yetilmezsoy et al., 2009;Yetilmezsoy, 2011).

Prediction of Suspended Dust Concentration (PM 10 )
In this study, a prognostic approach based on the fuzzylogic methodology and the multiple regression analysis analysis were conducted to forecast suspended dust concentrations (PM 10 ) in a specific residential area.In the multiple regression analysis, one exponential model and two first-order polynomial models were obtained for estimation of PM 10 levels.Results are summarized in Table 4. Regression variable results including standard error, the t-statistics and the corresponding p-values for the best-fit regression model (herein the exponential model) are given in Table 5.The exponential model derived as a   function of seven inputs variables [PM 10 = f(CH 4 , CO, WS, WD, RH, SOLAR, O 3 )] including four meteorological parameters (wind speed, wind direction, relative humidity and solar radiation) and ambient concentrations of three gaseous pollutants (methane, carbon monoxide and ozone) is expressed as follows: PM 10 = exp[4.591(CH4 ) + 0.187(CO) + 0.526(WS) -0.0047(WD) + 0.068(RH) + 1.056(SOLAR) + 0.0126 It is reported that the larger t-ratio indicates the more significant parameter in the regression model.Moreover, the variable with the lowest p-value is considered the most significant (Yetilmezsoy and Sapci-Zengin, 2009).According to the t-ratios in Table 4, the relative humidity, methane concentration and wind speed have more importance than other variables for the derived exponential model in prediction of suspended dust concentration.Looking at p-values (Table 4), it can also be seen that all p-values are less than the alpha (α) level of 0.05 (or 95% confidence) indicating the statistical significance of all components in the regression model.Scatter plots of PM 10 concentration as a function of each of the predictor variables are illustrated in Fig. 8. Consequently, all variables exhibited a certain importance, indicating that they should not be eliminated from the models.It is noted that physicochemical aspects (i.e., photochemical reactions, atmospheric dispersion, phenomenon of the adsorption effect, automotive emission, photolytic cycle, etc.) of meterological conditions and gaseous pollutants on PM 10 concentrations are fully discussed in previous studies ( Van der Wal and Janssen, 2000;Monn, 2001;Abdul-Wahab and Al-Alawi, 2002;Rizzo et al., 2002;Al-Salem, 2008;Giri et al., 2008;Barmpadimos et al., 2012).
Fig. 9 shows a head-to-head comparison of performances for the multiple regression-based models on prediction of M 10 levels.Although the exponential model (multiple regression model -1) produced smaller deviations compared to the first-order polynomial models (multiple regression model -2 with constant term and multiple regression model -3 without constant term), in general, multiple regressionbased methodology show a poor prediction performance on the measured data with high residual errors.Considering the overall performances, the conventional regression approach did not yield satisfactory predictions of the PM 10 levels as good as the proposed fuzzy-logic-based model (Fig. 10).

Model Testing and Validation
To validate the models' prediction capability, the testing data set for each model (fuzzy-logic and exponential model) were used to test the developed models.The resulting predictions were then compared with actual results, and various statistical numerical measures (i.e., R 2 , MAE, RMSE, RMSE S , RMSE U , PSE, IA, FV, FA2, CV and DW) were then calculated.Results are summarized in Table 6.In this study, the fuzzy logic-based model was developed based on a total of 146 rules in the IF-THEN format and tested against 45 additional observed data (Fig. 11).Looking at the testing outputs and deviations of the developed models (Fig. 11), it can be concluded that the proposed fuzzy-logic model demonstrated a very satisfactory performance on the prediction of PM 10 concentrations compared to the multiple regression approach.
As also seen in Table 6, descriptive performance indices revealed that the fuzzy-logic-based model produced very small deviations and demonstrated a superior predictive performance compared to the conventional multiple regression-based method.The values of determination coefficient (R 2 > 0.99 for both overall data and testing data) indicated that only about 1% of the total variations were not explained by the fuzzy-logic model in estimation of PM 10 levels.However, for the best-fit multiple regression model (exponential model), about 33.5% and 24.4% of total variations did not fit the observed data for the overall data set (R 2 = 0.665) and testing data set (R 2 = 0.756), respectively.
Additionally, values of PSE (0.058 and 0.314) and FV (0.018 and 0.029) obtained using the fuzzy-logic model demonstrated greater accuracy than the multiple regression approach in the forecast of PM concentration, respectively.Moreover, the lowest values of MAE (17.78 and 19.83), Table 3.A random selection of 25 rule sets from the total 146 sets.

Input variables
Output variable  b p-values < 0.05 were considered to be significant.Yetilmezsoy and Abdul-Wahab, Aerosol and Air Quality Research, 12: 1217-1236, 20121230   For the present case, the DW statistics (DW = 1.979 for overall data and 2.082 for testing data) were determined to be very close to 2, indicating the goodness-of-fit of the fuzzylogic model (Hewings et al., 2002).On the basis of the abovementioned results, this study has clearly indicated a simple means of modeling and potential of the artificial intelligencebased approach for capturing complicated inter-relationships between suspended dust and other factors in a highly nonlinear air pollution problem at a particular location.
Finally, it is important to note that although the processbased deterministic approaches may give a good insight into the mechanism at the steady-state conditions, however, recalibration of these models is extremely time-consuming and difficult for PM 10 -related air pollution problems in different periods of time.On the other hand, calibration of artificial intelligence-based models (i.e.fuzzy-logic model) is quite easier than white-box models, as there are fewer parameters used in the model development process.Since the model calibration should include comparisons between model-simulated conditions and the observed data from field conditions, it is important to minimize the difference between model simulations and field conditions.In this   and Abdul-Wahab, Aerosol and Air Quality Research, 12: 1217-1236, 20121232   regard, it is noted that re-calibration of fuzzy-logic-based models needs the assignment of specific thresholds (i.e., residuals should be less than 5-10 percent of the variability) nd membership degrees by decision makers and experts.Even if the circumstances change or the calibrated model is applied in a different period of time, this procedure can be handled in a straightforward manner by using efficient computational methods and user-friendly artificial intelligence-based software solutions.

CONCLUSIONS
An artificial intelligence-based approach has been conducted to develop a prognostic model that could make a reliable prediction on PM 10 levels in a specific residential area with high traffic and industrial influences.For seven fundamental model components (wind speed, wind direction, relative humidity, solar radiation, methane, carbon monoxide and ozone), the proposed prognostic approach based on the fuzzy-logic methodology has shown precise and very effective predictions compared to the conventional multiple regression-based method.Clearly, this study has indicated that the fuzzy-logic-based methodology provided a wellsuited method and gave promising results for modeling of a highly non-linear air pollution problem at a particular location.

Fig. 1 .
Fig. 1.Location of Khaldiya residential area in relation to Kuwait City and coordinates of the mobile air pollution monitoring laboratory situated in the area.

Fig. 3 .
Fig. 3.A detailed flowchart of the MISO fuzzy-logic methodology implemented in this study.

Fig. 4 .
Fig. 4. Input and output variables considered for the proposed fuzzy inference system (FIS).

Fig. 8 .
Fig. 8. Scatter plots of PM 10 concentration as a function of each of the predictor variables.

Fig. 9 .Fig. 10 .
Fig. 9.A head-to-head comparison of performances for measured data, fuzzy-logic outputs and the multiple regression models (exponential, polynomial with constant term and polynomial without constant term, respectively) by means of suspended dust concentration (PM 10 ).

Table 1 .
Methods, ranges and accuracy of the measurements conducted in the mobile laboratory.

Table 2 .
Number of trapezoidal membership functions (trapmf) and their ranks for each of the input and output variables considered in the present fuzzy sets.

Table 4 .
Summary of the multiple regression-based results.

Table 6 .
Descriptive statistical performance indices for the data sets considered in the present prognostic approach.
a O, P, m and reg are the subscripts indicating the observed, predicted, mean and regression respectively.a Fuzzy-logic model.b Multiple regression model.