Temporal Dynamics of Air Bacterial Communities in a University Health Centre Using Illumina MiSeq Sequencing

Bacterial contamination of air may have human health implications by the transmission of potential human pathogens. Therefore, assessment of air bacterial abundance and composition in different built environment is essential. Jawaharlal Nehru University health centre (UHC) is a primary healthcare setting providing need-based medication to university students. Using active air sampling method, we collected eight air samples from the indoor and outdoor area of UHC across four different seasons. The total genomic DNA was extracted from the air samples and subjected to 16S rRNA gene-based next-generation sequencing. We performed the taxonomic classification along with comparative analysis of air bacterial communities. This study revealed that Proteobacteria, Actinobacteria, Bacteroidetes and Firmicutes are the dominant phyla in the sampled air. Overall, the air bacterial composition in our studied samples was comparatively simple; only ten taxonomic families accounting for ~75% of the total sequences determined. We also observed ESKAPE pathogens in the air metagenomes in a low percentage (4.42%), which were dominated by Pseudomonas, Acinetobacter and Staphylococcus. Proteobacteria, Actinobacteria and Firmicutes showed significant correlation with PM2.5. We suggest that routine air monitoring and microbiological survey is essential for air quality standards and potential human pathogens detection in health care settings. It is the first report from India to uncover the temporal dynamics of air bacterial communities in UHC using Illumina MiSeq (PE300) sequencing and Quantitative Insights into Microbial Ecology (QIIME).


INTRODUCTION
University Health Centre (UHC) provides primary health care services to university students and residents of the university campus of Jawaharlal Nehru University (JNU). During unhealthy conditions, people visit UHC to get various medical facilities like consultation and lab investigation. The UHC microenvironment may possess aerosolised microorganisms which may have a potential health impact on patients as well as healthcare personnel. Therefore, the study of airborne bacterial diversity has a significant effect on public health assessment and infection control measures (Lax and Gilbert, 2015). The most-reported microorganisms involved in nosocomial infections are Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species (ESKAPE pathogens) and most of them showed multidrug resistance (Santajit and Indrawattana, 2016).
Previous studies have characterised the airborne microbes * Corresponding author. Tel.: 91-11-26704307 E-mail address: kasturim@mail.jnu.ac.in based on traditional culture method in the hospital environment (Qudiesat et al., 2009;Gilbert et al., 2010;Sudharsanam et al., 2012;Matouskova and Holy, 2013;Park et al., 2013;Gaüzère et al., 2014;Frías-De León et al., 2016). The culturebased methods only estimate a low proportion of total bacterial diversity in a particular environment (Pace, 1997). Therefore, culture-based studies fail to assess actual bacterial diversity of the hospital environment. To overcome these limitations, we must rely on culture-independent methods to determine the total bacterial diversity prevailing in a particular environment. In recent years, quantitative polymerase chain reaction (qPCR) and next-generation sequencing (NGS) have been widely used to characterise bacteria in environmental samples using 16S rRNA gene (Lee et al., 2010;Bertolini et al., 2013). The NGS has emerged as a better tool for estimation of bacterial diversity in different indoor and outdoor environments (Jiang et al., 2015;Gao et al., 2017). Previously, pyrosequencing has been applied to explore the bacterial diversity in a healthcare setting (Luna et al., 2007;Poza et al., 2012). Illumina MiSeq (PE300) sequencing provides a more accurate estimation of the bacterial community dynamics compared to previously used DNA sequencing techniques.
In India, various research groups have studied the microbial contaminants present in air samples of different indoor and outdoor environments. In previous studies carried out in Rajasthan (Yadav et al., 2007), Mumbai (Gangamma et al., 2011), Nagpur (Jagzape et al., 2013), Tamil Nadu (Srikanth et al., 2008;Sudharsanam et al., 2012), Gwalior (Yadav et al., 2015), Agra (Mamta et al., 2015), Chennai (Valsan et al., 2015;Priyamvada et al., 2018), Munnar (Valsan et al., 2015) and Kolkata (Debasmita, 2011), researchers have studied biogenic materials by traditional culture, microscopy, PCR and MALDI techniques. Some researchers also characterised microbes from different indoor areas in hospital settings (Sudharsanam et al., 2012;Bajpai et al., 2014;Paul et al., 2015). In addition, various research groups in Delhi (Yadav et al., 2007;Srivastava et al., 2012;Balyan et al., 2017;Lal et al., 2017;Balyan et al., 2019) also tried to characterise different biogenic components (pollens, bacteria and fungi) of aerosol using culture-based and morphological identification methods. Srivastava and colleagues collected air samples from JNU health centre and used culture-based method to detect the Gram-positive bacteria and Gram-negative bacteria. As authors identified cultured bacteria based on their morphology and cell shape, they failed to report actual bacterial abundance and diversity (Srivastava et al., 2012). As the NGS and metagenomics are emerging fields in the aerobiological research, its application will provide more valuable information, which might otherwise skip in these culture-based studies.
As in the Indian scenario, no data are available to study the air bacterial communities in health care setting using advanced 16S rRNA gene-based Illumina MiSeq sequencing, we have attempted to characterise the air bacterial communities from the indoor and outdoor area of the UHC using the same. For this purpose, air samples were collected using microbial air sampler during four different seasons, spring, monsoon, winter and summer. Next, the abundance and composition of bacterial communities present in the collected air samples were estimated. Further to gain knowledge on air bacterial diversity, α-and β-diversity was estimated. The abundance and composition of aerosolised nosocomial pathogens with special focus to ESKAPE pathogens were also investigated. Finally, correlations between the dominant phyla and the environmental factors (Temperature, Relative Humidity, PM 2.5 and PM 10 ) were analysed.

Air Sample Collection and Meteorological Conditions
We have collected air samples from the indoor and outdoor area of JNU Health Centre (28.54°N, 77.16°E) from March 2018 to June 2019 in different seasons, namely spring, monsoon, winter and summer (Fig. S1) We have collected the air samples in 12 ml of sterile collection liquid (PBS buffer, pH 7.4) using the microbial air sampler (Coriolis micro air sampler, Bertin Technologies, France) at an average flow rate of 300 L min -1 for two hours and for each sampling, 12 runs of 10 mins each with a oneminute pause for refilling the collection cone was performed. All the sampler fittings were sterilised before sample collection to avoid potential contamination. After collection, the samples were maintained at 4°C and transported to the laboratory within an hour. The longer sampling time and high flow rate of the sampler combined with Illumina MiSeq sequencing could minimise the effect of single air sampling per season on our determination of bacterial abundance and community composition because air is continuously moving.

DNA Extraction and PCR Amplification
The collected air samples (now hydrosol) were concentrated by centrifugation at 12000 × g for 60 mins at 4°C (Jiang et al., 2015) and the resulting pellet was resuspended in 4 mL of PBS buffer (pH 7.4). The genomic DNA was extracted using a DNAeasy PowerLyzer kit as per the manufacturers' protocol (Qiagen, India). The concentration and quality of extracted DNA were measured by NanoDrop Spectrophotometer ND-2000c (Thermo Fischer Scientific, USA).

Illumina MiSeq Library Preparation, Cluster Generation and Sequencing
The quality of PCR amplified DNA was determined using NanoDrop spectrophotometer. The amplicon libraries were prepared using the Nextera XT Index Kit (Illumina Inc.). The 16S rRNA gene-specific forward primer (5'-GCCTACGGGNGGCWGCAG-3') and 16S rRNA reverse primer (5'-ACTACHVGGGTATCTAATCC-3') were used to amplify the bacterial V3-V4 regions (Klindworth et al., 2013). The paired-end libraries (2 × 300) were sequenced following the 16S metagenomic sequencing library preparation protocol (Part# 15044223 Rev.B). All the unassembled, high throughput sequencing reads obtained were submitted at NCBI Sequence Read Archive (SRA) as BioProject under the accession number of PRJNA589998. Trimmomatic v0.38 (Bolger et al., 2014) and FLASH (v1.2.11) (Magoč and Salzberg, 2011) were used to obtain the high-quality clean-reads and stitched the PE data into single reads based on the overlap. The merged sequences after removing adapter sequences, ambiguous reads and low-quality sequences were defined as 'trimmed sequences', which were filtered out the chimaeras using the UPARSE (Edgar, 2013).
We used the Quantitative Insights into Microbial Ecology (QIIME) pipeline for post-sequencing analysis (Caporaso et al., 2010). We have picked Operational Taxonomic Units (OTUs) based on 97% sequence similarity within reads and chose a representative sequence from each OTU against the Greengenes database (v13_8) (McDonald et al., 2012) for downstream analysis. This representative sequence was used for taxonomic identification of the OTU. OTU table was generated having representative sequences of OTUs with corresponding taxonomic ranks assigned against Greengenes database. We used alpha_diversity.py and alpha_rarefaction.py workflow script to calculate the alpha diversity (Shannon, Simpson and Chao1) and generate rarefaction curves, respectively. The QIIME script beta_diversity_through_ plots.py was used to calculate the beta diversity to compare the different bacterial communities, which could be illustrated as the PCoA plots using weighted UniFrac diversity metrics (Lozupone and Knight, 2005). We have also calculated the Spearman's rank correlation coefficient (SRCC), ρ, between the different bacterial phyla and the environmental factors (Temperature, Relative Humidity, PM 2.5 and PM 10 ).

Meteorological Conditions and Particulate Matter (PM) Concentration
The average temperature (T), relative humidity (RH), PM 2.5 and PM 10 varied during the sampling across different seasons. The average T and RH during sampling periods were (28.05 ± 8.14)°C and (49.5 ± 17.26)%, respectively (Table S1 in Supplementary material). The concentrations of PM 2.5 and PM 10 ranged from 13.53 µg m -3 to 477.6 µg m -3 and 80.17 µg m -3 to 739.02 µg m -3 , respectively (Table S1 in Supplementary material). The PM 2.5 concentration was highest in SUI and lowest in MI. On the other hand, the PM 10 concentration was highest in WI due to the thick and persistent fog conditions during winter, which facilitates the prolonged suspension of these particulates in the air. However, the rain in monsoon brought down the concentration of particulate matter, including both PM 2.5 and PM 10 .

Richness and Diversity Analysis across four Different Seasons
Based on the Shannon and Simpson diversity index, the bacterial diversity of indoor air was highest in SUI and lowest in MI. Furthermore, the bacterial diversity of outdoor air was highest in SUO and lowest in SPO (Table S3 in Supplementary material), which was also depicted by the rarefaction curves ( Fig. S2 in Supplementary material). The Shannon indices of collected air samples were ranged from 3.2 to 9.1. The Shannon indices of SPO and MI were in proximity and also lower among the eight studied samples.
Similarly, the Shannon indices of WO and SPI samples are close to each other (Table S3 in Supplementary material). The rarefaction curves showed saturation level, indicating that most of the bacterial community was covered by the sequences obtained by MiSeq PE300 (Fig. S2 in Supplementary material).
We classified the obtained sequences in various taxa ranging from phylum to genus. A total of 34 phyla, 96 classes, 182 orders, 339 families, and 749 genera were retrieved (Table S4 in Supplementary material). For different air samples, the numbers of taxa determined at phylum, class, order, family and genus were in the range of 16-27, 39-78, 56-146, 115-283, 220-604, respectively. A proportion of sequences which could not be assigned to any known group, showing the occurrence of novel sequences. The proportions of unclassified taxa were 0.003%-0.162%, 0.045%-1.559%, 0.795%-9.353%, 10.789%-43.634% for class, order, family and genus, respectively (Table S5 in Supplementary material).

Bacterial Community Composition at Different Taxonomic Levels
The Proteobacteria was the most dominant phylum throughout all air samples, ranging from 39.21% to 78.04% of the sequences in the respective library of each air sample. Actinobacteria, Bacteroidetes and Firmicutes were also found in all air samples, but at the variable relative abundance in the sequences of each library (Fig. 1). Our findings are consistent with previous studies shown that Proteobacteria, Actinobacteria and Firmicutes were the most abundant microbes in the hospital air (Du et al., 2018;Gao et al., 2018). Proteobacteria, Actinobacteria and Firmicutes were also the dominant phyla in the urban PM 2.5 samples collected across the seasons (Du et al., 2018;Li et al., 2019).
The 52 dominant genera (> 1% abundance at least one sample) were selected from a total of 750 genera, as shown in Fig. 5. Psychrobacter, Moraxellaceae unclassified and Arthrobacter showed higher abundance with an average abundance of 24.66%, 7.69%, 6.40%, respectively. Psychrobacter was the most abundant genera in all samples except MI, MO and SUO. The genus Psychrobacter, a member of the family Moraxellaceae, is a widespread and evolutionarily successful bacterial group, the biology of which may provide better insights of environmental adaptation and survival (Bowman, 2006). Moraxellaceae unclassified was abundant in all samples except SUO. Arthrobacter was abundant in all samples except WI. The genus Arthrobacter, a member of the family Micrococcaceae, have been isolated from soil and sediments, but some clinical isolates were also reported (Busse and Wieser, 2014). Another 16 genera with average abundance more than 1% of the total sequences were Exiguobacterium (4.69%), Gillisia (3.96%), Planococcus  . Moreover, Staphylococcus (51%) and Micrococcus (37%) were dominant among the bacterial genera in Portuguese hospital (Cabo Verde et al., 2015). A culture-based study of tertiary care centre from Central India reported Coagulase-negative Staphylococci (CoNS) followed by Bacillus, Staphylococcus aureus and Pseudomonas aeruginosa in air samples (Bajpai et al., 2014). Micrococci, CoNS, Enterobacter and Pseudomonas were the predominant in the hospital air collected from west-Chennai, India using passive and active methods (Sudharsanam et al., 2012). The hospital indoor air samples showed the predominance of Escherichia coli, Staphylococcus aureus, Pseudomonas aeruginosa, Klebsiella sp. at Kalyani, West Bengal, India (Paul et al., 2015).
Principal coordinate analysis (PCoA) was applied based on the weighted UniFrac distance matrix to compare the bacterial communities across the air samples. The P1 and P2 explained 49.12% and 19.55% of the variation in overall community structure, respectively (Fig. 6). We found that samples WO and SPO tended to cluster together. However, the remaining samples were far from this cluster and ungrouped. This plot showed that the majority of samples were independent and different from each other (Fig. 6).

Correlation between Bacterial Community Structure, Meteorological Parameters (T, RH) and Particulate Matter (PM2.5 and PM10)
Airborne microbes are the vital component of particulate matter and have a potential impact on human health (Zhai et al., 2018). Due to their small aerodynamics nature, PM 2.5 and PM 10 loaded with airborne pathogens can directly entered into the human respiratory system, resulting in respiratory illnesses (Zhai et al., 2018). Here we tried to analyse the correlation between bacterial community structure (at phylum level), meteorological parameters (T, RH) and respirable particulate matter (PM 2.5 and PM 10 ) concentration by calculating the SRCC across different seasons (in supplementary Table S2). Actinobacteria showed a positive correlation with temperature. Proteobacteria showed a positive correlation with PM 2.5 concentration (ρ = 0.857, p < 0.05), suggesting that the relative abundance of Proteobacteria is in accordance with the concentration of PM 2.5 . Proteobacteria has been a frequently reported dominant phylum of PM 2.5 in China (Gao et al., 2017;Gao et al., 2018;Li et al., 2019;Pan et al., 2019). Interestingly, the Actinobacteria showed a negative correlation with both PM 2.5 (ρ = -0.893, p < 0.01) and PM 10 (ρ = -0.964, p < 0.01). Firmicutes showed a negative correlation with PM 2.5 concentration (ρ = -0.893, p < 0.01) only. These negative correlations can be explained by interphyla correlations as given in Table S7 in the supplementary  material; where Proteobacteria showed a negative correlation with Actinobacteria (ρ = -0.762, p < 0.05). Since all the bacterial communities simultaneously existed in the air at the time of sample collection, the summation of relative abundance of each bacterial phyla in a given sample should be equal to 100. Therefore, the increase of the relative abundance of Proteobacteria with the increase in PM concentration might be indirectly resulting in the decrease of the abundance of Actinobacteria with the increase in PM concentration. All remaining phyla showed no significant correlation with meteorological parameters and particulate matter concentration. The SRCC analysis could explain the relationship between the bacterial community structure, meteorological parameters and particulate matter concentration; but failed to estimate the relative importance of these factors in shaping the bacterial community structure (Gao et al., 2017).

Distribution of ESKAPE Pathogens
Nosocomial infections are also known as hospitalassociated infections (HAIs). The primary causative agents of HAIs are Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa and Enterobacter spp (in short ESKAPE pathogens). Some of the nosocomial pathogens showed multi-drug resistance (MDR) due to improper use of broad-spectrum antibiotics in healthcare settings (Khan et al., 2015). In this study, we obtained 22056 ESKAPE sequences, accounting for 4.42%, of the total 498671 sequences. The number of ESKAPE sequences were ranged from 90 to 10323 across different air samples. The lowest sequences were reported in SPO and WO, WI and SPI, moderate in SUI and MO, and higher in SUO, MI (Table S6 in supplementary  material). Reasonably, the climatic conditions during spring and winter are unfavourable for the bacterial growth, that is the reason we observed the lowest pathogen load during spring and winter seasons. In contrast, during summer and monsoon seasons, the pathogen load was higher due to the optimum climatic conditions for the bacterial growth.
Overall, we found 31 pathogenic species (known and unclassified) and their relative abundance were shown in Fig. 7(a). Among the 31 species, the relative abundance of unclassified species of Pseudomonas, Acinetobacter and Staphylococcus were high, which was 57.27%, 18.24% and 14.35% respectively. Airborne species of Staphylococcus and Pseudomonas showing multi-drug resistance were also reported from the neonatal intensive care unit (Morgado-Gamero et al., 2019). There were five ESKAPE pathogens, the relative abundances of which in the range of 1% to 3%, including Acinetobacter schindleri (2.98%), Pseudomonas veronii has only occurred in WO. Previous studies also revealed that Staphylococcus aureus (~16%), coagulase-negative Staphylococci (13%-17.2%) and Micrococcus luteus (10.7%-13.3%) were found to be the most common airborne pathogens in healthcare settings (Qudiesat et al., 2009). Higher prevalence of clinically relevant bacterial pathogens, Coagulase-negative staphylococci (29.6%), Staphylococcus aureus (26.3%), Acinetobacter species (9.5%), and Pseudomonas aeruginosa (5.3%) were also reported in a hospital air in Ethiopia (Solomon et al., 2017). However, some of the species assigned to the ESKAPE pathogens category in this report are not yet recognised as pathogenic to the humans. Moreover, as the short reads were generated during Illumina MiSeq sequencing, the accurate classification of pathogens at species-level has remained obscure (Jin et al., 2018).

CONCLUSIONS
We used next-generation sequencing to investigate the bacterial abundance and diversity in UHC during different seasons. Proteobacteria, Actinobacteria, Bacteroidetes and Firmicutes were the dominant phyla in the sampled air. Overall, air bacterial composition is comparatively simple in this study; only ten taxonomic families were accounting for ~75% of the total sequences determined. The longer sampling time and high flow rate of the sampler, combined with Illumina MiSeq sequencing could minimise the limitation of single sampling per season to determine the seasonal dynamics of air bacterial communities because air is continuously moving (Gao et al., 2017). However, multiple sampling in each season might have provided a better estimate of seasonwise bacterial abundance and diversity. Also, the seasonwise bacterial abundance in the air needs to be confirmed using qPCR of 16s rRNA gene (Lee et al., 2010).
We also found 31 ESKAPE pathogens in extremely low percentage (4.42%) of entire sequences reported and dominated by unclassified species of Pseudomonas, Acinetobacter and Staphylococcus.
Actinobacteria showed significant correlation with temperature, PM 2.5 and PM 10 . Whereas, Proteobacteria and Firmicutes showed significant correlation with PM 2.5 only. Exposure to these airborne bacterial pathogens may result in the emergence of respiratory ailments in human beings. The holistic approach, including administrative and environmental control, as well as personal protective measures, may regulate the airborne bacterial infections in healthcare settings.