A Comparative Study of PM2. 5 Ambient Aerosol Chemical Databases

The most extensive database, for Los Angeles, was used ... Hungary's Hortobagy National Park. ..... in the South Coast Air Basin also show strong simi...
0 downloads 0 Views 185KB Size
Environ. Sci. Technol. 1998, 32, 3926-3934

A Comparative Study of PM2.5 Ambient Aerosol Chemical Databases V. WONGPHATARAKUL,† S . K . F R I E D L A N D E R , * ,† A N D J . P . P I N T O * Department of Chemical Engineering, University of California, Los Angeles, California 90095-1592, and National Center for Environmental AssessmentsRTP, Mail Drop 52, USEPA, RTP, North Carolina 27711

Comparing PM2.5 chemical databases at different sites and times is of interest in developing air quality control strategies, planning health effects studies, and “harmonizing” international standards. Three methods of comparison were applied to databases for the fine aerosol from seven sites around the world, five urban and two nonurban. The most extensive database, for Los Angeles, was used as a reference. log-log plots of chemical concentrations at pairs of sampling sites provide an easily visualized comparison that can be characterized by the coefficient of divergence (CD) which approaches zero for similar sites and one if the sites are very different. Sites similar and dissimilar to downtown Los Angeles were Teplice (Czech Republic) and Taipei (Taiwan), respectively. Cluster analysis was used to group sampling sites with similar characteristics. The Los Angeles, Philadelphia, and Amazon Basin sampling sites each clustered strongly; Teplice fused with the Los Angeles cluster. Correlation coefficients for the spatial variation of the chemical components for aerosol sources provide a measure of source similarities for the Los Angeles sites. Differences in chemical component concentrations at different sites are caused not only by true chemical variations but also, by sampling and measurement artifacts. There is a need for intercomparison and calibration to reduce such effects.

Introduction Recent epidemiological studies indicate associations between particulate air pollution and mortality rates in various U.S. metropolitan areas (1), and the Environmental Protection Agency (EPA) has recently revised the National Ambient Air Quality Standard (NAAQS) for particulate matter (2). Of particular interest is the fine component of the aerosol, less than 2.5 µm (PM2.5), which has been linked to adverse health effects, and for which a new NAAQS has been adopted. The NAAQS for PM2.5 is stated in terms of aerosol mass loadings, but many in the scientific community believe specific chemical components are responsible for adverse health and environmental effects. Before 1970, aerosol chemical databases were fragmentary. Only a few chemical components had been measured simultaneously at a given sampling site and little information on the statistical variability was available. The advent of aerosol source resolution (3) and improvements in measurement methods led to the assembling of large aerosol chemical databases in the United States and, more recently, other nations. Such databases continue to accumulate. The 3926

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 32, NO. 24, 1998

purpose of this paper is to discuss quantitative methods for comparing aerosol chemical databases at various locations. Possible applications include: 1. Quantitative comparisons can be made of the uniformity of the aerosol composition in a given geographical region with multiple sampling sites. This can help in designing regional aerosol sampling strategies. 2. The similarity in the composition of aerosols at two entirely different locations, for example separate continents, is of interest in comparing adverse health effects caused by aerosols at the different sites. Such information would be of special value in the design of epidemiological and other health effects studies. 3. The Clean Air Act of 1990 calls for the “harmonization” of standards between the United States and its principal industrial trading partners. The approaches described in this paper permit systematic comparisons of aerosol air quality in different countries. In this study (4), data sets from seven locations around the world are compared, five sets based on PM2.5 and two on PM2. This group provides good geographical coverage for many of the same chemical components. Both urban and nonurban sites were selected. We also sought databases that covered periods of at least a few weeks up to a year or more. Measurements of aerosol chemical composition satisfying these requirements have been reported for urban locations including Los Angeles, CA; Philadelphia, PA; Taipei, Taiwan; and Ostrava and Teplice, Czech Republic. The Los Angeles, Philadelphia, and Taipei locations had multiple sampling sites. The nonurban sites were the Amazon Basin and Hungary’s Hortobagy National Park. Figure 1 shows the sampling locations. These data were originally gathered for source resolution studies in connection with abatement programs and with health and epidemiological studies and for the characterization of the ambient aerosol in modeling studies. Table 1 shows the chemical components reported in the databases. The most comprehensive is the Southern California Air Basin (SoCAB) database which includes the secondary components, SO42-, NO3-, and NH4+ in addition to organic and elemental carbon. Table 2 summarizes the databases and shows the sampling periods and methods and the original references. There were a total of 21 sampling sites including several in close proximity. Given the availability of a broad set of chemical databases, what relationships can be sought among them to help with the applications discussed above? Three approaches were investigated. According to the first, the relative enhancement or depletion of the individual chemical components at two different locations is displayed by plotting the concentrations against each other and calculating the coefficient of divergence for the spread. In the second approach, mathematical procedures are used to identify similar groups or clusters among the databases from the various sites. Finally, the spatial variation among sampling sites in a given region was evaluated by calculating the correlation coefficients among chemical components at different sites on the same day. The next section which describes the methodology in greater detail is followed by a discussion of the results of applying the methods.

Methodologies Concentrations of the chemical species present in the atmospheric aerosol vary over several orders of magnitude because of the composition and magnitude of the sources. Industrial societies have learned to live within certain 10.1021/es9800582 CCC: $15.00

 1998 American Chemical Society Published on Web 10/31/1998

FIGURE 1. Locations of aerosol databases used in the study. concentration ranges for the aerosol components. For example, while high sulfate levels (20 µg/m3) are accepted, high levels of toxic metals are not (although lead levels in certain localities once approached sulfate levels). The wide variations in concentrations make it convenient to represent data on a logarithmic scale and/or to normalize the data; both approaches are used in this paper. Data for a wellcharacterized site can serve as a reference standard in pairwise comparisons. In some cases, limited sets of chemical elements are of interest. The methods of comparison discussed in the paper are not meant to be exhaustive; they demonstrate the basic concepts and the use of several rather different approaches. Concentration Diagrams. The concentration diagram is a log-log plot of the concentration or mass fraction of the chemical components at one site against those at another. The log-log diagram is used because of the large concentration ranges for the various chemical components. The diagonal line of unit slope represents the hypothetical case in which the concentrations of the chemical components for the reference location (x axis) and comparison site are equal. The diagonal divides the diagram into regions of comparative enhancement (above the line) or depletion (below the line) for individual chemical components, relative to the reference site shown on the abscissa. In addition, the diagram shows the measurement errors for the chemical components. The data point labeled unknown that appears in the diagrams refers to the difference between the measured total mass concentration of PM2.5 and the sum of the measured concentrations of the chemical components common to both sampling sites. For Los Angeles, the unknown concentrations range between 3.2 and 17.5% of the total fine mass; for most of the other databases, the secondary components (NO3-, SO42-, NH4+), elemental carbon, and organic carbon which constitute much of the fine particle mass were not measured. In these cases, the comparison is limited to the metallic species and silicon. The coefficient of divergence (CD) defined in the next section was also calculated for each pair of sampling sites and is shown on the concentration diagrams. The CD is a measure of the similarity of the two databases. Hierarchical Clustering Methods. Collections of chemical measurements made at different sampling sites can be grouped according to their similarity by a mathematical procedure known as cluster analysis (12, 13). For this purpose, a measure of closeness or similarity is needed. Similarity measures for aerosols have been evaluated by Wongphatarakul (4). The wide variation in chemical component concentrations in the aerosol databases requires some type of normalization procedure. It is convenient to intro-

duce the coefficient of divergence (CD) used earlier in biological applications (14, 15), and defined as follows:

CDjk )

x ( ) 1

p

xij - xik

∑x p i)1

ij

2

(1)

+ xik

where xij represents the average concentration for a chemical component i at site j, j and k represent two sampling sites, and p is the number of chemical components. If the two sampling sites are similar, the CD approaches zero. If the two sampling sites are very different, the CD approaches one. The CD is self-normalizing and can be calculated from short-term measurements or long-term averages. By using the CD, aerosol databases can be compared even if the number of chemical components measured for each site is different, provided that the same set of chemical components is used in each case. Similarities between pairs of sampling sites were calculated using the CDs which are tabulated in matrix form with one CD for each pair of sampling sites. When calculating the CD for two sampling sites, the maximum number of chemical components common to both sampling sites is normally used. This makes the classification more reliable because more chemical components provide a better measure of similarity for the pairwise sampling sites. It is possible to select smaller sets of common chemical components for particular applications as discussed below. The most similar sampling sites are grouped first. Clusters are merged using one of the three commonly applied linkage methods: single linkage (minimum distance), complete linkage (maximum distance), and average linkage (average distance) (16). Single linkage involves merging nearest (most similar) clusters. Complete linkage occurs when groups are fused according to the distance between their furthest objects. Average linkage results when groups are fused according to the average distance between pairs of members in the respective sets. As the similarities between clusters decrease, all subgroups are fused into a single cluster. Finally, a dendrogram (tree diagram) is drawn to show the groupings and the linkage distances, which are proportional to the degree of dissimilarity between the clusters. The STATISTICA 5.1 software program was used in the clustering calculations (17). Correlation Coefficients for Spatial Variation of Chemical Components. Correlation coefficients for the spatial variation of the chemical components can be used to show the degree of similarity between sampling site pairs in a given region. Let xid and yid be the concentrations of a chemical VOL. 32, NO. 24, 1998 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3927

TABLE 1. Chemical Components in the Aerosol Databasesa Los Angeles Na Mg Al Si P S Cl K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Rb Sr Y Zr Mo Pd Ag Cd In Sn Sb Te I Ba La W Au Hg Pb Na+ Mg2+ NH4+ NO3SO42OC EC

Philadelphia

Ostrava

* * X X X X * X X * X * X X X * X * X * * * * X * * * * * * * *

* * X X X X X X X * X * X X X * X * X X X X X X X X

X X X X * X X X X X X X X * X * * * X * * * * * * * * * * *

* * * * * * * X X

* * * X X * X X X X X

Teplice

Taipei

Amazon Basin

Hortobagy National Park

X X X X X X X * X X X X X * X X X X * X X X X X * * * * * *

X X X X X X X

X X X X X X X

X X X X X X X

X X X X X X X X X

X X X X X

X X X X X

X X X

X X

X * * * * * * * * X

X

X

X X X

X X X

X

X

X

X

X

X

X X

a Key: X, concentrations used in comparisons with other aerosol databases; *, concentrations reported but not used because measurements are below the detection limit.

component i in the PM2.5 for two different sites on the same date, d. The correlation coefficient, Ri, is

1 Ri )

m

∑(x

md)1

x

1

id

- xid)(yid - yid)

x∑

m

∑(x m

id

d)1

- xid)

1

(2)

m

md)1

(yid - yid)

where m is the number of measurements made during the sampling period. The correlation coefficients for highly correlated and uncorrelated concentrations of chemical components for pairs of sampling sites approach one and zero, respectively. High correlations indicate that the source contributions of the chemical components to the two sites are similar. Equation 2 is of limited use in comparing 3928

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 32, NO. 24, 1998

locations situated in different regions because of different sampling periods, sources, and meteorological conditions. Chemical components can be grouped as markers for source types, based on chemical mass balance (18) or principal component analysis (19). Correlations of source types for pairs of sampling sites can be investigated in this way. The correlation coefficient averaged over the chemical components of a given source type is

Rs )

1

p

∑R p

i

(3)

i)1

where s is the source type and p is the number of chemical components of the source type s.

Results and Discussion The methods discussed above were applied to the chemical databases listed in Table 2. First, concentration diagrams

TABLE 2. Chemical Characterization Studies of Particulate Matter location South Coast Air Basin: Anaheim (ANH86) Burbank (BUR86) Downtown LA (DLA86) Long Beach (LGB87) Hawthorne (HAW86) Rubidoux (RUB86) Czech Republic: Ostrava

sampling period Jan 1986 - Jan 1987, 24-h samples, every sixth day

Oct 16, 1995 - Nov 14, 1995, 24-h samples, consecutive days

Teplice

Feb 17, 1992 - Aug 30, 1992, 24-h samples, consecutive days

Philadelphia:

July 18, 1993 - Sept 2, 1993, 24-h samples, random samples

N.E. Airport Downtown Philadelphia N. Philadelphia SW Philadelphia Valley Forge Camden, NJ Taipei:

Nov 1992 - Feb 1993 24-h samples, random samples

Residence 1 Residence 2 Residence 3 Hortobagy National Park

Amazon Basin:

Feb 9, 1995 - Feb 10, 1996 24-h samples Jun 1990 - Apr 1993 24-72 h samples

Serra do Navio Alta Floresta Cuiaba

were used to explore the similarities in concentrations of chemical components between pairs of sampling sites in the same geographical region or at widely different locations. Next, hierarchical clustering methods were used to group sampling sites first for a large set of chemical components common to many sites and then for a limited set. Finally, correlation coefficients were used to study the spatial variability of sources among sampling sites within a given region over a similar time period. Concentration Diagrams. Since there are 21 sets of timeaveraged data, 210 pairwise comparisons of datasets are possible. A few illustrative examples are discussed below; other examples are given by Wongphatarakul (4). Los Angeles aerosol databases are the most extensive available with respect to the number of chemical components measured and the frequency and time period covered. We have selected the data for downtown Los Angeles as a reference standard with which to compare other datasets. The first set of comparisons was made among the sampling sites in the Los Angeles region. Figure 2, the concentration diagram for downtown Los Angeles and Burbank (about nine miles northwest) shows the strong similarity between the two sites with a CD of 0.099. Concentration diagrams for other sites in the South Coast Air Basin also show strong similarities. The results indicate that the region extending from Haw-

measurement PM10 and PM2.5 total mass trace elements (34 species)

methodology

ref

5

PM10 sampler PTFE and quartz filters gravimetric analysis XRF

SO42-, NO3-, NH4+ OC and EC

ion chromatography colorimetry thermal/optical reflectance

PM2.5

virtual impactor with Teflon filters

6

total mass trace elements (28 species for Ostrava), (40 species for Treplice) organic compounds, OC, EC (Ostrava only) PM10 and PM2.5

gravimetric analysis XRF

7

virtual impactor Teflon and nuclepore filters

8

total mass trace elements (40 species)

gravimetric analysis XRF

PM10 and PM2.5

environmental monitor samples

total mass trace elements (20 species) PM10-PM2 and PM2 total mass trace elements (16 species) PM10-PM2 and PM2

gravimetric analysis XRF “Gent” stacked filter unit samples gravimetric analysis PIXE stacked filter unit

total mass trace elements (20 species)

gravimetric analysis PIXE

9

10

11

thorne to Anaheim (about 30 miles) is well mixed. The data for Rubidoux, using downtown Los Angeles as a reference site, tend to have a greater spread with a CD of 0.225 (Figure 3). The sources of the chemical components in Rubidoux, about 45 miles east of downtown Los Angeles, differ from those of the other SoCAB sites. The Rubidoux sources include agricultural activities and lime/gypsum operations (20), resulting in high relative concentrations of Ca, NH4+, and NO3-. Figure 4 compares the annual average concentrations of coarse (PM10-PM2.5) and fine (PM2.5) chemical components in Burbank. The PM2.5 fraction contains high concentrations of secondary components such as SO42-, NH4+, and NO3while the coarse fraction is dominated by primary sources, including chemical components such as Ca, Al, and Si. The fine fraction also contains Pb, Br, EC, and OC emitted in motor vehicle exhausts from motor vehicles using leaded gasoline (21, 22). Figure 4 shows the considerable spread in the chemical properties between the fine and coarse particles. Indeed chemical differences between the size ranges at a given sampling site are generally much larger than chemical differences between PM2.5 at widely separated sites. A concentration diagram comparing 17 chemical components for downtown Los Angeles and downtown Philadelphia is shown in Figure 5. Not surprisingly, data for the VOL. 32, NO. 24, 1998 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3929

FIGURE 2. PM2.5 chemical components in downtown Los Angeles and Burbank (1986) have similar chemical characteristics. Burbank is about 9 miles northwest of downtown Los Angeles. An asterisk indicates (*) Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicate Jan 2-Dec 28, 1986 (61 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm.

FIGURE 3. Concentrations of PM2.5 chemical components for Rubidoux and downtown Los Angeles (1986). Rubidoux is about 45 miles east of downtown Los Angeles. There is a significant spread in the concentrations for the two sites compared with downtown Los Angeles and Burbank (Figure 2). An asterisk (*) indicates Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicate Jan 2-Dec 28, 1986 (60 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. two cities, measured seven years apart, show significant differences in aerosol composition. The concentrations of Pb, Br, and Mn are much higher in downtown Los Angeles than downtown Philadelphia because the consumption of leaded gasoline in Los Angeles in 1986 was higher than in Philadelphia in 1993. The origin of aerosol manganese is not certain; it may be present in gasoline as the octane enhancer methyl cyclopentadienyl manganese tricarbonyl (MMT) (23). The Philadelphia sulfate concentration is higher 3930

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 32, NO. 24, 1998

FIGURE 4. Concentrations of chemical components in Burbank [1 year time series (Jan 1986-Jan 1987), 24 h sampling, sampling every 6 days, dp < 2.5 µm] for coarse (PM10-PM2.5) and fine particles (PM2.5) (1986) showing the strong enrichment of fine particles in secondary components.

FIGURE 5. Concentrations of PM2.5 chemical components in downtown Los Angeles and downtown Philadelphia. High concentrations of gasoline additives (Pb, Br, and possibly Mn) characterize the 1986 Los Angeles data compared with 1993 Philadelphia measurements. An asterisk (*) indicates Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicate Jul 1-Sept 1, 1993 (26 data points), 24 h sampling, random sampling day, dp < 2.5 µm. than that of Los Angeles because of the transport of sulfate from the conversion of SO2 emitted by midwest power plants, and higher sulfur content in the fuels used in Philadelphia (24). Figure 6 compares chemical components in Los Angeles in 1986 and Teplice in 1992. Sulfate concentrations in Teplice are higher than Los Angeles; data for the carbon containing components were not available for Teplice. Of all the comparisons between sites not in the same region, Teplice and downtown Los Angeles show the greatest similarity with a CD of 0.269. The cause of this similarity is not clear because the nature of sources in these two cities is quite different.

FIGURE 6. Concentrations of PM2.5 chemical components in Teplice and downtown Los Angeles. Surprisingly, Teplice is quite similar to downtown Los Angeles. This may be largely coincidence since the sources of the two regions differ. An asterisk (*) indicates Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicates Feb 17-Jul 7, 1992 (82 data points), 24 h sampling, random sampling day, dp < 2.5 µm.

FIGURE 7. Concentrations of fine chemical components in Alta Floresta and downtown Los Angeles. Concentrations of K is high because of biomass burning. High Al and Si probably result from land clearing activities. An asterisk (*) indicates Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicates Aug 1992-Mar 1993 (116 data points), vary between 24 and 72 h sampling, dp < 2.5 µm. The concentrations of chemical components at the Alta Floresta sampling site in the Amazon Basin are compared with downtown Los Angeles in Figure 7. Higher concentrations of K, Al, and Si are observed in Alta Floresta because of emissions from biomass burning and land clearing (10). Concentration diagrams for the other two sampling sites in the Amazon Basin also show higher concentrations of K than Los Angeles. Figure 8 compares the fine particle chemical components and mass for another nonurban site, Hortobagy National Park, with Los Angeles. In Hortobagy National Park,

FIGURE 8. Concentration of fine chemical components in Hortobagy National Park and downtown Los Angeles. Concentration of K in the park is greater probably because of biomass burning. An asterisk (*) indicates Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicates Feb 9, 1995-Feb 10, 1996 (86 data points), 24 h sampling, sampling two times a week, dp < 2.5 µm.

FIGURE 9. Concentrations of PM2.5 chemical components in Taipei Residence 2 and downtown Los Angeles. High concentrations of chemical components with respect to downtown Los Angeles are observed because the Taipei site is near several iron pipe foundries. An asterisk (*) indicates Jan 2-Dec 28, 1986 (63 data points), 24 h sampling, sampling every 6 days, dp < 2.5 µm. Double asterisks (**) indicates Nov 1992-Feb 1993 (7 data points), dp < 2.5 µm. the concentrations of the chemical components, except for K, are substantially below those of Los Angeles. Again, the K probably originates from the burning of vegetation. The concentration diagram comparing Taipei Residence 2 with downtown Los Angeles (Figure 9) shows that the Taipei site is much more heavily polluted. Of the paired urban sites examined, the two most dissimilar were Taipei and downtown Los Angeles (CD ) 0.783). The relatively high concentrations at the Taipei site results from its proximity to several iron pipe foundries. VOL. 32, NO. 24, 1998 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3931

FIGURE 10. Dendrograms for aerosol databases. The Amazon and Taipei databases are least similar largely because their total mass concentrations are much smaller and larger, respectively, compared with the other urban databases. High CDs occur in at least two different ways. Chemically similar aerosols may have high CDs due primarily to differences in mass concentration between the two aerosols under comparison. This is shown in Figures 8 and 9 where the general trend of the data falls on a straight line either above or below the diagonal. High CDs occur when PM2.5 mass is low (Figure 8, Hortobagy National Park, Hungary) or high (Figure 9, Taipei) compared with the reference data set. These CDs would be significantly reduced if the data were normalized by the total PM2.5 mass concentration and plotted 3932

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 32, NO. 24, 1998

as mass fractions. High CDs also occur when total PM2.5 mass loadings are about the same for the two sites (see Figure 5, Philadelphia/Los Angeles), but with a different mix of sources. Cluster Analysis. Hierarchical clustering was applied to the 21 databases to find the most similar groups of sampling sites. First, cluster analyses based on CDs for all chemical components in common among the sites are described. Next, the results of a cluster analysis for a limited set of transition metals of public health concern are discussed.

TABLE 3. Correlation Coefficients for Spatial Variation of PM2.5 Mass and Different Sources for Pairs of Sampling Sites in SoCAB (1986) Hawthorne and Rubidoux Long Beach and Rubidoux Anaheim and Rubidoux Downtown Los Angeles and Rubidoux Burbank and Rubidoux Hawthorne and Anaheim Long Beach and Anaheim Burbank and Anaheim Downtown Los Angeles and Anaheim Downtown Los Angeles and Hawthorne Burbank and Hawthorne Long Beach and Burbank Long Beach and Hawthorne Downtown Los Angeles and Long Beach Downtown Los Angeles and Burbank

Rtot

Rcrustal

Rsecondary

Rauto

Rresidual oil

-0.027 0.051 0.066 0.095 0.120 0.760 0.852 0.770 0.827 0.808 0.704 0.731 0.880 0.842 0.928

0.034 0.075 0.105 0.143 0.568 0.599 0.633 0.649 0.653 0.825

0.768 0.888 0.749 0.804 0.854 0.790 0.737 0.909 0.817 0.960

0.492 0.504 0.579 0.556 0.669 0.688 0.714 0.861 0.719 0.871

0.170 0.150 0.161 0.233 0.533 0.491 0.295 0.482 0.378 0.606

Three different linkage methods were used to prepare the dendrogram shown in Figure 10, which includes all of the chemical components in common among the databases. Not surprisingly, the greatest similarities are observed among sites in the same geographical region. Next there is a clumping of sites in the United States together with the Czech Republic sites. The Taipei and Amazon databases were most dissimilar from the rest. The similarity in the sampling site groupings using different linkage methods shows the consistency among this set of methods. Clustering methods provide an initial estimate of source profiles for poorly characterized sampling sites. If a poorly characterized site merges with a group of sampling sites of known source type or types, the poorly characterized site is likely to have similar chemical characteristics or source types. A large linkage distance indicates that the grouping is weak between the poorly characterized and well-characterized sampling sites. As an example, the hierarchical clustering methods show that Teplice fused with the SoCAB cluster. The linkage distances for Teplice and the SoCAB cluster on the three dendrograms were small. This is also shown in the concentration diagram for PM2.5 chemical components in Teplice and downtown Los Angeles (Figure 6). The Teplice/ SoCAB linkage was surprising because coal is used as a fuel in the Teplice region but not in California. One of the hypotheses advanced to explain observations of enhanced mortality and morbidity associated with particles is the presence of transition metals (ref 25, p 78). Cluster analysis of the transition metals, Ti, Ni, V, Mn, Fe, and Zn, may help in the design of epidemiological studies related to fine particles containing the transition metals. A dendrogram prepared for these metals was similar to the one for the overall chemical database and showed that Teplice fused with the SoCAB cluster (4). Statistical Variability among Regional Sampling Sites. The concentration diagrams and cluster analyses were based on average concentrations at sites around the world. Strong similarities in average concentrations at nearby sampling sites were noted. These can be analyzed in more detail by calculating correlation coefficients for the concentration variations of the aerosol components at the various sampling sites using eq 2. This provides insight into the uniformity of concentrations across a given region over a short sampling period. Correlation coefficients for the spatial variation of PM2.5 mass calculated for pairs of sampling sites in the Los Angeles region are shown in Table 3. Downtown Los Angeles and Burbank, Hawthorne, and Long Beach are very similar (0.70 < Rtot < 0.93). The correlation coefficients for Rubidoux are significantly smaller than for the other sites (Rtot < 0.12)

because the Rubidoux sources are different (20). The correlation coefficient for the spatial variation of a source s, Rs, was obtained by averaging the correlation coefficients of the set of chemical components that represent the source (eq 3). The sources selected were crustal material, secondary material, automobile exhaust, and residual oil. Values of Rs, are shown in Table 3. Rubidoux is not shown because it was so different from the other sites. Values of Rsecondary are high for all pairs of sampling sites showing that there is a fairly uniform distribution of secondary aerosol throughout the SoCAB. Automobile exhaust also shows a uniform distribution except for the Anaheim site for reasons not known. Values of Rcrustal between Anaheim and other sampling sites are also low compared to Rcrustal between other pairs of sampling sites. Values of Rresidual oil between pairs of sampling sites were generally low, less than 0.606, indicating that emissions from residual oil combustion of oil are nonuniform in the SoCAB region. This is consistent with the presence of localized stationary sources, especially refineries. Thus correlation coefficients for the spatial variation of the chemical components can be used to investigate the uniformity of sources within a given region. A long time series database between pairs of sampling sites is required for reliable results. The sampling sites must have the same sampling periods and sample sizes.

Acknowledgments This work was supported in part by grant R821288 from the U.S. Environmental Protection Agency. The contents of the paper do not necessarily reflect EPA views and policies.

Literature Cited (1) Dockery, D. W.; Pope, C. A.; Xu, X.; Spengler, J. D.; Ware, J. H.; Fay, M. E.; Ferris, B. G., Jr.; Speizer, F. E. New Engl. J. Med. 1993, 329, 1753. (2) Environmental Protection Agency. National Ambient Air Quality Standards for Particulate Matter. Code of Federal Regulations, part 50; Title 40; Final Rule; U.S. GPO: Washington, DC, 1997; Fed. Regist. July 18, 1997, 62 (No. 138), 38651-38701. (3) Hidy, G. M.; Friedlander, S. K. In Proceedings of the Second International Clean Air Congress; Englund, H. M., Beery, W. T., Eds.; Academic Press: New York, 1971; p 391. (4) Wongphatarakul, V. M.S. Thesis in Chemical Engineering, UCLA, 1997. (5) Solomon, P. A.; Fall, T.; Salmon, L.; Lin, P.; Vasquez, F.; Cass, G. R. Acquisition of Acid Vapor and Aerosol Concentration Data for Use in Dry Deposition Studies in the South Coast Air Basin. Final report to California Air Resources Board; California Institute of Technology: Pasadena, CA. Also: Solomon, P. A.; Fall, T.; Salmon, L.; Cass, G. R.; Gary, H. A.; Davidson, A. J. Air Pollut. Control Assoc. 1989, 39, 154. VOL. 32, NO. 24, 1998 / ENVIRONMENTAL SCIENCE & TECHNOLOGY

9

3933

(6) Willis R. D.; Ellenson W. D.; Pinto J. P.; Hartlage T. A.; Novak J.; Dosdalek H.; Cernikovsky L.; Bures V. USEPA Technical Report EPA/600/R-97/030; USEPA: Washington, DC, April 1997. (7) Stevens, R. K.; Pinto J. P.; Willis R. D.; Mamane Y.; Novak J. J.; Benes, I. In Environment; Allegrini, I., De Santis, F., Eds.; NATO ASI series, Sub-series 2; Plenum: New York, 1996; Vol. 8, p 151. (8) Sue H. H.; Allen G. A.; Koutrakis P.; Burton, R. M. J. Air Waste Manage. Assoc. 1995, 45, 442. (9) Li, C. H.; Hsu, L. Y. Chemosphere 1993, 27, 2143. (10) Borbely-Kiss, I.; Koltay, E.; Szabo, G. Y. J. Aerosol Sci. 1994, 27, S91. (11) Artaxo, P.; Gerab, F.; Yamasoe, M. A.; Martins, J. V. J. Geophys. Res. 1994, 99, 22857. (12) Johnson, R. A.; Wichern, D. W. Applied Multivariate Statistical Analysis, 3rd ed.; Prentice Hall: New Jersey, 1992. (13) Hopke, P. K.; Gladney E. S.; Gordon G. E.; Zoller W. H.; Jones, A. G. Atmos. Environ. 1976, 10, 1015. See, also: Hopke, P. K. Receptor Modeling in Environmental Chemistry; John Wiley: New York, 1985; Chapter 8. (14) Clark, P. J. Copeia 1952, 2, 61. (15) Rhodes, A. M.; Carner, S. G.; Courter, J. W. J. Am. Soc. Hortic. Sci. 1969, 94, 98. (16) Anderberg, R. Michael Cluster Analysis for Applications; Academic Press: New York, 1973. (17) StatSoft. Statistica 5.1; StatSoft Inc.: Tulsa, 1995. (18) Friedlander, S. K. Smoke, Dust and Haze; Wiley: New York, 1977. See, also: Watson, J. G.; Cooper, J. A.; Huntzicker, J. J. Atmos. Environ. 1984, 18, 1347.

3934

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 32, NO. 24, 1998

(19) Hopke, P. K. Receptor Modeling in Environmental Chemistry; Wiley: New York, 1985; Chapter 7. (20) Kao, A. S.; Friedlander, S. K. Environ. Sci. Technol. 1995, 29, 19. (21) Cooper, J. A.; Redline, D. C.; Sherman, J. R.; Valdovinos, L. M.; Pollard, W. L.; Scavone, L. C.; Badgett-West, C. PM10 Source Composition Library for the South Coast Air Basin: Source Profile Development Documentation Final Report; South Coast Air Quality Management District: El Monte, CA, 1987. (22) Houck, J. E.; Chow, J. C. Ahuja, M. S. In Transactions, Receptor Models in Air Resources Management; Watson, J. G., Ed.; Air Pollution Control Association: Pittsburgh, PA, 1989. (23) Lyons, J. M.; Venkataraman, C.; Main H. H.; Friedlander S. K. Atmos. Environ. 1993, 27B, 237. (24) Dzubay, T. G.; Stevens, R. K.; Gordon, G. E.; Olmez, I.; Sheffield, A. E.; Courtney, W. J. Environ. Sci. Technol. 1988, 22, 46. (25) U.S. Environmental Protection Agency. Particulate Matter Research Needs for Human Health Risk Assessment to Support Future Reviews of the National Ambient Air Quality Standards for Particulate Matter. Office of Research and Development; National Center for Environmental Assessment: Research Triangle Park, NC, January 15, 1998; Report EPA/600/R-97/132F.

Received for review January 22, 1998. Revised manuscript received August 31, 1998. Accepted September 15, 1998. ES9800582