Environ. Sci. Technol. 2005, 39, 7980-7983
Reconciling Trajectory Ensemble Receptor Model Results with Emissions P H I L I P K . H O P K E , * ,† LIMING ZHOU,† AND RICHARD L. POIROT‡ Center for Air Resources Engineering and Science and Department of Chemical Engineering, Clarkson University, P.O. Box 5708, Potsdam, New York 13699-5708, and Vermont Department of Environmental Conservation, Building 3 South, 103 South Main Street, Waterbury, Vermont 05671-0402
Several approaches have been developed that use ensembles of air parcel back trajectories to identify likely source regions for contaminants observed at a downwind receptor site. Although the results of these models have appeared to provide results that agree with known sources, there have not been efforts to make direct comparisons between the results of such analyses with emissions inventories. In this paper, the results of residence time analysis (RTA) of data from Underhill, VT and from Brigantine, NJ are compared to the average estimated emissions of coal- and oil-fired power plants. Excellent correspondence was found between the areas of highest probability identified in the trajectory ensemble analyses and the averaged emission intensities.
Introduction Receptor models typically use the differences in the patterns of chemical species emitted by different particulate matter sources to identify and estimate the contribution of those sources to the concentrations measured at a site. Such analyses depend on the extent of a priori information available. If the sources and their composition profiles are known, then the Chemical Mass Balance Model can be employed. If only the ambient concentrations of chemical species in the particles are available, then multivariate methods such as UNMIX and Positive Matrix Factorization (2) have been developed to solve this problem. However, these methods do not provide information on the likely locations of the sources and thus, other methods have been developed that utilize meteorology in the form of air parcel back trajectories. A back trajectory model calculates the position of the air being sampled backward in time from the receptor site from various starting times throughout the sampling interval. A number of such models are available (3). For any of these models, the uncertainty in the location of the air parcel becomes more uncertain as it moves backward in time. However, the use of ensembles of trajectories can improve the reliability of the estimation of the likely source positions since random errors will tend to be averaged out of the * Corresponding author phone: 315-268-3861; fax: 315-268-4410; e-mail:
[email protected]. † Clarkson University. ‡ Vermont Department of Environmental Conservation. 7980
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 39, NO. 20, 2005
analysis when enough trajectories can be incorporated in the calculations. An initial effort by Ashbaugh (4) made use of air parcel back trajectories to identify likely source locations for particulate sulfur observed at the Grand Canyon. A gridded array is created around the sampling location. Trajectories are a sequence of segments, each of which represents a fixed amount of time. Thus, each endpoint can be considered to be an indication that the air parcel has spent a given time within that grid cell. The total “residence time” that air spends in the given cell would be the total number of endpoints that fall into that cell. These values can be plotted over a map. The residence time values associated with high or low concentration can be plotted to examine likely directions from which contaminated or clean air is transported to the sampling site. The problem with this method is that all of the trajectories begin at the receptor site and thus, the residence time is maximum in the cells surrounding the sampling location. Ashbaugh et al. (5) suggest one solution to this problem. This method, potential source contribution function analysis, has been applied to a number of receptor model problems over a variety of spatial scales (1, 5-14). In general the method has apparently identified source regions. However, there have not been extensive studies to evaluate these methods. Cheng and Lin (16) examined the use of Potential Source Contribution Function (PSCF) to identify the emissions location for smoke from the 1998 wildfires in southern Mexico and Central America. In this case where there was a major event they were able to correctly identify the source location. A similar finding has been recently reported by Begum et al. (16). However, there has not been a detailed comparison of these methods with the emissions inventories for major sources of particulate matter in the eastern United States such as coal- and oil-fired power plants. An alternative method, which has come to be called Residence Time Analysis (RTA), was developed by Poirot and Wishinski (17). In their method, they first interpolate along each trajectory segment to estimate the fraction of time spent in each grid cell and then summing the residence time for that cell. They propose a method to adjust the resulting grid cell values for the geometrical problem of high values in the region immediately adjacent to the receptor site. In a recent study, PSCF and RTA were compared with respect to the locations of the sources contributing to the ambient aerosol concentrations at Underhill, VT (1). There was generally good agreement between the results of these two methods. In addition, a factor analysis study has been conducted on particulate matter compositions measured at Brigantine, NJ (18). In this study, the results of the RTA analysis for Underhill as well as unpublished residence time analysis performed for Brigantine for electricity generating sources were compared to emission inventories to assess the degree of correspondence between the receptor model results and the known emissions patterns.
Methods In the Underhill study, CAPITA Monte Carlo trajectories (19) were employed in two ensemble backward trajectory techniques. the Potential Source Contribution Function (PSCF) and Residence-Time Analysis (RTA), to evaluate and interpret the results from the multivariate statistical models. The PSCF and RTA techniques are similar in that both apply spatial grids to track and sort the trajectories as a function of metrics derived from the ambient measurement data (in this case, the daily source contributions identified by the mathematical 10.1021/es049816g CCC: $30.25
2005 American Chemical Society Published on Web 09/17/2005
FIGURE 1. Areas of highest probability of emissions from oil-fired power plants (left) and coal-fired power plants (right) as identified in the residence time analysis for Underhill, VT.
FIGURE 2. Areas of highest probability of emissions from oil-fired power plants (left) and coal-fired power plants (right) as identified in the residence time analysis for Brigantine, NJ. models). The trajectory techniques differ in the domains, orientations, and size of the trajectory tracking grid cells, the metrics employed to attribute trajectories to grid cells, and the metrics calculated from the gridded results. These differences are described by Poirot et al. (1). In the RTA approach, a variety of different metrics can be applied to the resultant counts of hours in the equal-area grid squares (17). One set of RTA metrics, referred to as “concentration-based sorting” begins with the conversion of the gridded trajectory hours to “probability fields” in which, for a given scenario of dates, the “upwind probability” of trajectory location in a given grid square is defined as the fraction of hours in that square compared to the total hours in all 1440 squares. An “everyday probability field” is calculated for a scenario of all sample days at the receptor, and provides an indication of areas most likely to be upwind of the receptor on a long-term or climatological basis. A “high day probability field” can be calculated for various definitions of “high” contributions at the receptor, for example upper 50th, 75th, or 90th percentile days, etc. The “incremental probability” for a given high day scenario is defined as the difference between the high day and everyday probability fields. A similar RTA analysis was also made for Brigantine, NJ based on source contributions derived by Positive Matrix Factorization of the IMPROVE network data by Lee et al. (18)
for the time interval from September 1991 to May 1999. The trajectories used in this analysis were calculated with HYSPLIT 4 (20), and reproduced here with the ATAD model (21). The factor analyses identified coal-fired and oil-fired power plants as contributors to the measured aerosol mass at Underhill and Brigantine. Using the RTA analyses, it is possible to identify the areas of highest probability related to the amounts of contributions from these two major source types. Figure 1 shows the areas of highest probability for coal-fired and oil-fired power plants to Underhill while Figure 2 provides the analogous plots for Brigantine. These results can be compared with the SO2 emissions estimates since the contributions from these sources will be proportional to the SO2 emissions. Figure 3 (left) shows the 1998 EPA EGRID SO2 emissions (22). EGRID’s default source for SO2, NOx, and CO2 data is EPA’s Emissions Tracking System/Continuous Emissions Monitoring (ETS/CEM). The oil emissions are multiplied by 10 to more clearly show their areal pattern. The point emissions are converted into grid emissions on a 100 × 100 km grid scale (center panel). These grid values are then plotted as contours in the right panel of Figure 3. The joint probability fields for the oil-fired power plants and the coal-fired power plants can be developed by averaging the probability fields for the two sampling sites VOL. 39, NO. 20, 2005 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
7981
FIGURE 3. Major areas of SO2 emissions from coal-fired and oil-fired power plants derived from the 1998 EPA EGRID SO2 emissions inventory. Left panel shows the EPA emissions with oil values multiplied by 10. Middle panel shows these emissions converted to 100 × 100 km gridded emissions density. The right panel shows the interpolated emissions as contour plots.
FIGURE 4. Trajectory probability fields for emissions from oil-fired and coal-fired power plants compared with the emissions areas derived from the EPA emissions inventory. (Underhill and Brigantine). These joint probability fields are compared with the emissions contours in Figure 4 for oil(left) and coal-fired (right) power plants. The trajectory ensemble method has clearly provided a very good overlap with the emissions areas. Part of the oil emissions are not included in the identified area because there is less probability of transport from the eastern area of the oil emissions region to either of the sampling sites.
j, and Cij is a measure of the proximity of i and j in terms of variable values. D and C are defined as follows (when the units of x and y are the number of grid cells):
Spatial Correlation Index
where gj and hf are the average values of g and f, respectively, over the entire images. The product of very high or very low similar f and g values have the greatest contribution to C value. When two maps are similar, the weight (D) for the product of the high g and f is large because of the short distance. Haining (23) indicates that the Γ values are relatively insensitive to outliers. If there is an outlier in one map and near the outlier location, there are no high values in the other map, then this outlier contributes very little to the Γ value. If the outliers in A and B are very close, the Γ will be large. However, this possibility is very small.
It is then possible to compare the pairs of maps using a quantitative spatial correlation index as described by Haining (23). Each map is considered to be a matrix defined by the latitude and longitude with the (x, y) value being the spatial variable being plotted (probability or emissions). The two images are converted to grayscale and then normalized by dividing by 255 so values in these two matrixes range from 0 to 1. The two spatial data matrixes g and f are vectorized by unfolding each matrix into a vector. For location i, xi, yi is the coordinate and gi and fi are the emission rate or predicted probability value. The g and f series were respectively scaled between 0 and 1. A generalized spatial autocorrelation statistic, Γ, is used to describe the association between the two maps (23):
Γ)
∑D C
ij ij
(1)
ij
Dij is a measure of the spatial proximity of locations i and 7982
9
ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 39, NO. 20, 2005
Dij ) 1/[(xi - xj)2 + (yi - yj)2 + 1]
(2)
Cij ) [(gi - gj)(fj - hf ) + (gj - gj)(fi - hf )]/2
(3)
Constructing a Test The null hypothesis (H0): there is no spatial correlation between g and f. Alternative hypotheses were not tested. Under a randomization assumption, the two maps are a possible arrangement of all the n values. The n! permutations of n values are used to provide the reference distribution (n is the number of all grid cells investigated). The first two
moments of Γ under this randomization assumption (23) are
ER(Γ) ) S0T0/n(n - 1)
(4)
ER(Γ2) ) S1T1/2n(2) + (S2 - 2S1)(T2 - 2T1)/4n(3) + (S02 + S1 - S2)(T02 + T1 - T2)/n(4) (5) where
S0 )
1 (Gij + Gji)2; S2 ) (Gi,a + 2 ij i Gi,b)2;Gi,a ) Gij; Gi,b ) Gji; n(b) )
∑G ; S ij
ij
1
)
∑
∑ j
∑ ∑ j
n(n - 1)(n - 2)... (n - b + 1)
with Tk being the same as Sk but with Gij replaced by Cij. The Γ index can be transformed by the following equation, and the obtained r is treated as N(0,1) distribution.
r ) (Γ - E(Γ))/(E(Γ2) - (E(Γ))2)1/2 The accumulated probability from -∞ to 1.6 for N(0,1) is 95%, and by using this one-tailed test, r values over 1.6 will result in the rejection of H0. The emission rates and RTA probabilities for the coalfired and oil-fired power plant sources presented in Figure 4 were compared. To reduce the total number of points over the high-resolution images (approximately 300 by 300 pixels), grid cells for the spatial analysis were defined to be 6 by 6 pixels. The values over these 36 pixels were then averaged to produce the value for a single grid cell. The resulting r values were r ) 149 for coal and r ) 107 for oil. These r values are very large compared to the test value of 1.6 and, thus, both pairs of maps show very strong spatial correlations. The trajectory ensemble analysis does not produce a quantitative apportionment of the amount of emissions that results in particulate matter at the two receptor sites, but it does clearly show the correct locations of these two sources and thus, provides useful information on the source/receptor relationships.
Acknowledgments We acknowledge the work of Dr. Alexander Polissar for the PMF results from Underhill and Dr. Jong Hoon Lee for the PMF results from Brigantine.
Literature Cited (1) Poirot, R. L.; Wishinski, P. R.; Hopke, P. K.; Polissar, A. V. Comparative Application of Multiple Receptor Methods to Identify Aerosol Sources in Northern Vermont. Environ. Sci. Technol. 2001, 35, 4622-4636. (2) Seigneur, C.; Pai, P.; Louis, J.-F.; Hopke, P. K.; Grosjean, D. Review of Air Quality Models for Particulate Matter; Document No. CP015-97-1b; American Petroleum Institute: Washington, DC, 1997. (3) Ashbaugh, L. L. A Statistical Trajectory Technique for Determining Air Pollution Source Regions. J. Air. Pollut. Control Assoc. 1983, 33, 1096-1098. (4) Ashbaugh, L. L.; Malm, W. C.; Sadeh, W. Z. A Residence Time Proability Analysis of Sulfur Concentrations at Grand Canyon National Park. Atmos. Environ. 1985, 19, 1263-1270.
(5) Malm, W. C.; Johnson, C. E.; Bresch, J. F. Application of Principal Component Analysis for Purposes of Identifying SourceReceptor Relationships. In Receptor Methods for Source Apportionment; Pace, T. G., Ed.; Air Pollution Control Association: Pittsburgh, PA, 1986; pp 127-148. (6) Cheng, M. D.; Hopke, P. K.; Zeng, Y. A Receptor Methodology for Determining Source Regions of Particle Sulfate Composition Observed at Dorset, Ontario. J. Geophys. Res. 1993, 98, 1683916849. (7) Cheng, M. D.; Hopke, P. K.; Barrie, L. A.; Rippe, A.; Olson, M.; Landsberger, S. Qualitative Determination of Source Regions of Long-Range Transported Aerosol Using Data Collected at Canadian High Arctic. Environ. Sci. Technol. 1993, 27, 20632071. (8) Cheng, M. D.; Gao, N.; Hopke, P. K. Source Apportionment Study of Nitrogen Species Measured in Southern California in 1987. J. Environ. Eng. 1995, 122, 183-190. (9) Gao, N.; Cheng, M. D.; Hopke, P. K. Potential Source Contribution Function Analysis and Source Apportionment of Sulfur Species Measured at Rubidoux, CA during the Southern California Air Quality Study, 1987. Anal. Chim. Acta 1993, 277, 369-380. (10) Gao, N.; Cheng, M. D.; Hopke, P. K. Receptor Modeling for Airborne Ionic Species Collected in SCAQS, 1987. Atmos. Environ. 1994, 28, 1447-1470. (11) Gao, N.; Hopke, P. K.; Reid, N. W. Possible Sources of Some Trace Elements Found in Airborne Particles and Precipitation in Dorset, Ontario. J. Air Waste Manage. Assoc. 1996, 46, 10351047. (12) Hopke, P. K.; Barrie, L. A.; Li, S.-M.; Li, C.; Cheng, M. D.; Xie, Y. Possible Sources and Preferred Pathways for Biogenic and Non-Seasalt Sulfur for the High Arctic. J. Geophys. Res. 1995, 100, 16595-16603. (13) Polissar, A. V.; Hopke, P. K.; Harris, J. M. Source Regions for Atmospheric Aerosol Measured at Barrow, Alaska. Environ. Sci. Technol. 2001, 35, 4214-4226. (14) Polissar, A. V.; Hopke, P. K.; Poirot, R. L. Atmospheric Aerosol over Vermont: Chemical Composition and Sources. Environ. Sci. Technol. 2001, 35, 4604-4621. (15) Cheng, M. D.; Lin, C. J. Receptor modeling for smoke of 1998 biomass burning in Central America. J. Geophys. Res. 2001, 106, 22871-22886. (16) Begum, B. A.; Kim, E.; Jeong, C.-H.; Lee, D.-W.; Hopke, P. K. Evaluation of the Potential Source Contribution Function using the 2002 Quebec Forest Fire Episode. Atmos. Environ. 2005, 39, 3719-3724. (17) Poirot, R. L.; Wishinski, P. R. Visibility, Sulfate, and Air Mass History Associated with the Summertime Aerosol in Northern Vermont. Atmos. Environ. 1986, 20, 1457-1469. (18) Lee, J. H.; Yoshida, Y.; Turpin, B. J.; Hopke, P. K.; Poirot, R. L.; Lioy, P. J.; Oxley, J. C. Identification of Sources Contributing to the Mid-Atlantic Regional Aerosol. J. Air Waste Manage. Assoc. 2002, 52, 1186-1205. (19) Schichtel, B. A.; Husar, R. B. Regional simulation of atmospheric pollutants with the CAPITA Monte Carlo model. J. Air Waste Manage. Assoc. 1996, 47, 331-343. (20) Draxler, R. R.; Rolph, G. D. HYSPLIT (HYbrid Single-Particle Lagrangian Integrated Trajectory) Model; access via NOAA ARL READY Website (http://www.arl.noaa.gov/ready/hysplit4.html); NOAA Air Resources Laboratory: Silver Spring, MD, 2003. (21) Hefter, J. L. Air Resources Laboratories Atmospheric Transport and Dispersion Model (ARL-ATAD); NOAA Technical Memorandum ERL ARL-81; Rockville, MD, 1980. (22) U.S. Environmental Protection Agency. Emissions & Generation Resource Integrated Database Website (http://www.epa.gov/ cleanenergy/egrid/index.html); Office of Air and Radiation, U.S. EPA: Washington, DC, 2004. (23) Haining, R. Spatial Data Analysis in the Social and Environmental Sciences; Cambridge University Press: Cambridge, UK, 1990; pp 228, 324.
Received for review February 4, 2004. Revised manuscript received July 8, 2005. Accepted August 16, 2005. ES049816G
VOL. 39, NO. 20, 2005 / ENVIRONMENTAL SCIENCE & TECHNOLOGY
9
7983