Environ. Sci. Technol. 1994,28, 823-832
Vehicle-Related Hydrocarbon Source Compositions from Ambient Data: The GRACEBAFER Method Ronald C. Henry,'pt Charles W. Lewis,* and John F. Collins+ Civil Engineering Department, Environmental Engineering Program, University of Southern California, 3620 South Vermont Avenue, Los Angeles, California 90089-2531, and Atmospheric Research and Exposure Assessement Laboratory, United States Environmental Protection Agency, Research Triangle Park, North Carolina 277 11
The composition of three volatile hydrocarbon sources (emissions from vehicles in motion, evaporation of whole gasoline, and gasoline headspace vapor) have been derived from 550 ambient, hourly concentration measurements of 37 Cz-Ce volatile organic compounds (VOC). The measurements were made by automated gas chromatograph in Atlanta, GA, during the summertime of 1990. The source compositions were obtained by a novel combination of graphical analysis and multivariate receptor modeling methodologies: GRACE (Graphical Ratio Analysis for Composition Estimates) and SAFER (Source Apportionment by Factors with Explicit Restrictions). For the relatively unreactive hydrocarbon species, the ambientderived source compositions were in good agreement with direct source measurements made in Atlanta concurrent with the ambient measurements. The prominence of the whole gasoline profile in the ambient data was an unexpected result. The GRACE/SAFER method may provide a cost-effective alternative to the usual direct source measurement of profiles. Introduction Source profiles, giving the fractional amount of each chemical species that together constitute the total emissions of an air pollutant source, have a variety of uses. For volatile hydrocarbon (VHC) sources, applications include the preparation of emissions inventories for exposure assessment of air toxics and for design of ambient ozone abatement strategies and their use in the chemical mass balance (CMB) form of receptor modeling, the application of most interest to us. These applications require source profiles that are both accurate and representative of the source emissions in the airshed under consideration. Obtaining profiles by direct source measurement is laborious and expensive, and the result may still not capture the full range of variability of the source emissions. An appealing alternative is the possibility of using a multivariate procedure to extract profiles from the ambient data. Proposed approaches to obtaining ambient-derived profiles have included target transformtion factor analysis ( I ) , absolute principal component analysis (2), specific , factor analysis (4), rotation factor analysis ( 3 ) three-mode source profiles by unique ratios (51, and source apportionment by factors with explicit restrictions (SAFER) (6). All were originated for deriving aerosol elemental composition profiles, but there is no fundamental reason not to consider VHC applications. Of these approaches, SAFER is the only one that includes physical constraints, such as nonnegativity of the t
University of Southern California.
* U S . Environmental Protection Agency. 0013-936X/94/0928-0823$04.50/0
0 1994 American Chemical Society
source compositions, as an integral part of the profile derivation procedure. An innovative use of simple physical constraints in limiting the solution space to a physically feasible region has been demonstrated by White and Macias (7). More fundamentally, Henry (8) has argued that factor analysis methods-which include most of the previous list-without sufficient physical constraints to guarantee a unique, physically valid solution are mathematically ill-posed. For this reason, we consider the SAFER approach to be the most promising. In this article, we use SAFER to obtain from ambient data three vehicle-related VHC profiles: emissions from roadways (vehicle tailpipe + running losses), gasoline headspace vapor, and whole gasoline. Gasoline headspace vapor represents the partial evaporation of gasoline in situations such as storage tank evaporation or vehicle diurnal evaporation for which the composition is that of vapor in equilibrium with the liquid at the relevant temperature. Whole gasoline represents the complete evaporation of gasoline, arising from situations such as spillage, leakage, and vehicle hot-soak emissions. The ambient data set utilized is from the 1990 Atlanta Ozone Precursor Study (9,IO). The data set was ideally suited for this multivariate application because of its several hundred hourly ambient sampling periods for which a large number of VHC species concentrations were quantified. In addition, there existed vehicle-related profiles that had been obtained by direct source measurements in Atlanta in the same period during which the ambient measurements were done (11, 12). This made possible an objective validation of the SAFER-derived results. Finally, a new tool for developing profiles is described: Graphical Ratio Analysis for Composition Estimates (GRACE). GRACE is a mathematically simple procedure that is useful both for generating constraints for input to SAFER or by itself for generating approximate profiles. The profiles derived by GRACE/SAFER have been employed in CMB analyses of the 1990 Atlanta ambient data with good results (13). The methodology should be useful in analyzing the large amount of VHC data that will be produced in the 22 US. metropolitan ozone nonattainment areas included in the photochemical assessment monitoring stations (PAMS) network (14). Ambient Measurements The ambient VHC measurements used in this paper were taken from the 1990 Atlanta Ozone Precursor Study (9,101. The massive database resulting from this study includes hourly measurements of VHC species from automated gas chromatographic systems (Chrompack Inc.) with flame ionization detection (GC-FID),operated at six ground-level sites throughout Atlanta during July and August of 1990. To minimize possible data quality Envlron. Sci. Technol., Vol. 28, No. 5, 1994 823
problems that have been identified in this database (15), the data utilized in this paper are only those from the Georgia Institute of Technology site, which is representative of downtown Atlanta and is thought to have data of high quality. Chromatogram processing using MetaChrom software (10) quantified 47 species from ethene through 1,2,4trimethylbenzene in addition to total non-methane organic compounds (TNMOC). The targeted species correspond closely to those prescribed for the PAMS network (16). The TNMOC parameter was defined (10) as the sum of all chromatographic peaks (identified and unidentified) from ethene through 1,2,3,5-tetramethylbenzene.However, for purposes of this analysis, we have chosen to use a “total” defined as the sum of the 37 hydrocarbon species remaining after the screening described below, naming it TNMHC (total non-methane hydrocarbons). In the final screened data set used in this study, the mean of TNMOC is 338.7 ppb of C (208.2 ppb of C standard deviation), and the mean of TNMHC (37 species) is 222.2 ppb of C (158.1 ppb of C standard deviation). The correlation coefficient of TNMOC and TNMHC is 0.993. Except for inclusionof the TNMHC parameter, the steps described above had been carried out prior to our obtaining the data set. Hence, our starting point was the “public” data set from the 1990 Atlanta Ozone Precursor Study (which does not include the chromatograms themselves). Our modifications (mainly deletions) of the public data set are described next.
Ambient Data Screening The initial data set consisted of 662 records of concentrations in ppb of C for 47 VHC species plus TNMOC. However, only about 200 of these records were complete. The remaining records had data for one or more species missing (i.e., below detection limits). Initial screening showed that the missing data were concentrated in a few species. Because the SAFER model used in this study relies on a principal component analysis (PCA), only complete data records can be used by the model. Thus, it was desirable to remove from the data set those species having large amounts of missing data. In fact, species with more than about 20 missing values were dropped. In addition, isobutane was dropped from the data set due to suspected analytical problems. The data were next screened by examining scatterplots of VHC concentrations for each species versus every other species. This screening step revealed three types of problems: a large number of zero concentration values for some species, outliers, and an apparent bias in ethane concentrations. The scatterplots showed that the zero values were not representative of small concentrations, but actually represented missing data. This conclusion was derived from the scatterplots by observing that zero concentrations for a given species occurred while concentrations for the remaining species ranged well above their detection limits. Thus, zero concentrations were converted to missing data, and the screening to remove species with large amounts of missing data was repeated. The 37 surviving species are listed in Table 1. The scatterplots also revealed about 50 data records that contained outliers with respect to the main body of data (as determined by visual inspection of the scatterplots). After records that had outliers and missing values 824
Environ. Sci. Technol.. Vol. 28, No. 5, 1994
were dropped, 550 complete records remained in the final data set. It is important to note that the outliers’ concentrations were not entirely random, but formed small, self-consistent groups. Thus, it is possible that some of the outliers deleted from the data set represent real but very infrequently occurring conditions, such as a large impact from an intermittent local source. However, these real but very infrequent events confound the statistical description of the main body of data upon which the model depends. Since the present study focuses on vehiclerelated impacts that are ubiquitous throughout the data set, it was thought appropriate to remove the outlier records. When ethane was plotted against most other species, a concentration offset of about 3.5 ppb of C was evident. This is consistent with previous observation (10) and was corrected for by subtracting this amount from the ethane data.
Source Measurements Three types of measured source profiles specific to Atlanta in the summertime of 1990were available: highway tunnel measurements, gasoline headspace, and whole gasoline (11, 12). The tunnel profile was derived from nine canister samples collected on three separate occasions in the underpass of the interchange of interstate highways 20 and 75/85 near downtown Atlanta. All samples were taken between 7 and 8 A.M. during August 1990. The headspace and whole gasoline profiles were derived from unleaded gasoline samples of all three octanes of the six major brands being sold in Atlanta during August 1990. There were no sales of leaded gasoline at this time. The GC-FID measurements for all three profile types were performed at the EPAs Atmospheric Research and Exposure Assessment Laboratory. The definition of TNMHC used to normalize the resulting profiles was the same as that used for the ambient data (37-species sum). For the nine tunnel samples, the coefficient of variation, averaged over species with abundances >0.2 % ,was 11% . For the six headspace and whole gasoline samples of a particular octane, the corresponding average coefficients of variation ranged from 24% to 40%, mostly reflecting real differences between brands rather than measurement uncertainties. Details for individual species are given in Conner et al. (11).
Graphical Ratio Analysis for Composition Estimates (GRACE) Scatterplots were generated for each species against every other species and TNMHC. Figure 1is an example showing scatterplots of every species against acetylene. In each plot of this figure, the horizontal axis is acetylene and the vertical axis is the species identified by the number (see Table 1)in the lower right-hand corner of the plot. Astriking feature of many of the plots is the sharply defined linear lower boundary of the distribution. For each such species, this means that a given concentration of acetylene is always accompanied by a minimum concentration of that species, which is proportional to acetylene. It is wellknown that the predominant source of urban atmospheric acetylene is motor vehicle exhaust-for example, accounting for 90% in Los Angeles according to the estimate of Harley et al. (17). Under the assumption that acetylene
Table 1. Species Included in the Analysis. species
abbreviation
chemical name
common name
ethene ethyne ethane propene propane n-butane trans-2-butene cis-2-butene 2-methylbutane 1-pentene n-pentane 3-methyl-l,3-butadiene 2-methyl-2-butene cyclopentene + 4-methyl-1-pentene 2-methylpentane 3-methylpentane 2-methylpentene n-hexane trans-2- hexene methylcyclopentane 2,4-dimethylpentane benzene cyclohexane + 2-methylhexane 2,3-dimethylpentane 3-methylhexane 2,2,4-trimethylpentane n-heptane methylcyclohexane 2,3,4-trimethylpentane methylbenzene 2,3-dimethylhexane + 2-methylheptane 3-methylheptane n-octane ethylbenzene 1,3-dimethylbenzene 1,4-dimethylbenzene
ethylene acetylene
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
ethene acetylene ethane propene propane nButane t2Butene c2Butene iPentane lPentene nPentane isoprene 2M2Buten CyPten + 2MPentan 3MPentan 2MlPente nHexane t2Hexene MCyP+2,4-DMP benzene CyHx+2MHx 2,3-DMP 3MHexan 2,2,4-TMP n H eptane MCyHex 2,3,4-TMP To1 ... 3MHeptan nOctane EBenzene m/p-Xy
34 35 36 37
o-Xy+nNo nPBenz 1,3,5-TMB 1,2,4-TMB
1,2-dimethylbenzene + n-nonane n-propylbenzene 1,3,5-trimethylbenzene 1,2,4-trimethylbenzene
1 2 3 4 5 6 7 8 9 10 11
0
...
+
propylene
isopentane isoprene
+
+
+
isooctane
toluene
m-xylene p-xylene o-xylene
Species 14, 20, 22, 29, 33, and 34 are the sum of co-eluting species.
in Atlanta has essentially no source other than tailpipe emissions, the lower edge in the plot defines the ratio of the species to acetylene in the source for which acetylene is a tracer, Le., emissions from motor vehicles in motion. The recognition and quantitative use of such scatterplot edges is the essence of the GRACE method. Not all species in Figure 1 have scatterplots with acetylene that exhibit sharp lower edges. The scatterplots of ethene and propene, which like acetylene are found mostly in vehicle exhaust, are essentially linear, except for measurement errors. At the other extreme, isoprene and cyclopentene (plus co-eluting 4-methyl-1-pentene) show virtually no relationship to acetylene. This is not surprising for isoprene, whose dominant source is surely biogenic emissions (13). In addition, isoprene and cyclopentene are highly reactive in the atmosphere, and the resulting changes in their concentrations distort the relationships with other species that might otherwise suggest a common source heritage. Most of the species in Figure 1have sharp lower edges, placing them in-between these two extremes, and it is from these species that source composition information can be deduced by the GRACE method. The conclusion that acetylene appears to be such a welldefined fraction of motor vehicle emissions may at first seem incompatible with the highly variable emission fractions for this species that have been measured from individual vehicles (18). Ambient concentrations, however,
arise from the emissions contributions of an enormous number of vehicles, with the result that the average emission fraction is a stable quantity. It is interesting to note that the tight linear relationship between ethene and acetylene (species1in Figure 1)suggeststhat ethene would serve as well as acetylene as a starting point for a GRACE analysis. In the following, the GRACE method is explained in detail for 2,3-dimethylpentane (2,3-DMP) and acetylene. Consider the lower edge of the wedge-shaped region in Figure 2, the GRACE plot of 2,3-DMP versus acetylene. The slope of the lower edge along the main body of data is about 0.19, while the slope along the extreme lowest points is about 0.10. Assuming that acetylene is present only in roadway emissions, the ratio of 2,3-DMP to acetylene in roadway emissions must be in the range of 0.10-0.19. There are many data points falling well above the 0.19 line, indicating that there is at least one source of 2,3DMP that occurs independently of roadway emissions. It is possible that the other source or sources of 2,3-DMP are always present to some extent. If this is so, then the ratio of 0.19 is certainly an upper limit on the ratio of 2,3-DMP to acetylene in exhaust. In this case, the lower limit of the ratio might be less than the 0.1 value from the GRACE plot; however, this lower limit is based on only a few extreme points, and it seems unlikely that the ratio of 2,3-DMP to acetylene in roadway emissions is less than Environ. Sci. Technol., Vol. 28, No. 5, 1994 826
.-2: 'Lq-
..
.
1
:.. ....... . . . . .. . . . :.-\L.l .... f - L.3 .
.
I
..
#.'*.
*
p.*.
?:-
4
- ,
*.;
;-
.....
a
. . . . ..
*. '*
*
l------
.
*
.,.r4.:
'
*
"I
.
. :- ,.,..