Resolution of GC-MS Data of Complex PAC Mixtures and

The composition of the new mixtures may be determined by means of statistical experimental design, which is increasingly being used in mixture researc...
0 downloads 0 Views 69KB Size
Environ. Sci. Technol. 2001, 35, 2314-2318

Resolution of GC-MS Data of Complex PAC Mixtures and Regression Modeling of Mutagenicity by PLS I N G V A R E I D E , * ,† G U N H I L D N E V E R D A L , † BODIL THORVALDSEN,† HAILIN SHEN,‡ BJØRN GRUNG,‡ AND OLAV KVALHEIM‡ Statoil Research Centre, N-7005 Trondheim, Norway, and Department of Chemistry, University of Bergen, N-5007 Bergen, Norway

The present work describes a strategy to predict the mutagenicity of very complex mixtures of polycyclic aromatic compounds (PAC) from gas chromatography-mass spectrometry (GC-MS) patterns of the mixtures, each containing 260 compounds on average. The mixtures, 13 organic extracts of exhaust particles, were characterized by full scan GC-MS. The data were resolved into peaks and spectra for individual compounds by an automated curve resolution procedure. Similarity between spectra was evaluated for peaks that appeared within a time interval of 4 min, using a similarity index of 0.8 to ascertain that the same compound was represented by the same variable name (retention time) in all samples. The resolved chromatograms were integrated, resulting in a predictor matrix of size 13 × 721, which was used as input to a multivariate regression model. Partial least-squares projections to latent structures (PLS) were used to correlate the GC-MS chromatograms to mutagenicity as measured in the Ames Salmonella assay. The best model (high r2 and Q2) was obtained with 52 variables. These variables covary with the observed mutagenicity, and may subsequently be identified chemically. Furthermore, the regression model can be used to predict mutagenicity from GC-MS chromatograms of other organic extracts.

Introduction Organic extracts of exhaust particles, generated by the combustion of, e.g., fossil fuels, contain a variety of different polycyclic aromatic compounds (PAC) including polycyclic aromatic hydrocarbons (PAH), nitro-PAH, and oxy-PAH (1). Many of these are mutagenic and carcinogenic. It is difficult to identify and quantify all PAC in such complex mixtures, and even more difficult to predict their combined toxic or mutagenic effect. Different strategies have been described for the toxicological evaluation of mixtures: integrative (studying the mixture as a whole), dissective (dissecting or fractionating a mixture to determine causative constituents), and synthetic (studying interactions between agents in simple combinations) (2). In a recent study, fractionation of organic extracts * Correspondence should be addressed to this author at Statoil Research Centre, N-7005 Trondheim, Norway. Phone: +47 73584595. Fax: +47 73967286. E-mail: [email protected]. † Statoil Research Centre. ‡ University of Bergen. 2314

9

ENVIRONMENTAL SCIENCE & TECHNOLOGY / VOL. 35, NO. 11, 2001

followed by recombination of the fractions was introduced as a strategy for toxicological evaluation of mixtures (3). Spiking has been used in the evaluation of mutagenicity of complex mixtures by adding individual PAC to the mixtures (4, 5). In studies using either the synthetic approach, fractionation and recombination, or spiking, well-defined variables may be combined differently to obtain the effect of each variable and possible interactions between them. The composition of the new mixtures may be determined by means of statistical experimental design, which is increasingly being used in mixture research. However, for practical reasons, this approach is only possible with a limited number of variables. The present work describes a strategy to relate the PAC pattern to the mutagenicity of very complex mixtures, each containing 260 compounds on average. The mixtures are organic extracts of exhaust particles, and are characterized by full scan gas chromatography-mass spectrometry (GCMS). The first challenge is to resolve the complex GC-MS data into peaks and spectra for individual compounds. This task is performed by an automated curve resolution procedure (6). The integrated chromatographic peaks are used as input to an empirical multivariate regression model, which correlates the GC-MS chromatograms to the mutagenicity in the Ames Salmonella assay. Partial least-squares projection to latent structures (PLS) (7) is used for the regression modeling as it overcomes the problems of intercorrelated predictor variables and data matrixes where the number of variables exceeds the number of samples (8, 9). The regression model identifies those peaks that may represent the major contributors to the observed mutagenicity, and these may subsequently be identified chemically. Furthermore, the regression model can be used to predict mutagenicity from GC-MS chromatograms of other organic extracts. This is a very attractive possibility since bioassays are generally more resource-demanding and require larger samples than chemical characterization.

Materials and Methods Organic Extracts of PAC. Thirteen different organic extracts of exhaust particles were selected and assumed to have different but overlapping composition. The samples were obtained from the combustion of autodiesel in diesel engines, and heating oil and natural gas in boilers. Dichloromethane (DCM from Merck, Darmstadt, Germany, >99.8%) was used as the solvent. The extracts were evaporated to approximately 10 mL in a turbovapor. In addition to PAC, the extracts were expected to contain some saturated hydrocarbons (alkanes). Ames Salmonella Assay. Prior to mutagenicity testing, a volume of each of the DCM extracts was evaporated to dryness under dry nitrogen and completely dissolved in dimethyl sulfoxide (DMSO from EMS, Fort Washington, PA, >99.9%). The standard plate incorporation assay as described by Maron and Ames (10) was used for mutagenicity testing. A volume of 100 µL test solution was added to each plate. The Salmonella typhimurium strain TA98 was obtained from Dr. Bruce N. Ames, University of California, Berkeley. The mutagenicity testing was performed without the addition of a metabolizing system. Mutagenicity was expressed as revertants per microgram of particulate matter (slope of linear dose-response curve after linear regression). GC-MS. A volume of 0.5-1 mL of each DCM extract was spiked with 3.24 µg of naphthalene-d8 (99%, Cambridge Isotope Laboratories, Woburn, MA). The volume of each extract was then reduced under a gentle stream of nitrogen. 10.1021/es000154e CCC: $20.00

 2001 American Chemical Society Published on Web 05/03/2001

FIGURE 2. Schematic illustration of curve resolution and data matrix.

FIGURE 1. Structure of the data matrixes obtained from the GC-MS data. The samples were analyzed by GC-MS. A Fisons GC8000 equipped with a 100 m Petrocol DH Fused Silica Capillary Column (0.25 mm i.d., 0.5 µm film thickness, Supelco) was used for sample introduction into the mass spectrometer. The GC run program began at an initial temperature of 40 °C, ramped to a final temperature of 320 °C at 4 °C/min and held for 20 min. A Fisons MD800 quadrupole mass spectrometer operated in the EI mode (70 eV) was used to obtain mass spectra. The instrument was operated at 1.3 scans/s from m/z 40 to 450 (i.e., full scan mode) to obtain structural information from all important fragments. In addition to the 13 samples, a dilution series of 4 samples was analyzed on the GC-MS to verify the linearity of response factors in the concentration region of the samples (data not shown). Data Matrix and Curve Resolution. Figure 1 illustrates the structure of the data matrixes obtained from the GC-MS data. Signals from compounds eluting before the internal standard (naphthalene-d8) were not used, as these were assumed to be nonmutagenic. The remaining matrixes were split into smaller ones, each one containing only one cluster of coeluting peaks. Mass numbers containing only background were deleted using a shape criterion for the masses. Finally, mass numbers with intensity