Challenges in the analysis of novel flame retardants in indoor dust

Jul 13, 2018 - Mining the Chemical Information on Urban Wastewater: Monitoring Human Exposure to Phosphorus Flame Retardants and Plasticizers...
1 downloads 0 Views 1MB Size
Article pubs.acs.org/est

Cite This: Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Challenges in the Analysis of Novel Flame Retardants in Indoor Dust: Results of the INTERFLAB 2 Interlaboratory Evaluation Lisa Melymuk,*,† Miriam L. Diamond,‡ Nicole Riddell,§ Yuchao Wan,∥ Š imon Vojta,†,# and Brock Chittim§

Environ. Sci. Technol. Downloaded from pubs.acs.org by KAOHSIUNG MEDICAL UNIV on 08/05/18. For personal use only.



Research Centre for Toxic Compounds in the Environment (RECETOX), Masaryk University, Kamenice 753/5, Brno, 62500 Czech Republic ‡ Department of Earth Sciences, University of Toronto, 22 Russell Street, Toronto, Ontario M5S 3B1, Canada § Wellington Laboratories Inc., 345 Southgate Drive, Guelph, Ontario N1G 3M5, Canada ∥ Department of Physical and Environmental Sciences, University of Toronto Scarborough, 1065 Military Trail, Toronto, Ontario M1C 1A4, Canada S Supporting Information *

ABSTRACT: The Interlaboratory Study of Novel Flame Retardants (INTERFLAB 2) was conducted by 20 laboratories in 12 countries to test the precision and accuracy of the analysis of 24 “novel” flame retardants (NFRs). Laboratories analyzed NFRs in injection-ready test mixtures, in extracts of residential dust, and in residential dust to evaluate the influence of dust handling and extraction. For test mixtures, mean reported concentrations of PBT, PBEB, EH-TBB, TBBPA, TBDP-TAZTO, TBOEP, αTBCO, β-DBE-DBCH, and total HBCDD differed by >25% relative to reference values. Coefficients of variation were higher in dusts/dust extracts than in test mixtures. Concentrations among laboratories ranged over 3−4 orders of magnitude for HBB, TBP-DBPE, TBP-AE, and TDCIPP in dust extracts and dusts. Most laboratories produced repeatable dust concentrations, but differences reported in the literature among laboratories of 50% relative standard deviations among measured values and a 25% difference between mean measure values and reference values for decabromodiphenylethane (DBDPE), tris(1,3dichloropropyl)phosphate (TDCIPP), tetrabromobisphenol A (TBBPA), and gas chromatography-based measurements of hexabromocyclododecane (HBCDD), suggesting poor comparability of results among laboratories for these compounds. The goal of INTERFLAB 2 was to evaluate presumed improvements in analytical performance relative to INTERFLAB 1 and to test the lab performance for the handling and analysis of dust extracts, and sieved and unsieved dust.

native”, “replacement”, or “non-PBDE” flame retardants (referred to here as NFRs), typically consist of either halogenated or organophosphate-based organic compounds. These “alternative” compounds are increasingly the focus of indoor and environmental measurements.22 Analytical methods for the PBDEs were developed and optimized over the past 20 years, including through the use of interlaboratory comparison exercises.23−25 As such, analytical methods are now relatively well established. Validated routine analytical methods exist for PBDE, such as EPA Method 1614A,26 and reference materials such as NIST SRM 2585: Organic Contaminants in House Dust are certified for PBDEs, facilitating analytical method development. The PBDEs consist of a relatively small group of congeners (typically only 8−10 PBDEs are reported) of the same base structure, enabling more straightforward development of methods. Furthermore, many 13 C-labeled standards are available to improve PBDE quantification, and since PBDEs are relatively stable compounds, extracts can be subjected to a more intensive cleanup resulting in lower chromatographic background, reduced noise/matrix interference, and less impairment to columns and other parts of the analytical instrumentation. None of the analytical advantages of PBDEs translate to the novel FRs. The replacement compounds are structurally heterogeneous, even within the subset of brominated FRs, with structures ranging from bromobenzenes to brominated cyanurates (Table 1). As expected, this has resulted in challenges when analyzing such a large set of diverse compounds using a single analytical method (as evidenced by the findings of INTERFLAB 1),27 and a much larger set than the 8−10 PBDEs is typically reported. Moreover, many of the FRs are less stable than the PBDEs, leading to less cleanup on sample extracts and thus greater background noise in chromatograms, higher limits of detection, more challenges in identification of compounds, and further, greater wear on instrumentation resulting in more frequent cleaning and replacement of parts.22,28,29 Labeled internal standards are lacking for many compounds which can cause greater uncertainties in quantification. Overall, the development of methods and analysis of the emerging FRs requires more time, labor, and expense. Moreover, standardized methods have not been developed and quality assurance/quality control (QA/ QC) parameters differ greatly among laboratories.28 Given the ongoing use of FRs and concerns about some novel FRs,30−33 interest in these compounds is increasing dramatically; as of May 2018 a Web of Science search for novel or emerging flame retardants in dust identified 219 publications, with more than 75% of these published in the



MATERIALS AND METHODS The 13 laboratories who participated in INTERFLAB 1 plus 11 additional laboratories joined the study, of which 20 laboratories from 12 countries submitted results. Participating laboratories were academic (62%), government (28%), and commercial/private (10%). The study used a tiered sample approach to investigate the causes of discrepancies in performance and comparability of the analysis of 24 NFRs (Tables 1 and S1). Injection-ready test mixtures were used to determine analytical performance for NFRs in the absence of any matrix effects or influence of sample processing and also to enable a comparison of B

DOI: 10.1021/acs.est.8b02715 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

Figure 1. Laboratory-reported values and reference values for (a) high concentration and (b) low concentration test mixture. Red ×s show reference values, black boxes show medians and 25th/75th percentiles, whiskers show 5th/95th percentile, and black points show outlier values.

performance with INTERFLAB 1. Residential dust was provided to identify the influence of sieving and cleanup on data comparability. Each participant performed triplicate analysis on all test mixtures/samples with results submitted only for those NFRs for which the laboratory had established methods. Two injection-ready test mixtures (low and high concentrations, LC and HC, respectively) were provided by Wellington Laboratories Inc. (Guelph, Canada). Test mixtures were prepared as in INTERFLAB 1, described by Melymuk et al.27 with concentrations selected to reflect levels comparable to those typically reported for house dust (Table S2). Three types of dust samples were provided: (1) dust extract, (2) sieved dust, and (3) unsieved dust. All samples were subsampled from a larger dust sample that was a composite of residential dust obtained from household vacuum bags collected from homes in Toronto, Ontario, Canada. Dust was homogenized by 1 h of shaking, 30 min of homogenization with a Glas-Col 099C K54 Homogenizer, followed by 1 h of repeated shaking. The dust extract was obtained by sieving a subsample of the composite residential dust to 150 μm followed by extraction using accelerated solvent extraction with 1:1 n-hexane/acetone, followed by dilution and solvent exchange into toluene. The sieved dust sample was produced using a 150-μm sieve and the “unsieved” dust was preprocessed by sieving the collected material with a coarse 1-mm sieve to aid in homogenization. Dust was processed in bulk and

aliquots were removed from the main batch for distribution to each laboratory. Homogeneity of the dust subsampling was tested by analyzing triplicate samples for 15 PBDEs, which showed no significant differences among samples. The amounts received by the participating laboratories were 1.7 mL of dust extract aliquot representing ∼0.2 g of household dust in toluene, ∼ 1 g of sieved dust, and ∼2 g of unsieved dust.



STATISTICAL ANALYSES Statistical analyses were based on ISO 13528.34 All statistical calculations were performed with either Microsoft Excel 2010 or Prism GraphPad v. 5.00. Laboratories are identified by letter or number codes to preserve their anonymity. Compounds reported by only one or two laboratories were omitted from the statistical analyses, other than the summary statistics.



RESULTS AND DISCUSSION Summary of Methods. Out of 20 participating laboratories, only one submitted all 24 target FRs (29 individual compounds when considering isomers). Ninetyfive percent of laboratories reported results for EH-TBB (Table S2). Other compounds frequently reported were BTBPE and PBEB (86% each), HBB and BEH-TEBP (82% each), PBT (77%), TBP-AE, PBBZ, TBP-DBPE, s-DDC-CO and a-DDC-CO (73% each). Isomer-specific results for α-, β-, and γ-HBCDD were included by 45% of laboratories, while C

DOI: 10.1021/acs.est.8b02715 Environ. Sci. Technol. XXXX, XXX, XXX−XXX

Article

Environmental Science & Technology

laboratories using GC-MS analysis. Means were 44% and 64% lower than the reference value in the low and high concentration test mixtures, respectively. Medians were similarly inaccurate: 49% lower in the low concentration test mixture and 66% lower in the high concentration test mixture. In contrast, laboratories using LC-MS/MS to quantify HBCDD isomers separately were very accurate (Figure S2). Poor performance in analysis of HBCDDs by GC-MS was identified in the first round of INTERFLAB27 however no obvious improvements have been made in this method in the intervening two years. As a clear recommendation, laboratories should avoid using GC-MS for analysis of HBCDDs as it provides significantly low biased concentrations, whereas LCMS/MS methods perform well and provide accurate results. Poor accuracy and precision, even in the injection-ready test mixtures, for many of these NFRs suggests that there may be problems of comparability when combining results from different laboratories and/or studies. We further examined the pattern of poor precision and accuracy on an individual laboratory/compound basis to determine if the source of the errors was a few poorly performing laboratories, rather than a systematic phenomenon, by calculating percentage differences from the reference value for individual laboratory means (% difflab) (Tables S6 and S7). On average, laboratories had inaccurate results (i.e., %difflab > 25%) for ∼20% of compounds for the high concentration mixture and ∼35% for the low concentration mixture. All but two laboratories had inaccurate results for at least one compound in both test mixtures. Eight laboratories had poor performance for >50% compounds; one laboratory had inaccurate results for 80% reported compounds in the high concentration test mixture and 60% in the low concentration test mixture. Thus, the % difflab indicates that a small number of laboratories contribute to the overall errors for many compounds, but the larger effect was poor performance for 2−4 compounds by many laboratories. However, the inaccurate compounds differed among laboratories; and only 4 compounds (TBP-BAE, BTBPE, and α- and β-HBCDD) had accurate results from all laboratories, and only in the high concentration mixture. This presents a challenge for comparability among data sets as we cannot simply identify and exclude one or two inaccurate compounds or one inaccurate laboratory to improve comparability. Laboratories are reporting inaccurate results for many different compounds, making the problem more difficult to address. Comparison of Test Mixture Results between INTERFLAB 1 and INTERFLAB 2. We did not find significant improvements in precision between INTERFLAB 1 and 2 (IF1 and IF2, respectively). In fact, CVs were similar if not higher in IF2 vs IF1 for both test mixtures (Figure S3). This held even when considering the same subset of laboratories that participated in both studies (Figure S4). A notable exception was that CVs were 25−55% lower for DBDPE and TDCIPP in IF2 vs IF1. In terms of accuracy, TBBPA and DBDPE had the poorest accuracy in IF1 with percentage differences of 211% and 92% respectively; this improved in IF2 to 32 and 14%, respectively. Whereas this is a positive outcome suggesting method improvements in these compounds, the same or poorer performance for many other compounds is concerning, as it suggests that insufficient attention is being given to maintaining or improving performance for other NFRs. For some compounds, e.g. OBTMPI, this may be due to waning interest in the compound, as it is infrequently detected in many

only 14% of laboratories reported nonisomer specific values for HBCDD, interpreted as “total HBCDD”, thus 59% of laboratories reported some form of HBCDD values. Wide variation was evident in the analytical and instrumental methods with no two laboratories using identical methods (see Tables S3 and S4). However, features were common to many methods: sonication was the most frequently used extraction method (65% of participating laboratories), and hexane/ acetone (most often at 1:1) was the most common extraction solvent (50% of laboratories). Five laboratories applied different extraction and cleanup methods for different subgroups of FRs (Table S3), and ten laboratories split the target compounds among two or three instrumental methods, typically separating the analysis of HBCDD, TBBPA, TBOEP, and TDCIPP from the other target compounds (Table S4). Gas chromatography−mass spectrometry (GC-MS) was used by 81% of laboratories for most NFRs with separation on some version of a 15-m 5% phenyl stationary phase column. TBOEP and TDCIPP were also typically analyzed by GC-MS but with a 30-m 5% phenyl column, while TBBPA and HBCDD were typically analyzed by liquid chromatography−tandem MS (LCMS/MS) with a C18-type column. Evaluation of Test Mixtures. Figure 1 shows the distribution of reported values for the high and low concentration test mixtures, relative to the reference values provided by Wellington Laboratories Inc., and full summary statistics are given in Table S2. The test mixtures represent a “best case scenario” for analysis, as they require no processing by the laboratories and have no matrix effects or other potential interferences in the chromatography. The difference between the reference value of the test mixture and the mean of all laboratories (excluding outliers) was 1000% suggested misidentification of compounds or errors in standards used for calibration (e.g., TBP-AE, PBT, HBB). If we consider the unsieved dust as most representative of a typical environmental sample, we expect that differences of 9000 ng/g for unsieved dust, and 10 of 42 values deviated from the mean by more than 50%. For ∑DDC-CO, the IQR spanned 65 ng/ g, and 14 of 42 values deviated by more than 50%. For BEHTEBP, the IQR spanned 270 ng/g, and 35 of 51 values deviate from the mean by more than 50%. Finally, the IQR of EH-TBB spanned 210 ng/g and 16 of 54 values deviate by more than 50% from the mean. The size of the ranges reported by different laboratories for the INTERFLAB unsieved dust are comparable to the ranges of most reported dust concentrations in the scientific literature, particularly for TCDIPP and BEH-TEBP (Figure 4). Thus, it is

precluded evaluating the overall performance accuracy of the laboratories, so instead we evaluated the performance of individual laboratories as the deviation relative to the mean (excluding outliers) of all reporting laboratories, assuming that the mean was close to the “true” concentration. Figure 2 shows the distribution of reported values for the unsieved dusts, sieved dusts, and dust extracts, and full summary statistics are given in Table S8. All compounds were detected by at least one laboratory in at least one of the samples. However, TDBPTAZTO, OBTMPI, α- and β-TBCO, and TBP-BAE were only detected by one or two laboratories, thus performance was not evaluated for these compounds. In general, the range of concentrations reported for dust and dust extracts spanned up to 4 orders of magnitude, however, as with the test mixtures, some compounds and some laboratories had more results closer to the overall mean than others (Figure 2). For example, reported HBB and TBP-DBPE concentrations in dust extracts spanned >3 orders of magnitude (Figure 2a). In the sieved dust, TBP-AE concentrations ranged over 3 orders of magnitude and TDCIPP and HBB over 4 orders of magnitude; similar results were found for the unsieved dust samples. This variation was quantified using the coefficient of variation (CV), indicating the difference in the range of reported values from the mean of all laboratories that excluded outliers, and thus the influence of laboratories with significant errors in their analytical methods. CVs were >175% for HBB and α- and β-DBE-DBCH in dusts and dust extracts, and for PBT in the unsieved dust (Figure 3). Dusts had significantly higher CVs (∼40−220%) than the test mixtures (typically 100% in their analysis of EH-TBB in dusts had