Novel Data Abstraction Strategy Utilizing Gas Chromatography–Mass

Feb 3, 2014 - The high cost and limited availability of emerging alternative fuels and/or fuel blending stocks with unknown compositions are often maj...
0 downloads 9 Views 2MB Size
Article pubs.acs.org/EF

Novel Data Abstraction Strategy Utilizing Gas Chromatography− Mass Spectrometry Data for Fuel Property Modeling Jeffrey A. Cramer,*,† Mark H. Hammond,† Kristina M. Myers,‡ Thomas N. Loegel,‡ and Robert E. Morris† †

U.S. Naval Research Laboratory, Code 6181, 4555 Overlook Ave. S.W., Washington, District of Columbia, United States Nova Research, Inc., 1900 Elkin Street, Suite 230, Alexandria, Virginia 22308, United States



ABSTRACT: The high cost and limited availability of emerging alternative fuels and/or fuel blending stocks with unknown compositions are often major impediments to the certification of these materials as Fit-For-Purpose (FFP) for the U.S. Navy. A method was desired whereby a candidate fuel could be rapidly prescreened to determine if it would be suitable for further, moreextensive FFP testing. The goal of this research was to employ statistical analysis strategies to establish linkages between the chemical constituency of any given fuel or fuel stock, regardless of type or source, and the resultant performance, and/or fuel properties. A chemical profiler developed during the course of this work has previously been used to quantify the constituencies of fuels using gas chromatography−mass spectrometry (GC-MS) data. These constituencies were then correlated to specification properties using partial least-squares regression modeling reconfigured into a multistep, iterative strategy. While this modeling strategy was shown to be successful at predicting the performance properties not only of the training data but also of uncalibrated alternative fuels, the underlying data abstraction strategy was determined to be inherently unsuitable for use with the disparate data from multiple GC-MS instruments due to instrument-based overfitting. The following report details a novel modeling strategy that makes use of normalized total ion chromatography (TIC) peak areas to both streamline the procedural complexity of the previous modeling strategy and more ably quantify chemical constituencies for the purposes of multi-instrument FFP fuel modeling.



INTRODUCTION The U.S. Naval Research Laboratory (NRL) has been engaged in the development of sensor-based technologies to replace the Combined Contaminated Fuel Detector (CCFD), which is used to perform fuel quality surveillance on board Navy ships. This approach offers significant advantages over the current state-of-the-art by reducing the time and manpower required to measure critical specification properties, which would also improve shipboard safety by minimizing the manual transport of fuel aboard naval vessels. An automated fuel analyzer provides the means to perform automated real-time fuel quality surveillance and product monitoring, without altering the fuel, and would also not require any supplies or generate any disposable waste. The Navy Fuel Property Monitor (NFPM)1 was developed during this research effort to predict various critical specification fuel properties with the partial least squares (PLS)2 regression modeling of near-infrared (NIR) spectroscopic data. This instrument was primarily trained on petrochemical fuels, but its capabilities were extended to accommodate a fuel population of Fischer−Tropsch (FT) synthetic fuels, fuels derived from biomass, and blends of these two fuel types with petrochemical fuels.3 In addition, a discriminant model was incorporated into the NFPM to identify ultralow sulfur diesel (ULSD) fuels.4 Despite what has previously been achieved with respect to fuel property predictions and generalized fuel quality and fitness monitoring, emerging fuels derived from biomass and other alternative sources present a significant challenge to predicting fuel properties from NIR spectra. Nonpetrochemical fuels, potentially originating from unknown and unpredictable © 2014 American Chemical Society

sources, must be accommodated in the present research. Unfortunately, the process of manually adapting to individual nontraditional fuel types as they become available, as was achieved for the aforementioned population of FT and biomass fuel types, is simply untenable. This presents significant logistical difficulties in maintaining calibration sets and perpetually maintaining the fuel property models. In other words, a fuel modeling strategy that requires recalibration each time a new or emerging fuel is introduced is not practical, particularly in light of the wide compositional variability of alternative fuels and blending agents. This becomes especially important when attempting to predict if a new alternative fuel or blends of that alternative fuel with traditional petroleum fuels will be suitable for more-extensive Fit-For-Purpose (FFP) certification, and this situation becomes further compounded in those instances in which sufficient quantities of an emerging fuel are not available for full FFP certification. In such instances, a prescreening method that requires very small fuel quantities would assist in determining if it would be worthwhile to acquire larger quantities, often at significant cost, for such a certification. Although some strategies have been developed to improve and automate spectroscopic fuel property modeling algorithms in the development of the NFPM,5,6 the present challenge simply cannot be adequately addressed with a spectrophotometric Received: November 5, 2013 Revised: January 27, 2014 Published: February 3, 2014 1781

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

sample is not a realistic operation. Ultimately, GC-MS data, or data that are at least as or more chemically informative than GC-MS data, must be modeled in a comprehensive fashion to properly pursue fuel property modeling. Early multiway work26 performed in our laboratory focused upon the differences that could be found between sample classes. The data features quantified in this work were the same type of data features from which fuel constituencies would be derived for the present work, but a more direct focus upon fuel constituency was deemed necessary for the purposes of direct fuel property modeling. As PLS has already proven itself to be successful at modeling fuel properties, it was decided that PLS would be maintained as the central data modeling algorithm. This, of course, results in an initial incompatibility between the type of data collected through GC-MS (three-dimensional (3D), suitable for multiway analysis) and the type of analysis available using standard PLS (a two-dimensional (2D), multivariate analysis technique). To address this incompatibility, it was decided that the 3D GCMS data could be reduced in dimensionality by translating each multivariate mass spectrum into a single, univariate component identity. Thus was a multistep data abstraction and modeling strategy initially developed and effectively implemented, the results of which having been reported upon previously.27,28 In this initial strategy, the GC-MS chromatograms obtained from fuels were first reduced in dimensionality by using a GC-MS based compositional profiler, designed in-house,29 to determine compound identities from mass spectra and collate these identities into 2D chemical component metaspectra for individual samples, with the number of appearances of each chemical component reported along the y-axis for each component listed, arbitrarily, along the x-axis. The chemical component metaspectra resulting from this initial data abstraction strategy were quite complex. This is not only because they were obtained from compositionally complex fuel samples, but also because this data abstraction strategy, referred to as the time slice strategy herein, collected a great deal of compositionally uninformative information in addition to the compositionally informative information. Compound identifications were taken from regularly spaced retention times throughout any given GC-MS dataset using this strategy, regardless of the compositional information that may or may not have been found at said retention times. Thus, the strategy made use of an unchanging population of retention times for all fuel samples. Because these retention times were chosen for arbitrary, nonchemical reasons, this tended to overidentify certain instrument-specific data factors, such as column bleed,30 that exacerbated the presence of masking compounds, i.e., those compounds preventing the proper discrimination of desired underlying chemical factors.31−33 In addition, even a portion of data possessing no rightly identifiable chemical information, if said data portion falls within the set of analyzed retention times, will be assigned a spurious identity from the limited database, thereby further convoluting the desired chemical information with undesired masking effects. To compensate for this complexity, the component spectra used for model construction were comprehensively cleared of masking compounds during the overall modeling procedure. This need for unmasking was an integral part of this multistep, iterative modeling strategy in which minor masking compounds were slowly removed from a dataset to sequentially improve modeling and, incidentally, lessen the impact of major masking

approach that must be manually adjusted on a case-by-case basis. PLS functions, in this case, by deriving relationships between the analytical calibration data and measured fuel properties. In the case of the NFPM, these calibration data consist of NIR spectra, which themselves consist of chemically nonspecific combination and overtone bands. Because PLS models will not function in a precise manner when confronted with data that are outside of the calibration space, alternative and other unconventional fuels that are compositionally, and, hence, spectrally divergent from the petroleum fuels in the calibration set, are outside of the calibration space. In other words, the alternative fuels would be completely uncalibrated fuels, thus compromising the precision of the NIR-based PLS property models. An hypothesis was thus developed that, if the PLS property models were based directly on fuel compositional data, then these unconventional fuels would allow for, in a worst-case scenario, uncalibrated compounds in the overall compositional data space, which still allows the calibrated compounds in unconventional fuels to inform fuel property predictions. In addition, even individually uncalibrated compounds could, if necessary, be further approximated with functionally similar calibration compounds found in the calibration set, with minimal impact on the resultant property predictions. Therefore, to test this hypothesis, a new analysis strategy that could more ably assess entire fuel compositions was deemed necessary to accommodate any and all future fuels, regardless of whether or not they were explicitly calibrated. Such a robust strategy would determine fuel performance parameters directly, based on fuel composition, to fulfill the requirements for prescreening fuels as candidates for further FFP7 certification, i.e., determining whether or not a fuel is fundamentally compositionally suitable for use in an engine, outside of procurement and cost concerns. It should be stressed that this composition-based property modeling approach, in its present form, is intended only as a prescreening tool, since there are limitations inherent in any statistical modeling strategy, and some critical fuel properties are not easily correlated to composition. Moreover, the modeling of dynamic fuel properties,8 such as bulk modulus, specific heat, thermal conductivity, and interfacial tension can only be approached through first-order computational methods. However, these methods are generally limited to simple model systems, and are beyond the scope of this research effort. It is already known that a great deal of information regarding fuel composition and performance can be obtained from gas chromatography−mass spectrometry (GC-MS)9−12 data, so this is a useful analytical technique upon which to base an indepth analysis strategy. Compositional information can, of course, be derived from the analysis of GC data or GC×GC data without the benefit of MS hyphenation13−16 and from MS without the benefit of chromatography,17 as well as GC-MS data and GC×GC-MS data without the benefit of complete mass/charge ratio information.18−22 Nevertheless, fuel-based property modeling requires the discrimination of hundreds, if not thousands, of discrete compounds, and chromatography combined with fully mass analyzed MS data will provide this level of discrimination. Also, although algorithms such as Target Factor Analysis (TFA),23 instrumental modes such as selected ion monitoring,24 or comparative techniques requiring the use of internal standards25 can be used to interrogate GCMS datasets for individual target compounds, attempting to target every compound that could potentially be found in a fuel 1782

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

models when modeling NIR data, as was the case with the NFPM. All fuel types were modeled together in the present work to ensure that compositional results, as determined through GC-MS, were as comprehensive as possible and, thus, applied to as many fuels and fuel types as possible, as this will ensure the long-term robustness of the technique. All of the fuel samples were provided with a list of specification properties collected during the course of assuring military specification compliance. Because of differences in fuel quality assurance requirements when the samples were collected, not every sample in the training set has an associated fuel property value available for every desired fuel property. However, every sample that has a useable fuel property value was included in every given model construction. GC-MS Analysis. The property models were developed from GCMS data acquired on an Agilent Model 5890 GC with an Agilent Model 5971 mass-selective detector. Injections (1.0 μL) of the neat fuels were made with an autoinjector into a split/splitless inductor held at 250 °C at a split ratio of 200:1. The liner type was an Aglient ultra inert inlet liner low psi drop with glass wool. A DB-1 bonded and cross-linked polysiloxane capillary column (50 m × 0.25 mm ID, 0.20 μm film thickness) was run in constant flow mode at 1.2 mL/min using a helium carrier gas, and was used with an oven temperature program that initiated data collection at a temperature of 40 °C and ramped the temperature at 10 °C/min to 290 °C, holding this temperature for the remainder of the data collection. Data were acquired at column retention times from 6.8 min to 36.1 min, at 2.8 mass spectra per second, from 40 to 279 m/z. It should be noted that the chromatographic data tended to show peak widths of between 7 and 9 variables, or ∼2−3 s, along the retention time axis. With this configuration, the GC-MS could detect hydrocarbons up to a size of approximately C28, which was considered to provide a reasonably complete compositional characterization of processed fuels in the jet and diesel ranges. While it is entirely possible that some of the modeled fuel properties may be influenced by chemical constituents that would not pass through the described GC column and thus not be detected, this is not necessarily a significant limitation for those fuel properties for which those heavy constituents play a major role. This is because it is also possible, through statistical modeling, to indirectly detect or predict fuel characteristics that cannot be directly measured. One example of this is found in the identification of ULSD fuels by NIR spectra acquired over a wavelength range that contains no sulfur chromophores.4 In that instance, it was possible to indirectly identify ULSD fuels with a high-factor discriminant model by modeling the compositional changes brought about in diesel fuels because of the hydroprocessing used to remove the sulfur. Therefore, while it is recognized that one must exercise circumspection when interpreting the results of any statistical modeling, chromatographic resolution is not necessarily a limiting factor, as long as the appropriate performance metrics are established and the validity of a quantitative property model can be established. However, analytical resolution could limit the extent to which data can be used to infer specific compositional correlations to a modeled fuel property. Calibration transfer of the PLS property models was assessed using data from a different, secondary instrument configured with a different GC oven temperature profile. GC-MS data were acquired from a subset of the fuel training set with an Agilent Model 6890 GC with an Agilent Model 5975 mass-selective detector. Injections (0.2 μL) of the neat fuels were made with an autoinjector into a split/splitless injector held at 300 °C with a split ratio of 50:1. An Agilent Model HP-5MS 5% phenyl methyl siloxane (30 m × 0.25 mm ID, 0.25 μm film thickness) GC column was run in constant flow mode at 1.3 mL/min using a helium carrier gas. The oven temperature program began at a temperature of 35 °C and ramped at 3 °C/min to 200 °C, followed by a ramp at 10 °C/min to 305 °C and holding at this temperature for 4.5 min for the remainder of the data collection. Data were collected at a rate of 20 Hz from 40 m/z to 700 m/z with no solvent delay. Both of these experimental methodologies were designed to allow for the injection of neat fuels. This was done to minimize variations in composition arising from the preparation of multiple dilutions, without the need to add an internal standard. It is well-known that the

compounds through these sequential model improvements. This approach was undertaken because, as reported previously, there was no straightforward method by which to remove major masking compounds from available fuel data based on more typical data assessment metrics. However, once again, this unmasking strategy, while effective, was only necessary because a form of metaspectral data was being used that was already predisposed to the initial identification of masking compounds during the construction of metaspectra. Furthermore, while, through unmasking, effective time slice models can be constructed for data collected from a single instrument, data collected from instruments other than the instrument from which the calibration data were initially collected, or data collected with the same instrument but with different experimental parameters, will likely not possess the same chemical information at the same retention times. Thus, the time slice strategy cannot be considered robust in the context of calibration transfer between multiple instruments and long-term instrument drift. To compensate for the limitations inherent in the time slice strategy, a novel strategy was developed that fuses peak area assessments to a less complex population of component identity determinations. This is done so that the assumptions previously made as to which compounds would appear at which retention times would no longer be necessary. Instead, the peaks themselves, and their relative prominences, were used to define where individual compounds would be found in any given GC-MS dataset, and the use of these nonretention time metrics allowed chemically similar peaks appearing at different locations in various GC-MS datasets to be compared more effectively. It is also important to note that this novel peak area modeling strategy, because of its more restricted and more quantitative portrayal of the chemical composition of a GC-MS dataset, no longer captures as many masking compounds during the course of metaspectral data construction. Therefore, this peak area strategy no longer requires the extensive, multistep data unmasking of the previous time slice strategy, and the peak area strategy’s fuel property modeling can thus be accomplished through the relatively simple application of uninformative variable elimination PLS (UVE-PLS)34 alone. A final advantage of using the peak area strategy arises from the fact that the use of peak maxima and peak area detection threshold values can be used to limit the number of times an individual mass spectrum requires an identity to be assigned to it, thus accelerating computation times. The novelty of the proposed strategy does not arise from the fact that discrete compound identities, as determined through a database search, were coupled with peak areas. The novelty arises from the collation of these peak areas into area-based compositional metaspectra that are, in turn, subjected to multivariate modeling and correlated to fuel properties. The following work will discuss how this novel peak area fuel property modeling strategy was designed and how it functions with various datasets collected from various instruments.



EXPERIMENTAL SECTION

Fuel Samples. A group of over 1100 worldwide fuel samples was used for the fuel property modeling seen in the present study. This dataset consisted of JP-5, JP-8, Jet A, Jet A-1, F-76, marine gas oil (MGO), ultralow sulfur diesel (ULSD), fatty acid methyl ester (FAME) biodiesel fuels and blends; and FT fuels and blends. All fuel types were modeled as a single group, which, it should be noted, deviates from the approach of developing fuel-type-specific property 1783

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

injection of neat fuels can result in chromatographic35 and detector overloading, and care was taken in the present work to avoid sample overloading to the extent possible. It should be noted that an added but incidental benefit of the peak area methodology, through the focus on peak maxima, allows the technique to be relatively insensitive to peak shape, as compared to the time-slice abstraction method. Chemometric Calculations. GC-MS spectra, in an unprocessed form, are three-dimensional, which refers to the number and type of axes required for each individual GC-MS chromatogram (retention time versus mass/charge ratio versus instrument response).36 The GCMS chromatograms collected in this study were first reduced in dimensionality by using the in-house GC-MS analysis tool described previously to determine compound identities. This profiler, whose effectiveness is not restricted to fuel analysis, does this by interpreting the mass spectral data found at assigned retention times and thus provides a list of chemical components to be found within the entire 3D GC-MS dataset. Identification themselves are accomplished by matching individual mass spectra with archived library data through the NIST Mass Spectral Search Program for the NIST/EPA/NIH Mass Spectral Library (versions 2.0f and 2.0g).37 This translates the 3D instrument response to a 2D chemical component metaspectrum for each individual sample (number of appearances per chemical component). It should be noted here that the profiler is capable of additional result classifications and metrics that were not used in the present fuel property modeling work, but nonetheless appear in contemporary versions of the software application, because of their relevance in overall fuel characterization. As mentioned previously, the former, time slice metaspectral construction strategy utilized GC-MS data found at regularly spaced intervals throughout the retention time axis. However, this overly biases, or overfits, the resulting models to the specific data at the specific retention times being modeled. If a different instrument, or even different experimental settings on the same instrument, were to be used, then the data found at any given retention time are no longer necessarily going to provide the same compositional information found at the same retention times in the training dataset. Therefore, a novel peak area strategy, based on first combining Total Ion Chromatography (TIC) peak areas (after the TICs themselves are normalized to unit area) with identified compounds, and then correlating this area-based metaspectral data to fuel properties using a retention time-independent multivariate analysis, had to be developed to more accurately portray chemical composition across multiple instruments. The metaspectra, whether derived from the time slice or peak area strategies, and the calibration fuel property values, were computed and modeled using MATLAB38 (MathWorks, Inc., Natick, MA). The component spectra were assembled into matrices where each row represented the spectrum of a different fuel sample and each column represented a different compound. All analysis strategies were developed with algorithms and other functionalities provided by the PLS_Toolbox version 4.2 for MATLAB39 (Eigenvector Research, Inc.) and the Calibration and Standard Toolboxes40 (FABI/ChemoAC Consortium). Mean centering was considered vital for the proper use of subsequent chemometrics techniques and was performed prior to all other analysis steps. The technique of partial least-squares (PLS) is based on singular value decomposition (SVD),41,42 which mathematically transforms data based on the underlying linear variances that can be found within it. These linear variances are transformed, through SVD, into new variables, known as latent variables (LVs) because they are not directly observable in the original data. These LVs can then be further correlated to calibration data, such as that provided by fuel property measurements, using PLS data fitting. This fitting, in turn, produces multivariate prediction models that take a great deal of underlying linear data variance into account, providing a higher level of overall model robustness than can be afforded by simpler univariate prediction models. The number of LVs to be retained in each PLS fuel property model construction is determined using leave-one-out cross-validation (LOO−CV),43 which approximates model performance with uncali-

brated data. In LOO−CV, the predicted fuel property value of each fuel sample in a given model is based on a submodel built from every other sample except for the sample being given a prediction value. This operation produces a single Root Mean Square Error of CrossValidation (RMSECV) result for each number of LVs evaluated. Choosing the number of LVs that minimizes this RMSECV value theoretically maximizes the performance of a given model with uncalibrated data. However, RMSECV results are ultimately an imperfect metric to use to optimize the number of LVs in this type of modeling.44−46 This is because RMSECV results are still based on models that take almost all of the available training data into account and are, therefore, being created under the assumption that the training data are completely representative of all possible future data. This assumption may be valid when only modeling petrochemical fuels, but is not necessarily valid with the inclusion of alternative fuels of unknown compositions in sample populations. To compensate for this reality, the number of LVs to be used for each fuel property prediction model produced were instead chosen automatically using a statistic called the F-test.47−49 The F-test, in the present case, was applied to the cross-validation’s cumulative predicted residual error sum of squares (CUMPRESS) results with an 85% confidence interval, using a maximum of 10 LVs. The use of the F-test tends to select a smaller number of LVs than the minimum RMSECV value would suggest, which, in turn, sacrifices the immediate quality of a model in order to better preserve its versatility and its potential utility in the presence of uncalibrated data. By limiting the number of LVs that can be incorporated into a model, the F-test protects against the overfitting that can produce models that are too biased toward a specific set of training data. Once the number of LVs was chosen using the F-test, each model was rebuilt using all possible calibration data to obtain the final Root Mean Square Error of Prediction (RMSEP) results. Uninformative Variable Elimination PLS. A modified version of PLS known as UVE-PLS33 was also used extensively in this work. This technique removes those variables from the PLS model training data that do not contribute any relevant information toward the given modeling goal. In the present case, this results in the elimination of uninformative individual compounds, which focuses the subsequent construction of PLS models upon those compounds that actually describe the fuel property being modeled. It should, of course, be noted that the elimination of specific compounds is counterintuitive, given the stated goal of developing a compositionally comprehensive fuel property modeling strategy. However, it has previously been shown27,28 that the eliminated compounds deemed to be uninformative through UVE-PLS were not contributing constructively to model quality, and thus constituted only noise or interference, regardless of fuel composition. Also, as mentioned previously, the use of peak areas produces more accurately quantitative and less noise-hindered metaspectra, which are also less hindered by the potential compound misidentifications inherent in any analysis reliant on database comparisons. This provides an additional advantage to the peak area methodology in the fact that fewer masking compounds are maintained in the metaspectra in the first place. This means that the multistep, iterative unmasking strategy used previously could be discarded in favor of the simple and direct application of UVE-PLS, which is much easier to justify in the context of algorithmic design. In UVE-PLS, the amount of relevant information possessed by individual fuel components is determined by using regression coefficients derived from the overall stability of the modeling procedure found during LOO−CV. To determine this stability in the case of the peak area modeling strategy, during an initial LOO−CV procedure, several randomized variables to be modeled are added to the actual variables being modeled. This number of randomized variables is an experimental parameter that was adjusted during the course of the peak area strategy’s development work, and the results of these adjustments can be found later in this report. After the randomized variables are added, the actual variables determined to be as inconsistently informative as their randomized counterparts during LOO−CV are eliminated from the final model 1784

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

Figure 1. Schematic representation of a total ion chromatography (TIC) profile and metaspectral construction strategies: (a) Time slice: evenly spaced mass spectra along retention time axis (i.e., black dots) identified. (b) Peak area: TIC peak areas (i.e., diagonally shaded area) quantified and assigned a single identity. construction. In this case, being inconsistently informative is defined as having a regression coefficient average/regression coefficient standard deviation ratio lower than that obtained for 85% of the random variables. The value of 85% was used to maintain consistency with the F-test’s pre-existing statistical parameter, which itself was finalized in a previous work.50 It should be noted that the number of LVs to be used for these UVE-PLS models are, as in the case of standard PLS, calculated from CUMPRESS results using the statistical F-test, as described previously. Peak Area Metaspectral Construction Strategy. The method by which the time slice strategy proceeds has been described in previous work27,28 and will not be described here. The details of the time slice strategy, nonetheless, will be revisited when necessary for the

purposes of comparison during the course of describing how the novel peak area strategy functions. Primarily, it should again be noted that the time slice strategy proceeds by simply identifying the compounds that appear for a training dataset over an evenly spaced span of retention time values. These compound identifications effectively transform the 3D GC-MS data to 2D metaspectra that can then potentially undergo PLS, UVE-PLS, SVD, and other data processing procedures for the purposes of data unmasking and data modeling. The peak area strategy, as was the case with the time slice strategy, takes a range of retention time values into account, but instead of simply collating the compound identities found at each of a set of evenly spaced values, only the retention time values, for any given sample, found to possess total ion chromatography local maxima (i.e., 1785

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

TIC peaks), were allowed to contribute compound identities to the final metaspectrum. Furthermore, these contributions were not simply the number of appearances of a given compound, as was the case with the time slice strategy, but, instead, were the sums of all of the peak areas, after the TIC spectra are normalized to unit area, corresponding to the compounds found at each peak maximum. These peaks and their corresponding maxima were defined using methodologies based on previously established work.51 This use of peak areas not only adds a more refined quantification to the overall analysis strategy, but also eliminates the potentially misidentified compounds that could appear at the edges of TIC peaks. Figure 1 shows the differences between the time slice and peak area metaspectral model construction procedures pictorially. The use of peak areas in the peak area methodology also provides an additional peak area-based analysis parameter that can be used to fine-tune modeling results. Because very small TIC peaks can potentially be representing trivial, noise-hindered mass spectral data, which would, in turn, be misidentified by database comparisons during the construction of metaspectra, a peak area threshold can be applied that only allows normalized peak areas to be recorded if said peak area is a certain size or larger. However, the manipulation of this peak area threshold has a tradeoff, with respect to time of analysis that should not be discounted, especially when considering potential future highthroughput operations. Furthermore, the use of a peak area threshold might also hinder the quantification of chemical compounds that are both present at low quantities and legitimately affect fuel properties, so the need to detect these compounds must be balanced against the need to filter out masking compounds and other noise. The results of using multiple peak area thresholds, and a decision as to which to use for the novel strategy, will be discussed in the present report.

Figure 2. Correlation coefficient (R2) values, found for early modeling strategies, with the values found when using UVE-PLS as the central algorithm plotted versus the models produced using identical parameters save for the use of standard PLS as the central algorithm.

present work that the peak area metadata would be modeled using UVE-PLS. The use of a single, noniterated UVE-PLS modeling procedure is a significant improvement over the more complex unmasking procedure necessary to make the time slice strategy workable. It should also be noted that the additional iterated modeling steps used in the time slice strategy were not evaluated in the context of the peak area strategy because these steps were only implemented based on their practical effectiveness at data unmasking within the strict context of the time slice strategy. UVE-PLS Evaluation. As mentioned previously, UVE-PLS has a customizable parameter in the number of randomized variables to be included during the peak area model construction. During the present work, then, it was necessary to adequately define the value for this parameter. Because one of the long-term goals of the current research effort is to accommodate data originating from multiple instruments, another model evaluation was undertaken during initial model development using three GC-MS spectra, collected for the same petrochemical jet fuel sample. Two of these replicates were collected on the same instrument from which the calibration data were initially derived, taken over a year apart to monitor potential instrument drift and potential time-based compositional changes in stored fuels. The third sample was a part of the secondary data collection undertaken using a significantly different hardware configuration and analysis parameters, as described previously. The data evaluated by and used to calibrate these early models utilized a peak area threshold of 0.006%, as was seen in the work undertaken to eliminate the iterative strategy in the first place. At this point in the research, the unit area TIC normalization procedure had been finalized and was thus implemented during all metaspectral data constructions seen here and in the remainder of the report. Figure 3 shows the RMSEP results arising from these three jet fuel replicates when models constructed using various numbers of randomized UVE-PLS variables were used to predict colligative and compositional fuel property results. The goal of this exercise was to determine if the number of



RESULTS AND DISCUSSION Elimination of Iterative, Multistep Modeling Strategy. One implicit goal of modifying the metaspectral analysis strategy was to eliminate the multistep, iterative data modeling strategy used to assess the time slice metadata. This is because the time slice strategy inherently chose more compounds than are necessary for compositionally relevant fuel property modeling, and the use of the iterative, multistep unmasking during this modeling strategy was simply a way to correct for a situation that should not have existed in the first place. However, it became apparent early in this new avenue of research that, even if the peak area strategy was capable of producing less masking-prone data, masking compounds would still be present. Figure 2 shows the linear correlation coefficients (R2) when the predicted property values from early models, produced in the present work for colligative and compositional fuel properties, were regressed against their corresponding measured values. These early models utilized a peak threshold of 0.006% and non-normalized peak areas, as the choice of normalization procedure was, at the time, still under investigation. The R2 results shown in this figure compare those values found using UVE-PLS versus the values produced using identical parameters with standard PLS as the core algorithm in the overall modeling strategy. As can be seen in the Figure, while many of the model comparisons yield relatively small differences, the UVE-PLS results are generally more accurate than the PLS results. At the very least, the use of UVE-PLS improves overall modeling results in aggregate, because those few cases in which the PLS models are significantly more accurate than corresponding UVE-PLS models are centered around lower R2 values along both axes, i.e., those models that underperformed regardless of whether or not PLS or UVE-PLS is used. Based on these results and similar trends found during initial development work, it was decided early in the course of the 1786

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

Figure 3. RMSEP values, found for three replicate fuel samples collected on two different instruments, obtained using various fuel property models (as indicated by x-axis) constructed using various numbers of randomized UVE-PLS variables (as indicated in the legend).

Figure 4. RMSEP/minimum RMSEP ratios, collected from the training dataset, obtained using various fuel property models constructed using various peak area thresholds.

Peak Area Threshold. Once the use of the UVE-PLS modeling strategy and a method by which to select a reasonable number of randomized variables to be used in the context of this strategy were determined, the next task was to determine the peak area threshold that would allow for the most useful models in the context of the developing peak area strategy. This was a relatively straightforward operation, with a series of models being constructed using multiple peak area thresholds and their RMSEP/minimum RMSEP ratios subsequently compared. Those fuel properties showing ratios above a 2 were eliminated based on the theory that changes of such magnitude were not as indicative of relative model quality as they were of uncharacteristic, random modeling errors. More-restrictive peak area thresholds would potentially eliminate more-spurious compound identifications, but would also eliminate more-legitimate compound identifications in the process. However, even the least restrictive peak area thresholds stand some probability of excluding low-area but nonetheless compositionally informative results from the metaspectra thus constructed. Thus, this examination of various peak area thresholds also inherently investigates the balance that must be struck between eliminating desired and undesired data phenomena.

randomized variables affects the fuel property predictions either positively or negatively by decreasing or increasing the RMSEP values, respectively, of the results obtained from these fuels. The data shown in Figure 3 indicate that no clear systematic trends are present, although it should be noted that the use of larger numbers of randomized variables tends to avoid the occasional nonsystematic increases in RMSEP that were observed with some predicted fuel properties when using models constructed from lower numbers of randomized variables, such as can be seen in the pour point results. It was finally decided such improvements were too sporadic to solely justify the larger calculation times associated with larger numbers of random variables. Because no overwhelming trends could be determined, the goal became to balance the mild trend of larger numbers of randomized UVE-PLS variables producing slightly more accurate models with the necessity of keeping the model constructions as fast as reasonably possible, especially during initial development work. The compromise reached was to set the number of randomized variables equal to one-third of the total number of variables, i.e., compounds that could be found, per fuel property, in the initial training data. 1787

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

Table 1. Performance Metrics of Fuel Property Models Produced Using Both the Time Slice Strategy and Peak Area Strategya Primary Data

Secondary Data

R2

a

RMSEP

RMSEP

property model

ref source

time slice

peak area

time slice

peak area

time slice

peak area

acid number aromatics cetane index density distillation, 10% distillation, 20% distillation, 50% distillation, 90% distillation, FBP distillation, IBP flashpoint freeze point hydrogen naphthalenes olefins particulates pour point saturates viscosity @ 40 °C FSII cloud point existent gum lubricity storage stability viscosity @ −20 °C

52 53 54 55 56 56 56 56 56 56 57 58 59, 60 61 62 63 64 62 65 66 67 68 69 70 65

0.89 0.97 0.89 0.93 0.88 0.94 0.91 0.86 0.84 0.82 0.89 0.83 0.93 0.91 0.96 0.81 0.97 0.98 0.95 0.77 0.75 0.66 0.65 0.55 0.55

0.72 0.96 0.85 0.92 0.93 0.98 0.98 0.91 0.90 0.79 0.79 0.81 0.92 0.85 0.99 0.61 0.93 0.98 0.66 0.54 0.71 0.57 0.46 0.32 0.34

0.02 1.86 1.44 0.01 11.16 7.32 13.05 18.90 20.93 10.33 4.01 2.36 0.65 0.25 0.10 2.82 4.65 0.58 0.86 0.01 3.11 1.51 0.03 0.73 0.75

0.04 2.05 1.67 0.01 6.71 4.37 4.99 13.62 14.75 8.41 5.54 2.47 0.69 0.29 0.21 4.03 6.68 0.56 1.03 0.02 3.39 1.69 0.04 0.90 0.90

0.17 24.59 13.27 0.07 45.35 73.45 189.61 63.91 97.20 77.76 23.13 20.19 0.53 1.96 2.45 7.34 23.77 1.29 19.40 0.15 30.84 7.70 0.00 1.73 5.42

0.12 22.70 10.11 0.05 20.86 29.55 43.00 71.03 73.98 11.42 25.43 3.95 2.94 1.49 1.61 1.38 50.72 0.70 9.33 0.09 19.03 0.61 0.11 2.50 2.19

For each time slice/peak area pairing, the results indicating a higher level of modeling accuracy are highlighted in bold italic font.

instrument, but this exact same information is from where possible instrument-based overfitting will originate. To test the hypothesis that instrumental overfitting may be contributing to the peak area strategy’s apparent, relative underperformance, data from another instrument must be subjected to the models constructed based on the first instrument’s data. The results of such an evaluation can also be found in Table 1 and, in fact, the lower RMSEP values obtained from six uncalibrated fuel samples from the secondary GC-MS instrument indicate that the peak area method was less sensitive to variations between instruments and data chromatographic methods. R2 values are not provided for the secondary data only because there were not enough relevant secondary samples to produce meaningful correlation coefficient comparisons. One could, of course, make the argument that an analysis strategy that is overfit to a single instrument, if it outperforms a competing multi-instrument technique, is still the appropriate choice if only that single instrument is ever used for a given analysis challenge. However, the broad scope of fuel property modeling is better served by the construction of data modeling strategies that are broadly applicable across the multiple disparate instruments possessed by parties interested in exactly this type of modeling. While the peak-based abstraction modeling strategy may slightly underperform the time-slice methodology, the results shown in Table 1 indicate that the peak area methodology provides more-robust models that maintain calibration transfer across multiple instruments. It is also less computationally intensive, reducing computational times by almost an order of

The results of this operation are shown in Figure 4, which, unfortunately, does not clearly and unambiguously inform the parameter decision that must be made. However, it can still be seen in Figure 4 that there are occasional large RMSEP ratios with respect to the highest, most-permissive area threshold of 0.005%. Therefore, in the present analysis, it was determined that using a peak area threshold of 0.001% was the best choice for the current modeling procedure. In other words, a normalized peak must possess at least 0.001% of the total area of the entire normalized TIC spectrum for a given sample at any given retention time before being collated into the desired metaspectra. This happens to be the same threshold value that was previously determined to be optimal for use with the complementary compositional profiling algorithms already developed. Peak Area Modeling Results. The use of the novel peak area strategy produces models that, on average, underperform those models produced using the time slice strategy with respect to the RMSEP and R2 results obtained from the actual calibration data, as can be seen in Table 1. However, it must again be noted that the previous time slice strategy might be anticipated to outperform the novel peak area strategy, because of instrument-based overfitting. The peak area strategy, by making no assumptions regarding where in a dataset usable compositional information can be found, does not take into account the information inadvertently provided by modeling the data that appears at predetermined intervals along a single instrument’s retention time axis. Such inadvertent information might indeed inform predictive modeling results on a single 1788

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

necessary for the previous time slice strategy. This provided significant improvements in computational speed, thus allowing for the practical implementation of these modeling algorithms in stand-alone software applications for fuel certification screening. The present work serves as further validation of our underlying research hypothesis that a composition-based property modeling approach will overcome the inherent limitations of less comprehensive chemometric fuel property modeling in accommodating fuels that are not a part of the calibration dataset. This is a critical requirement for any process used to accommodate novel alternative fuels that do not have known physical and chemical properties. The next phase of this research will be to extend the present modeling strategy to both different types of GC-MS data and comprehensive two-dimensional gas chromatography with time-of-flight mass spectrometric detection (GC×GCTOFMS). In the case of GC×GC-TOFMS, it is envisioned that the additional dimension of separation would allow for greater resolution of possible components from which to derive accurate and robust PLS models. Conceptually, this could also present a greater multi-instrument calibration transfer challenge than encountered in GC-MS, and may require extending existing peak and/or landmark alignment methods to the 2D data set. While GC×GC-TOFMS can be analyzed to produce lists of compounds in much the same manner as GC-MS, comprehensive chemometric data analysis and modeling may also require the development of data compression methodologies in order to accommodate the larger data sizes. Another avenue of research to be investigated is the implementation of mass spectral response factors into the overall analysis scheme. Because different compound classes (e.g., aromatics versus aliphatics) do not possess the same electron impact ionization efficiencies, the resulting peak areas do not necessarily represent equal compound quantities. Response factor calibration across all compound classes found in mobility fuels would improve the quantitative accuracy of the PLS property models.

magnitude. This increases the potential applicability of GC-MS data to fuel modeling challenges when compared to the previous time slice strategy.



CONCLUSIONS While statistical modeling cannot replace traditional property measurements, it does offer the advantage of predicting many critical fuel properties with very small sample sizes, when it is either impractical or too costly to obtain sufficient sample quantities for laboratory testing. Compositional property modeling also offers the potential to rapidly predict the likelihood of specification compliance of different blend ratios of fuels and alternative fuels in a relatively short time. In this context, statistical predictions of critical fuel properties could serve as a useful tool to prescreen fuels to determine if they would be suitable candidates for further Fit-For-Purpose (FFP) certification testing. A novel metaspectral modeling strategy has been developed to correlate the composition of a complex organic mixture, in this case fuels, with measured properties and characteristics. This strategy is a refinement upon previous work that was also designed to maintain this type of relevant chemical information in fuel property models. Both this work and the previous work were performed to produce fuel property correlations, derived as directly as possible from available chemical information, which would accurately reflect real and fundamental relationships between fuel composition and performance parameters. This is important because of the widely varying chemical compositions of fuels, both petrochemical and alternative, that cannot be reliably predicted. With no assumptions available as to what might be in a fuel, composition must be modeled directly. The key to performing the partial least squares (PLS) statistical analysis of complex gas chromatography−mass spectroscopy (GC-MS) data is the reduction of the dimensionality of the data with a suitable abstraction strategy. The initial comprehensive, but computationally intensive, time slice abstraction approach provided accurate models, but required a separate application of iterated unmasking steps to remove statistically and chemically irrelevant chemical constituents from the training data. In addition, the time slice abstraction methodology was tied to a given GC retention time profile and was therefore not robust with respect to variations, either within a particular instrument, or between different instruments. A peak-based abstraction strategy was thus developed and was shown to be effective in modeling the chemical constituencies of fuel properties. Although the time slice strategy was more effective in terms of immediate model quality, it was also shown that the peak area models are significantly more capable with respect to accommodating data from varied experimental parameters and instrumentation, which provides a more robust and practical application of the models. This new abstraction strategy also provides for the use of peak area thresholds, by which noisy data can be excluded, thereby directly reducing the number of overall masking compounds to be subsequently analyzed. The use of this novel peak area metaspectral construction strategy consequently simplified the resulting fuel property modeling, which can now proceed using a self-contained uninformative variable elimination PLS (UVE-PLS) algorithm with no iterations, as opposed to the multistep, iterative modeling procedure



AUTHOR INFORMATION

Corresponding Author

*Tel.: 202-404-3419. Fax: 202-767-1716. E-mail: jeffrey. [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The authors would like to thank the Office of Naval Research (ONR) for supporting this work. The authors also acknowledge the Naval Air Systems Command (NAVAIR) for providing assistance in collecting secondary GC-MS data.



REFERENCES

(1) Morris, R. E.; Hammond, M. H.; Cramer, J. A.; Johnson, K. J.; Giordano, B. C.; Kramer, K. E.; Rose-Pehrsson, S. L. Energy Fuels 2009, 23, 1610−1618. (2) Geladi, P.; Kowalski, B. R. Anal. Chim. Acta 1986, 185, 1−17. (3) Cramer, J. A.; Morris, R. E.; Giordano, B.; Rose-Pehrsson, S. L. Energy Fuels 2009, 23, 894−902. (4) Cramer, J. A.; Morris, R. E.; Hammond, M. H.; Rose-Pehrsson, S. L. Energy Fuels 2009, 23, 1132−1133. (5) Cramer, J. A.; Kramer, K. E.; Johnson, K. J.; Morris, R. E.; RosePehrsson, S. L. Chemom. Int. Lab. Syst. 2008, 92, 13−21.

1789

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

(6) Cramer, J. A.; Morris, R. E.; Rose-Pehrsson, S. L. Energy Fuels 2010, 24, 5560−5572. (7) Harvey, L.; Green, D. Assessment and Evaluation in Higher Education 1993, 18, 9−34. (8) Van Nostrand, K. P.; Harrison, J. A. Prepr. Pap.Am. Chem. Soc., Div. Energy Fuels Prepr. 2013, 58 (1), 863. (9) Liu, G.; Wang, L.; Qu, H.; Shen, H.; Zhang, X.; Zhang, S.; Mi, Z. Fuel 2007, 86, 2551−2559. (10) Hupp, A. M.; Marshall, L. J.; Campbell, D. L.; Smith, R. W.; McGuffin, V. L. Anal. Chim. Acta 2008, 606, 159−171. (11) Fernandez-Varela, R.; Andrade, J. M.; Muniategui, S.; Prada, D.; Ramirez-Villalobos, F. Water Res. 2009, 43, 1015−1026. (12) Sun, X.; Zimmerman, C. M.; Jackson, G. P.; Bunker, C. E.; Harrington, P. B. Talanta 2011, 83, 1260−1268. (13) Pedroso, M. P.; Fonseca de Godoy, L. A.; Ferreira, E. C.; Poppi, R. J.; Augusto, F. J. Chromatogr. A 2008, 1201, 176−182. (14) Zeng, Z.-D.; Hugel, H. M.; Marriott, P. J. Anal. Bioanal. Chem. 2011, 401, 2373−2386. (15) Zorzetti, B. M.; Harynuk, J. J. Anal. Bioanal. Chem. 2011, 401, 2423−2431. (16) Niu, Y.; Zhang, X.; Xiao, Z.; Song, S.; Eric, K.; Jia, C.; Yu, H.; Zhu, J. J. Chromatogr. B 2011, 879, 2287−2293. (17) Yang, L.; Bennett, R.; Strum, J.; Ellsworth, B. B.; Hamilton, D.; Tomlinson, M.; Wolf, R. W.; Housley, M.; Roberts, B. A.; Welsh, J.; Jackson, B. J.; Wood, S. G.; Banka, C. L.; Thulin, C. D.; Linford, M. R. Anal. Bioanal. Chem. 2009, 393, 643−654. (18) Jalali-Heravi, M.; Parastar, H.; Sereshti, H. Anal. Chim. Acta 2008, 623, 11−21. (19) Amador-Muñoz, O.; Villalobos-Pietrini, R.; Aragón-Piña, A.; Tran, T. C.; Morrison, P.; Marriott, P. J. J. Chromatogr. A 2008, 1201, 161−168. (20) Huang, X.; Shao, L.; Gong, Y.; Mao, Y.; Liu, C.; Qu, H.; Cheng, Y. J. Chromatogr. B 2008, 870, 178−185. (21) Marshall, L. J.; McIlroy, J. W.; McGuffin, V. L.; Smith, R. W. Anal. Bioanal. Chem. 2009, 394, 2049−2059. (22) Song, S.; Zhang, X.; Hayat, K.; Jia, C.; Xia, S.; Zhong, F.; Xiao, Z.; Tian, H.; Niu, Y. Sens. Actuators B 2010, 147, 660−668. (23) Miao, L.; Cai, W.; Shao, X. Talanta 2011, 83, 1247−1253. (24) Bernabei, M.; Reda, R.; Galiero, R.; Bocchinfuso, G. J. Chromatogr. A 2003, 985, 197−203. (25) Kaspar, H.; Dettmer, K.; Gronwald, W.; Oefner, P. J. J. Chromatogr. B 2008, 870, 222−232. (26) Cramer, J. A.; Begue, N. J.; Morris, R. E. J. Chromatogr. A 2011, 1218, 824−832. (27) Cramer, J. A.; Morris, R. E.; Begue, N. J. Proc. Int. Assoc. Stab. Handl. Use Liq. Fuels (IASH) 2011. (28) Cramer, J. A.; Begue, N. J.; Morris, R. E. Developing Fundamental Relationships Between Chemical Composition and Chemical Effects: GC-MS Profiling and Chemometrics. Presented at the Chemical and Biological Defense Science and Technology Conference, Las Vegas, NV, Nov. 14−18, 2011; Poster No. M01-002. (29) Begue, N. J.; Cramer, J. A.; Bargen, C. V.; Myers, K. M.; Johnson, K. J.; Morris, R. E. Energy Fuels 2011, 25, 1617−1623. (30) Teutenberg, T.; Tuerk, J.; Holzhauser, M.; Kiffmeyer, T. K. J. Chromatogr. A 2006, 1119, 197−201. (31) Pell, R. J. Chemom. Int. Lab. Syst. 2000, 52, 87−104. (32) Rousseeuw, P. T.; van Zomeren, B. C. J. Am. Stat. Assoc. 1990, 85, 633−639. (33) Li, B.; Martin, E.; Morris, J. Chemom. Int. Lab. Syst. 2008, 94, 104−111. (34) Centner, V.; Massart, D. L.; de Noord, O. E.; de Jong, S.; Vandeginste, B. M.; Sterna, C. Anal. Chem. 1996, 68, 3851−3858. (35) Gonzalez, F. R.; Romero, L. M. J. Chromatogr. A 2006, 1128, 203−207. (36) Booksh, K. S.; Kowalski, B. R. Anal. Chem. 1994, 66, 782A− 791A. (37) http://www.nist.gov/srd/nist1a.cfm, last accessed Nov. 2013. (38) http://www.mathworks.com/products/matlab/, last accessed Nov. 2013.

(39) http://www.eigenvector.com/software/pls_toolbox.htm, last accessed Nov. 2013. (40) http://www.vub.ac.be/fabi/publiek/index.html, last accessed Nov. 2013. (41) Golub, G. H.; Reinsch, C. Numer. Math. 1970, 14, 403−420. (42) http://mathworld.wolfram.com/SingularValueDecomposition. html, last accessed Nov. 2013. (43) Beebe, K. R.; Pell, R. J.; Seasholtz, M. B. Chemometrics: A Practical Guide; Wiley: New York, 1998; pp 93−94. (44) Anderssen, E.; Dyrstad, K.; Westad, F.; Martens, H. Chemom. Int. Lab. Syst. 2006, 84, 69−74. (45) Gidskehaug, L.; Anderssen, E.; Alsberg, B. K. Chemom. Int. Lab. Syst. 2008, 93, 1−10. (46) Esbensen, K. H.; Geladi, P. J. Chemom. 2010, 24, 168−187. (47) Haaland, D. M.; Thomas, E. V. Anal. Chem. 1988, 60, 1193− 1202. (48) Thomas, E. V. J. Chemom. 2003, 17, 653−659. (49) Lin, W. Q.; Jiang, J. H.; Shen, Q.; Shen, G. L.; Yu, R. Q. J. Chem. Inf. Model. 2005, 45, 486−493. (50) Kramer, K. E.; Morris, R. E.; Rose-Pehrsson, S. L.; Cramer, J. A.; Johnson, K. J. Energy Fuels 2008, 22, 523−534. (51) Stein, S. E. J. Am. Soc. Mass Spectrom. 1999, 10, 770−781. (52) ASTM Standard D3242. Standard Test Method for Acidity in Aviation Turbine Fuel. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/ D3242-11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (53) ASTM Standard D6379. Standard Test Method for Determination of Aromatic Hydrocarbon Types in Aviation Fuels and Petroleum DistillatesHigh Performance Liquid Chromatography Method with Refractive Index Detection. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/D6379-11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (54) ASTM Standard D976. Standard Test Method for Calculated Cetane Index of Distillate Fuels. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/D0976-06R11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (55) ASTM Standard D4052. Standard Test Method for Density, Relative Density, and API Gravity of Liquids by Digital Density Meter. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/D4052-11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (56) ASTM Standard D86. Standard Test Method for Distillation of Petroleum Products at Atmospheric Pressure. In 2012 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2012; DOI: 10.1520/D0086-12. (Available via the Internet at www. astm.org, last accessed Jan. 2014.) (57) ASTM Standard D93. Standard Test Methods for Flash Point by Pensky-Martens Closed Cup Tester. In 2013 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2013; DOI: 10.1520/D0093. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (58) ASTM Standard D5972. Standard Test Method for Freezing Point of Aviation Fuels (Automatic Phase Transition Method). In 2010 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2010; DOI: 10.1520/D5972-05R10. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (59) ASTM Standard D3701. Standard Test Method for Hydrogen Content of Aviation Turbine Fuels by Low Resolution Nuclear Magnetic Resonance Spectrometry. In 2012 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2012; DOI: 10.1520/D3701-01R12. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (60) ASTM Standard D3343. Standard Test Method for Estimation of Hydrogen Content of Aviation Fuels. In 2010 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2010; 1790

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791

Energy & Fuels

Article

DOI: 10.1520/D3343-05R10. (Available via the Internet at www.astm. org, last accessed Jan. 2014.) (61) ASTM Standard D1840. Standard Test Method for Naphthalene Hydrocarbons in Aviation Turbine Fuels by Ultraviolet Spectrophotometry. In 2013 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2013; DOI: 10.1520/D1840. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (62) ASTM Standard D1319. Standard Test Method for Hydrocarbon Types in Liquid Petroleum Products by Fluorescent Indicator Adsorption. In 2013 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2013; DOI: 10.1520/ D1319. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (63) ASTM Standard D5452. Standard Test Method for Particulate Contamination in Aviation Fuels by Laboratory Filtration. In 2012 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2012; DOI: 10.1520/D5452-12. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (64) ASTM Standard D5949. Standard Test Method for Pour Point of Petroleum Products (Automatic Pressure Pulsing Method). In 2010 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2010; 10.1520/D5949-10. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (65) ASTM Standard D445. Standard Test Method for Kinematic Viscosity of Transparent and Opaque Liquids (and Calculation of Dynamic Viscosity). In 2012 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2012; DOI: 10.1520/D044512. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (66) ASTM Standard D5006. Standard Test Method for Measurement of Fuel System Icing Inhibitors (Ether Type) in Aviation Fuels. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/D5006-11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (67) ASTM Standard D2500. Standard Test Method for Cloud Point of Petroleum Products. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/ D2500-11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (68) ASTM Standard D381. Standard Test Method for Gum Content in Fuels by Jet Evaporation. In 2012 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2012; DOI: 10.1520/D0381-12. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (69) ASTM Standard D5001. Standard Test Method for Measurement of Lubricity of Aviation Turbine Fuels by the Ball-on-Cylinder Lubricity Evaluator (BOCLE). In 2012 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2010; DOI: 10.1520/D5001-10. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (70) ASTM Standard D5304. Standard Test Method for Assessing Middle Distillate Fuel Storage Stability by Oxygen Overpressure. In 2011 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2011; DOI: 10.1520/D5304-11. (Available via the Internet at www.astm.org, last accessed Jan. 2014.) (71) ASTM Standard D4294. Standard Test Method for Sulfur in Petroleum and Petroleum Products by Energy Dispersive X-ray Fluorescence Spectrometry. In 2010 ASTM Annual Book of Standards; ASTM International: West Conshohocken, PA, 2010; DOI: 10.1520/ D4294-10. (Available via the Internet at www.astm.org, last accessed Jan. 2014.)

1791

dx.doi.org/10.1021/ef4021872 | Energy Fuels 2014, 28, 1781−1791