Rapid Prediction of Camelina Seed Oil Content Using Near-Infrared

Apr 4, 2017 - Rapid Prediction of Camelina Seed Oil Content Using Near-Infrared Spectroscopy. Ke Zhang†‡ ... Near-infrared (NIR) spectroscopy prov...
1 downloads 10 Views 667KB Size
Subscriber access provided by University of Newcastle, Australia

Article

Rapid prediction of oil content of camelina seeds using near-infrared spectroscopy Ke Zhang, Zhenglin Tan, Chengci Chen, Xiuzhi Susan Sun, and Donghai Wang Energy Fuels, Just Accepted Manuscript • DOI: 10.1021/acs.energyfuels.6b02762 • Publication Date (Web): 04 Apr 2017 Downloaded from http://pubs.acs.org on April 7, 2017

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Energy & Fuels is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

1

Rapid prediction of camelina seeds oil content using near-infrared

2

spectroscopy

3

Ke Zhanga,#, Zhenglin Tana,b,#, Chengci Chenc, Xiuzhi Susan Sund, and Donghai

4

Wanga*

5 a

6

Department of Biological and Agricultural Engineering, Kansas State University, Manhattan,

7 8

KS 66506, USA b

College of Tour and Hotel Management, HuBei University of Economics, Wuhan, 430205, P.R.

9 10 11

China c

Central Agricultural Research Center, Montana State University, Moccasin, MT 59462, USA d

Department of Grain Science and Industry, Kansas State University, Manhattan, KS 66506,

12 13 14

USA #

The first two authors contributed equally to this work

*Corresponding author. Telephone: 785-5322919, Fax: 785-5325825. E-mail: [email protected].

15 16 17

1 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

18 19

Abstract Camelina is a promising feedstock due to its ability to provide high quality edible oil and jet

20

fuel. However, predicting its oil content currently requires time and labor intensive analysis.

21

Near-infrared (NIR) spectroscopy provides a rapid, low-cost determination approach for oil seed

22

characterization. The objective of this study was to develop NIR model to predict camelina oil

23

content using 200 camelina seeds simples. Partial least squares regression (PLS) and principal

24

component regression (PCR) were used to compare the performance of calibration models with

25

full spectra range (4,000-1,000 cm-1). PLS regression showed better prediction performance than

26

PCR. The optimal model provided excellent fitness with an R2 of 0.94 and root mean square of

27

prediction error of 0.495%; madding the model useful in various applications, including quality

28

assurance and screening. This study confirmed that the NIR method significantly reduces time

29

(from 60 to 1 min) and cost (from 20 to 1 USD) required to determine camelina seed oil content.

30

Last but not least, this study leverage a high-throughput, cost-effective prediction method of

31

camelina oil content to facilitate plant breeding and genetics studies. Future work on the NIR

32

model should focus on developing a model system to achieve rapid analysis of genomics at low

33

cost to assist plant feedstock improvement.

34

Keywords: Camelina; oil content; near-infrared; chemometric analysis; prediction model

35

2 ACS Paragon Plus Environment

Page 2 of 21

Page 3 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

36

Energy & Fuels

1. Introduction

37

Camelina sativa (L), commonly as false flax, is an underutilized oil seed crop belonging to the

38

Brassicaceae family. Camelina oil is a promising candidate for high quality edible oils because it

39

contains up to 90% unsaturated fatty acid with high α-linolenic acid (18:3, omega-3) content 1. It

40

also reduces cholesterol and provides resistance to stress 2. Recent studies have shown that

41

camelina is a superior feedstock for biodiesel and jet fuel, specifically in fuel performance.

42

Commercial airline and military fighter jet testing showed that jet fuel made from camelina oil

43

has up to 80% reduction of greenhouse gas emissions compared to petroleum-based jet fuel 3.

44

Camelina also offers many agronomic advantages over traditional commodity oilseed crops, such

45

as low requirement for water, fertilizer, and lands; good tolerance to adverse environmental

46

conditions, and high resistance to alternaria black spot and other diseases and pests 4, 5. In general,

47

camelina contains 29.9% to 38.3% oil, 23% to 30% protein, 10% carbohydrates, and 6.6% ash

48

based on % w/w, depending on variations in breeding conditions 6. Many camelina breeding

49

programs have been established in order to improve camelina characteristics such as oil content,

50

fatty acid composition, seed size, disease resistance, and yield7-10. These programs require large

51

numbers of camelina seeds to be rapidly and cost-effectively screened for multiple traits, but the

52

traditional wet chemical method to measure oil content is time-consuming. Near-infrared (NIR)

53

spectroscopy, by contrast, provides a reliable and efficient prediction tool for food,

54

pharmaceutical, and agricultural applications11-13.

55

NIR is a fast approach based on spectrum absorption by molecular overtone and combination

56

vibrations within a sample. NIR offers some advantages such as high-throughput, less-

57

preparation, and non-destructive prediction and can be used in harsh industrial settings 14, 15.

58

Previous research has shown that NIR is an effective method for motor oil classification in terms 3 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

59

of base stock and viscosity-based classes 16. Azizian and Kramer 17 developed several useful FT-

60

NIR models in order to classify 55 oil, fat, and oil/fat mixtures. In addition, NIR discriminate

61

analysis has been conducted on edible oils 18, and many oil parameters (acidity and peroxide

62

index) have been analyzed using NIR 19. Wang et al. 20 used NIR to discriminate soybean oil

63

adulteration in camelina oils with an R2 of 0.992 and root mean standard error of cross validation

64

(RMSECV) of 1.79 for a PLS model. Weinstock et al. 21 built an NIR model that predicted oil

65

and oleic acid concentrations in individual corn kernels with a root mean square error of

66

prediction (RMSEP) of 0.7% and 14%, respectively. NIR has also been successfully employed to

67

model the oil contents of peanut 22, soybean seed 23, fish oil 24, and ground nuts 25.

68

However, no published information exists on modeling the oil content of camelina seed using

69

NIR. The objective of this research was to develop NIR models coupled with chemometrics in

70

order to rapidly and cost-effectively predict camelina oil content.

71 72

2. Materials and Methods

73

2.1Materials

74

The study used two hundred camelina whole seed samples from different fertility trials, crop

75

rotation, and nitrogen treatment studies, provided by the Central Agricultural Research Center,

76

Montana State University, Bozeman, Montana, United States. These camelina seed samples

77

consisted of four cultivars planted in 2013 in unique environments in Montana with various

78

fertility levels, resulting in large variations in oil contents.

79

2.2 Camelina oil content

4 ACS Paragon Plus Environment

Page 4 of 21

Page 5 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

80

Energy & Fuels

Camelina seeds were grinded using Micro-Mill grinder (SP Scienceware, NJ, USA) to < 0.5

81

mm powder and encapsulated in a plastic bag in a desiccator to homogenize moisture content.

82

Oil content measurement of camelina seeds was based on AOAC 2003.5 26. In briefly, a 2.0 g

83

grinded camelina seed sample was weighed in a single thickness cotton cellulose thimble (80

84

mm external length by 33 mm internal diameter) and extracted in a Soxtec HT2 apparatus

85

comprised of a 1045 Extraction Unit 1046 Service Unit (Tecator, Hoganas, Sweden) with 60 ml

86

of hexane (boiling period: 20 min; rinsing period: 40 min). When almost all the hexane was

87

collected, the extraction cup was dried for 30 min at 103 ℃. The cup was cooled to room

88 89

temperature and weighed. Extracted oil was collected in a 2-ml centrifuge tube and stored at 7 ℃

for further use. Oil content is calculated as:   =

 !"#$ %&' '(&) *)"+&* ,*"-. −  !"#$ %&' ,*"-. 0!1'(* ,*"-.

× 344 (3)

90

2.3 NIR spectra

91

An Antaris II FT-NIR analyzer (Thermo Scientific Incorporated, Madison, WI, U.S.) was

92

applied to collect NIR spectra in reflectance mode. Intact camelina seeds were measured using a

93

sample cup spinner combined with an integrating sphere to rapidly and accurately obtain bulk

94

information by spinning the sample cup. Each spectrum was averaged with 32 accumulations at a

95

resolution of 4 cm-1 in the wavelength range of 4,000–10,000 cm-1.

96

2.4 Spectra treatment and chemometrics

97

TQ Analyst 8.6.12 (Thermo Scientific Incorporated, Madison, WI, U.S.) was used to pretreat

98

and analyze the spectra. Path length was calibrated using standard normal variate (SNV), which 5 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

99

Page 6 of 21

has a mean of 0, a standard deviation of 1, and multiplicative signal correction (MSC). MSC is

100

another spectrum processing method by regressing a measured spectrum against a reference

101

spectrum and then corrects the measured spectrum using the slope. The Savitzky-Golay (SG)

102

filter was applied to a set of digital data points to increase the signal-to-noise ratio and reducing

103

random noise. First and second derivatives were used as a treatment method to resolve spectra

104

peak overlap and eliminate linear baseline drift. The first derivative was the rate of change of

105

absorbance with respect to wavelength, whereas the second derivative corresponded to the

106

curvature or concavity of the graph. First and second spectra formats were compared based on

107

RMSEP, R2 and RPD of models.

108

In order to eliminate bias in the subsets and carry out a calibration subset and prediction subset

109

with a ratio of 3 to 1, all of 200 camelina seed samples was arranged as descending order based

110

on measured value. One in every four spectra was randomly selected to the prediction subset (50),

111

with remaining spectra as calibration samples (150). Both full spectra range (4,000 to 10,000 cm-

112

1

113

PCR methods described in the previous publications 27, 28 were applied for models building. The

114

performance models were evaluated in terms of R2, RMSEP, and ratio of standard deviation of

115

calculated subset (SDy) to RMSEP (RPD), calculated with the following equations:

) and reduced spectra range (7500-7000, 5600 – 5000, 4700-4250 cm-1) were used. PLS and

789:; = < 7;I =

B

C ∑?DE (ŷ? − @? )A

(2)

FG

9IJ 789:;

(3)

6 ACS Paragon Plus Environment

Page 7 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

116

where np is the number of samples in the prediction subset; yi and ŷi are the measured value and

117

predicted value of the i th sample, respectively.

118

The Chauvenet test was applied to remove outliers defined as points at distances greater than

119

3.0 in the principal component (PC) space. Predicted residual error sum of squares (PRESS)

120

diagnostic function was used to determine the number of factors necessary for calibration.

121

3. Results and Discussion

122

3.1 Sample statistics

123

The camelina seed samples’ oil content range mean, and standard deviation as measured using

124

reference methods, are summarized in Table 1. The results showed good range and distribution:

125

25.9 to 37.7% oil content with a mean of 32.3% and standard deviation of 2.1% for the full set

126

and a mean of 32.2% and standard deviation of 2.09% for the calibration subset. The range,

127

mean, and standard deviation of the calibration subset covered those of the prediction subset,

128

which was from 28.3 to 36.6% with a mean of 32.5% and standard deviation of 2.0%. The oil

129

content range was consistent with published informaton1, 6. In addition, both subsets had similar

130

and consistent distribution patterns for prosperous model development.

131

3.2 Samples spectra

132

NIR spectra of 200 camelina seed samples with the full range (4000-10000 cm-1) are

133

demonstrated in Fig. 1. In general, absorption peaks were observed at 8250, 6800, 5830, 5765

134

cm-1, with several small peaks between 4800 and 4550 cm-1. The first peak around 8250 cm-1

135

was related to the second overtones of CH stretching vibration 24. The second peak was relatively

136

large, from 7200 to 6600 cm-1, resulting from the OH stretch overtone 29. Peaks at 5830 and 5765

7 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

137

cm-1 corresponded to the first overtones of CH stretching vibrations of –CH3, –CH2 and –

138

HC=CH– 30. Small peaks between 4800 and 4550 cm-1 were attributed to the C=C and C–H

139

stretch combination tone of cis unsaturated fatty acids (C18:1 and C18:2) 24.

140

3.3 NIR model development for camelina oil content

141

Principal component analysis (PCA), employed prior to NIR model development, derived 10

142

PCs from the spectral data in order to analyze the relevant and interpretable structure in the data.

143

In addition, five outliers were identified based on the Chauvenet test. Chauvenet test found a

144

probability band, which contains all the statistically acceptable data. Although several dots in Fig.

145

2 have a relative distance from the cluster, these samples were not strictly considered outliers

146

because they represented real variation in industrial conditions. Both PCR and PLS were used to

147

build calibration models; however, PLS consistently demonstrated better performance and

148

statistic results than PCR in terms of R2 and RPD. PLS also showed more quantitative analytical

149

power than PCR based on R2 and RPD (Table 2).

150

The PCA score showed how each spectrum in the PLS was represented by each PC during

151

model calibration. Each PC presented an independent factor from spectral variation in the data.

152

Fig. 2 shows plots of PC analysis scores (PC1 versus PC2) of camelina oil content. PC1

153

accounted for 98.73% of the variation explanation and described most variation in the calibration

154

spectra. PC2 represented 1.06% of the variation in the spectra. The first two PCs accumulatively

155

accounted for more than 99% of the variation among samples. The pattern of PC score dots

156

seemed reasonably random, suggesting good representation for all camelina seed samples. In

157

order to determine sufficient amounts of variation in the data and a useful spectral region, PC1

158

loading was analyzed, as shown in Fig. 3. Orthogonal loading spectra showed strong loadings in

159

PC1 from 7500 to 7000 cm-1 and from 5600 to 5000 cm-1. These spectral regions corresponded 8 ACS Paragon Plus Environment

Page 8 of 21

Page 9 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

160

to the first overtone of OH and the CH2 overtone, both of which were representative absorption

161

of fatty acid in oil 29.

162

Spectra processes including SNV, MSC, SG, and derivatives were applied to optimize the

163

models. The optimal model was developed by comparing the performance of every combination

164

of derivative treatments, spectra processes, and quantitative methods 31. SNV enhanced the

165

performance of prediction compared to MSC and SG, suggesting that SNV resolved the

166

accentuation problems caused by the light scattering of a slightly uniform camelina seeds pile.

167

First and second derivatives were used to further improve the model, and their performances are

168

summarized in Table 2. First derivative significantly enhance the performance of calibration and

169

prediction models, but no improvement was found for the second derivative, possibly because

170

the second derivative brought more false information from mathematical artifacts when

171

identifying smaller absorption signals 32. A reduced spectral range was also applied in order to

172

enhance the goodness-of-fit of models.

173

In Fig. 4, the mean spectrum (red line) shows the average of 200 camelina seed spectra. The

174

variance spectrum (blue line), which shows variance in all spectra, was calculated from the

175

square root of the spectral variance for each oil content across all spectra. The mean and variance

176

spectra of 200 camelina seed samples revealed a similar trend, suggesting that the calibration

177

model could be robust. The regression coefficient line as function of wavelength (black line)

178

indicates wavelengths with large weights in the calibration model. In order to improve the model,

179

three wavelength ranges (7500-7000, 5600-5000, 4700-4250 cm-1) were selected resulting from

180

their high coefficient and high variance, which indicate high correlation and large variation. The

181

reduced range demonstrated better goodness-of-fit for both calibration and prediction models

182

than the full range models, with an R2 of 0.97, RMSEP of 0.495%, and RPD of 4 for the

9 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

183

prediction sample subset, suggesting that this model provides an accurate prediction and could be

184

used in most applications 33. The reduced wavelength region correlated best with samples’ oil

185

content with reduced random noise. Previous studies have also reported that similar wavelength

186

regions highly corresponded to the oil instauration index and cis/trans ratio of unsaturated fatty

187

acids 34, 35.

188

PRESS function was used to avoid overfitting the models. In Fig. 5, the R2 value increased

189

sharply at three factors, peaked gradually at five factors, and then leveled off to a constant value.

190

Fig. 6 plots chemical reference data against NIR-predicted values in the calibration and

191

prediction subsets of the final model. This plot illustrates the relationship between the chemical

192

data and NIR calibration models as well as the presence of outlier samples, suggesting that the

193

optimal model was prosperous and did the terrific job on the prediction of camelina seed oil

194

content. The conventional method to measure the oil content of oil seeds requires grinding the

195

sample and extracting oil using chemical solvent, which is time consuming and labor intensive.

196

Our NIR method, by scanning whole oil seeds using spectroscopy, significantly reduced the time

197

(from an estimated 60 to 1 min) and cost (from an estimated 20 to 1 USD)to determine camelina

198

oil content.

199

4. Conclusion

200

This study successfully developed models to predict camelina seeds’ oil content. The optimal

201

model was achieved by PLS quantitative analysis, SNV, and first derivative with a prediction R2

202

of 0.94, RMSEP of 0.495%, and RPD of 4. NIR models could improve the speed of oil content

203

determination and can assist in crop management with increased efficiency.

204 205

Acknowledgment 10 ACS Paragon Plus Environment

Page 10 of 21

Page 11 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

206

This work was supported by Biomass Research and Development Initiative Program with grant

207

number of 2012-10006-20230 from the U.S. Department of Agriculture National Institute of

208

Food and Agriculture.

209

11 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

210

Page 12 of 21

Table 1. Oil content of 200 camelina seed samples measured using reference method.

211 Full set

Oil content (%)

Calibration subset

Prediction subset

Range

Mean

SD

Range

Mean

SD

Range

Mean

SD

25.9-37.7

32.3

2.1

25.9-37.7

32.2

2.1

28.3-36.6

32.5

2.0

212 213 214

Table 2. NIR calibration and prediction statistics for camelina oil content prediction. Full range (10000 - 4000cm-1)

Reduced range (7500-7000, 5600 – 5000, 4700-4250 cm-1)

Raw spectra

First derivative

Second derivative

Raw spectra

First derivative

Second derivative

0.83 0.80 0.886 2.23 2

0.89 0.88 0.668 2.96 4

0.79 0.67 1.19 1.66 1

0.93 0.91 0.606 3.27 5

0.97 0.94 0.495 4 5

0.87 0.81 0.952 2.08 4

0.79 0.67 1.21 1.63 2

0.78 0.68 1.17 1.69 10

0.67 0.67 1.15 1.72 2

0.81 0.85 0.732 2.71 2

0.83 0.88 0.666 2.97 2

0.77 0.74 0.974 2.03 2

PLS R2, cal. R2,val. RMSEP RPD Factor used

PCR R2, cal. R2,val. RMSEP RPD PC used

215 216

12 ACS Paragon Plus Environment

Page 13 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

217 218

Fig. 1. NIR spectra of 200 camelina seed samples.

219 220 221

13 ACS Paragon Plus Environment

Energy & Fuels

0.4

0.2

PC2 Score x 10 (1.06%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 21

0.0

-0.2

-0.4

-0.6

-0.8 -0.10

-0.05

0.00

0.05

0.10

PC1 Score (98.73%) 222 223

Fig. 2. Principal component analysis scores of camelina seed sample oil content for PLS model.

224

14 ACS Paragon Plus Environment

Page 15 of 21

0.4

PC1 loading 0.3

0.2

Absorbance

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

0.1

0.0

-0.1

-0.2

-0.3 10000

9000

8000

7000

6000

5000

4000

-1

Wavenumbers (cm ) 225 226

Fig. 3. Loading for the first principal component of camelina seeds sample oil content.

227 228

15 ACS Paragon Plus Environment

Energy & Fuels

0.74

Regression coefficient spectra Mean spectra Variance spectra

0.72

1.4

0.14

1.2

0.12

1.0

0.70

0.8 0.68

Absorbance

Regression coefficient

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 21

0.10

0.08

0.6 0.66

0.06 0.4

0.64

10000

0.04 9000

8000

7000

6000

5000

0.2 4000

-1

229 230

Wavenumbers (cm ) Fig. 4. Regression coefficient, mean, and variance spectra of oil content in PLS model.

231 232

16 ACS Paragon Plus Environment

Page 17 of 21

PRESS R2

1.0 500 0.9

400

0.8

0.7 300

R2

PRESS

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Energy & Fuels

0.6 200 0.5 100

0.4 0

2

4

6

8

10

Factor 233 234

Fig. 5. PRESS and R2 vs. calculated factors in the prediction model.

235 236

17 ACS Paragon Plus Environment

Energy & Fuels

Calculation Prediction

38

36

Calculated value (%)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 21

34

32

30

28

26 26

28

30

32

34

36

38

Measured value (%) 237 238 239 240

Fig. 6. Comparison of calculated versus measured camelina seed sample oil contents.

241 242 243

18 ACS Paragon Plus Environment

Page 19 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288

Energy & Fuels

References 1. Budin, J. T.; Breene, W. M.; Putnam, D. H., Some compositional properties of camelina (Camelina sativa L. Crantz) seeds and oils. Journal of the American Oil Chemists’ Society 1995, 72, (3), 309-315. 2. Fu, C.; Zhou, P., Camellia oil: a new special type of plant oil. Food Science and Technology 2003, 2, 19-21. 3. Shonnard, D. R.; Williams, L.; Kalnes, T. N., Camelina‐derived jet fuel and diesel: Sustainable advanced biofuels. Environmental Progress & Sustainable Energy 2010, 29, (3), 382-392. 4. Francis, A.; Warwick, S., The biology of Canadian weeds. 142. Camelina alyssum (Mill.) Thell.; C. microcarpa Andrz. ex DC.; C. sativa (L.) Crantz. Canadian Journal of Plant Science 2009, 89, (4), 791-810. 5. Razeq, F. M.; Kosma, D. K.; Rowland, O.; Molina, I., Extracellular lipids of Camelina sativa: Characterization of chloroform-extractable waxes from aerial and subterranean surfaces. Phytochemistry 2014, 106, 188-196. 6. Sampath, A. Chemical characterization of camelina seed oil. Rutgers UniversityGraduate School-New Brunswick, 2009. 7. Vollmann, J.; Eynck, C., Camelina as a sustainable oilseed crop: Contributions of plant breeding and genetic engineering. Biotechnology journal 2015. 8. Groeneveld, J. H.; Klein, A. M., Pollination of two oil‐producing plant species: Camelina (Camelina sativa L. Crantz) and pennycress (Thlaspi arvense L.) double‐cropping in Germany. GCB Bioenergy 2014, 6, (3), 242-251. 9. Guy, S. O.; Wysocki, D. J.; Schillinger, W. F.; Chastain, T. G.; Karow, R. S.; GarlandCampbell, K.; Burke, I. C., Camelina: Adaptation and performance of genotypes. Field Crops Research 2014, 155, 224-232. 10. Matei, F.; Sauca, F.; Dobre, P.; Jurcoane, S., Breeding low temperature resistant Camelina sativa for biofuel production. New Biotechnology 2014, 31, S93. 11. Benito, J.; Ojeda, B.; Rojas, S., Process analytical chemistry: Applications of near infrared spectrometry in environmental and food analysis: An overview. Applied Spectroscopy Reviews 2008, 43, (5), 452-484. 12. Stark, E.; Luchter, K.; Margoshes, M., Near-infrared analysis (NIRA): A technology for quantitative and qualitative analysis. Applied Spectroscopy Reviews 1986, 22, (4), 335-399. 13. Cen, H.; He, Y., Theory and application of near infrared reflectance spectroscopy in determination of food quality. Trends in Food Science & Technology 2007, 18, (2), 72-83. 14. Lestander, T. A.; Johnsson, B.; Grothage, M., NIR techniques create added values for the pellet and biofuel industry. Bioresource technology 2009, 100, (4), 1589-1594. 15. Cai, W.; Gouveia, L. L., Modeling and Simulation of Maximum Power Point Tracker in Ptolemy. Journal of Clean Energy Technologies 2013, 1, (1). 16. Balabin, R. M.; Safieva, R. Z., Motor oil classification by base stock and viscosity based on near infrared (NIR) spectroscopy data. Fuel 2008, 87, (12), 2745-2752. 17. Azizian, H.; Kramer, J. K., A rapid method for the quantification of fatty acids in fats and oils with emphasis on trans fatty acids using Fourier transform near infrared spectroscopy (FTNIR). Lipids 2005, 40, (8), 855-867. 18. Yang, H.; Irudayaraj, J.; Paradkar, M. M., Discriminant analysis of edible oils and fats by FTIR, FT-NIR and FT-Raman spectroscopy. Food Chemistry 2005, 93, (1), 25-32.

19 ACS Paragon Plus Environment

Energy & Fuels

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332

19. Armenta, S.; Garrigues, S.; De la Guardia, M., Determination of edible oil parameters by near infrared spectrometry. Analytica chimica acta 2007, 596, (2), 330-337. 20. Wang, L.; Lee, F. S.; Wang, X.; He, Y., Feasibility study of quantifying and discriminating soybean oil adulteration in camellia oils by attenuated total reflectance MIR and fiber optic diffuse reflectance NIR. Food chemistry 2006, 95, (3), 529-536. 21. Weinstock, B. A.; Janni, J.; Hagen, L.; Wright, S., Prediction of oil and oleic acid concentrations in individual corn (Zea mays L.) kernels using near-infrared reflectance hyperspectral imaging and multivariate analysis. Applied spectroscopy 2006, 60, (1), 9-16. 22. Sundaram, J.; Kandala, C. V.; Holser, R. A.; Butts, C. L.; Windham, W. R., Determination of in-shell peanut oil and fatty acid composition using near-infrared reflectance spectroscopy. Journal of the American Oil Chemists' Society 2010, 87, (10), 1103-1114. 23. Esteve Agelet, L.; Armstrong, P. R.; Romagosa Clariana, I.; Hurburgh, C. R., Measurement of single soybean seed attributes by near-infrared technologies. A comparative study. Journal of agricultural and food chemistry 2012, 60, (34), 8314-8322. 24. Cozzolino, D.; Murray, a. I.; Chree, A.; Scaife, J., Multivariate determination of free fatty acids and moisture in fish oils by partial least-squares regression and near-infrared spectroscopy. LWT-Food Science and Technology 2005, 38, (8), 821-828. 25. Misra, J. B.; Mathur, R. S.; Bhatt, D. M., Near‐infrared transmittance spectroscopy: A potential tool for non‐destructive determination of oil content in groundnuts. Journal of the Science of Food and Agriculture 2000, 80, (2), 237-240. 26. Thiex, N. J.; Anderson, S.; Gildemeister, B., Crude fat, diethyl ether extraction, in feed, cereal grain, and forage (Randall/Soxtec/submersion method): collaborative study. Journal of AOAC International 2003, 86, (5), 888-898. 27. Xu, F.; Yu, J.; Tesso, T.; Dowell, F.; Wang, D., Qualitative and quantitative analysis of lignocellulosic biomass using infrared techniques: a mini-review. Applied Energy 2013, 104, 801-809. 28. Xu, F.; Zhou, L.; Zhang, K.; Yu, J.; Wang, D., Rapid determination of both structural polysaccharides and soluble sugars in sorghum biomass using near-infrared spectroscopy. BioEnergy Research 2015, 8, (1), 130-136. 29. Murray, I. In The NIR spectra of homologous series of organic compounds, Proceedings of the international NIR/NIT conference, 1986; Akademiai Kiado: Budapest, Hungary: 1986; pp 13-28. 30. Miller, C. E., Chemical principles of near-infrared technology. Near-infrared technology in the agricultural and food industries 2001, 2. 31. Ren, G.; Chen, F., Simultaneous Quantification of Ginsenosides in American Ginseng (Panax q uinquefolium) Root Powder by Visible/Near-Infrared Reflectance Spectroscopy. Journal of agricultural and food chemistry 1999, 47, (7), 2771-2775. 32. Shenk, J. S.; Workman, J.; Westerhaus, M. O., Application of NIR spectroscopy to agricultural products. Practical Spectroscopy Series 2001, 27, 419-474. 33. Williams, P. C., Implementation of near-infrared technology. Near-infrared technology in the agricultural and food industries 2001, 2, 143. 34. Ismail, A.; Van de Voort, F.; Emo, G.; Sedman, J., Rapid quantitative determination of free fatty acids in fats and oils by Fourier transform infrared spectroscopy. Journal of the American Oil Chemists’ Society 1993, 70, (4), 335-341.

20 ACS Paragon Plus Environment

Page 20 of 21

Page 21 of 21

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

333 334 335

Energy & Fuels

35. Li, H.; Van de Voort, F.; Sedman, J.; Ismail, A., Rapid determination of cis and trans content, iodine value, and saponification number of edible oils by Fourier transform nearinfrared spectroscopy. Journal of the American Oil Chemists' Society 1999, 76, (4), 491-497.

336

21 ACS Paragon Plus Environment