Identification of Different Bile Species and Fermentation Times of Bile

Feb 3, 2018 - Machine learning techniques have been widely used in biomedical data analytics recently. The algorithm model of LS-SVM has been widely u...
3 downloads 18 Views 1MB Size
Subscriber access provided by UNIV OF MISSOURI ST LOUIS

Article

Identification of Different Bile Species and Fermentation Time of Bile Arisaema Based on An Intelligent Electronic Nose and Least Squares Support Vector Machine Chaoqun Tan, Dashuai Xie, Yujie Liu, Chun-Jie Wu, Chuanbiao Wen, Xiwei Huang, and Jinhong Guo Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.7b05189 • Publication Date (Web): 03 Feb 2018 Downloaded from http://pubs.acs.org on February 5, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Analytical Chemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Identification of Different Bile Species and Fermentation Time of Bile Arisaema Based on An Intelligent Electronic Nose and Least Squares Support Vector Machine

Chaoqun Tan1,2, Dashuai Xie1, Yujie Liu1, Wei Peng1, Xinyi Li1, Li Ai1, Chunjie Wu1,*, Chuanbiao Wen2, Xiwei Huang3* and Jinhong Guo2,4,5* 1. College of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China; Chengdu, 611731, P. R. China. 2. Institute of Digital Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China, 611731, P. R. China. 3. Ministry of Education Key Lab of RF Circuits and Systems, Hangzhou Dianzi University, Hangzhou 310018, China 4. School of Information and Communication Engineering, University of Electronic Science and Technology of China, 611731, P. R. China. 5. Institute of Medical Equipment, University of Electronic Science and Technology of China, 611731, P. R. China. * Corresponding Authors Contact: +86 15802894860 (J. Guo) E-mail: [email protected] (J. Guo), [email protected] (X. Huang)

1

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Fermentation is one of the most traditionally utilized methods to process the raw materials of traditional Chinese medicine (TCM). Bile Arisaema (BA) is produced by the fermentation of the roots of Arisaema heterophyllum with bile. Fermentation time and bile species are the key factors in producing BA. The study was aimed to develop a new and rapid method for the identification of different fermentation time and bile species of BA. The polysaccharide content (PC), protease activity (PA) and amylase activity (AC) of BA were determined. The changes of PC, PA and AC were significant indicators for the evaluation of different fermentation time. Based on the odor data of BA obtained by electronic nose technology (E-nose), the Principal Component Analysis (PCA) was used to identify bile species. The results were further verified by the Least Squares Support Vector Machine (LS-SVM). The trained LS-SVM was also used to predict the PC, PA and AC of the samples to identify fermentation time. The present study indicated that E-nose combined with LS-SVM could effectively predict the PC, PA and AC of the samples, identify the bile species and fermentation time of BA, and it was proved to be a useful strategy for quality control of fermented products of TCMs. Keywords: Bile Arisaema; Electronic nose; LS-SVM; Identification

2

ACS Paragon Plus Environment

Page 2 of 21

Page 3 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1. Introduction Fermentation is a traditional processing technology commonly used in herbal medicines for enhancing efficacy, producing new bio-activities and alleviating toxicity.[1,2] Bile Arisaema (BA), the fermented product of raw Arisaema heterophyllum (AH) with bile, has been traditionally used for clearing heat and reducing phlegm in Chinese traditional medicinal (TCM) theory.[3] Increasing researches have demonstrated that BA has various pharmacological functionalities, especially the analgesic and sedative effects.[4] To produce BA, there are three commonly used bile species, including bovine, sheep and pig bile. It has been reported that fermentation time and bile species are the most important influencing factors for processing BA, which directly affect the efficacy of BA.[5] And the changes of polysaccharide content (PC), protease activity (PA) and amylase activity (AC) of BA were significant indicators for the evaluation of different fermentation time. However, so far, there is no convenient or rapid method for evaluating the best fermentation time of BA, and the method of identifying different bile species is lacking. It is still mainly based on human experience evaluation, which is easily influenced by subjective and external environmental factors and lack of objectivity or authenticity. [6] In the existing system of quality evaluation, thin layer chromatography and physicochemical identification [7-12] are the most commonly used methods of identification. However, these methods are time-consuming, tedious, and require specific references. [13] Thus, it is urgent to develop novel method for identifying the bile species and fermentation time. Electronic nose (E-nose) is a new type of analytic testing equipment that simulates human olfactory system, which examines the overall odor characteristics of the sample, and analyzes it qualitatively or quantitatively.[14] Due to its rapid, non-expansive, non-destructive, efficient, precise, repeatable, and consistent characteristics, E-nose has been widely used to discriminate the origin, quality of herbs and processed products, etc.[15-18] In this paper, E-nose is used to determine the odor of BA, and then bile species and fermentation time of BA were identified by employing machine learning techniques, principal component analysis (PCA) and least squares support vector machine (LS-SVM). 3

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Machine learning techniques have been widely used in biomedical data analytics recently. The algorithm model of LS-SVM has been widely used in other researches. Segovia employed the data of the computer aided diagnosis and treatment system of Alzheimer's disease (AD) extracted by least squares method.[19] The support class of the support vector machine was used to determine the bottom layer of the image, and the early diagnosis of the disease. Yeganeh used the least squares method as the data selection tool, combined with the support vector machine algorithm to predict the daily CO concentration, so as to predict the daily air pollution. [20] Yang et al. proposed the use of least squares method for non-linear feature selection and correlation analysis of financial indicators, combined with support vector combination for information fusion; to achieve the relevant financial indicators bankruptcy forecast. [21] The method based on LS-SVM is proposed by using the E-nose technology and Particle Swarm Optimization (PSO) algorithm. [22] The idea of LS-SVM is to map multidimensional, nonlinear data to highdimensional space, and obtain the corresponding formula. This method can solve the phenomenon of "local optimal value".[23] It has been widely used in water quality evaluation, classification, short-term load charge, prediction, the consumption of natural gas and other fields.[24-30] However, there have been no reports on modeling fermentation prediction of BA based on LS-SVM. Therefore, our present study is designed to investigate an effective and rapid strategy for discriminating the fermentation time and bile species for BA based on odor. This is an innovative combination of TCM and computer technology.

2. Materials and methods 2. 1 Sample preparation A total of 36 batches of BA samples are provided by Sichuan auxiliary Pharmaceutical Co., Ltd. (Ziyang, China), Sichuan Qianfang Traditional Chinese Medicine (Chengdu, China), and Sichuan Baisheng Pharmaceutical Co., Ltd. (Neijiang, China). Ethanol, sodium chloride solution, Sodium hydroxide solution were obtained from the Chengdu Jinshan Chemical Reagent Co. (Chengdu, China); maltose was purchased 4

ACS Paragon Plus Environment

Page 4 of 21

Page 5 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

from the Food and Drug Administration Research Institute of China (Chengdu, China); casein solution was purchased from the Chengdu Kelon Chemical Reagent Factory (Chengdu, China). In generally, the experimental data were recorded from 0 to 9 day with one day step in the constant temperature and humidity incubator (HWS-150, Beijing Zhongxing Weiye Instrument Co., Ltd.). The volume of fermentation broth was six times of the BA. [31] 2. 1. 1 Polysaccharide content (PC) Polysaccharide content (PC) was determined according to the previously described method.

[32]

For

sample extraction, 50 mL of ethanol was added to 2 g fine powder of the BA, and maintained at room temperature for 12 h. Then the mixture was refluxed by (De-Ionised) DI water for 1 h. After filtration, 100 mL of DI water was added to the residue, and it’s extracted for 2 h. Finally the liquid was obtained. Absorbance was determined at 490 nm by using microplate reader. 2. 1. 2 Amylase activity (AC) Amylase activity (AC) was determined using the methods in previous reports.[32] In the experiments, 1% NaCl solution (20 μL) was mixed with 1% soluble starch solution (50 μL), after bathed for 10 min in 37℃ DI water, 10 μL of DI water was added and bathed for 15 min, then immediately mixed with 10 μL Sodium hydroxide. Finally 10 μL DNS (Dimethylamino Naphthalene Sulfonyl) solution was added and boiled for 5 min. Absorbance of the extracted mixture was determined at 540 nm by using microplate reader. 2. 1. 3 Protease activity (PA) Protease activity (PA) was determined according to the previously described method. [32] Briefly, the 0.4μL/g diluted enzyme solution was added to the 2.5 mL of 1% (w/v) casein solution in the phosphate buffers (the pH of 50 mM phosphate buffer is 7.5). The mixture was incubated at 40°C for 10 min and then 2.5 mL of a 44 mM trichloroacetic acid solution was added. 1 mL of the filtrate was mixed with 1 mL of the three fold diluted Folin reagent. The solution was incubated at 40℃ for 20 min and the absorbance values

5

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 21

were recorded at 660 nm. One unit (U) of protease activity was defined as the amount of enzyme and required to release 1 μg of tyrosine per minute under the above conditions. 2. 2 Electronic nose assay The E-nose (FOX-4000, Alpha M.O.S., France) is a kind of analytical testing equipment which used the gas sensor array to detect the overall odor characteristics of the sample and analyze the sample qualitatively or quantitatively. It consists of a sampling apparatus, sensors array, air generator equipment, HS-100 autosampler and pattern recognition software (Alpha M.O.S., Version 2012.45). [33-35] The sensor array comprised 18 metal oxide semiconductors (MOS) chemical sensors, namely T30/1, T40/2, T40/1, TA/2, and T70/2; P10/1, P10/2, P40/1, PA/2, P30/1, P40/2, and P30/2; and LY2/LG, LY2/G, LY2/AA, LY2/GH, LY2/gCTL, and LY2/gCT. [37, 38] Each sensor has different detection type of typical analytes as shown in Table 1. In order to increase the detection range of Oxidizing gas, Organic compounds and Hydrocarbons, six sensors, P30/2, T40/2, T40/1, TA/2, P30/1 and P40/2, were added to the sensor array over the traditional 12 sensors. [36] Table 1. The different sensor detection type of typical analytes using E-nose Sensor

Analytes

Sensor

Analytes

LY2/LG

Oxidizing gas Ammonia/organic amines, carbon monoxide

TA/2

Organic compounds

LY2/G LY2/AA

LY2/gCTL

Hydrogen sulfide

Ethanol

LY2/gCT

Propane/Butane Organic solvents

P10/2

Ammonia/organic amines Methane

T30/1 P10/1

Hydrocarbons, methane

P40/1

Fluorine

P40/2

Oxidizing gas

T70/2

Aromatic compounds

P30/2

Ethanol

PA/2

Ethanol, ammonia/organic amines Hydrocarbons

T40/2

Oxidizing gas

T40/1

Oxidizing gas

LY2/GH

P30/1

The data of 18 sensors were specifically measured for the odor of BA. Samples were crushed and filtered through a 50 mesh sieve (inside diameter 355 μm ± 13 μm) and accurately weighed at 1.0 g and placed in 20 6

ACS Paragon Plus Environment

Page 7 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

mL sealed headspace vials before loading into the auto-sampler tray. In the testing process, synthetic dry air was pumped into the sensor chambers at a constant rate of 150 mL/min via an air transformer connected to a syringe during the measurement process. Then, 1500 μL of headspace air was automatically injected into the E-nose by a syringe and flow-injected into the carrier gas flow. The injection rate was 1500 μL/s, and incubation temperature was maintained at 50 °C. The incubation time was set to 1080 s, and the time between injections was set to 600 s. Based on the method mentioned above, good repeatability was investigated and shown in Table 2. The relative standard deviation (RSD) was defined as follows: RSD =

𝑆𝑇𝐷𝐸𝑉() ⁄𝐴𝑉𝐸𝑅𝐺𝐸() ∗ 100%

Table 2 The repeatability based on the detective method of E-nose (n=6) Senor

RSD (%)

Senor

RSD (%)

LY2/LG

1.92

TA/2

2.06

LY2/G

2.15

LY2/gCTL

1.77

LY2/AA

0.72

LY2/gCT

1.76

LY2/GH

1.9

T30/1

2.17

P10/2

1.97

P10/1

1.25

P40/1

1.92

P40/2

2.33

T70/2

1.3

P30/2

2.17

PA/2

2.42

T40/2

2.87

P30/1

0.43

T40/1

2.32

2. 3 Analytical methods 2. 3. 1 Particle Swarm Optimization algorithm (PSO) The E-nose data analytical method based on LS-SVM is proposed by using the Particle Swarm Optimization (PSO) algorithm. PSO is an evolutionary computing technique, which has been proved to be a good method for optimization of bird predation. In this algorithm, the solution of the optimization problem is a bird that searches for space

[37]

. All particles have their own speed, position, and fitness values determined by the

objective function. Through constant calculations, particles update their speed and position by tracking 7

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 21

individual extremes and global extremes. M represents a particle swarm, where the position of the ith particle is Pid and the flight velocity is Vid, the optimal position of the particle is 𝑋𝑖𝑑 , and the optimal position of the whole particle group is 𝑋𝑖𝑑+1 . The formula for its position and speed update is: Vid+1 = ω ×Vid +c1×r1×(Pid- Xid)+ c2×r2×(Gid- Xid) (1)

Xid+1 = Vid + Vid+1 where ω is the inertia weight, which it used to control the impact of the previous speed on the current speed; 𝑐1 and 𝑐2 are the learning factors; 𝑟1 and 𝑟2 are random numbers within [0,1].[38] 2. 3. 2 Principal Component Analysis (PCA) and Pearson Correlation Analysis (PCC) Principal Component Analysis (PCA) is a commonly used data analysis method. PCA transforms the original data into a set of linearly independent representations by linear transformation. It can be used to extract the main feature of data in high dimensionality.

[39]

Pearson Correlation Analysis (PCC) were

developed using the SPSS software to predict the correlation of the two variables. [40] 2. 3. 3 LS-SVM theory LS-SVM is a common machine learning algorithm. The basic idea of the LS-SVM approach is as follows. For a training sample set{(𝑥i , 𝑦i )}, 𝑥i is the input vector, which represent PC, PA, AC and odor data included different fermentation times and different fermentation bile species, whereas the parameter 𝑦i is the corresponding output vector refers to fermentation times and bile species. The input space is mapped to the high-dimensional feature space by nonlinear mapping, and the high-dimensional feature space is linearly regressed to realize the regression of the original space. Thus establishing a regression function: f(𝑥) = 𝜔𝜑(𝑥) + 𝑏

(2)

where f(𝑥) is a nonlinear mapping function; The weight and bias vectors have been represented by 𝜔 and 𝑏 respectively that are calculated from the training data set. [41, 42] Therefore, the model can be converted into the following optimization problem:

8

ACS Paragon Plus Environment

Page 9 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

1

𝛾

𝐽(𝑤, 𝜁) = 2 𝑤 𝑇 𝑤 + 2 ∑𝑛𝑖=1 ζ2𝑖

min

𝑦𝑖 − 𝑤𝜑(𝑥𝑖 ) − 𝑏 ≤ 𝜃 + ζ𝑖 𝑠. 𝑡. {−𝑦𝑖 + 𝑤𝜑(𝑥𝑖 ) + 𝑏 ≤ 𝜃 + ζ∗𝑖 ζ𝑖 ≥ 0, ζ∗𝑖 ≥ 0 𝑖 = 1,2 … 𝑛 {

(3)

The cost function (𝐽) is the regression error for all training data. Where ζ𝑖 , ζ∗𝑖 are the relaxation factors that establish a constraint, the 𝛾 is the penalty factor which decreases the model complexity and minimizes the training error. The Lagrange multiplier method is introduced to solve the optimization problem for the training data points, and the new formulation is obtained as follows: 𝐿(𝑤, 𝑏, 𝑒, 𝑎)=J(𝑤, 𝑒) − ∑𝑛𝑖=1 𝑎𝑖 [𝑤 𝑇 𝜑(𝑥𝑖 ) + 𝑏 + 𝑒 − 𝑦𝑖 ] (4) The 𝑎𝑖 belongs to the multiplier of the Lagrange function. The corresponding partial variables are set to zero, the weights obtained are linear combinations of the training set, which are shown as follows: 𝜕𝐿 𝜕𝑤

= 0 → 𝑤 = ∑𝑙𝑖=1 𝑎𝑖 𝜑(𝑥𝑖 ) 𝜕𝐿 𝜕𝑏

𝜕𝐿 𝜕𝑒𝑖

{

= 0 → ∑𝑙𝑖=1 𝑎𝑖 = 0 (5)

= 0 → 𝑤 𝑇 𝜑(𝑥𝑖 ) + 𝑏 + 𝑒 − 𝑦𝑖 = 0 𝜕𝐿 𝜕𝑎𝑖

= 0 → 𝑎𝑖 = 𝛾𝑒𝑖 , 𝑖 = 1,2. . 𝑙

The 𝑤 can be written as linear combinations of the Lagrange multipliers with the training data. The 𝑎 vector follows from solving a set of linear equations, the following matrix equation is obtained: 0 [1

1𝑇 𝑎 0 1 𝛼 + 𝛾 ] [ b ] = [𝑦 ]

(6)

The parameters 𝑎 and b can be calculated through (6). Finally, the regression function of LS-SVM is: y(𝑥) = ∑𝑙𝑖=1 𝑎𝑖 𝐾(𝑥, 𝑥𝑖 ) + 𝑏

(7)

where 𝐾(𝑥, 𝑥𝑖 )is the kernel function. In this paper, Radial basis function (RBF) is selected. 2. 3. 4 The establishment of LS-SVM model The odor data were obtained from the BA samples by electronic nose. The PA, PC and AC of samples were obtained by microplate reader. The 75% of odor data are used as the input to train the model. The LS9

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

SVM algorithm is used to create the identified model, PSO algorithm is used to optimize model parameters and obtain the optimal parameters. The error generator represent that the errors between calculated values by model and actual values. The best kernel function (G) and penalty parameter (C) are determined by continually analyzing training sets to reduce the error generator. The remaining 25% of samples are used to test and predict the different fermentation times and bile species. The prediction model based on LS-SVM is shown in Fig. 2.

10

ACS Paragon Plus Environment

Page 10 of 21

Page 11 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Fig.1 The scheme designed for prediction of the chemical attributes, bile and fermentation period of samples using LS-SVM.

3. Results and discussion 3. 1 The responses of E-nose for BA The typical E-nose responses to BA samples were first characterized as shown in Fig. 2, which shows the response of sensor array within 120 s. The sensor array was acted on by experimental gas with the redox reaction. Each sensor has different positive and negative response to multiple gases components. [43, 44] These sensors such as, LY (LY2/LG, LY2/G, LY2/AA, LY2/gCTL, and LY2/gCT), T (T30/1 and T70/2), P (P10/1, P10/2, P40/1, PA/2, and P30/1) have positive response for these compounds, and the sensors T40/2, T40/1, TA/2, P40/2, and P30/2 have negative responses. The changes of sensor resistance have increased gradually then decreased, finally reached a steady state. All the samples were measured for three times, and the maximum response were automatically recorded for 18 sensors.

Fig. 2 The typical sensor responses of E-nose.

11

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

3. 2 Identification of three bile species 3. 2. 1 Identification of bile species based on PCA According to the E-nose parameters shown in Fig.2, the samples are divided into three groups that represent pig bile, sheep bile and bovine bile, respectively. It shows that a three-dimensional scores plot of the first three principal components (PC1 = 60.275%; PC2 = 37.825%; PC3 = 1.104%). As can be seen in Fig 3, the samples representing three groups can be discriminated clearly and performed obvious dispersion.

Fig. 3. PCA scores plots for discriminating three bile species. 3. 2. 2 Validation of bile species based on LS-SVM The LS-SVM was used to identify the bile species. The 27 samples of each group were used as the calibrating group to train the model, and the remaining 9 unknown samples are input into the model used as the testing group. According to the analysis of LS-SVM, the information of the variation of RMSE and the R²value for different bile species can be seen in Table 3. The values of the training set, the testing set and the comprehensive sample for pig bile, sheep bile and bovine bile were larger than 0.9, which indicated that good agreement between the analytical and experimental results has been achieved. The results were validated by the identification analysis based on PCA. Furthermore, the results indicated that LS-SVM can be effectively used for the identification of bile species. 12

ACS Paragon Plus Environment

Page 12 of 21

Page 13 of 21 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Table 3 RMSE and R²values of different fermentation bile Training (n = 27)

Testing (n = 9)

All sample (n = 36)

Species RMSE



RMSE



RMSE



Pig bile

0.822

0.997

0.841

0.981

0.830

0.991

Sheep bile

0.831

0.990

0.850

0.989

0.851

0.989

Bovine bile

0.843

0.987

0.859

0.982

0.854

0.988

3. 3 Identification of different fermentation time by PC, PA, and AC 3. 3. 1 PC, PA, and AC contents in the processing period of BA Results of the PC, PA and AC of BA with different fermentation time are shown in Fig. 4. PC is one type of active ingredients of BA. The PA and AC of BA are important enzymes which effected on ingredients and material conversion. The changes of PC, PA and AC are intrinsic indicators that evaluate the fermentation time and quality of BA.

[45, 46]

In the beginning of fermentation, the ambient temperature

elevated due to the growth of microorganisms and the production of metabolites promoted the catalysis of the enzyme. The values of PC gradually decrease, while the values of AC and PA increase for 0 to 3 days. With the proceeding of fermentation time, the principal characteristics such as temperature, environment, and pH value were changing, and the types of microorganisms also changed correspondingly. The values of PC, PA and AC were changed slowly from 4 to 6 days. In the final fermentation period, the principal characteristics remained steady. Thus the values of PC, PA and AC tend to be stable from 7 to 9 days. The overall curves of AC, PA rise with the increasing of fermentation times, which contributes significantly to the odor. Thereby, the variation tendency of PC, PA and AC could reflect the fermentation time for BA.

13

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 21

Fig. 4. PC, AC and PC values of fermentative days

3. 3. 2 PC, PA, and AC and its relationships with E-nose data Each group were measured in three times from 0 to 9 days. The 360 sets of data refer to the values of 18 sensors and 120 sets of data included the values of AC, PC and PA, which were obtained for 24 batches. Firstly, the original data obtained by E-nose were transformed to mean values. Then, the values of PC, PA and AC were counted from 0 to 9 days. Finally, The Pearson’s coefficient test was used to investigate the correlation between the PC, PA and AC content and the electronic sensory values for 24 sets of data. It is shown in Table 4. One factor was extracted for E-nose. The Pearson values of PC, PA and AC were less than 0.05. It was confirmed that the E-nose data had a good correlation with the PC, PA and AC content. The variation tendency of PC, PA and AC contents could be used to reflect the different fermentation times. Table 4 Pearson’s correlations between components and E-nose data E-nose

PC

PA

AC

Coefficient

-0.921

Coefficient

0.699

Coefficient

0.825

P