Deep Learning-Assisted Three-Dimensional Fluorescence Difference

Jun 11, 2019 - In this adversarial way, the discriminator could greatly improve its classification and generalization abilities due to the generation ...
1 downloads 0 Views 4MB Size
Letter Cite This: Anal. Chem. XXXX, XXX, XXX−XXX

pubs.acs.org/ac

Deep Learning-Assisted Three-Dimensional Fluorescence Difference Spectroscopy for Identification and Semiquantification of Illicit Drugs in Biofluids Li Ju,†,∥ Aihua Lyu,†,∥ Hongxia Hao,§ Wen Shen,*,† and Hua Cui*,†

Downloaded via 185.46.84.235 on July 19, 2019 at 03:27:08 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.



CAS Key Laboratory of Soft Matter Chemistry, iChEM (Collaborative Innovation Center of Chemistry for Energy Materials), Department of Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, P. R. China § Collaborative Innovation Center of Judicial Civilization and Key Laboratory of Evidence Science, China University of Political Science and Law, Beijing 100088, P. R. China S Supporting Information *

ABSTRACT: The fast identification and quantification of illicit drugs in biofluids are of great significance in clinical detection. However, existing drug detection strategies cannot fully meet clinical needs, and the on-site identification and quantification of various illicit drugs in biofluids remain a great challenge. Here, we report the development of a deep learning-assisted three-dimensional (3D) fluorescence difference spectroscopy for rapid identification and semiquantification of illicit drugs in biofluids. This strategy introduces highly fluorescent silver nanoclusters into the biofluids with illicit drugs as signal sources. The interaction between silver nanoclusters and drug molecules changed the fluorescence performance of the mixture. Deep learning methods were applied to grasp the subtle fingerprint information from the 3D fluorescence difference spectra to identify and semiquantify various illicit drugs in biofluids, including codeine, 4,5-methylene-dioxy amphetamine, 3,4-methylene dioxy methamphetamine, meperidine, and methcathinone. This approach can achieve a high prediction accuracy rate of 88.07% and a broad detection range from 2 μg/mL to 100 mg/mL. It opens up a new way for the detection of small molecules with or without fluorescence in complicated matrixes.

I

The emergence of artificial intelligence (AI) technologies provides new avenues for scientific research. As a subfield of artificial intelligence, deep learning (DL) models can reproduce known and even extract currently unknown relations between different physical quantities on the atomic scale if trained using enough data and a rule-discovery algorithm. Such techniques in particular are suitable for discovering complex relations in complex systems, which is often beyond the capacity of conventional procedures. In recent years, impressive progress has been made in the application of AI in chemical domains.17 Particularly in the field of analytical chemistry, an astonishing wave of developments has recently demonstrated the power of deep learning in image analysis for the risk stratification of cancer patients18 and image transformation from bright-field input to prediction of fluorescence micrographs output to facilitate multiplexed imaging.19,20 More recently, He’s group reported an AIassisted SERS method to discriminate tumor suppressor genes in a label-free way, which represented a pioneering AI-based

llicit drug dependence/addiction has been an increasingly grim problem around the world,1,2 due to its threat to healthcare, family relations, and society stability.3 Therefore, it is of vital importance to detect illicit drugs in a rapid, sensitive, and accurate way. Generally, drugs need to be detected in two situations. First, rapid scanning for solid drugs in public areas is needed in order to prevent drug trafficking and transport.4−6 Second, the identification of drug addiction in a suspected population by analyzing their biofluids is needed.1 Two general approaches, including separation-based and recognition element-based strategies, have been reported, which can avoid the interference of complicated biofluids.7−13 These strategies are time-consuming and limited by the presence of recognition elements and can only be operated by professionals.13 In recent years, direct label/separation-free matrix analysis based on surface-enhanced Raman spectrometry (SERS) has attracted much attention for the determination of illicit drugs in biofluids due to its rapidity and sensitivity.14−16 However, most SERS-based strategies are mainly used for qualitative determination of illicit drugs in biofluids. Therefore, novel drug detection strategies that enable on-site screening to identify and quantify various illicit drugs in biofluids are still needed. © XXXX American Chemical Society

Received: March 14, 2019 Accepted: June 11, 2019 Published: June 11, 2019 A

DOI: 10.1021/acs.analchem.9b01315 Anal. Chem. XXXX, XXX, XXX−XXX

Letter

Analytical Chemistry

Figure 1. Schematic illustration of deep learning-assisted 3D fluorescence difference spectroscopy for detection of illicit drugs.

Figure 2. 3D fluorescence spectra of (a) Ag nanoclusters and 3D fluorescence difference spectra by subtracting the spectrum of the pure AgNCs from the spectra of a certain drug-interacted AgNC solution: (b) methcathinone, (c) MDA, (d) MDMA, (e) codeine, and (f) meperidine.

sensing application.21 Nevertheless, the powerful capability of deep learning is far from being well exploited, especially in improving identification and quantification of analytes in biofluids. In this work, inspired by information-rich three-dimensional (3D) fluorescence (FL) spectroscopy, a contour plot of excitation wavelength vs emission wavelength vs fluorescence intensity, we propose an efficient deep learning-assisted 3D FL difference spectroscopy approach (FL-DL) for the identification and quantification of drugs in biofluids. This strategy introduces highly fluorescent silver nanoclusters (AgNCs) as signal sources into the biofluids containing illicit drugs, and the drugs present in the matrixes might alter the signal output to different extents due to the interactions between AgNCs and drug molecules in sample matrixes. A distinct difference spectrum is obtained by subtracting the fluorescence excitation emission matrix (EEM) of pure AgNCs from that of the samples containing AgNCs. We demonstrate that, with the help of DL, a 3D fluorescence difference spectrum pattern can be used to identify and quantify the drug’s category and concentration without any separation or specific target recognition processes, even in the case where we do not know the exact mechanism of the interactions. To the best of our knowledge, this is the first deep learning study done for the identification and quantification of various illicit drugs in biofluids. The deep learning-assisted 3D fluorescence difference spectroscopy for detection of illicit drugs is schematically

described in Figure 1. The AgNCs with an average diameter of about 2 nm were prepared by reducing aqueous AgNO3 with glutathione (GSH) under vigorous stirring in an alkaline solution.22 The as-prepared AgNCs were characterized by various characterization methods, including high-resolution transmission electron microscopy, UV−vis spectra, and fluorescence emission and excitation spectra, as described in Supporting Information 1. The highly fluorescent AgNCs acted as a strong signal source. 3D FL spectra of AgNCs are obtained as described in Supporting Information 1. The contour pattern of its EEM would be altered after mixing with drug-containing biofluids. Figure 2 shows five 3D FL difference spectra acquired by mixing AgNCs with five kinds of illicit drugs, including codeine, 3,4-methylene-dioxy amphetamine (MDA), 3,4-methylene dioxy methamphetamine (MDMA), meperidine, and methcathinone. We found that every individual illicit drug could produce a distinct output signal in the difference spectrum, which might be related to the differences in the functional groups of the drug molecules. We used supervised deep learning to extract vital fingerprint information on target analytes from the difference spectrum for 5 kinds of illicit drugs by filtering out the interference information. We set the difference spectra as model inputs and the category and concentration of the illicit drugs as the model outputs. Surprisingly, strong correlations between the 3D difference spectrum pattern and the category and concentration of the illicit drugs were observed. We first demonstrate B

DOI: 10.1021/acs.analchem.9b01315 Anal. Chem. XXXX, XXX, XXX−XXX

Letter

Analytical Chemistry

Figure 3. Identification performance of deep learning-assisted 3D fluorescence difference spectroscopy for illicit drugs in human urine. (a) Architecture of GAN: left panel, generator; right panel, discriminator; (b) comparison of prediction accuracies for the drug category by employing different models including NB, SVM, KNN, ANNC, CNN, and GAN on the training set and test set; (c) confusion matrix of the GAN prediction for the test set; (d) comparison of prediction accuracies for the drug category by using different models including NB, SVM, KNN, ANNC, CNN, and GAN on the external test set.

producing novel synthesized instances that appear to have come from the true data distribution), while the discriminator was trying to figure out if the spectra received are simulated or real and to identify the category of the sample. In this adversarial way, the discriminator could greatly improve its classification and generalization abilities due to the generation of more realistic 3D spectra by the trained generator, and finally, the enhanced discriminator was capable of identifying the category of the testing samples. In order to obtain the best performance of the GAN model, we optimized the hyperparameters of the model, including dimensional reduction methods, batch sizes, and epoch numbers with the cross validation method. The optimization information and results are provided in Supporting Information 3.4, 3.5, and 4 and Figures S4, S5, and S6. The optimal GAN model was used for the following evaluation. Furthermore, we found that the prediction accuracy of GAN was hardly affected by the increase in the species number of illicit drugs as shown in Figure S7, indicating the robust analytical performance of GAN as long as the training data set is adequate. We compared our model with other commonly used supervised classifiers, including Naive Bayes (NB), Support Vector Machine (SVM), K-nearest neighbors (KNN), general Artificial Neural Network classifier (ANNC), and convolutional neural network (CNN). Among the 6 classifiers, ANNC, CNN, and GAN have the most flexible capability for nonlinear description while the others are easier to train with less computational cost. Compared with ANNC, GAN is like an enhanced ANNC with a generator, which can increase its classification ability by using extra simulated data. Therefore, GAN should have better generalization capability than ANNC because of the additional generator, in particular for those

the capability of this method to identify different drugs in human urine. The whole procedure of the model for the qualitative analysis of illicit drugs is described in detail in Figure S3. For each sample, its 3D FL difference spectrum was presented as an excitation emission matrix, which is hard for the model to process. Thus, data preprocessing must be done. Since there are about 1300 samples in the input data set, simply unfolding an EEM into a long sequence of numbers will transform the input data into a matrix with more than 8000 feature dimensions and 1300 samples, facing the curse of dimensionality. As a result, the model is prone to overfitting and even failing to converge if the feature dimension greatly outnumbers the number of samples in the input data. Therefore, dimensional reduction must be done before modeling to make the model robust to overfitting and efficient. After preprocessing, an approach, generative adversarial network (GAN) classifier, was used to correlate the categories of illicit drugs with the spectral signatures in the first identification step. GAN was constructed by a two-layer generator and a threelayer discriminator as shown in Figure 3a. The left panel is the generator and the right panel is the discriminator. Detailed descriptions on the architecture of our GAN are provided in Supporting Information 3.5. As a newly developed neural network, GAN trained the model in a semisupervised learning way containing both a discriminator and a generator. The discriminator was able to learn the decision boundary between different classes of illicit drugs, while the generator could model the distribution of the features of individual classes, making full use of the information in the training data set at the same time. The generator’s training objective was to decrease the error rate of the discriminator (“fool” the discriminator by C

DOI: 10.1021/acs.analchem.9b01315 Anal. Chem. XXXX, XXX, XXX−XXX

Letter

Analytical Chemistry adversarial cases. While all classifiers perform well on the training set, only GAN, CNN, and ANNC possess high accuracy (above 85%) on the test set (Figure 3b). The confusion matrix for the test set is summarized in Figure 3c. The calculation results demonstrate that GAN has both high true positive and true negative rates at 99.6% and 95.1%, respectively, if we label samples with drugs as positive and samples without drugs as negative. Moreover, in order to further prove that this model could be applied to analyze any other urine matrix samples, an external test set was introduced to further evaluate the real analytical performance by testing more urine matrix samples that were completely different from those used in the training set and test set. Accordingly, 6 classifiers were also evaluated with the external test set. GAN outperforms ANNC and CNN due to its better generalization capability. A high accuracy of 88.07% is achieved by using GAN with a standard deviation of 2.43% while the other five methods all possess poor accuracy, lower than 78%. To summarize, GAN provides substantially greater confidence than the other five algorithms for drug identification. The quantification of the illicit drugs in biofluids is challenging. After identifying the category of drugs, models that can quantitatively determine the concentrations of different drugs were developed. Without prior information about the interaction between AgNCs and illicit drugs, no strong assumption of the function used in the quantification model could be made. Among all the commonly used quantification models, ANN regressor (ANNR) possesses flexible nonlinear modeling capability and does not need to make ad-hoc assumptions for features used in the regressor. Here, a three-layer ANNR with a quantitative output was used (Figure 4a). The detailed architecture of the model is illustrated in Supporting Information 3.5. We compared the predicted concentration to the actual concentration of testing samples and analyzed the error of our results (Figure 4b,c). The slopes of the predicted concentration fitted as a function of the actual concentration are close to 1, while the bias is around 0. Mean absolute errors (MAE) of the results are ±0.16. The histogram of the error distribution shows that most prediction errors are ±1. All the models have small p-values of less than 2.2 × 10−16. Therefore, our model can produce convincing results with no bias and low deviation. We further validated our model on the external test set as shown in Figure 4e,f. On this external test set, most errors are still ±1 and the average errors are also around 0, indicating that our test can determine the order of magnitude of the concentration of illicit drugs in human urine with a satisfactory accuracy of 92.6%. The detection range of our method is 2 μg/mL to 100 mg/mL, covering nearly 5 orders of magnitude. All the steps can be done within 3 min. Finally, we compared our FL-DL method with the liquid chromatography−mass spectrometry (LC-MS) method for the determination of illicit drugs in the urine samples as shown in Table 1. The results demonstrate that the measured concentrations of illicit drugs by the FL-DL method are the same order of magnitude of the concentrations obtained by LC-MS analysis. In real situations, FL-DL could be used to rapidly scan a large amount of suspected urine samples; then, LC-MS could be further used to accurately qualify and quantify the illicit drugs in suspected urine samples, which can be used as forensic evidence. Therefore, this work provides a novel FL-DL method for the rapid scan of illicit drugs in suspected urine samples for forensics analysis.

Figure 4. Semiquantification performance of deep learning-assisted 3D fluorescence difference spectroscopy for illicit drugs in human urine. (a) Architecture of ANN: left two layers, normal neural network; right, a single-node layer for regression; (b) actual vs predicted concentration in logarithm for the test set and (c) its corresponding prediction error distribution; (d) four key indexes of semiquantification accuracy for codeine, MDA, MDMA, meperidine, and methcathinone on the test set; (e) actual vs predicted concentration in logarithm for the external set and (f) its corresponding prediction error distribution.

Table 1. Comparison of the FL-DL Method with the LC-MS Method for Determination of Illicit Drugs in the Urine Samples illicit drugs codeine MDA MDMA meperidine methcathinone D

LC-MS analysis: concentration (μg/mL) 3.12 2.55 6.96 2.13 8.36

± ± ± ± ±

0.10 0.05 0.11 0.08 0.11

FL-DL analysis: concentration (μg/mL) 2.95 2.13 8.91 3.24 11.22

± ± ± ± ±

0.81 1.15 4.47 1.39 3.36

DOI: 10.1021/acs.analchem.9b01315 Anal. Chem. XXXX, XXX, XXX−XXX

Letter

Analytical Chemistry

of China (Grant Nos. 21475120, 21527807, and 81871523), and the Program for Young Innovative Research Team in China University of Political Science and Law (Grant No. 18CXTD09) is gratefully acknowledged. We also thank Dr. D. Chen for the discussion on the deep learning strategy and revision of the manuscript.

In summary, we have demonstrated that the proposed deep learning-assisted 3D FL difference spectroscopy approach is able to identify and semiquantify illicit drugs in biofluids. With AgNCs as signal sources, the interaction between AgNCs and drug molecules led to a change in fluorescence performance, and deep learning methods were applied to extract the subtle fingerprint information from the difference spectra to identify and semiquantify various illicit drugs in biofluids. This recognition element-free and separation-free strategy is rapid, simple, low-cost, and accurate, with a high prediction accuracy rate of 88.07% and a broad detection range from 2 μg/mL to 100 mg/mL (3 min per sample). The combination of DL with the information-rich 3D fluorescence difference spectrum derived from subtle interactions between fluorescent nanoclusters and small molecule targets opens up a new way for the detection of small molecules in complex matrixes. It perfectly avoids a big obstacle in fluorescence assays, i.e., the spectrum overlapping, especially in complex matrixes. In addition, it can be used to identify and semiquantify nonfluorescence targets, which greatly expands the applicable scope of fluorescence spectroscopy. Moreover, though the interaction mechanism between AgNCs and target molecules is unknown, DL methods can nonetheless determine the unknown correlations between optical signals and the category and concentration of analytes and achieves satisfactory analytical performance. Thus, this work can shed light on the development of novel analytical methods for the detection of substances in complex environments.





ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.analchem.9b01315. Experimental methods, three-dimensional fluorescence difference spectra of different illicit drugs, computational methods, optimization of the computational algorithm, and effect of species number of illicit drugs on analytical performance of GAN (PDF)



REFERENCES

(1) Mahmoudi, M.; Pakpour, S.; Perry, G. ACS Chem. Neurosci. 2018, 9 (10), 2288−2298. (2) Lester, B. M.; LaGasse, L. L.; Seifer, R. Science 1998, 282 (5389), 633. (3) Lu, H.; Miethe, T. D.; Liang, B. China’s drug practices and policies: regulating controlled substances in a global context; Routledge: 2016. (4) Liu, K.; Shang, C.; Wang, Z.; Qi, Y.; Miao, R.; Liu, K.; Liu, T.; Fang, Y. Nat. Commun. 2018, 9 (1), 1695. (5) Hernández, P. T.; Naik, A. J. T.; Newton, E. J.; Hailes, S. M. V.; Parkin, I. P. J. Mater. Chem. A 2014, 2 (23), 8952−8960. (6) Guerra-Diaz, P.; Gura, S.; Almirall, J. R. Anal. Chem. 2010, 82 (7), 2826−2835. (7) Weinmann, W.; Renz, M.; Vogt, S.; Pollak, S. Int. J. Legal Med. 2000, 113 (4), 229−235. (8) Clauwaert, K. M.; Van Bocxlaer, J. F.; De Letter, E. A.; Van Calenbergh, S.; Lambert, W. E.; De Leenheer, A. P. Clin Chem. 2000, 46 (12), 1968−1977. (9) Basilicata, P.; Pieri, M.; Settembre, V.; Galdiero, A.; Della Casa, E.; Acampora, A.; Miraglia, N. Anal. Chem. 2011, 83 (22), 8566− 8574. (10) Shlyahovsky, B.; Li, D.; Weizmann, Y.; Nowarski, R.; Kotler, M.; Willner, I. J. Am. Chem. Soc. 2007, 129 (13), 3814−3815. (11) Gaillard, Y. P.; Cuquel, A. C.; Boucher, A.; Romeuf, L.; Bevalot, F.; Prevosto, J. M.; Menard, J. M. J. Forensic Sci. 2013, 58 (1), 263− 269. (12) Freeman, R.; Sharon, E.; Tel-Vered, R.; Willner, I. J. Am. Chem. Soc. 2009, 131 (14), 5028−5029. (13) Baker, B. R.; Lai, R. Y.; Wood, M. S.; Doctor, E. H.; Heeger, A. J.; Plaxco, K. W. J. Am. Chem. Soc. 2006, 128 (10), 3138−9. (14) Dong, R.; Weng, S.; Yang, L.; Liu, J. Anal. Chem. 2015, 87 (5), 2937−44. (15) Siddhanta, S.; Wróbel, M. S.; Barman, I. Chem. Commun. 2016, 52 (58), 9016−9019. (16) Weng, S.; Dong, R.; Zhu, Z.; Zhang, D.; Zhao, J.; Huang, L.; Liang, D. Spectrochim. Acta, Part A 2018, 189, 1−7. (17) Butler, K. T.; Davies, D. W.; Cartwright, H.; Isayev, O.; Walsh, A. Nature 2018, 559 (7715), 547. (18) Manak, M. S.; Varsanik, J. S.; Hogan, B. J.; Whitfield, M. J.; Su, W. R.; Joshi, N.; Steinke, N.; Min, A.; Berger, D.; Saphirstein, R. J.; Dixit, G.; Meyyappan, T.; Chu, H.-M.; Knopf, K. B.; Albala, D. M.; Sant, G. R.; Chander, A. C. Nat. Biomed Eng. 2018, 2 (10), 761−772. (19) Strack, R. Nat. Methods 2019, 16 (1), 17. (20) Ounkomol, C.; Seshamani, S.; Maleckar, M. M.; Collman, F.; Johnson, G. R. Nat. Methods 2018, 15 (11), 917−920. (21) Shi, H.; Wang, H.; Meng, X.; Chen, R.; Zhang, Y.; Su, Y.; He, Y. Anal. Chem. 2018, 90 (24), 14216−14221. (22) Zhao, K.; Shen, W.; Cui, H. J. Mater. Chem. C 2018, 6 (24), 6549−6555.

AUTHOR INFORMATION

Corresponding Authors

*E-mail: [email protected]. *E-mail: [email protected]. ORCID

Hua Cui: 0000-0003-4769-9464 Author Contributions ∥

L.J. and A.L. contributed equally to this work. L.J. designed the project, performed the calculations, and wrote the original draft. A.L. conducted most experimental works and discussed the results. H.H. discussed the trend in illicit drug detection and results. W.S. contributed to the design of the study and revision of the manuscript. H.C. directed the study and wrote the manuscript. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS The support of this research by the National Key Research and Development Program of China (Grant No. 2016YFA0201300), the National Natural Science Foundation E

DOI: 10.1021/acs.analchem.9b01315 Anal. Chem. XXXX, XXX, XXX−XXX