Support Vector Machine Classification of Nonmelanoma Skin Lesions

Jul 17, 2019 - Support Vector Machine Classification of Nonmelanoma Skin Lesions .... and emission spectra, quantum yield determination, and two-photo...
0 downloads 0 Views 4MB Size
Subscriber access provided by KEAN UNIV

Article

Support Vector Machine Classification of Non-melanoma Skin Lesions based on Fluorescence Lifetime Imaging Microscopy Bingling Chen, Yuan Lu, Wenhui Pan, Jia Xiong, Zhigang Yang, Wei Yan, Liwei Liu, and Junle Qu Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.9b01866 • Publication Date (Web): 17 Jul 2019 Downloaded from pubs.acs.org on July 19, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Support Vector Machine Classification of Non-melanoma Skin Lesions based on Fluorescence Lifetime Imaging Microscopy Bingling Chen1, Yuan Lu2, Wenhui Pan1, Jia Xiong1, Zhigang Yang1, *, Wei Yan1, Liwei Liu1 and Junle Qu1, * 1Key

Laboratory of Optoelectronic Devices and Systems of Ministry of Education and Guangdong Province, College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen 518060, China 2Department

of Dermatology, The Sixth People’s Hospital of Shenzhen, Guangdong 518052, China

Corresponding email: [email protected]; [email protected]

ABSTRACT Early diagnosis of malignant skin lesions is critical for prompt treatment and clinical prognosis of skin cancers. However, it is difficult to precisely evaluate the developing stage of non-melanoma skin cancer because they are derived from the same tissues as a result of uncontrolled growth of abnormal squamous keratinocytes in the epidermis layer of the skin. In the present study, we developed a linear-kernel support vector machine (LSVM) model to distinguish basal cell carcinoma (BCC) from actinic keratosis (AK) and Bowen’s disease (BD). The input parameters of the LSVM model consist of appropriate lifetime components and entropy values, which were extracted from two-photon fluorescence lifetime imaging of hematoxylin and eosin (H&E)-stained biopsy sections. Different features used as inputs for SVM training were compared and evaluated. In constructing the SVM models, features obtained from the lifetime (τ2) of the second component were found to be significantly more predictive than the average fluorescence lifetime (τm) in terms of diagnostic accuracy, sensitivity, and specificity. The above findings were confirmed based on the receiver operating characteristic (ROC) curves of diagnostic models. Shannon entropy was added as an independent feature into the SVM models to further improve the diagnostic accuracy. Therefore, fluorescence lifetime analysis and entropy calculation can provide highly informative features for accurate detection of skin neoplasm disorders. In summary, fluorescence lifetime imaging microscopy (FLIM) combined with SVM classification exhibited great potential for developing an effective computer-aided diagnostic criteria and accurate cancer detection in dermatology. Keywords: FLIM; support vector machine; Shannon entropy; differential evolution Early diagnosis of malignant skin lesions, including malignant melanoma (MM) and non-melanoma skin cancer (NMSC), is critical for prompt treatment and clinical prognosis for excessive sun exposure symptoms, especially for light-skinned individuals.1 Compared to MM, NMSC is further classified into different sub-types, such as basal cell carcinoma (BCC), squamous cell carcinoma (SCC), its precursor actinic keratosis (AK), and Bowen’s disease (BD), which are difficult to evaluate based on the developing stage, given that they originate from the same uncontrolled growth of abnormal squamous keratinocytes in the epidermis layer of the skin.2 AK lesions are the most common premalignant forms of SCC and difficult to be distinguished from BCC and BD. Both AK and BD are not life-threatening at the beginning but are locally destructive and give rise to malignant invasion upon metastasizing into the dermis. Therefore, early and accurate diagnosis of such cancers and effective differential diagnosis between benign and malignant lesions is crucial for the delivery of suitable treatment and to ensure patient survival. Excisional biopsy is currently the routine method for the diagnosis of the stage and extent of skin cancer. However, the accuracy of this method is highly dependent on the experience of dermatologists in analyzing pathological sections,3 which could lead to misdiagnosis and further unnecessary biopsies because of poor specificity. Recently, various optical techniques, such as optical coherence tomography (OCT), Fourier transform infrared spectroscopy (FTIR), and Raman spectroscopy, have been developed to replace pathological biopsy.4,5 However, complex algorithms are required to analyze the multidimensional datasets,6 making it challenging for both clinicians and computer-aided diagnostics. Therefore, the development of new optical microscopic imaging method is required to facilitate both traditional histopathology and computer-aided interpretation for automated classification. The emergence of automated techniques can significantly promote clinicians in developing reliable point-of-care diagnostics of cutaneous lesions and provide accurate diagnosis with high detection probability and low false-alarm probability. However, efficient extraction of distinguishable features at the cellular or tissue level ACS Paragon Plus Environment

1

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 15

is required. Color-based H&E staining can enhance the image contrast and reveal abundant structural information with specific functional implications. The use of fluorescence intensities and spectra from H&E staining have been previously explored for disease diagnosis.7 However, the non-specificity of protein staining and spectral overlap of H&E fluorophores can lead to inaccurate morphological and diagnostic analysis. Fluorescence lifetime is known to be sensitive to local microenvironment conditions and has important advantages compared to intensity-based methods. The feasibility of lifetime method has been demonstrated by quantification of various analytes, e.g. polarity, oxygen abundance, viscosity, pH and temperature, etc.8 Fluorescence lifetime research usually involves molecules that are somehow dispersed in the sample. It is well known that the properties of the sample affect the lifetime of the molecules,9 which can be potentially correlated with the state of cells and tissues during physiological processes.10 Since the fluorescence lifetime measurement is closely related to the instability of molecular transition, which is sensitive to a great variety of internal factors defined by the fluorophore structure and external factors that include acid-base properties, hypoxia, and the presence of fluorescence quenchers.11 In cancer diagnosis, tumor cells secrete proteins that are very different from normal cells, which interact with the dye and alter the fluorescence lifetime. Therefore, fluorescence lifetime imaging microcopy (FLIM) can serve as a new tool for quantitative analyses of digital histopathological imaging and could be used to design efficient automated diagnostic tools.12,13 A phasor approach was further employed for fluorescence lifetime analysis to enhance pathological characteristics.14 In phasor analysis, the plotting procedure provides information on the nature of fluorescence decay and can be determined based on the average fluorescence lifetime (τm), which can reduce the accuracy of diagnosis or classification. In the present study, support vector machine (SVM)15-17 was used for automatic diagnosis of three subcategories of cutaneous neoplastic lesions (AK, BD, and BCC). The establishment of the SVM training model involved the extraction of fluorescence lifetime features and calculation of the information entropy. The process of extracting the fluorescence lifetime features was realized by fitting the FLIM data from the epithelial cells (ECs) with the function of triple-exponential components for each pixel of the image. Then, the histogram of the second component lifetime (τ2) of a small region of interest (ROI) was fitted using the Gaussian distribution, and the fitting results for the median (μ) and width were used as input features for the SVM model. Shannon’s entropy was incorporated into the SVM model as an independent feature to indicate the extent of differentiation of melanocytes. The clinical fatal tumor variants (e.g., morpheaform BCC and spindle cell SCC) were expected to have lower entropy values; by contrast, normal tissues were expected to have higher entropy values because of highly structured and more informative features.18 The present study aimed to explore the application of SVM combined with FLIM in computer-aided automated cutaneous carcinoma diagnosis based on images derived from H&E staining of biopsy sections. Then, the SVM classifier was found to be useful in distinguishing between benign lesions (AK&BD) and malignant lesions (BCC) or between the AK and BD subcategories within the benign groups. The parameters for all SVM models were optimized by leave-one-out five-fold cross validation on the training datasets. The predictive accuracies of the models were evaluated using the testing datasets.

EXPERIMENT SECTION Sample Preparation Fresh human skin specimens were obtained from nine patients undergoing skin biopsies as part of routine diagnostic procedure in the Department of Dermatology at the Sixth People’s Hospital of Shenzhen. The samples were placed in a standard pathologic transport container, covered with ice, and then sent to the Department of Pathology. Three consecutive sections were cut from each paraffin block using a cryostat microtome following standard histology procedures. A total of 27 H&E-staining skin tissue sections were examined by a senior pathologist, and the corresponding types of cutaneous lesions were identified as AK (two patients), BD (two patients), and BCC (two patients). The present study was performed according to a protocol approved by the research ethics committee of the Shenzhen Sixth People’s Hospital. All patients provided informed consent for the use of their tissues for medical research. Hematoxylin and Eosin (H&E) Staining Hematoxylin-Eosin (H&E) staining is the most commonly used method in histopathology. The method uses a combination of hematoxylin and eosin for microscopic inspection of nuclear and cytoplasmic inclusions in clinical specimens. Hematoxylin is not a dye since the molecule possesses no chromophore (Leuco form as shown in Supplementary Figure S1 a1). The oxidation product of hematoxylin is haematin which acts as the ACS Paragon Plus Environment

2

Page 3 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

active ingredient in the staining solution. Haematin exhibits being blue and less soluble in aqueous alkaline conditions and turns red and more soluble in acidic alcoholic solution. Haematin readily binds to nuclear chromatin when incubating with tissues or cells. And the undesirable colouration is removed by treating tissues with an acid alcohol, and the cell nuclei stains light blue by returning to an alkaline environment. Whole-cell coloration is accomplished by counterstaining with the eosin mixture which gives pink to the cytoplasm. The structure of eosin is shown in Supplementary Figure S1 b1. Specific steps of staining procedure are shown in the H&E Staining Protocol section in the Supplementary Information. Two-photon Fluorescence Lifetime Imaging and Analysis The identified H&E-stained sections were then sent to the optical laboratory at Shenzhen University for fluorescence lifetime imaging using a homemade two-photon FLIM system as described in detail in Supplementary Materials. The minimum time channel width of the TCSPC module was 813 fs, and the response time of the whole system was < 9 ps. Lifetime calculations and fitting were performed using the SPCImage software (Becker & Hickl GmbH, Germany). A triple-exponential components model was used to perform the least square fitting of the fluorescence decay histogram for each pixel of a 256×256 image to calculate the fluorescence lifetime distribution of all H&E-stained tissue sections. A pseudocolored lifetime image for each image was generated by assigning a color to the lifetime value of the average lifetime (τm) at each pixel. The phasor plot method19,20 was used to segment the image to calculate the lifetime distribution of the region of interest (ROI) corresponding to stratum corneum (SC), epithelial cells (ECs), and dermal connective tissue (CT) for different types of lesions of actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC), as shown in Supplementary Figure S3-S5, respectively. The fluorescence lifetimes of the segmented region of interest for SC, ECs, and CT were extracted for statistical analysis.21 Lifetime Feature Extraction and Entropy Calculation After extracting the region of interest in the segmentation procedure, meaningful features were extracted from the epithelial cells (ECs) layer. For each segmented image, 15 small windows (28×28 pixels) were randomly selected from the ECs layers for lifetime feature extraction and entropy calculation. The lifetime feature extraction involves the triple-exponential components model for fitting to exclude the effects of long- and short-lifetime components, while retaining the media lifetime component τ2 for each pixel. Then, the lifetime histogram of τ2 for each 28×28-pixel patch was fitted by a Gaussian distribution, and the median (μ) and width of lifetime τ2 were determined. The entropy value was calculated based on the lifetime histogram of τ2 and following the Shannon’s definition as described in detail in the Supplementary Materials. Support Vector Machine Support vector machine (SVM) is a supervised learning model and is considered to have superior performance over traditional linear approaches because of its capability to perform binary classification using the kernel trick technique with nonlinear boundary by mapping the inputs into higher-dimensional feature spaces.22 The SVM training algorithm builds a model that maps the inputs as scattering points in the feature space and classifies examples by finding the largest gap between the distinct categories. New examples are then mapped to the same space, and the classification is predicted based on the side of the gap in which they fall. The intrinsic advantage of SVM makes it an excellent non-probabilistic binary classifier and has been successfully applied in various fields. Compared to other multivariate statistical methods, SVM has many advantages. First, it is a powerful way to classify small datasets. Second, the performance of SVMs can be controlled by the regularization parameter, which can be used to avoid overfitting. Third, SVMs can deal with complex class boundaries using the kernel trick, so that users can incorporate expert knowledge by engineering the kernel.

RESULTS AND DISCUSSION Bright field images of H&E-stained biopsy sections of actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) were first captured using a standard digital camera (Leica), as shown in Figure 1a. Then, fluorescence lifetime images were captured from the same stained area using a homemade two-photon excited FLIM system. H&E staining of pathological sections was ineffective for fluorescence quenching using formalinfixed and paraffin-embedded sections.23 Hematoxylin is non-fluorescent, whereas eosin (λabs = 527, λem = 550 nm in ethanol) exhibits intensive fluorescence under two-photon excitation (785 nm).21,24 Thus, the fluorescence ACS Paragon Plus Environment

3

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 15

emissions of H&E-stained sections were primarily attributed to eosin under two-photon excitation.25 Figure 1 shows the fluorescence intensity images (b and d) and the corresponding color-encoded FLIM images (c and e) of H&E-stained sections for the three different types of skin neoplastic lesions. The pseudocolors of FLIM images were encoded using the average fluorescence lifetime τm. In general, the detected fluorescence signals were attributed to H&E dyes and the autofluorescence (AF) of bioactive substances, such as elastin (420-460 nm), collagen (370-440 nm), NADH (450-500 nm), and keratin (450-550 nm).26 In terms of excitation efficiency and detection sensitivity, the fluorescence detection of H&E-stained sections with single- or two-photon excitation was found to be more informative compared to using endogenous autofluorescence and second harmonic generation (SHG).27,28 Based on fluorescence intensity images shown in Figure 1 (b and d), we observed many hollow regions with very weak fluorescence in the nuclei of epithelial cells (ECs), which was attributed to the strong absorption of melanin synthesized by melanocytes. In addition, relatively bright fluorescence signals were observed in the stratum corneum (SC), cytoplasm, and dermal connective tissue (CT). Intensity-based images were insufficient to evaluate the morphometric malignancy in cases of neoplastic growth, given that nonspecific staining of H&E renders fluorescence signals from different tissue components indistinguishable. Fluorescence lifetime is generally not affected by factors, including luminous intensity, fluorophores concentration, and photobleaching, but is solely dependent on the microenvironment surrounding the fluorophores.8,29 Therefore, fluorescence lifetime can provide quantitative in situ biochemical information and reveal cellular changes in the microstructure caused by malignant transformation.9,11 Pseudocolor-encoded fluorescence lifetime imaging (Figure 1 c and e) revealed that the stromal region is fairly distinct from the cell-rich epithelium. The dissimilarity in orientations of dermal CT can be clearly detected based on the pseudocolor-encoded lifetime images. The CT structures were observed as long and straight linear fibers in AK; however, these fine structures were absent in BD and BCC. The epithelial cells in all lesion types showed high variability in ECs karyotype showed a multilayered structure with signs of cellular proliferation. Given the high sensitivity and specificity of fluorescence lifetime imaging it can be utilized by clinicians to derive efficient and reliable diagnosis by analyzing the features and patterns from the FLIM images. Compared to fluorescence intensity-based imaging, fluorescence lifetime imaging can provide extra contrast in tissue staining. As show in Figure 2, the panels a1-a4 in red color display SC as a thin layer of the upper cortex. The green panels b1-b4 show the ECs from the epithelium. The blue panels c1-c4 represent the CT layer in the dermis. The merged channels combine the above three channels and are shown in panels d1-d4. Fluorescence imaging analysis of the three layers in each lesion type reveal the lifetime distributions for all layers and are clearly distinguishable in the three different lesion types. All layers in the BD samples showed a longer τm values with the range of 100-300 ps in SC, 50-250 ps in ECs, and 200-450 ps in CT. On the other hand, the histograms of the three layers showed a medial τm range of 80-250 ps in AK and a shorter range of 30-180 ps in BCC. Compared to the merged images in Figure 2 d1-d3, the τm histogram in Figure 2 d4 has two peaks for the AK, BD, and BCC samples. The peaks with shorter τm are derived from cell-rich epithelia, while the longer peaks are derived from H&E-stained CT. Therefore, the long lifetime of autofluorescence (typically ~1.2-3.0 ns)11 did not markedly alter the lifetime distribution of H&E-stained samples in the range of 0-450 ps. In addition, the τm histogram of the three different lesion types showed highly distinct distributions, although small overlaps were observed between the τm histograms of BCC and the histograms of AK and BD. The separated τm histograms can produce marked visualizations with pseudocolor lifetime images and provided additional features for histopathological investigation. Results of Welch t-test revealed statistically significant differences in the mean value of τm histograms (τmμ) among the AK, BD, and BCC samples (data shown in our previous work).21 The τmμ values of the three types of lesions are distinct and can serve as ideal features for identifying skin neoplastic lesions. The differences in τmμ values in the intercellular and interstitial microenvironments of the AK, BD, and BCC lesions can be easily distinguished based on the corresponding τm pseudocolor images (Figure 1 c and e), thereby demonstrating that fluorescence lifetime imaging of H&E-stained samples is sensitive to the cutaneous microenvironment surrounding the dye molecules. Therefore, fluorescence lifetime imaging of H&E staining serves as a promising method for early diagnosis of skin cancer. The results of the above analyses are consistent with our previous findings. Triple-exponential components fitting method was used herein to replace the biexponential components fitting in our previous work.21 Phasor plot is a fit-free method for numerical analysis of FLIM data, which can be potentially used for the identification of three different types of AK, BD, and BCC. Even indistinguishable pixels in proximity represented by fluorescence lifetime imaging can be well separated based on the clustering distribution in the phasor plot (Supplementary Figures S3-S5). Finally, the phasor points in the divergent clustering distributions can ACS Paragon Plus Environment

4

Page 5 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

be utilized to distinguish BCC from AK and BD.21 This process of identifying different skin neoplastic lesions is usually visualized by histochemical staining and relies on the expertise of the pathologist and is therefore not conducive to automatically screen early cancer. So far, the signature of phasor clustering, including center-ofmass positions, shape, and angle of phasor distribution, has been employed to quantitatively distinguish disease and healthy tissues.30 However, tissue samples are more complex and structurally heterogeneous and produce multi-component fluorescence decay. In analyzing a complex system, individual lifetime components are difficult to solve using the phasor plot. The coordinates (g and s) in the phasor plot is equivalent to the average fluorescence lifetime (τm) in fitting analysis and can hardly reflect the true fluorescence lifetime components in the complex. Furthermore, optimizing the fitting for all pixels in the FLIM images using a biexponential model is difficult because of signal pollution from SHG and autofluorescence.26-28 The fluorescence lifetimes of H&Estained samples are mainly distributed in range of 50-450 ps, whereas the lifetime of autofluorescence is typically distributed in the range of 1.2-3 ns.11 When using triple-exponential component fitting, a sharp descent at the beginning of the decay profile was observed, revealing an ultra-fast decay contributed by dye aggregation or SHG, which cannot be acquired from biexponential fitting method.21 However, SHG derived from noncentrosymmetric collagen is inevitable during two-photon excitation, considering that collagen is the major structural protein that maintains the mechanical stability of human skin.31,32 Equipped with the bh-GmbH SPC150 modules, our two-photon TCSPC-FLIM system delivers an instrument response function (IRF) of 6.7-6.9 ps FWHM (full width at half maximum).33 Thus, a triple-exponential model was employed to precisely measure the lifetime of eosin in microenvironment of tissue sections and to consequently better fit all FLIM data. Gaussian fitting of lifetime histograms in Supplementary Figure S3-S5 showed the lifetime distributions for short component τ1 (0-80 ps, width ~30 ps), medial component τ2 (100-320 ps, width ~90 ps), and long component τ3 (350-1500 ps, width ~400 ps), which are expected to correspond to SHG signal, eosin, endogenous autofluorescence, respectively. Therefore, the τ2 of cell-rich epithelium layer (ECs) was selected for feature extraction in the subsequent SVM classification analysis. After selecting a set of feature parameters, an effective diagnostic SVM classifier was constructed to evaluate the neoplastic lesions from AK, BD, and BCC. The diagnostic process can be divided into two steps, namely, distinguishing the malignant of BCC from the benign lesions of AK and BD using conventional SVM and further detecting difference between AK and BD. In detail, a total of 270 instances (90 for each type) from three different type of lesions (AK, BD, and BCC) were randomly split into the training (135 instances) and testing sets (135 instances). In the training stage, the SVM model was built by the leave-one-out five-fold cross-validation method to determine the optimal hyperplane; afterwards, a linear kernel function was used to clearly separate the classes. For the algorithm of linear kernel support vector machine (LSVM), the penalty parameter C should be optimized to achieve the best trade-off between the training error and generalization ability and values in range of -25-0 in logarithmic scale were evaluated (Figure 3). Cross-validation accuracy improved with higher C values and reached the maximum at a certain position for the different models. Finally, the receiver operating characteristic (ROC) curves and the integration area under the curve (AUC) values of the well-trained models in the prediction of the labeled testing datasets were used to evaluate the performance of the SVM classifiers. Comparison of the LSVM models was performed based on the median (μ) and the width of τm or τ2, respectively (Figure 3 a1-a2). From the scattering distribution in the 2D features space of μ-width, data points of τm from AK, BD, and BCC cannot be fully distinguished (Figure 3 a1), while data points of τ2 corresponding to the three clusters were easily separated (Figure 3 a2). Using the τm feature, the LSVM model (C value = 2-11) effectively distinguished BCC from the precancerous lesions (AK and BD), corresponding to the prediction accuracy of 90.4% (122/135), sensitivity of 84.3%, and specificity of 97.6%; however, this model failed to distinguish between the AK and BD subcategories, giving the predicted accuracy of 66.7% (57/90), sensitivity of 63.6%, and specificity of 71.4%; in addition, tuning the penalty parameter C value did not effectively increase the performance (Figure 3 b1). The AUC values of these two models were calculated to be 0.9316 and 0.7294, respectively (Figure 3 c1). By contrast, the LSVM model using the lifetime component of τ2 as training feature achieved better classification performance. In Figure 3 a2, the scattering point distributions of AK, BD, and BCC can be clearly separated with predicted accuracy [BCC vs. (AK & BD) 95.6% (129/135) or AK vs. BD 91.1% (84/90)], sensitivity [BCC vs. (AK & BD) 89.8% or AK vs. BD 89.4%], and specificity [BCC vs. (AK & BD) 98.8% or AK vs. BD 93.0%], by a narrower margin and fewer support vectors at the C value of 2-14. The AUC of these two models were calculated to be 0.9904 and 0.9482, respectively (Figure 3 c2).

ACS Paragon Plus Environment

5

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 15

To further improve the predictive accuracy of the SVM model, the information entropy was incorporated into the SVM model as an independent feature to build the 3D feature space of μ-width-entropy derived from lifetime τ2 in Figure 3 a3. The projection of scattering point distribution on the μ-width plane indicated a degree of crosstalk among AK, BD, and BCC, but the projection on the planes of entropy-μ and entropy-width can be completely separated. The above models showed predictive indexes of accuracy [BCC vs. (AK & BD) 99.3% (134/135) or AK vs. BD 96.7% (87/90)], sensitivity [BCC vs. (AK & BD) 100% or AK vs. BD 97.7%], and specificity [BCC vs. (AK & BD) 98.9% or AK vs. BD 95.7%] at the C value of 2-14. The corresponding AUC values of the two models were calculated to be 0.9988 and 0.9941, respectively (Figure 3 c3).

CONCLUSION In conclusion, fluorescence lifetime imaging of H&E-stained biopsy samples can significantly enhance the image contrast for the detection of microstructural changes in premalignant lesions or cutaneous cancerous lesions. SVM model was successfully implemented for the classification of skin neoplastic lesions from AK, BD, and BCC patients based on FLIM imaging of H&E-stained sections. The classifier constructed using the features extracted from the medium lifetime component of τ2 was found to be superior to that constructed from the mean lifetime τm. In addition, Shannon entropy as an independent feature was introduced into the SVM model training and was found to markedly improve the classification accuracy. The diagnostic performances were comprehensively evaluated for lifetime feature extraction. The optimal features were determined to be the median, width, and entropy of lifetime τ2, which enabled the algorithm used in the linear kernel function to simplify calculations without sacrificing the performance. Therefore, FLIM images of H&E-stained tissue section combined with SVM exhibit great potential for effective and accurate diagnosis of NMSC skin cancer.

Supporting Information Details of the H&E staining protocol; details of chemical structure and optical properties of hematoxylin and eosin, including absorption and emission spectra, quantum yield determination and two-photon absorption crosssection calculation; details of Shannon’s definition of information entropy; details of support vector machine principle and its accuracy verification; details of image segmentation and lifetime statistics based on phasorFLIM analysis.

ACKNOWLEDGEMENTS This work has been partially supported by the National Natural Science Foundation of China (61525503 / 61875131 / 61620106016 / 61835009 / 81727804); (Key) Project of Department of Education of Guangdong Province ( 2015KGJHZ002 / 2016KCXTD007 ) ; Guangdong Natural Science Foundation Innovation Team (2014A030312008) and Shenzhen Basic Research Project (JCYJ20170818100931714 / JCYJ20150930104948169 / JCYJ20160328144746940 / JCYJ20170412105003520).

REFERENCES (1) Finnane, A.; Dallest, K.; Janda, M.; Soyer, H. P. JAMA Dermatology 2017, 153, 319. (2) Srivastava, J.; Rho, O.; DiGiovanni., J. Cancer Research 2016, 76, 2022–2022. (3) Elmore, J.; Longton, G.; Pepe, M.; Carney, P.; Nelson, H.; Allison, K.; Geller, B.; Onega, T.; Tosteson, A. A.; Mercan, E.; Shapiro, L.; Brunyé, T.; Morgan, T.; Weaver, D. Journal of Pathology Informatics 2017, 8, 12. (4) Archer, J.; Li, E. Frontiers of Optoelectronics 2018, 11, 23–29. (5) Azimi, A.; Kaufman, K. L.; Ali, M.; Arthur, J.; Kossard, S.; Fernandez-Penas, P. Journal of Dermatological Science 2018, 91, 69–78. (6) Wang, X.; Chang, J.; Niu, Y.; Du, X.; Zhang, K.; Xie, G.; Zhang, B. Frontiers of Optoelectronics 2017, 10, 89–94. (7) Elston, D. M. Journal of the American Academy of Dermatology 2002, 47, 777–779. (8) Suhling, K.; Hirvonen, L. M.; Levitt, J. A.; Chung, P.-H.; Tregidgo, C.; Marois, A. L.; Rusakov, D. A.; Zheng, K.; Ameer-Beg, S.; Poland, S.; Coelho, S.; Henderson, R.; Krstajic, N. Medical Photonics 2015, 27, 3–40. (9) Draxler, S.; Lippitsch, M. E. Analytical Chemistry 1996, 68, 753–757. (10) Wang, B.; Zhang, X.; Wang, C.; Chen, L.; Xiao, Y.; Pang, Y. The Analyst 2015, 140, 5488–5494. (11) Berezin, M. Y.; Achilefu, S. Chemical Reviews 2010, 110, 2641–2684. (12) Galletly, N.; McGinty, J.; Dunsby, C.; Teixeira, F.; Requejo-Isidro, J.; Munro, I.; Elson, D.; Neil, M.; Chu, A.; French, P.; Stamp, G. British Journal of Dermatology 2008, 159, 152–161. ACS Paragon Plus Environment

6

Page 7 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(13) Simpson, M. J.; Wilson, J. W.; Phipps, M. A.; Robles, F. E.; Selim, M. A.; Warren, W. S. Journal of Investigative Dermatology 2013, 133, 1822–1826. (14) Luo, T.; Lu, Y.; Liu, S.; Lin, D.; Qu, J. Analytical Chemistry 2017, 89, 9224–9231. (15) Jaworek-Korjakowska, J. BioMed Research International 2016, 2016, 1–8. (16) Dong, R.; Weng, S.; Yang, L.; Liu, J. Analytical Chemistry 2015, 87, 2937–2944. (17) Alexiev, U.; Volz, P.; Boreham, A.; Brodwolf, R. European Journal of Pharmaceutics and Biopharmaceutics 2017, 116, 111–124. (18) Entropy analysis of OCT signal for automatic tissue characterization. 2016; pp 9720 – 9720 – 7. (19) Digman, M. A.; Caiolfa, V. R.; Zamai, M.; Gratton, E. Biophysical Journal 2008, 94, L14 – L16. (20) Stringari, C.; Cinquin, A.; Cinquin, O.; Digman, M. A.; Donovan, P. J.; Gratton, E. Proc Natl Acad Sci U S A 2011, 108, 13582–13587. (21) Luo, T.; Lu, Y.; Liu, S.; Lin, D.; Qu, J. Analytical Chemistry 2017, 89, 8104–8111. (22) Chang, C.-C.; Lin, C.-J. ACM Transactions on Intelligent Systems and Technology 2011, 2, 1–27. (23) Robertson, D.; Isacke, C. M. Methods in Molecular Biology; Humana Press, 2011; pp 69–77. (24) Tuer, A.; Tokarz, D.; Prent, N.; Cisek, R.; Alami, J.; Dumont, D. J.; Bakueva, L.; Rowlands, J.; Barzda, V. Journal of Biomedical Optics 2010, 15, 026018. (25) Deerinck, T. J. The Journal of Cell Biology 1994, 126, 901–910. (26) Gillies, R.; Zonios, G.; Anderson, R. R.; Kollias, N. Journal of Investigative Dermatology 2000, 115, 704– 707. (27) Quinn, K. P.; Leal, E. C.; Tellechea, A.; Kafanas, A.; Auster, M. E.; Veves, A.; Georgakoudi, I. Journal of Investigative Dermatology 2016, 136, 342–344. (28) Sun, T. Y.; Haberman, A. M.; Greco, V. Journal of Investigative Dermatology 2017, 137, 282–287. (29) Dysli, C.; Wolf, S.; Berezin, M. Y.; Sauer, L.; Hammer, M.; Zinkernagel, M. S. Progress in Retinal and Eye Research 2017, 60, 120–143. (30) Ranjit, S.; Dvornikov, A.; Levi, M.; Furgeson, S.; Gratton, E. Biomedical Optics Express 2016, 7, 3519. (31) Zheng, M.-L.; Fujita, K.; Chen, W.-Q.; Duan, X.-M.; Kawata, S. The Journal of Physical Chemistry C 2011, 115, 8988–8993. (32) Mao, H.; Su, P.; Qiu, W.; Huang, L.; Yu, H.; Wang, Y. Colorectal Disease 2016, 18, 1172–1178. (33) Becker, W.; Bergmann, A.; Biskup, C. Microscopy Research and Technique 2007, 70, 403–409.

ACS Paragon Plus Environment

7

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 15

FIGURE LEGENDS

Figure 1. Fluorescence lifetime imaging of the routine H&E-stained sections. From top to bottom, panels a1-a3 show bright field histological slides of actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) samples stained with H&E. Images were captured at 40× magnification using Leica TCS SP2 fluorescent microscope equipped with a digital camera. Panels b1-b3 show fluorescence intensity images of H&E-stained sections upon two-photon excitation. Panels c1-c3 show fluorescence lifetime pseudocolor images of H&Estained images captured at 63× magnification using the same microscope equipped with TCSPC model (B&H GmbH). Panels d1-d3 show fluorescence intensities, and panels e1-e3 show fluorescence lifetime images of H&E-stained sections from another patient.

ACS Paragon Plus Environment

8

Page 9 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 2. Distinction among actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) using image segmentation based on phasor plot of fluorescence lifetime imaging of H&E-stained sections. From left to right, fluorescence lifetime images of H&E staining sections from AK, BD, and BCC; red, green, and blue colors are used to indicate image segmentations from stratum corneum (SC) (a1-a3), epithelial cells (ECs) (b1b3), and dermal connective tissue (CT) (c1-c3). Panels d1-d4 show the color merge channels. The last column (a4, b4, c4, and d4) indicates the histograms of average lifetime (τm) fitted by triple-exponential components decay from SC, ECs, CT, and merge channels, respectively.

ACS Paragon Plus Environment

9

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 15

Figure 3. Linear-kernel support vector machine (LSVM) for binary classification among actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) using fluorescence lifetime of H&E-stained sections. a1a2 show 2D features space (μ and width) and a3 shows 3D features space (μ, width and entropy) with decision boundary between precancerous lesion (AK & BD) vs. skin carcinoma (BCC) or classification between the AK vs. BD subcategories using τm (a1) or τ2 (a2-a3); panels b1-b3 show the dependence of five-fold cross-validation accuracy on the penalty parameter C for LSVM models as show in a1-a3, respectively; panels c1-c3 show the receiver operating characteristic (ROC) curves corresponding to binary classification of a1-a3, respectively.

ACS Paragon Plus Environment

10

Page 11 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Table of Content

ACS Paragon Plus Environment

11

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Two-photon FLIM system was used to take fluorescence lifetime imaging for H&E-stained sections of skin lesion biopsy. ROI small windows of 28×28 pixels were randomly selected from the ECs layers for life-time feature extraction and entropy calculation. Finally, SVM model was built using 3D feature space of μ-widthentropy for classification of AK, BD, and BCC. 180x71mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 12 of 15

Page 13 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 1. Fluorescence lifetime imaging of the routine H&E-stained sections. From top to bottom, panels a1a3 show bright field histological slides of actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) samples stained with H&E. Images were captured at 40× magnification using Leica TCS SP2 fluorescent microscope equipped with a digital camera. Panels b1-b3 show fluorescence intensity images of H&E-stained sections upon two-photon excitation. Panels c1-c3 show fluorescence lifetime pseudocolor images of H&E-stained images captured at 63× magnification using the same microscope equipped with TCSPC model (B&H GmbH). Panels d1-d3 show fluorescence intensities, and panels e1-e3 show fluorescence lifetime images of H&E-stained sections from another patient. 180x122mm (300 x 300 DPI)

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 2. Distinction among actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) using image segmentation based on phasor plot of fluorescence lifetime imaging of H&E-stained sections. From left to right, fluorescence lifetime images of H&E staining sections from AK, BD, and BCC; red, green, and blue colors are used to indicate image segmentations from stratum corneum (SC) (a1-a3), epi-thelial cells (ECs) (b1-b3), and dermal connective tissue (CT) (c1-c3). Panels d1-d4 show the color merge channels. The last column (a4, b4, c4, and d4) indicates the histograms of average lifetime (τm) fitted by triple-exponential components decay from SC, ECs, CT, and merge channels, respectively. 180x163mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 14 of 15

Page 15 of 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Figure 3. Linear-kernel support vector machine (LSVM) for binary classification among actinic keratosis (AK), Bowen’s disease (BD), and basal cell carcinoma (BCC) using fluorescence lifetime of H&E-stained sections. a1-a2 show 2D features space (μ and width) and a3 shows 3D features space (μ, width and entropy) with decision boundary between precancerous lesion (AK & BD) vs. skin carcinoma (BCC) or classification between the AK vs. BD subcategories using τm (a1) or τ2 (a2-a3); panels b1-b3 show the dependence of five-fold cross-validation accuracy on the penalty parameter C for LSVM models as show in a1-a3, respectively; panels c1-c3 show the receiver operating characteristic (ROC) curves corresponding to binary classification of a1-a3, respectively. 180x146mm (300 x 300 DPI)

ACS Paragon Plus Environment