Subscriber access provided by Kaohsiung Medical University
Article
Machine Learning for Analysis of Time-resolved Luminescence Data Nikola Dordevi#, Joseph Samuel Beckwith, Maksym Yarema, Olesya Yarema, Arnulf Rosspeintner, Nuri Yazdani, Juerg Leuthold, Eric Vauthey, and Vanessa Wood ACS Photonics, Just Accepted Manuscript • DOI: 10.1021/acsphotonics.8b01047 • Publication Date (Web): 12 Nov 2018 Downloaded from http://pubs.acs.org on November 12, 2018
Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.
is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.
Page 1 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Photonics
1
Machine Learning for Analysis of Time-resolved Luminescence Data
2 3
Nikola Dordevic*▲, Joseph S. Beckwith○, Maksym Yarema*, Olesya Yarema*, Arnulf Rosspeintner○, Nuri Yazdani*, Juerg Leuthold▲, Eric Vauthey○, Vanessa Wood*
4
*Materials
5 6 7
and Device Engineering Group, ETH Zurich, 8092 Zurich, Switzerland
▲Institute ○Physical
of Electromagnetic Fields, ETH Zurich, 8092 Zurich, Switzerland
Chemistry Department, University of Geneva, 1211 Geneva, Switzerland
e-mail:
[email protected] 8 9
Abstract
10 11 12 13 14 15 16 17
Time-resolved photoluminescence is one of the most standard techniques to understand and systematically optimize the performance of optical materials and optoelectronic devices. Here, we present a machine learning code to analyze time-resolved photoluminescence data and determine the decay rate distribution of an arbitrary emitter without any a priori assumptions. To demonstrate and validate our approach, we analyze computer-generated time-resolved photoluminescence datasets and show its benefits for studying the photoluminescence of novel semiconductor nanocrystals (quantum dots), where it quickly provides insight into the possible physical mechanisms of luminescence without the need for educated guessing and fitting.
18
Keywords
19
machine learning, luminescence, radiative rate, quantum emitters, nanocrystals
20 21 22 23 24 25 26 27 28 29 30 31
Buried in luminescence dynamics is information about the electronic structure of the emitter (including the presence of defect states and electron-phonon interactions), the interaction of the electron-hole pair with free or bound charge, and the deviations of the photon density of states caused for instance by the location of material in a complex dielectric, near a metallic interface, or in a cavity. Luminescence dynamics are usually recorded in photon-counting experiments, which result in time-resolved photoluminescence (TRPL) data sets1. To study the TRPL data, one can either perform ‘modeldependent’ or ‘model-free’ strategies. In the case of ‘model-dependent’ analysis, the TRPL data analysis is performed by fitting the dataset with a multi-exponential2, stretched exponential3-4, or log-normal distribution5 model. On the other hand, ‘model-free’ analysis aims to recover distribution of lifetimes, with most notable approaches being exponential sampling method (ESM)6, the maximum entropy method (MEM)7, and lifetime density analysis (LDA)8.
32 33 34 35 36 37 38 39 40 41 42 43 44
The example in Figure 1 highlights the problems with fitting TRPL data assuming the decay model a priori. A TRPL measurement of Ag-In-Se nanocrystals3 (gray line) is fit with three different models: a biexponential decay (Figure 1a), a triexponential decay (Figure 1b), and a combination of a single exponential and a stretched exponential decay (Figure 1c). The biexponential model does not describe the emission after 300 ns well; however, statistical analysis of the results (see Supporting Information) suggests that the triexponential and the combination of the single exponential and a stretched exponential are equally good fits. However, these two fits would be used to describe two very different models of the underlying physics. The triexponential fit means that the decay rate distribution is a sum of three Dirac delta functions, a scenario that could result from three distinct emission pathways. The mixture model assumes one well resolved emission process and another emission channel that features a broad distribution of decay rates. Because the electronic structure of Ag-In-Se nanocrystal and the optical coupling between levels is not well understood, it is not possible to know which model is correct in advance. This example illustrates the need for an a priori-free, data-driven analysis of TRPL curves9-10. ACS Paragon Plus Environment
ACS Photonics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
45 46 47 48 49 50 51 52
Figure 1. Time-resolved photoluminescence of Ag-In-Se nanocrystals. The measured data in gray is the same for (a - d). (a) Bi-exponential decay fitting (left) and resulting decay rate distribution (right). (b) Tri-exponential fitting (left) and resulting decay rate distribution (right). (c) Fit with an exponential decay and stretched exponential decay (left) and the resulting decay rate distribution (right). The decay rate of an exponential decay is located at ~4.8×10-3 ns-1, while the stretched exponential decay has a peak at ~2×10-2 ns-1. (d) The model obtained using machine learning (left) and resulting histogram of the probability distribution of decay rates depopulating system (right). Machine learning model indicates a narrow distribution centered at 5×10-3 ns-1 and a broader distribution centered around 3×10-2 ns-1.
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
The concept of determining a decay rate distribution instead of trying multi-exponential fits to describe photophysical processes has been previously applied to analyze particle size distribution of scatterers in polydisperse samples6, 11-12, spin and charge transport dynamics in doped semiconductors13, nanoparticle photoluminescence in inverse opal photonic crystals14, and the influence of external refractive index on fluorescence kinetics in organics10, to name a few. McWhirter and Pike15 wrote extensively about the numerical inversion of the Laplace transform, which was later applied to other invariant transforms for applications in photon correlation spectroscopy and Fraunhofer diffraction16-17. Ostrowsky et al.6 introduced the exponential sampling method (ESM) for numerical inversion of the Laplace transform to sample the decay rate distribution of polydisperse scatterers. The ESM was then extended by introducing the histogram method, which does not need additional normalization of the parameters and offers better resolution of the decay rate distribution11-12. The works by Santra et al.18-19, Kim et al.20, and Maus et al.21 demonstrate that use of maximum likelihood estimation combined with Poisson instead of Gaussian statistics makes it possible to reliably analyze low signal data. Regularization combined with the decay rate distribution approach was applied to TRPL data by Petrov et al.10 for analysis of fluorescence dynamics of perylene molecules, while Slavov et al.8 used the regularization in combination with LDA to obtain stable recovery of lifetime distribution. Another way of using regularization to obtain decay rate distribution is the maximum entropy method, which uses Shannon-Janes entropy as a smoothing function (Kumar et al.7 and references therein). This method has been previously applied to study decay kinetics of fluorescent molecules22 and proteins7, 23.
72 73 74
Here, we build upon this previous literature, and implement and validate a machine learning approach to calculate the probability distribution of decay rates making no a priori assumptions. Our method works by solving the discrete Laplace transform using Poisson statistics and applying an even more
ACS Paragon Plus Environment
Page 2 of 11
Page 3 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Photonics
75 76 77 78 79 80
robust regularization approach (k-fold cross-validation), which works by minimizing the prediction error and which is standard in the machine learning community24. As highlighted in Figure 1d, this approach shows that the decay rate distribution of the example of the Ag-In-Se nanocrystals described above is best described by the mixture of an exponential and a stretched exponential (i.e., the case in Figure 1c). This fit can be used by the research community while working to obtain better understanding of PL mechanism in I-III-VI nanocrystals3, 25.
81 82 83 84 85 86
We provide our software (LumiML) open source26. After describing the details of our approach, we validate it using computer-generated TRPL data. We then apply it to analyze energy-resolved, ultrafast emission from CsPbBr3 perovskite nanocrystals. The machine learning approach confirms the complex dynamics of neutral excitons (X), charged excitons (X*), and multiexcitons (XX)27-31, as well as their binding energies32, highlighting the utility of this approach for quickly developing an understanding of new materials.
87
Theory and Approach
88 89 90 91
Determining the Decay Rate Distribution with Machine Learning. The time-resolved photoluminescence (TRPL) of the excited state of an emitter characterized by a distribution of the decay rates can be written as: 𝐼(𝑡) =
∫
+∞
𝜌(Γ)𝑒 ―Γ𝑡𝑑Γ ,
(1)
0
92 93
where 𝐼(𝑡) is the intensity of the photoluminescence in time, and 𝜌(Γ) represents the underlying distribution of decay rates14.
94 95 96
As described above, the analysis of a time-resolved emission is usually performed by assuming the underlying distribution 𝜌(Γ). However, the distribution of decay rates can be calculated by solving (1) for 𝜌(Γ), which is equivalent to taking the inverse Laplace transform of the measured signal 𝐼(𝑡).
97 98 99
Our approach to determine 𝜌(Γ) without any a priori assumptions is shown in Figure 2. The Eq. (1) is a well-known Fredholm integral equation of the first kind15, and it can be transformed into a matrix equation by expanding the decay rate distribution into a set of basis function 𝜑𝑖 and coefficients 𝛽𝑖: 𝑛
𝜌(Γ) = ∑ 𝛽𝑖 ⋅ 𝜑𝑖(Γ) .
(2)
{
(3)
𝑖=1
100
We implement three simple basis sets: Γ ― Γ𝑖 ― 1 ,Γ ∈ (Γ𝑖 ― 1,Γ𝑖] 𝛿(Γ ― Γ𝑖) Γ𝑖 ― Γ𝑖 ― 1 φi(Γ) = 𝜃ℎ(Γ ― Γ𝑖) ― 𝜃ℎ(Γ ― Γ𝑖 + 1), 𝜒𝑖(Γ) = Γ , 𝑖+1 ― Γ 𝜒𝑖(Γ) ,Γ ∈ (Γ𝑖,Γ𝑖 + 1] Γ𝑖 + 1 ― Γ𝑖
{
101 102 103 104 105 106 107 108 109 110
where 𝛿(Γ) is the Dirac delta function (delta-basis6, 16), 𝜃ℎ(Γ) is the Heaviside step function (histogram-basis11-12), and 𝜒𝑖(Γ) represents the triangular function defined on the region [Γ𝑖 ― 1,Γ𝑖 + 1] and centered on Γ𝑖 (triangular-basis). The decay rate axis Γ ∈ [Γ𝑚𝑖𝑛,Γ𝑚𝑎𝑥 ] is discretized using exponential sampling16-17, which puts the individual Γ values equidistant on the logarithmic scale: Γ𝑘 = Γ0exp (𝑘𝜋/𝜔), where 𝜔 is a parameter that defines the resolution of sampling. In this way, our method is fundamentally different from the standard fitting process: here we define a range of decay rates (lifetimes) based on the limits of the actual measurement and then calculate the decay rate distribution for the given range using the previously chosen basis. With the chosen basis expansion 𝜑𝑖(Γ) and array of decay rates Γ𝑘, the matrix equivalent of Eq. (1) is (4) 𝐲=𝐗⋅𝛃,
ACS Paragon Plus Environment
ACS Photonics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
Page 4 of 11
111 112
where 𝐲 is a vector of the measured TRPL data points (𝑦𝑖 = 𝐼(𝑡𝑖) [counts]) and can considered to be a Poisson random variable in photon-counting experiments, the elements 𝑥𝑖𝑗 of matrix X are obtained by
113
evaluating the integral ∫Γ𝑚𝑎𝑥𝜑𝑖(Γ)exp ( ― Γ ⋅ 𝑡𝑗)𝑑Γ, and 𝛃 are the expansion coefficients. In general, Eq.
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128
(4) is an overdetermined system of equations, since in standard TRPL experiments there are couple of thousands data points y (number of rows in the matrix X), while reasonable number of elements in the decay rate array Γ𝑘 is usually less than 50. In addition, as matrix equation (4) is derived from the Fredholm integral equation of the first kind, Eq. (4) is an ill-posed inverse problem (i.e. there is no unique solution to the inverse of 𝐗).
Γ
𝑚𝑖𝑛
To find the optimal vector of coefficients 𝛃, we implement machine learning algorithms. The library (which we name LumiML) is written in Python 3 and based on the established machine learning library (scikit-learn33). We provide LumiML open source26. Concretely, the optimal coefficients 𝛃 are found by using linear regression, which consists of minimizing an objective function. Here, our objective function is the sum of a cost function 𝐿 that accounts for the underlying statistics of the experiment and a penalty term 𝑃 that penalizes certain values of coefficients24. The presence of a penalty term is known as regularization and is used to obtain a stable solution to Eq. (4) and to avoid overfitting. Here, we chose as the cost function the negative Poisson log-likelihood function20 𝑛
𝐿(𝛃,𝐲) = ∑ 𝑦𝑖(𝛃) ― 𝑦𝑖log 𝑦𝑖(𝛃),
(5)
𝑖=1
129 130 131 132
where 𝑦𝑖 is the estimation calculated from Eq. (4) and 𝑦𝑖 is the measured data. 𝐿 is minimized when the estimated values 𝑦𝑖 are closest to the measured data. This is known as maximum likelihood estimation. We use the Poisson statistics20-21, since datapoints in photon counting experiments can considered to be samples drawn from a Poisson distribution.
133
In order to prevent overfitting, we use elastic net regularization34 with a penalty term 𝑃(𝛃;𝛼,𝜃) = 𝐿2(‖𝛃‖2) +𝐿1(‖𝛃‖1) =
𝛼𝜃 2 ‖𝛃‖2
+
𝛼(1 ― 𝜃) ‖𝛃‖1, 2
(6)
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148
where 𝛼 and 𝜃 are so-called hyper parameters that need to be determined and which control the amount of regularization (𝛼) and ratio between L1 and L2 penalties (𝜃). The L1 penalty term (also known as the Lasso penalty24, 35) adds the absolute value of the magnitude of coefficients ‖𝛃‖1, and drives the smallest coefficients to zero. The L2 penalty term (used in ridge regression in machine learning) adds the sum of squared amplitude of coefficients ‖𝛃‖2, but does not put any values strictly to zero24. Both penalty terms put constraint on a maximum value of the norm of coefficients, which is inversely proportional to the amount of regularization (i.e. a large value of 𝛼 drives all coefficients to be zero). More detailed discussion on the values of the parameter 𝛼 on the norm of the calculated coefficients 𝛃 is presented in the Supplementary Information. The main difference between L1 and L2 penalties is that the L1 penalty can ‘pre-select’ coefficients, in a way that the coefficients with least significance for explaining the data will be shrunk to zero, resulting in a sparse solution (only a few coefficients that are higher than zero). The problem with L1 penalty occurs when the coefficients are correlated, which can be the case when the decay rate distribution is broad. In that case, the L1 penalty will randomly pick only one of the two coefficients. Adding the L2 penalty circumvents this issue, but still allows coefficients with minimal influence on the model to be shrunk to zero.
149
Thus, the final solution is obtained as a minimizer of the sum of the equations (5) and (6): 𝛃 = argmin𝐿(𝛃,𝐲) + 𝑃(𝛃;𝛼,𝜃) .
150 151 152
(7)
The optimal regularization parameter, 𝛼, is found by using the k-fold cross-validation (Figure 2, step 3b). First, the range of regularization parameters is defined. For every regularization parameter, the single TRPL dataset under investigation (y), is split randomly k times into training and validation sets. ACS Paragon Plus Environment
Page 5 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
153 154 155 156 157 158 159 160
ACS Photonics
From each of k training sets (X𝑡𝑟𝑎𝑖𝑛,y𝑡𝑟𝑎𝑖𝑛), we determine temporary coefficients 𝛃 by solving Eq. (7), and use the validation sets (X𝑣𝑎𝑙,y𝑣𝑎𝑙) to evaluate the prediction error (PE), which is given in Eq. (5). The mean value of the PE is calculated across the k folds, and we take 𝛼𝑜𝑝𝑡 that minimizes the PE. Alternatively, one takes negative of the prediction error and forms a score, which is then maximized. The ratio 𝜃 between L1 and L2 penalties can be also chosen using cross-validation, but it is usually satisfactory to start the analysis by setting it to 𝜃 = 0.5, in the case of broad underlying distribution of decay rates, or to 𝜃 = 1, when signal-to-noise ratio is low or if underlying distribution is narrow. The effect of choosing different values for the parameter 𝜃 is shown in the Supplementary Information.
161 162 163 164 165 166
Figure 2. Workflow schematics for obtaining decay rate distributions from a measured TRPL dataset (step 1). A delta, histogram, or triangular basis is selected (step 2). With the basis, the kernel matrix X can be constructed and solved (step 3) using regression and regularization techniques (step 3a). In order to optimize regularization, we employ k-fold cross-validation and search for the regularization parameter which minimizes the prediction error (step 3b). Lastly, we use this optimal regularization to solve the matrix equation (step 3c) and obtain the optimal coefficients and decay rate distribution.
167
Results and Discussion
168
Validation of Machine Learning Approach with Computer Generated Datasets
169 170 171 172 173 174 175 176
In order to show effectiveness of the method, we apply our decay rate analysis to two computer-generated datasets: a biexponential decay (Figure 3) and a stretched exponential (Figure 4). For the biexponential decay, we assume that the underlying distribution is a sum of two delta functions centered at 0.01 and 0.2 ns-1, respectively, and add a background noise, corrupting the signal with Poisson noise so that the peak number of counts is 1000 times higher than the mean counts of the noise. The predicted distribution of decay rates using delta-basis (Figure 3a) and histogram-basis (Figure 3b) (blue) match the input decay rates and their magnitudes (orange) with the residuals randomly distributed around zero.
ACS Paragon Plus Environment
ACS Photonics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
177 178 179 180 181
Figure 3. Calculated decay rate distribution for simulated bi-exponential decay. (a) Left: generated data and model prediction (bottom) and residuals weighted with data variance (top); right: Simulated distribution (top) and calculated distribution using delta basis (bottom). (b) Left: simulated data and model prediction (bottom) and weighted residuals (top); right: Simulated distribution (top) and calculated distribution using histogram basis.
182 183 184 185 186
Figure 4 shows results obtained from a simulated stretched exponential decay 𝑦𝑎𝑛(𝑡) = 𝐴 exp ( ― (𝑡/𝜏)𝜅) with 𝜏 = 25 ns and 𝜅 = 0.6, for which we can analytically compute the distribution of decay rates36. The noise conditions are the same previously described. By using either a delta-basis (Figure 4a) or a histogram-basis (Figure 4b) we can obtain either properly sampled or histogram of the probability distribution of decay rates, respectively.
187 188 189 190 191
Figure 4. Calculated decay rate distribution for simulated stretched exponential decay. (a) Left: generated data and model prediction (bottom) and weighted residuals (top); right: Simulated decay rate distribution and calculated decay rate distribution using delta basis. (b) Left: simulated data and model prediction (bottom) and weighted residuals (top); right: Simulated decay rate distribution and calculated decay rate distribution using histogram basis.
192 193 194 195
These results on simulated datasets confirm that our machine learning approach is applicable to systems that exhibit narrowly- or widely-distributed decay distributions. More details on the analysis of the simulated datasets presented above as well as on the analysis of additional simulated datasets can be found in the Supporting Information.
196
Analyzing the Emission of CsPbBr3 Nanocrystals.
197 198 199 200 201 202 203 204 205 206
To highlight the practicality of our machine learning approach, we turn to perovskite nanocrystals, which have generated significant excitement as optically-active materials for low-threshold, optically pumped lasers37, LEDs38, and solar cells39. We work with in-house synthesized40 cubic CsPbBr3 nanocrystals with narrow linewidth (FWHM 22 nm), high quantum yield (76%), and average cube edge length of (9.7 ± 1.1) nm (Supporting Information). To study energy-resolved photoluminescence at short time scales, we perform femtosecond broadband fluorescence upconversion spectroscopy41 (FLUPS). This technique relies on measuring the upconverted signal, which is created by the sum frequency mixing of the fluorescence light with 1340 nm gate pulses in a thin β-barium borate crystal. The broadband (energy-resolved) fluorescence signal can thus be probed at different time delays after the excitation of the sample with ~170 fs time resolution. The excitation pulse wavelength is 400 nm and its temporal
ACS Paragon Plus Environment
Page 6 of 11
Page 7 of 11 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
ACS Photonics
207 208
width is ~100 fs. The excitation pulse energy is varied from 5 nJ to 100 nJ, which, assuming an excitation volume of ~70 µm diameter corresponds to fluences of 0.13 – 2.6 mJ/cm2.
209 210 211 212 213 214
Figure 5. (a) Two-dimensional contour plot of energy and time-resolved emission intensity from FLUPS experiment with excitation fluence of 2.6 mJ/cm2. Different regimes of emission: (I) carrier cooling, (II) multiexciton and charge exciton emission, and (III) single exciton emission are indicated. (b) Calculated energy-resolved decay rate distribution for the measurement shown in (b). The three different luminescent species are apparent from the figure: single excitons (green line), trions (blue line) and biexcitons (red line).
215 216 217 218 219 220 221 222 223 224 225 226 227
Figure 5a shows the energy and time resolved fluorescence intensity (blue color scale) map for CsPbBr3 nanocrystals with excitation pulse energy of 100 nJ (excitation fluence of ~2.6 mJ/cm2). We can identify three emission regimes. In agreement with previous studies27-29, at early times (