Machine Learning Algorithms for Liquid Crystal-Based Sensors - ACS

Oct 5, 2018 - We present a machine learning (ML) framework to optimize the specificity and speed of liquid crystal (LC)-based chemical sensors. Specif...
0 downloads 0 Views 1MB Size
Subscriber access provided by UNIV OF CAMBRIDGE

Article

Machine Learning Algorithms for Liquid Crystals-Based Sensors Yankai Cao, Huaizhe Yu, Nicholas L. Abbott, and Victor M. Zavala ACS Sens., Just Accepted Manuscript • DOI: 10.1021/acssensors.8b00100 • Publication Date (Web): 05 Oct 2018 Downloaded from http://pubs.acs.org on October 9, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

Machine Learning Algorithms for Liquid Crystals-Based Sensors Yankai Cao, Huaizhe Yu, Nicholas L. Abbott, and Victor M. Zavala∗ Department of Chemical and Biological Engineering University of Wisconsin-Madison, 1415 Engineering Dr, Madison, WI 53706, USA

Abstract We present a machine learning (ML) framework to optimize the specificity and speed of liquid crystal (LC)-based chemical sensors. Specifically, we demonstrate that ML techniques can uncover valuable feature information from surface-driven LC orientational transitions triggered by the presence of different gas-phase analytes (and the corresponding optical responses) and can exploit such feature information to train accurate and automatic classifiers. We demonstrate the utility of the framework by designing an experimental LC system that exhibits similar optical responses to a stream of nitrogen containing either 10 ppmv dimethyl-methylphosphonate (DMMP) or 30% relative humidity (RH). The ML framework is used to process and classify thousands of images (optical micrographs) collected during the LC responses and we show that classification (sensing) accuracies of over 99% can be achieved. For the same experimental system, we demonstrate that traditional feature information used in characterizing LC responses (such as average brightness) can only achieve sensing accuracies of 60%. We also find that high accuracies can be achieved by using time snapshots collected early in the LC response, thus providing the ability to create fast sensors. We also show that the ML framework can be used to systematically analyze the quality of information embedded in LC responses and to filter out noise that arises from imperfect LC designs and from sample variations. We evaluate a range of classifiers and feature extraction methods and conclude that linear support vector machines are preferred and that high accuracies can only be achieved by simultaneously exploiting multiple sources of feature information.

Keywords: liquid crystals; chemical sensors; automated; machine learning; fast Security, health, and environmental monitoring requires the development of chemical sensing technologies that can be used in-situ and with limited equipment and human intervention. The impact of such technologies when deployed at a large scale can be significant; for instance, the U.S. Department of Energy Savannah River laboratory analyzes over 40,000 groundwater samples per year at a cost of $1,000 per sample ($40 million per year) [1]. LCs are fluid phases with preferred molecular orientations, a so-called director, that undergo surface-driven ordering transitions in the presence of chemical species such as organophosphonates [2, 3, 4, 5, 6], chlorine, ammonia, and hydrogen sulfide [7]. The optical characteristics (features) of the LC transitions can be tailored and exploited to design ∗

Corresponding Author: [email protected]

1 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

chemical sensors. For instance, LCs can be designed to assume homeotropic (perpendicular) orientations on surfaces decorated with metal salts [2, 4, 8]. Chemical species that diffuse into the LCs and bind more strongly to the metal cations than the LC functional groups will trigger a transition of the LCs orientation from homeotropic to planar (see Figure 1) [9, 10, 11]. It is possible to manipulate the selectivity and response characteristics (e.g., dynamics) of the LC by tuning the binding energies of the LC functional groups (e.g., nitrile and pyridine groups) to the surface (e.g., Fe+3 , La3+ ) [9, 10]. For instance, ordering transitions of LC sensors fabricated using a nematic LC called 4-cyano4’-penthylbiphenyl (5CB) and surfaces presenting aluminum perchlorate salts have been studied in [12, 13, 14].

Inspection of Figure 1 reveals that this particular LC system responds in a similar manner to a gaseous stream of nitrogen containing either DMMP vapor (10 ppmv) or water vapor (30% RH). DMMP and water vapors are of particular relevance; DMMP is an organophosphate that is used to simulate nerve agents and pesticides while water vapor is a common, potentially interfering, compound. In past designs of LC sensors, the inability to distinguish similar LC responses has been traditionally addressed through the selection of surface chemistry, LC chemistry, and other design variables. This process has typically involved extensive and laborious experimentation, although recent guidance has been obtained from the use of computational chemistry methods [9]. In addition, even if chemically optimized systems are designed, the initial state of the LC and of the surface will inherently exhibit variations from sample to sample and this will induce variations in the LC response (introducing sensor noise). Recent research activity in LC demonstrates that there is a perceptible difference in the response time of mean luminosity under different analytes and surface designs [8]. However, the application of systematic techniques to analyze the limits of using simple features such as response times to guide sensor design is lacking. Moreover, the use of additional response features in sensor design has not been explored.

In this work we use machine learning (ML) algorithms to optimize the selectivity and speed of LC sensors. Specifically, we use advanced feature extraction techniques to characterize complex space-time patterns observed in the response of the LCs. We combine feature information on average brightness, oriented brightness gradients, and deep neural nets to capture such patterns. We find that the combination of multiple sources of feature information enables high classification accuracies. However, the large amounts of feature information and the number of training samples leads to computationally challenging classification models. Consequently, we also explore different classification paradigms and software implementations. These computational capabilities help us assess the accuracy limits of different LC-based chemical sensors. We demonstrate our capabilities using a case study in which we classify the responses of LCs anchored to surfaces decorated with metal cations (to detect DMMP). We conclude that the proposed LC-ML framework enables systematic examinations of synergies between molecular interactions, LC design, information content, and sensor accuracy. This framework can help identify critical aspects of LC designs and molecular behavior that maximize sensor accuracy. 2 Environment ACS Paragon Plus

Page 2 of 19

Page 3 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

Figure 1: Optical micrographs of LCs supported on surfaces decorated with Al3+ . (Top) initial LC state with homeotropic orientation, (Middle) LC transition to planar orientation under N2 -DMMP vapor, and (Bottom) transition under N2 -water vapor. Schematic illustration of the orientation of the LC is shown on the right.

Experimental Materials and Methods We recorded six videos that show the response of LCs to DMMP-N2 at 10 ppm (the length of each video ranges from 4 - 13 minutes) and six videos that show the response of LCs to water-N2 (the length of each video ranges from 7 - 30 minutes). The experimental system is sketched in Figure 2. Each video tracks the dynamic evolution of multiple independent micro-wells (the total number of micro-wells recorded was 391). We captured a frame (micrograph) from each video every 3.3 seconds. We split each frame into several images, each containing a single micro-well at a specific time. The total number of micro-well snapshots generated was 75,081 and each image is resized to 60 x 60 pixels (see Figure 3 for some example micrographs). The experimental procedure followed to obtain the LC response data involved the following steps. Materials. 5CB was purchased from HCCH (Jiangsu Hecheng Display Technology Co., LTD). The SU-8 2050 and SU-8 developers were purchased from MicroChem (Westborough, MA). Absolute ethanol (anhydrous, 200 proof) and aluminum(III) perchlorate salt in its highest purity form were purchased from Sigma-Aldrich. (Tridecafluoro-1,1,2,2- tetrahydrooctyl)-trichlorosilane was purchased from Pfaltz & Bauer (Waterbury, CT). DMMP in nitrogen at a concentration of 10 ppmv was obtained from Airgas (Radnor, PA) and used as received. Fischer’s finest glass slides were purchased from Fischer Scientific (Hampton, NH). All chemicals and solvents were of analytical reagent grade and were used as received without any further purification. All deionized water used in the study possessed a 3 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

resistivity of at least 18.2 MΩcm. Fabrication of microwells. Polymeric wells with diameters of 200 µm were fabricated by photolithography to create LC films supported on metal salt surfaces (as detailed below). SU-8 2005, which contains 45 wt% bisphenol A novolac epoxy, was made by adding cyclopentanone to SU-8 2050, which contains 71.65 wt% bisphenol A novolac epoxy, to decrease the viscosity of the photoresist. Then, a thin film of SU-8 2005 was deposited on a cleaned glass surface by spin-coating at 500 rpm for 10 seconds followed by 3,000 rpm for 30 seconds. The polymer-coated surface was subsequently prebaked on a hot plate at 95 o C for 5 min and then cooled to room temperature for 10 min. After prebaking, a photomask with 200 µm-diameter dark circular patterns was placed on the polymer coated surface and exposed to UV for 70 s (λ = 254 nm, UV crosslinker, Spectronics, Westbury, NY). After UV exposure, the sample was post-baked for 7 min at 95 o C. The SU-8 film was exposed to an oxygen plasma (250 W RF power, 50 cm3 /min oxygen) and subsequently placed into a desiccator to which 25 µL of (tridecafluoro-1,1,2,2-tetrahydrooctyl)-trichlorosilane was added (adjacent to the SU-8 film). A vacuum was then pulled in the desiccator for 20 min, during which time the organosilane formed a vapor and reacted with the surface of the SU-8 film. After the surface treatment, the sample was placed in a SU-8 developer (1-methoxy-2-propyl acetate) and sonicated for 15 seconds to dissolve the regions of the SU-8 film that were not exposed to UV light. The sample was then washed with a copious amount of isopropanol and dried under a gaseous flow of nitrogen. The depth of the polymeric microwells fabricated using the aforementioned procedure was determined to be 5 µm by surface profilometry. Formation of Thin films of LC supported on metal-salt decorated surfaces. Aluminum perchlorate salts were dissolved into dry ethanol to form 10 mM solution, and then 50 µL of the solution was deposited by spin-coating (3000 rpm for 30 seconds) onto the glass surfaces at the bottom of the polymeric microwells. Next, the microwells were filled with LC by depositing 2 µL of LC onto each array of microwells using a micropipette. The excess LC was removed from the array by wicking into a microcapillary. Exposure of LC films to DMMP and humid N2 . The LC-filled microwells were exposed to a stream of dry N2 containing DMMP (10 ppmv) within in a flow cell [15] with glass windows that permitted characterization of the optical appearance of the LC using a polarized optical microscope. The gas containing DMMP was delivered to the flow cell at 300 mL/min by using a rotameter (Aalborg Instruments and Control, Orangeburg, NY). For experiments performed to evaluate the response of the LCs to water vapor, nitrogen containing 30% relative humidity (RH) was delivered to the flow cell at 300 mL/min with the same rotameter. The RH of the air was controlled using a portable dew point generator (LI-610, LI-COR Biosciences, Lincoln, NE). To generate 30% RH gas stream, the temperature of the gas fed to the generator was controlled at 25 o C and the dew point was set as 6.2 o C. The optical appearance of the LC film was recorded using an Olympus camera (Olympus C2040Zoom, Melville, NY) and WinTV software (Hauppauge, NY). 4 Environment ACS Paragon Plus

Page 4 of 19

Optical characterization of LC films. The optical appearance of the LC was characterized by using an Olympus BX-60 polarizing light microscope in transmission mode (Olympus, Japan). Conoscopic imaging of the LC films was performed by inserting a Bertran lens into the optical path of a polarizedlight microscope to distinguish between homeotropic and isotropic films [12]. Detector

TDP Air

RH Condenser LC sample Flow cell Cross polarizing film

Flow controller

10 ppm DMMP in N2

Light

Figure 2: Sketch of experimental system used for collecting LC response data.

Computational Methods The working hypothesis motivating this work is that modern ML algorithms can be used to automatically extract information from LC responses in order to optimize specificity and speed of LC sensors. ML algorithms can also enable sensing with limited human intervention and can help reduce onsite hardware needs. In particular, classification models can be pre-trained (calibrated) against large numbers of experimental samples and new samples can be classified using the pre-trained model in real-time. ML techniques also provide the ability to quantify the impact of LC design characteristics on the information content of the response signals and on the sensor accuracy. ML can thus help to better guide and reduce experimental effort. An ML classification framework involves three fundamental computational tasks: feature extraction, model training (learning), and model testing (prediction) that we now proceed to describe. Feature Extraction. This task distills (extracts) useful information from raw data to drive the classification model. In the context of LC responses, the input data are typically time sequences of images that track the response of the LC after a sample is introduced. The quality (information content) and number of the extracted features have strong impacts on the classification accuracy. In other words, if non-informative features are used, the model will not be able to distinguish among different LC responses. For instance, if aggregate metrics such as average brightness are used as the only feature 5

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 19

defining an image, the classification can be inaccurate (different images can give the same average value). Measuring the information content of LC responses is key because experimental researchers often have strong physical insights on what features can best describe an LC response but they often lack the means to quantify the quality of such features. Moreover, information can remain hidden to experienced observers due to pattern complexity and large amounts of data generated. Selecting appropriate features is a complex and delicate task and, as a result, diverse techniques have been reported in the literature. Popular feature extraction techniques used in computer vision include histogram of oriented gradients (HOG) and deep neural networks (Alexnet). HOG features analyze the gradient orientations in localized portions of an image and this information can be used to detect persistent spatial patterns (see Figure 3). Alexnet is a classification model (a convolutional neural network) that has been pre-trained against millions of different images found on the internet. Interestingly, the neurons of Alexnet can be used as image features because these implicitly seek to classify a new image based on previous knowledge of other images (even if these are not necessarily related to the application at hand). This approach mimics how humans categorize new objects based on prior information. For instance, Alexnet features have been recently used to train melanoma Liquid Crystals and Learning for Chemical Sensing classifiers Data that achieve process accuracies similar to those obtained by trained dermatologists [16]. Features: Total 4997 Histogram of oriented gradients (HOG) features (900) Water Water Water Deep learning features 4096 Mean density 1 Samples: 75081 (37540 for training) Method: Supported Vector Machine (SVM)

DMMP

DMMP

DMMP

Figure 3: LC

Solver Ipopt PIPS-NLP IPCluster SMO

Time(s) DMMP tol Insufficient Memory 43200 10 5 281 10 5 4087⇤ 10 1

Water

Water

DMMP

DMMP

Water

Water

DMMP

Water

IPCluster: C=1% S Test Accuracy: 0.95 Test Accuracy a\er 3s: 0.92 Test Bme per sample < 1s SensiBvity: 0.01ppmv Water Water imagesWater under N2 -DMMP andWater N2 -Water (left) Recall: lethal concentra3on: 0.035 ppmv for 2min

Water features for an image (right). and HOG 21 / 23

Water

DMMP

Water

DMMP

Water

20 / 23

Model Learning. Diverse classification models have been explored in the literature. Popular methods include support vector machines (SVM), logistic regression, and deep neural networks. Here, we only discuss the specifics of SVM because this is the technique that has worked best in our preliminary work on LC response classification. SVM seeks to find a multi-dimensional hyperplane that effectively separates the training samples (samples are described in terms of their features) into two or more classes. For simplicity, we only discuss algorithms for two classes (binary classification). The hyperplane is described by a weighted function of the features. The training (learning) of an SVM model thus consists of solving an optimization problem that finds the feature weights that achieves a maximum separation among the training samples. The classification problem has the following 6 Environment ACS Paragon Plus

Page 7 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

mathematical form: min

w,γ,ξs

1 X ξs + λwT w |S|

(1a)

s∈S

s.t. ys · (wT ϕ(xs ) − γ) ≥ 1 − ξs , s ∈ S ξs ≥ 0 s ∈ S.

(1b) (1c)

Here, s ∈ S is the index of the sample in the training set S (containing S samples), xs is the vector of features of sample s with associated classification label ys (e.g., ys = 1 if a sample contains N2 -DMMP and ys = −1 if a sample contains N2 -Water), ξs is the classification error, w is the weight vector, γ is the hyperplane offset, λ is a regularization parameter that prevents overfitting (when many features are needed), and ϕ(·) is the feature mapping function (ϕ(xs ) = xs for linear classification). The solution of the SVM problem gives the optimal model parameters w∗ , γ ∗ that define the trained classification model. The computational complexity of the SVM problem (1) is high and is related to the number of training samples and to the number of features used (the dimension of vector xs , which can reach thousands in our LC context). A wide range of algorithmic techniques and of software implementations have been proposed to tackle such computational complexity [17, 18]. A scalable and flexible approach consists of using interior-point algorithms. These algorithms can achieve high accuracies and can exploit underlying mathematical structure at the linear algebra level. Effective structure exploitation strategies are critical to leverage high-performance (parallel) computing capabilities [19]. The authors have developed different solvers such as IPCluster [20] and PIPS-NLP [21] to solve largescale structured optimization problems that have the same mathematical structure of SVM problems. In particular, such solvers exploit the following arrowhead structure of the linear algebra system:      K1 B1 q1 r1         q2   r2  K B 2 2       ..   ..   ..  .. (2)   .  =  . . . .            KS BS   qS   rS   T T B1 B2 . . . BST K0 q0 r0 In the context of SVM, q0 is the search step associated to the feature weights and offset and qs is the search step for the dual variables of the classification constraints and for the classification errors in sample s ∈ S. The diagonal blocks Ks are sparse matrices that are associated to each training sample (in stochastic programming these are random scenarios). The arrowhead system can be solved in parallel computers by using a Schur complement decomposition: ! X X T −1 BsT Ks−1 rs , Ks qs = rs − Bs q0 , s ∈ S. (3) K0 − Bs Ks Bs q0 = r0 − s∈S

s∈S

|

{z Z

}

This approach parallelizes operations associated to each individual blocks Ks but scalability is limited by operations with the Schur complement matrix Z (which is a dense matrix of the same dimension as the number of features). This limits the use of SVM to classification problems with a 7 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 19

few thousands of features. The emergence of dense linear algebra operations is also an obstacle that prevents scalability of existing dual algorithms that operate on the sample (kernel) space [17]. Full Set S

c1

¯1

S2

S1 ¯2

c2

c1 c2

C

Compressed Set

Figure 4: Illustration of the adaptive clustering techniques. Scalability bottlenecks of Schur decomposition can be overcome by IPCluster by using adaptive clustering techniques (see Figure 4). These techniques identify data redundancies in the training samples and exploit these redundancies to compress the number of samples into a small set of clusters C = {c1 , ..., cC }. The compressed set is then used to create a sparse preconditioner for (2) of the form:         

1 s1 Kc1

BcT1

1 s2 Kc2

BcT2

..

Bc1 Bc2 .. .

.

...

1 sC KcC BcTC

BcC K0



qc1 qc2 .. .

       q   cC q¯0





rc1 rc2 .. .

        =     r   cC r¯0

Ks qs = rs − Bs r¯0 ,

        

s∈S

(4)

Numerical experiments with different applications have shown that the size of the compressed sample set C is usually less than 10% of the size of the original sample space S [20]. The advantages of the clustering approach are that no dense linear algebra operations are needed (thus avoiding scalability issues in the number of features) and that the approach is parallelizable (thus scaling linearly in the number of training samples). This approach is inspired by hierarchical multi-grid preconditioners used in the solution of partial differential equations and can also be generalized to perform hierarchical clustering. We also highlight that sample compression is performed at the linear algebra level, but the original SVM problem (containing all the training samples) is actually solved. The clusters are P P constructed by minimizing the distortion metric s∈S i∈C κs,i kγ¯i − γs k, where γs are the features of sample s. The quality of the preconditioner (its spectral properties) is tightly related to the distortion metric, which implies that strong data redundancies can yield efficient preconditioners. Model Prediction. This task uses the optimal trained parameters of the classification model (w∗ , γ ∗ ) to predict the classification label ys given the feature vector xs of a new test sample (not contained in 8 Environment ACS Paragon Plus

Page 9 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

the training set). This task involves relatively minor computing operations (extract features using the data of the given sample and predict the category of the sample). Such operations can be performed on the cloud (remotely) and in real-time to keep in situ hardware requirements at a minimum. The sensor accuracy is measured in terms of the number of correct predictions and, in the case of binary classification, we are also often interested in the proportion of false positives and negatives.

Results and Discussion Data Processing We used the available LC response data to investigate the prediction accuracy of an SVM classifier for LC responses to N2 -DMMP and N2 -water vapors. We train the classifier using different types (and combinations) of feature information. All feature extraction tasks were performed using existing capabilities in Matlab (version R2015a). Classification tasks were performed using tools available in Matlab and advanced optimization solvers such as Ipopt, PIPS-NLP, and IPCluster. We consider the following feature information extracted from the images: I. Average Intensity of RGB channels: An RGB image (micrograph) collected at a given time instance has three channels: red (R), green (G), and blue (B). The brightness field in each channel is represented as a matrix, each element of which denotes one pixel that captures spatial patterns. We average the spatial field for each channel to obtain a feature for each channel (that we call the average RGB intensities), thus obtaining three features at each point in time (every 3.3. seconds). We also consider the average of the three average intensities to obtain the average total intensity. II. HOG Features: We use the Matlab function extractHOGFeatures with cell size [10, 10] to extract these features (see Figure 3). The total number of features extracted with this method was 900 at each point in time. III. Deep Learning Features: We apply the Matlab implementation of Alexnet to each image and use the value of the neurons on the last hidden layer as features. The total number of features extracted with this method was 4,096 at each point in time. IV. Grayscale Pixels: We convert the original RGB (true color) image into a grayscale image represented as a matrix of the size 60 × 60 and then use the entire matrix (pixels) as the features, this allowed us to capture spatial patterns. The total number of features extracted with this method was 3,600. The total number of features extracted with the four methods (I-IV) at each point in time was 8,599. Spatial patterns of the RGB channels were not captured because this would raise the number of features by nearly a factor of three. Instead, we seek to capture spatial patterns by using the HOG (II) and grayscale pixel features (IV). We process the data using two different training strategies, which we call dynamic and static: The dynamic strategy classifies a response based on average RGB feature information (I) that is accumulated during the evolution of the LC response. This strategy is motivated by the observation 9 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

that LC responses to N2 -water tend to be slower than responses to DMMP. For example, Figure 5 shows that, when exposed to DMMP, the LCs take around 100 -150 seconds to change the appearance from a ring to a full moon. A similar transition in response to N2 -water takes over 250 seconds. Consequently, the evolution of average intensity is different in response to DMMP and water. Figure 6 shows that the evolution of the average RGB channels in response to N2 -water are smoother than the responses to DMMP. Consequently, it is expected that the shape of the dynamic profiles of the RGB channels provide valuable information to perform classification. However, from the nature of the dynamic responses we can also see that the DMMP responses exhibit highly variability from sample to sample. This is because the responses are tightly linked to the initial conditions of the LC and to variations in the sample and surface, which are difficult to control experimentally (as seen in the initial states of the RGB channels displayed in Figure 6). We can also see that, for a given sample, the differences in the evolution of the average RGB intensities are not as marked (suggesting that significant redundancy in these features exists). Another important limitation of the dynamic classification strategy is that its accuracy is inherently tied to the slow dynamics of the LC response. We also highlight that the dynamic strategy has severe bottlenecks on the amount of feature information that it can handle. In particular, if we were to accumulate all feature information (RGB, HOG, deep learning, and grayscale pixels) during the entire response, each training sample would contain 524,539 features, leading to intractable classification models and to overfitting (overparameterization) issues. The static strategy seeks to overcome the limitations of the dynamic counterpart by classifying LC responses based only on instantaneous time snapshots. This strategy is based on the hypothesis that differences in spatial patterns are sufficient to identify the presence of DMMP or N2 -water (see Figure 1), although such differences are difficult to detect by a human observer (particularly early in the responses). The static strategy has they key practical advantage that it does not require running a lengthy experiment to conduct classification, thus accelerating sensing. Moreover, this strategy does not accumulate feature information over time (if all sources of feature information are used we only have 8,599 features per sample). For the static strategy, we considered two training set selection cases in order to evaluate how quality of training data affect sensor accuracy. In the first case (that we call static (a)) we partition the entire image population at random to create a training and testing set. In the second case (that we call static (b)) we partition the entire image population by micro-wells. We use data from a subset of micro-wells for training and data from an independent set of micro-wells for testing. The static (b) strategy is expected to have more spatially correlated data (contains more redundancy and less information) compared to the static (a) counterpart. This comparison allowed us to quantify impacts of quality of the training data.

Classification Using Dynamic Information Each training sample captures cumulative feature information of one micro-well up to a given time t. We experiment with different times to analyze how increasing information impacts classification accuracy. The total number of samples (micro-wells) available is 391. We randomly selected 80% of the total micro-wells as the training set and left the remaining 20% as the test set and the random selection procedure is repeated five times to ensure reproducibility. For each micro-well, the features are the 10 Environment ACS Paragon Plus

Page 10 of 19

Page 11 of 19

Red Channel

Figure 5: Response of LCs to DMMP (top 2 rows) and N2 -water (bottom 2 rows) after t =0, 3.3, 50, 100, 150, 200, 250 seconds. 60

60

40

40

20

20

Green Channel

0

0

50

100

150

200

0

60

60

40

40

20

20

0

Blue Channel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

0

50

100

150

200

0

60

60

40

40

20

20

0

0

50

100 150 Time [sec]

200

0

0

50

100

150

200

0

50

100

150

200

0

50

100 150 Time [sec]

200

Figure 6: Dynamic evolution for average red (top), green (middle), and blue (bottom) intensities in the presence of DMMP (left) and N2 -water (right). Each line represents a different sample. average RGB intensities of multiple images recorded up to time t. For example, for a response lasting 200 seconds, we use feature information from 61 images (collected every 3.3. seconds) and each image has three average RGB intensities. Therefore, the total number of features used for a response up to time t = 3.3 seconds is 2, for t = 100 seconds is 91, and for t = 200 seconds is 183. For this strategy we use a linear SVM classifier. We recall that we do not use other feature information (HOG, deep learning, and grayscale pixels) in this strategy because this would lead to an extremely large number of features. Instead, our objective with this strategy is to verify that the dynamic response of the LC indeed contains valuable information to conduct classification. Moreover, the results obtained with this strategy are used as a baseline to benchmark the static strategies. Figure 7 shows that the testing classification accuracy achieved after t =3.3 seconds is only 78% and after t =200 seconds it reaches 97%. We also note that the training accuracy is 100% for t=200 seconds, which indicates that RGB feature information is sufficient to perfectly categorize the images. Table 1 shows that there is perceptible variability in the testing classification accuracy when using different 11 Environment ACS Paragon Plus

ACS Sensors

training sets (ranging from 95 to 100%). This is because the dynamic responses of LCs to DMMP vary significantly from sample to sample and thus the selection of the training set is important. We use the term DMMP Accuracy to denote the correct classification of DMMP presence and the term N2 -Water Accuracy to denote the correct classification of water intrusion. In statistical terms, 1- DMMP Accuracy is called a false negative, while 1- N2 -Water Accuracy is called a false positive. We have found that the DMMP Accuracy ranges from 90 - 100% while the N2 -Water Accuracy is consistently 100% (indicating that water is more easily detectable than DMMP). 100 98 96 94 Test Accuracy [%]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 19

92 90 88 86 84 Dynamic Static (a) Static (b)

82 80

25

50

75

100 Time [sec]

125

150

175

200

Figure 7: Average classification accuracy (for test set) for dynamic and static strategies.

Table 1: Classification accuracy using RGB information (after 200 seconds) and a dynamic strategy. Round

Training Accuracy

Test Accuracy

DMMP Accuracy

N2 -Water Accuracy

1 2 3 4 5 Avg.

100.00 100.00 100.00 100.00 100.00 100.00

97.56 96.34 95.12 97.56 100.00 97.32

95.12 92.68 90.24 95.12 100.00 94.63

100.00 100.00 100.00 100.00 100.00 100.00

Classification Using Static Information The results obtained with the dynamic strategy show that accuracies of 78% can be reached after just 3.3 seconds, indicating that there is a non-trivial amount of information embedded in the early response of the LC that can be used to classify the samples more quickly. This insight is reinforced when applying the static classification strategies. In the static (a) strategy we randomly selected 80% of the total image population as the training set and left the remaining 20% as the test set. Our base study uses data from features I, II, and III and a linear SVM classifier. The results are summarized in Table 2. The main finding is that the overall classification accuracy reaches levels of 99.95%. We also note that the training accuracy is 100%, which indicates that the feature information (I, II, and III) provided is sufficient to perfectly categorize the images. From Figure 7 we see that high prediction 12 Environment ACS Paragon Plus

Page 13 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

accuracies of 99% can be achieved by using images that are captured after just 3.3 seconds of tracking the LC response. This has important practical implications because it suggests that classification can be achieved nearly instantaneously (without the need of running a lengthy experiment). Table 2: Classification accuracy with static (a) strategy (data partitioned over entire population). Round

Training Accuracy

Test Accuracy

DMMP Accuracy

N2 -Water Accuracy

1 2 3 4 5 Avg.

100.00 100.00 100.00 100.00 100.00 100.00

99.96 99.95 99.97 99.97 99.93 99.95

99.97 99.93 99.95 99.95 99.90 99.93

99.95 99.96 99.98 99.98 99.95 99.96

Table 3: Classification accuracy with static (b) strategy (data partitioned by micro-wells). Round

Training Accuracy

Test Accuracy

DMMP Accuracy

N2 -Water Accuracy

1 2 3 4 5 Avg.

99.48 99.41 99.42 99.52 99.49 99.46

95.39 94.79 96.36 92.51 96.02 95.00

95.97 95.72 97.40 94.74 94.96 95.75

94.95 94.20 95.63 90.92 96.76 94.45

In the static (b) strategy, we randomly selected 80% of the micro-wells as training wells and the remainder of the wells are used as test wells. The random selection process is repeated five times and the results are summarized in Table 3. We can see that the training classification is also high (99.46%), providing additional evidence that the features I, II, and III are highly informative. The predicted classification accuracy, however, can only reach 95% for images at t = 200 seconds and accuracy reaches levels of only 91% for images at t=3.3 seconds (see Figure 7). The decreased accuracy is the result of using more correlated (and thus less informative) data. We can thus see that the proposed ML framework allows us to directly quantify the effects of using lower quality data in the learning (training) process. To emphasize on this issue further, we also explored the effect of the number of training samples on classification accuracy. The images are also partitioned by micro-wells in these experiments (static (b)). Table 4 shows that an accuracy of 95% can be achieved if 80% of the data is used as training samples (312 micro-wells or 60,064 images). In comparison, if only 20% of the data (79 micro-wells or 15,017 images) are used as training set, the accuracy drops to 87.47%. Our results are thus consistent. Table 4: Effect of number of training samples on classification accuracy. % Training Samples

Test Accuracy

DMMP Accuracy

N2 -Water Accuracy

20 40 60 80

87.47 91.41 93.08 95.00

87.15 91.03 93.41 95.75

87.68 91.65 92.85 94.45

13 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 19

Impact of Feature Information In Table 5 we report the performance of the SVM classifier using different types of feature information. In these experiments we use the static (b) strategy and 80% of the wells are used for training. We see that, while the number of features provided by IV (grayscale pixels) is high (3,600), the accuracy achieved with this feature alone is only 80.12%. This indicates that spatial pixel patterns are insufficient to achieve high accuracy. With a similar number of features (4,907), the combination of features I + III (average total RGB intensity and deep learning) reaches an accuracy of 93.56%. A combination of these features with feature II (HOG) results in the maximum accuracy of 95 %. We have also found that the combination of all the features I-IV does not improve this performance and that the use of average total RGB intensity alone provides very low accuracies (of around 60%). This last observation is important, because average brightness has been widely used by experimental researchers to classify LC responses. The benefits of using HOG information are attributed to the fact that these features capture spatial patterns that are developed early in the LC responses. Such patterns, however, are not sufficient to achieve high accuracies. We attribute the high accuracies obtained with AlexNet to the fact that this technique generates highly evolved features by such as texture, edges, and blobs. We note, however, that such evolved features do not have direct physical interpretations. The interpretation of such features is an interesting topic that will be explored in future work. Table 5: Effect of feature information on classification accuracy. Feature Info I + II I + III I + II + III IV

Test Accuracy

DMMP Accuracy

N2 -Water Accuracy

87.15 93.56 95.00 80.12

85.42 94.83 95.75 76.68

88.24 92.73 94.45 82.40

Performance of Classification Techniques We also explored the performance of different classification techniques. Specifically, we compare linear SVM (used in our previous results) against logistic classification, nonlinear SVM, and artificial neural networks (ANN). To achieve consistency, we only compare classification methods available in Matlab. In this study, images are partitioned by micro-wells (static (b) case) and 80% of the wells are used for training. For nonlinear SVM we used a Gaussian kernel and for ANN we designed the input layer to be the RGB (true color) image and the hidden layers to be a sequence of convolution2dLayer, maxPooling2dLayer, fullyConnectedLayer, reluLayer, and softmaxLayer. The time required to solve the classification problems with different methods are significantly different. Logistic classification takes about 10 minutes for training the classifier, linear SVM takes about 30 minutes, nonlinear SVM takes about 2 hours, and ANN takes 2 days. We also highlight that the classification predictions can be obtained in seconds. Table 6 summarizes the classification accuracies obtained with the different methods. We have found that linear SVM is the superior method, achieving classification accuracies above 95%. The performance of nonlinear SVM is close to that of linear SVM but requires significantly more time to 14 Environment ACS Paragon Plus

Page 15 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

train the model (due to the presence of nonlinear functions). The results indicate that capturing nonlinear effects is not beneficial and in fact slightly deteriorate performance (due to reduced numerical accuracy). Logistic regression only achieves accuracies of 92% while ANN only achieves 83% (training of the ANN was stopped after 2 days of computation). We attribute the superior performance of SVN over logistic regression to the fact that logistic regression is in fact an approximate scheme of SVM (such approximation is often used to reduce computational complexity, as shown in the solution times obtained in our experiments). We also highlight that it is possible that improved classification performance can be obtained with ANN by better tuning the network layout but this would require significant computational effort. Table 6: Performance of classification techniques on accuracy. Method

Test Accuracy

DMMP Accuracy

N2 -Water Accuracy

Linear SVM Nonlinear SVM Logistic classification ANN

95.00 94.64 92.99 81.86

95.75 95.64 93.98 83.36

94.45 94.00 92.35 81.51

In Table 7 we compare the computational performance of different high-performance optimization solvers for a large linear SVM classification instance that comprises 37,540 training samples and 4,997 features. These experiments are run on a multi-core computing server with 32 cores. As can be seen, the off-the-shelf interior-point solver Ipopt cannot solve the problem because it runs out of memory (the linear algebra system is too large to be handled all-at-once). The parallel Schur decomposition strategy implemented in PIPS-NLP bypasses the memory obstacle but it requires more than twelve hours to solve the problem to a tolerance of 1 × 10−5 (due to dense linear algebra operations). The clustering-based preconditioner used in IPCluster reduces the solution time to 4.8 minutes and achieves the same tolerance of 1 × 10−5 . Notably, the preconditioner only uses 1% of the training samples, indicating that high redundancy exists in the feature information. Another surprising result is that the SMO algorithm (tailored to SVM problems) can only reach a tolerance of 1 × 10−1 after an hour. Our results suggest that drastic reductions in computing time and accuracy can be achieved with advanced optimization solvers. In future work we will explore the use of other advanced software libraries such as libSVM. Table 7: Computational performance of optimization solvers in large SVM instance. Solver Interior-Point (Ipopt) Schur Decomposition (PIPS-NLP) Sequential Minimal Optimization (SMO) Cluster-Based Preconditioning (IPCluster)

Classification Time (sec)

Tolerance

Out-of-Memory ≥43,200 4, 087 281

10−5 10−1 10−5

Conclusions We presented a computational framework that uses ML techniques to optimize the selectivity and speed of LC-based chemical sensors. The proposed framework can correctly classify up to 99% of 15 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

images captured during the response of LCs to the presence of DMMP at 10 ppmv, in contrast to the response in the presence of a nitrogen stream containing 30% RH. The framework is also used to show that standard feature information used in the classification of LC responses (such as average brightness) can only achieve classification accuracies of 60%. It is also demonstrated that the framework can correctly classify images collected after short LC response times of around three seconds, thus enabling fast sensing capabilities. The results demonstrate the potential utility of ML-based methods for distinguishing different chemical analytes that trigger anchoring transitions of LCs that are indistinguishable for human observers (or a photodetector that measures average brightness). As part of future work, we will investigate strategies to classify the response of multiple analytes, as reported in [22]. In addition, we will investigate the use of space-time statistics to extract feature information from complex response patterns of LCs. Direct classification using deep learning techniques under larger amounts of experimental data will also be explored.

Acknowledgements The work of Y. Cao and V.M. Zavala was supported by the U.S. Department of Energy under grant DE-SC0014114. The work of H. Yu and N.L. Abbott was partially supported by the Army Research Office (W911NF-14-1-0140) and the National Science Foundation (DMREF grant DMR-1435195). Facilities of the Wisconsin MRSEC were used in this work (DMR-1121288).

REFERENCES [1] C. K. Ho, A. Robinson, D. R. Miller, and M. J. Davis, “Overview of sensors and needs for environmental monitoring,” Sensors, vol. 5, no. 1, pp. 4–37, 2005. [2] K.-L. Yang, K. Cadwell, and N. L. Abbott, “Contact printing of metal ions onto carboxylateterminated self-assembled monolayers,” Advanced Materials, vol. 15, no. 21, pp. 1819–1823, 2003. [3] K.-L. Yang, K. Cadwell, and N. L. Abbott, “Mechanistic study of the anchoring behavior of liquid crystals supported on metal salts and their orientational responses to dimethyl methylphosphonate,” the Journal of Physical Chemistry B, vol. 108, no. 52, pp. 20180–20186, 2004. [4] K. D. Cadwell, M. E. Alf, and N. L. Abbott, “Infrared spectroscopy of competitive interactions between liquid crystals, metal salts, and dimethyl methylphosphonate at surfaces,” The Journal of Physical Chemistry B, vol. 110, no. 51, pp. 26081–26088, 2006. [5] M. L. Bungabong, P. B. Ong, and K.-L. Yang, “Using copper perchlorate doped liquid crystals for the detection of organophosphonate vapor,” Sensors and Actuators B: Chemical, vol. 148, no. 2, pp. 420–426, 2010. [6] S. K. Pal, C. Acevedo-Velez, J. T. Hunter, and N. L. Abbott, “Effects of divalent ligand interactions on surface-induced ordering of liquid crystals,” Chemistry of Materials, vol. 22, no. 19, pp. 5474– 5482, 2010. 16 Environment ACS Paragon Plus

Page 16 of 19

Page 17 of 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Sensors

[7] S. E. Robinson, B. A. Grinwald, L. L. Bremer, K. A. Kupcho, B. R. Acharya, and P. D. Owens, “A liquid crystal-based passive badge for personal monitoring of exposure to hydrogen sulfide,” Journal of occupational and environmental hygiene, vol. 11, no. 11, pp. 741–750, 2014. [8] J. T. Hunter and N. L. Abbott, “Adsorbate-induced anchoring transitions of liquid crystals on surfaces presenting metal salts with mixed anions,” ACS applied materials & interfaces, vol. 6, no. 4, pp. 2362–2369, 2014. [9] T. Szilv´asi, L. T. Roling, H. Yu, P. Rai, S. Choi, R. J. Twieg, M. Mavrikakis, and N. L. Abbott, “Design of chemoresponsive liquid crystals through integration of computational chemistry and experimental studies,” Chemistry of Materials, vol. 29, no. 8, pp. 3563–3571, 2017. [10] H. Yu, T. Szilv´asi, P. Rai, R. J. Twieg, M. Mavrikakis, and N. L. Abbott, “Computational chemistry-guided design of selective chemoresponsive liquid crystals using pyridine and pyrimidine functional groups,” Advanced Functional Materials, vol. 28, no. 13, p. 1703581, 2018. doi: 10.1002/adfm.201703581. [11] L. T. Roling, J. Scaranto, J. A. Herron, H. Yu, S. Choi, N. L. Abbott, and M. Mavrikakis, “Towards first-principles molecular design of liquid crystal-based chemoresponsive systems,” Nature communications, vol. 7, p. 13338, 2016. [12] D. S. Miller, R. J. Carlton, P. C. Mushenheim, and N. L. Abbott, “Introduction to optical methods for characterizing liquid crystals at interfaces,” Langmuir, vol. 29, no. 10, pp. 3154–3169, 2013. [13] A. Hassanzadeh and R. G. Lindquist, “Liquid crystal sensor microchip,” IEEE Sensors Journal, vol. 12, no. 5, pp. 1536–1544, 2012. [14] R. R. Shah and N. L. Abbott, “Orientational transitions of liquid crystals driven by binding of organoamines to carboxylic acids presented at surfaces with nanometer-scale topography,” Langmuir, vol. 19, no. 2, pp. 275–284, 2003. [15] J. T. Hunter and N. L. Abbott, “Dynamics of the chemo-optical response of supported films of nematic liquid crystals,” Sensors and Actuators B: Chemical, vol. 183, pp. 71–80, 2013. [16] A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologistlevel classification of skin cancer with deep neural networks,” Nature, vol. 542, no. 7639, pp. 115– 118, 2017. [17] C.-C. Chang and C.-J. Lin, “Libsvm: a library for support vector machines,” ACM transactions on intelligent systems and technology (TIST), vol. 2, no. 3, p. 27, 2011. [18] P. Bradley and O. Mangasarian, “Massive data discrimination via linear support vector machines,” Optimization methods and software, vol. 13, no. 1, pp. 1–10, 2000. [19] M. C. Ferris and T. S. Munson, “Interior-point methods for massive support vector machines,” SIAM Journal on Optimization, vol. 13, no. 3, pp. 783–804, 2002. 17 Environment ACS Paragon Plus

ACS Sensors 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

[20] Y. Cao, C. D. Laird, and V. M. Zavala, “Clustering-based preconditioning for stochastic programs,” Computational Optimization and Applications, vol. 64, no. 2, pp. 379–406, 2016. [21] J. Kang, N. Chiang, C. D. Laird, and V. M. Zavala, “Nonlinear programming strategies on highperformance computers,” in Decision and Control (CDC), 2015 IEEE 54th Annual Conference on, pp. 4612–4620, IEEE, 2015. [22] M. S. Wiederoder, E. C. Nallon, M. Weiss, S. K. McGraw, V. P. Schnee, C. J. Bright, M. P. Polcha, R. Paffenroth, and J. R. Uzarski, “Graphene nanoplatelet-polymer chemiresistive sensor arrays for the detection and discrimination of chemical warfare agent simulants,” ACS sensors, vol. 2, no. 11, pp. 1669–1678, 2017. doi: 10.1021/acssensors.7b00550.

18 Environment ACS Paragon Plus

Page 18 of 19

Page 19 of 19

ACS Sensors

=)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 http://zavalab.engr.wisc.edu 20 21 Real-Time Data 22 23 24 Data process Training Data 25 26 DMMP Water Water Water DMMP Feature Extraction 27 Classifier 28 29 DMMP Water DMMP Water Water 30 31 =) =) 32 DMMP Water DMMP DMMP Water 33 34 Response of LCs to DMMP (top 2 rows) and N2 -water (bottom 2 rows) after t =0, 3.3, 50, 35 36 Water Water Water Water Water 00, 250 seconds. For TOC only 37 38 that tied to the slow dynamics of the LC response. We also high39 its accuracy is inherently Water DMMP Water DMMP Water 40 the dynamic strategy has severe bottlenecks on the amount of feature information that it e. 41 In particular, if we were to accumulate all feature information (RGB, HOG, deep learn42 grayscale pixels) during the entire response, each training sample would20 / 23 contain 524,539 43 eading 44 to intractable classification models and to overfitting (overparameterization) issues. atic45strategy seeks to overcome the limitations of the dynamic counterpart by classifying 46 nses based only on instantaneous time snapshots. This strategy is based on the hypothesis 47 ences 48 in spatial patterns are sufficient to identify the presence of DMMP or N2 -water (see 49 although such differences are difficult to detect by a human observer (particularly early in 50 nses). The static strategy has they key practical advantage that it does not require running a 51 xperiment to conduct classification, thus accelerating sensing. Moreover, this strategy does 52 mulate 53 feature information over time (if all sources of feature information are used we only 54 9 features per sample). For the static strategy, we considered two training set selection cases 55 o evaluate how quality of training data affect sensor accuracy. In the first case (that we call 56 we57 partition the entire image population at random to create a training and testing set. In d case 58 (that we call static (b)) we partition the entire image population by micro-wells. We 59 a subset of micro-wells for training and data from an independent set of micro-wells rom 19 Environment ACS Paragon Plus 60 g. The static (b) strategy is expected to have more spatially correlated data (contains more Liquid Crystals and Learning for Chemical Sensing

Features: Total 4997 Histogram of oriented gradients (HOG) features (900) Deep learning features 4096 Mean density 1 Samples: 75081 (37540 for training) Method: Supported Vector Machine (SVM) Solver Ipopt PIPS-NLP IPCluster SMO

Time(s) Insu cient Memory 43200 281 4087⇤

tol 10 5 10 5 10 1

IPCluster: C=1% S Test Accuracy: 0.95 Test Accuracy a\er 3s: 0.92 Test Bme per sample < 1s SensiBvity: 0.01ppmv Recall: lethal concentra3on: 0.035 ppmv for 2min

21 / 23

cy and less information) compared to the static (a) counterpart. This comparison allowed

=)

DMMP