Surface Adsorbed Antibody Characterization Using ... - ACS Publications

Aug 5, 2016 - ABSTRACT: Artificial neural networks (ANNs) form a class of powerful multivariate analysis techniques, yet their routine use in the surf...
4 downloads 12 Views 4MB Size
Article pubs.acs.org/Langmuir

Surface Adsorbed Antibody Characterization Using ToF-SIMS with Principal Component Analysis and Artificial Neural Networks Nicholas G. Welch,†,‡ Robert M. T. Madiona,†,‡ Thomas B. Payten,† Robert T. Jones,† Narelle Brack,† Benjamin W. Muir,‡ and Paul J. Pigram*,† †

Centre for Materials and Surface Science and Department of Chemistry and Physics, School of Molecular Sciences, La Trobe University, Melbourne, VIC 3086, Australia ‡ CSIRO Manufacturing, Clayton, VIC 3168, Australia S Supporting Information *

ABSTRACT: Artificial neural networks (ANNs) form a class of powerful multivariate analysis techniques, yet their routine use in the surface analysis community is limited. Principal component analysis (PCA) is more commonly employed to reduce the dimensionality of large data sets and highlight key characteristics. Herein, we discuss the strengths and weaknesses of PCA and ANNs as methods for investigation and interpretation of a complex multivariate sample set. Using time-of-flight secondary ion mass spectrometry (ToF-SIMS) we acquired spectra from an antibody and its proteolysis fragments with three primary-ion sources to obtain a panel of 72 spectra and a characteristic peak list of 775 fragment ions. We describe the use of ANNs as a means to interpret the ToF-SIMS spectral data, highlight the optimal neural network design and computational parameters, and discuss the technique limitations. Further, employing Bi3+ as the primary-ion source, ANNs can accurately classify antibody fragments from the parent antibody based on ToF-SIMS spectra.



INTRODUCTION

Principal component analysis (PCA) is a widely employed multivariate analysis technique, utilized in a range of disciplines to ask compelling questions of complex data. PCA describes and classifies variance in a data set in terms of independent linear combinations of the variables making up that set. These linear combinations are known as “principal components” (PCs) and comprise a positive or negative contribution from each variable. The first principal component (often noted as PC1) captures the greatest variance. Subsequent principal components (PC2, PC3, and so on) capture variance in descending order. Each PC is uncorrelated or orthogonal with all the others. On analysis, PCA produces three new matrices; the scores, the loadings and the residuals. The scores plot shows the relationship of the sample set with the PCs. The loadings show how the variables contribute to the PCs, and finally the residuals represent random noise in the set. PCA is employed to reduce the dimensionality of large data sets and highlight key characteristics, representing the data set with a much reduced number of PCs in comparison with the number of variables. Graham et al. provide an excellent review4 of PCA and its use in a range of scenarios, particularly in ToF-SIMS. PCA has also been used extensively to investigate protein adsorption.5−8

Time-of-flight secondary ion mass spectrometry (ToF-SIMS) provides valuable insight into the structure, elemental and chemical composition of surface chemical species.1 In contemporary ToF-SIMS instruments, the sample surface is typically bombarded with short pulses of energetic ions or ion clusters. Bismuth and its clusters are primary-ions in common use.2 The primary-ion pulses deliver energy to the surface and disrupt its structure, producing positively charged, negatively charged, and neutral fragments and radicals representative of the surface chemical species.3 The secondary ions emerging from the outermost layers are extracted, mass sorted by a timeof-flight mass analyzer and their populations counted. The resulting spectra are information-rich, comprising thousands of peaks; each corresponding to discrete molecular species and fragments. ToF-SIMS is intrinsically multivariate, as components of a sample may be defined by a single molecular species, multiple species, and vice versa; where one species may be produced from multiple components of the sample. For example, the ToF-SIMS spectrum from a hydrocarbon-containing polymer substrate with a bound protein will contain a CH2+ mass fragment attributed to both polymer and protein components. Thus, to distinguish between samples exhibiting similar chemical properties, the use of multivariate techniques has become common. © 2016 American Chemical Society

Received: June 21, 2016 Revised: August 1, 2016 Published: August 5, 2016 8717

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Figure 1. Visual representation of ANNs and associated toroidal topology. (a) 2D plot of the resulting map showing neurons (individual cells, hexagons), component planes (that contain individual variable information), and output class planes (one for each individual class). (b) Toroidal geometry employed for topological mapping.

Artificial neural networks (ANNs) form another class of powerful multivariate analysis techniques that is much less widely employed in the investigation and interpretation of ToFSIMS data. The strength of ANNs is their ability to produce a 2D plot of multivariate space and group data based on similarities in variables. There are unsupervised ANNs popularized by Kohonen9−11 and his self-organizing maps (SOMs), and supervised ANNs (such as counter propagation ANNs, CP-ANNs) that allow an extra layer of information to be added to the ANN to assist with minimizing computation requirements and posing more defined questions. Brereton12 has presented an excellent review with nonmathematical descriptions of ANNs. The ANN 2D plot presented after computation is defined by a series of component planes, with each plane corresponding to a single input variable. The final map typically has the topology of a sphere or toroid, though toroid topology is considered the improved condition as the same number of neighbors exist for all points.13 The map is continuous, with the upper row connecting to the bottom row, and similarly, the left column is adjacent to the right column, wrapping around the toroid. This continuous map allows each neuron to have the same number of neighbors, thereby generating a more accurate model. A neuron in the final (or resulting) map is the sum of the neurons in the component planes behind that map (Figure 1). Initially, the neurons in the component planes are assigned values (known as weights) randomly between 0 and 1. A sample (xi) then is randomly selected and presented to the network. The neuron most similar to the sample, based on the Euclidean distance, is called the “winning neuron” or best matching unit (BMU). The weights of this neuron (wr) and the surrounding neurons (scaled by topological distance, dr) are changed (Δwr) as a function of the difference between their values and the values of the sample. This is done either using sequential training, in which one sample is presented to the network and the weights are updated based on the winner neuron, or batch training, in which the whole set of the samples is presented to the network, the winner neurons are found, and then the weight maps updated with the effect of all samples. The batch method is typically computationally faster, because the weights are less frequently updated.

The change of the neuron weights (Δwr) can be determined from ⎛ ⎞ dr Δwr = η⎜1 − ⎟(xi − wrold) dmax + 1 ⎠ ⎝

where η is the learning rate and dmax is the size of the considered neighborhood, which decreases during the training phase as the number of computational iterations increases.13−15 CP-ANNs feature the addition of output planes to a traditional ANN to allow for classification. The number of classification planes (weights) is identical to the number of user-defined classes to be modeled. The output layer is modified according to the same algorithm shown above. At the end of the network training, each neuron is assigned a class on the basis of the output weights and, subsequently, the samples in that neuron are assigned that class. ANNs have been used successfully in chemometrics since the 1970s as a method, for example, to interpret chromatography signals and to deconvolute spectra.12 Since then, the dramatic increase in computing power and the emergence of userfriendly interfaces has greatly improved access to the technique within both the chemometrics and analytical communities, significantly increasing the uptake of ANNs as a multivariate analysis approach. Computational environments such as MATLAB (The MathWorks Inc.) have become an engine to host the “front-end” graphical user interface and toolboxes, such as the Kohonen and CP-ANN Toolbox by Milano Chemometrics.14,15 Thus, users can now explore their complex data sets without a specialist knowledge of mathematics or chemometrics code. While ANNs have been used recently for the spectral analysis of elements16 and FTIR spectroscopy of microorganisms,17 there are few examples of their application to ToF-SIMS.6,18 NeuroSpectraNet was an ANN mechanism for sorting and classifying samples based on their static SIMS spectra. NeuroSpectraNet was able to determine differences between a variety of proteins and determine the significant ions representative of each protein.6 ANNs have also found application in the conceptually adjacent field of surface enhanced laser desorption ionization (SELDI),19−21 where important cancer biomarkers have been assessed. There is a 8718

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Figure 2. Representation of the anti-EGFR IgG antibody and its proteolytic fragments used in this work. antibody) in 0.1 M citrate (pH 3.6) for 45 min at 37 °C before the F(ab′)2 fragment was isolated by gel filtration. The F(ab′)2 fragment was reduced to Fab′ with TCEP and a portion put aside. The remainder was blocked by reaction with iodoacetamide, forming the Fab′-IAA, before gel filtration purification as above. Silicon Wafer Preparation and Protein Adsorption. Silicon wafers were cut into 1 cm × 1 cm squares and sonicated for 30 min in a solution of 2% RBS, 2% ethanol, and Milli-Q water (18.2 MΩ.cm). Wafers were first rinsed with ethanol and, second, with Milli-Q water and dried with nitrogen gas. Antibody and antibody fragments were prepared in separate solutions of TBS (150 mM NaCl, pH 8.0) at 7 nM. Volumes of 400 μL of protein solutions were added to each well containing a silicon wafer and incubated at room temperature for 1 h. Samples were further rinsed with TBS and 0.05% Tween 20 (removing unbound protein) and Milli-Q water to remove excess protein and salt. Time-of-Flight Secondary Ion Mass Spectrometry (ToFSIMS). SIMS data were acquired using a TOF.SIMS 5 instrument (ION-TOF GmbH, Münster, Germany) equipped with a Bi/Mn liquid-metal ion source operating at 30 kV in pulsed, bunched mode. All spectra were collected in positive polarity from a 100 μm × 100 μm region under static conditions (total ion dose < 1 × 1012 ions cm−2) using three primary-ions: Mn+, Bi+, and Bi3+. Spectra were acquired from three locations for each primary-ion. Charge compensation was achieved by flooding the sample surface with low-energy electrons, resulting in a mass resolution (m/Δm) for the C2H3+ (27.023 u) peak of greater than 7100 for Bi+, 6600 for Bi3+, and 7000 for Mn+. Data were acquired for 150 μs between primary-ion pulses, providing an accessible mass range of up to 2000 u. The mass scale was calibrated using peaks assigned to the C+, CH+, CH2+, CH3+, C2H3+, C3H3+, C4H3+, C5H3+, C6H5+, and C7H7+ ions. The pressure in the analysis chamber during data acquisition was 2 × 10−9 mbar. Peak lists used for multivariate analysis were generated using the SurfaceLab 6 software (ION-TOF GmbH, Münster, Germany; version 6.5). Multivariate Analysis (MVA). Multivariate analysis (MVA) was carried out using principal component analysis (PCA) and artificial neural networks (ANN). For MVA, a list of 782 discrete mass spectral peaks, each with maximum intensity greater than 102, was selected without binning over the m/z range 1−300. Peak areas were normalized by total ion intensity per spectrum to take account of differences in beam current with each primary-ion source. Seven peaks (Si+, K+, SiH+, SiOH+, Na+, Al+, Cs+) were removed from the peak list to minimize data skewing due to their strong intensities. The resulting matrix consisted of 72 samples (3 spectra per source ion per sample) with 775 mass spectral peaks. A smaller matrix was prepared consisting of 35 known mass spectral peaks attributed to amino acids. Peak areas were normalized within their set of mass spectral peaks and meancentered in PLS_Toolbox before each data analysis to mitigate systematic differences between samples. For PCA, no cross-validation

significant opportunity for ANNs to be widely applied to ToFSIMS analysis, if optimal neural network and computational parameters can be established for utilization with modern computing hardware. Protein analysis using ToF-SIMS is particularly difficult, as there are a finite number of elements (C, H, N, O) combined in a highly specific manner to produce a limited number of amino acids and thus unique mass peaks.22 Moreover, the secondary ions observed may be formed by the ionization of more than one amino acid (e.g., CH2, NH2, etc.). In the case of a 150 kDa antibody (Immunoglobulin G, IgG), the structure is mirrored about the central axis, giving 75 kDa of recurring protein sequences. Hence, even when the antibody is dissected into its proteolytic fragments, the formation of unique peaks particular to that fragment is very small relative to the total number of peaks. However, as ToF-SIMS is highly surface sensitive, and the populations of the secondary ions are dependent on neighboring protein chains, we believe ANNs can resolve differences even in singular secondary ion variations. In this work, we describe the systematic analysis of an antibody (IgG) and its proteolytic fragments to exemplify the power of ToF-SIMS and ANNs. We compare PCA and ANNs in detail and discuss the important user-defined factors in building ANNs from a MATLAB toolbox.



EXPERIMENTAL SECTION

Preparation and Purification of Fab and Fc Fragments. An amount of 0.24 mg of papain (Sigma-Aldrich, activated with DTT immediately prior to use) was added to 11.8 mg of anti-epidermal growth factor receptor (anti-EGFR23) IgG antibody, and the digestion allowed to proceed for 16 h at 37 °C before being stopped with the addition of iodoacetamide. The reaction mixture was applied to a 1 mL Protein A FF column (GE Healthcare). The unbound fraction containing the Fab fragment was subjected to gel filtration (Superdex S200 10−30, GE Healthcare) in tris-buffered saline (TBS, 150 mM NaCl, pH 8.0) while the bound fraction containing the Fc fragment was eluted with 0.1 M citrate, 0.15 M NaCl, pH 3.0 and again subjected to gel filtration polishing. Preparation and Purification of Deglycosylated Antibody. The anti-EGFR IgG antibody (4.7 mg) was digested with 0.25 mg of PNGase F24 at 37 °C for 48 h, with addition of a further 0.02 mg of PNGase F after 24 h. The reaction mixture was purified by gel filtration as above. Preparation and Purification of F(ab′)2, Fab′, and Fab′-IAA Fragments. The anti-EGFR was digested with pepsin (50 μg per mg 8719

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Figure 3. Principal component analysis of antibody fragments on silicon wafers PC 1 vs PC 2 as labeled by either protein fragment (left) or primaryion source (right). ToF-SIMS peak list of 775 mass fragments was utilized. Ellipses show a 95% confidence level. methods were applied and 4 principal components were selected. For ANNs, two types of models were utilized: an unsupervised Kohonen network (UKN) and a counter propagation artificial neural network (CP-ANN). Hexagonal networks were constructed from map sizes 4 × 4, 6 × 6, 8 × 8, 10 × 10, and 16 × 16 neurons. Up to 100 000 epochs were employed for network training using toroidal boundary conditions and batch training. PCA and ANNs were conducted using PLS_Toolbox (version 8.1) (Eigenvector Research, Manson, WA) and the Kohonen and CP-ANN Toolbox (version 3.8) (Milano Chemometrics and QSAR Research Group, Milano, Italy), respectively, utilizing MATLAB R2015b (version 8.6) (The MathWorks Inc.). ANN calculations were undertaken on standard Dell desktop computers, featuring Intel Xeon 8 Core processors and 16 GB of RAM, under Windows 7 (64 bit). Calculation times varied greatly depending on the size of the network (number of neurons) and the number of epochs (iterations). The calculations for 10 × 10 NNs and 100 000 epochs typically required less than 8 h to complete.

was implemented this is as expected, as heavier primary-ion sources are known to pronounce heavier mass fragment populations.2,25 PC2 identifies the Fc fragment as the most positively loading, regardless of primary-ion, and groups the F(ab′)2, Fab′-IAA, and Fab′ next. The remaining four fragments are closely spaced and at times overlapping. This is not surprising as the Fc fragment has a different protein structure compared to the other more closely related fragments. In the case of F(ab′)2, Fab′-IAA and Fab′, each contain a sulfur group that is perhaps more available to analysis with ToF-SIMS than the native internal disulfide present in the IgG, IgG(De) and Fab. Considering the four fragments specifically, the most negatively loading are the Bi+ samples, followed by Mn+ and Bi3+, indicating that the PC2 is not primary-ion dependent. This result is expected, as it is a condition of PCA that PC2 is orthogonal to PC1. PC3 plotted against PC4 (Figure S1) separated the control samples from the protein fragments, however further interpretation of this data was complicated due to multiple primary-ions. The loadings plots for the first four PCs are shown in Figure S2. As the Bi3+ primary-ion alone gave good separation between different protein fragments in Figure 3, individual PCA was performed on the samples collected with Bi3+ alone. The new PCA was used to plot PC1 (92.8%) was against PC2 (4.0%) and yielded excellent separation between all but the IgG and IgG(De) samples (Figure 4). The clustering of samples across this new PC1 was very similar to the PC2 of the whole data set. It can be argued that the interpretation of the data set in PCA is largely dictated by the researcher’s ability to identify similarities or ask specific questions of the data based on assumptions or prior knowledge to gain further detailed information. Analyzing a mass fragment (variable) list of over 700 peaks is a time-consuming exercise and simply identifying the top contributors in each PC list is generally not sufficient to understand the whole data set and at times may be misleading. This is particularly evident at the higher PCs where the component may be defined by relatively minor contributions from many hundreds of peaks. Amino Acid PCA. For comparison, we prepared a reduced peak list comprising only known mass peaks attributed to amino acids. The final list of 35 peaks was compiled from multiple sources,26−29 and only peaks present in our whole



RESULTS AND DISCUSSION Silicon wafers were incubated in solutions of the whole antiEGFR IgG antibody, the deglycosylated whole antibody, and proteolytic fragments of the whole antibody; F(ab′)2, Fab′, Fab′-IAA (with iodoacetamide blocking), Fab, and Fc (Figure 2). As each of the seven proteins was produced from the same whole antibody, not only do they contain the same 20 (overall) amino acids, they also have peptide sequences that are identical to the whole IgG. In the case of the Fab, Fab′, and Fab′-IAA, these are each very close to 50 kDa, with very few chemical species representative of their differences. ToF-SIMS data was collected with three primary-ion sources, Bi3+, Bi+, and Mn+, to investigate the benefits of combining different fragmentation patterns in multivariate analysis. In addition, this was to provide objective information on the analytical capabilities of heavier primary-ions in particular with antibody systems and training of artificial neural networks. Principal Component Analysis (PCA). PCA was used to interrogate the 72 sample by 775 fragment ion variable matrix to reveal the first four principal components (PCs). PC 1 (which encompasses 53.7% of the data set) was plotted against PC2 (29.2%) and the samples labeled by the protein fragment or the primary-ion used for analysis (Figure 3). It is immediately evident that PC1 separates the samples based on the primary-ion in ascending order of the mass of the ion; i.e. positive loading for the lighter Mn+ ion and negative loading for the heavier Bi3+. As a comprehensive nonbiased input matrix 8720

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Figure 4. Principal component analysis of antibody fragments on silicon wafers PC 1 vs PC 2 as labeled by protein fragment. ToF-SIMS peak list of 775 mass fragments was utilized. Data collected with Bi3+.

Figure 6. Principal component analysis of antibody fragments on silicon wafers PC 1 vs PC 2 as labeled by protein fragment. ToF-SIMS peak list of 35 amino acid mass fragments was utilized. Data collected with Bi3+.

matrix were utilized (see Table S1). The remaining discarded amino acid peaks did not have sufficient ion intensity (greater than 102 counts) across all samples, to be selected in the original matrix. PCA was used to interrogate this new 72 sample by 35 fragment ion variable matrix to reveal the first four principal components. PC1 (73.2%) was plotted against PC2 (17.3%) and showed good separation between the primary-ion sources (Figure 5). Unlike the 775 mass peak list, this 35 mass peak list required two principal components to separate the primary-ions from one another. PC2 displays good separation of the protein fragments particularly for Bi3+. PC3 (5.2%) and PC4 (1.5%) appeared not to contribute any further information (Figure S3). The loadings plots for all amino acid PCs can be found in Figure S4. In order to investigate the trend in the scores plot and the amino acid distribution we further simplified the system to only contain samples analyzed with Bi3+. Using this set, PC1 (54.6%) was plot against PC2 (30.9%) shown in Figure 6. Unsurprisingly, the new PC plot is very similar to the Bi3+ component of the all primary-ion set (Figure 5). PC1 accounted for the presence and absence of protein, and PC2 showed good separation for the protein fragments. Interestingly, there was significant overlap for the closely

related Fab and Fab′-IAA fragments but not the Fab′. The Fab′ was most strongly loading in PC2 and the loadings plot, shown in Figure 7, included amino acid fragments C3H6N+ and

Figure 7. Loadings plot for PC2 of Figure 6. ToF-SIMS peak list contained 35 amino acid mass fragments. Loadings are ordered in increasing mass to charge ratio.

Figure 5. Principal component analysis of antibody fragments on silicon wafers PC 1 vs PC 2 as labeled by either protein fragment (left) or primaryion source (right). ToF-SIMS peak list of 35 amino acid mass fragments was utilized. Ellipses show a 95% confidence level. 8721

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Table 1. Amino Acid Composition of F(ab′)2 and Fc Fragments from the Anti-EGFR Antibody and Their Corresponding F(ab′)2/Fc Ratio F(ab′)2

Fc

amino acid

no. of groups

composition (%)

no. of groups

composition (%)

ratio of F(ab′)2/Fc composition

alanine (A) arginine (R) aspargine (N) aspartate (D) cysteine (C) glutamate (E) glutamine (Q) glycine (G) histidine (H) isoleucine (I) leucine (L) lysine (K) methionine (M) phenylalanine (F) proline (P) serine (S) threonine (T) tryptophan (W) tyrosine (Y) valine (V)

38 34 34 40 26 32 34 72 20 36 62 48 10 30 64 124 86 18 40 60

4.19 3.74 3.74 4.41 2.86 3.52 3.74 7.93 2.20 3.96 6.83 5.29 1.10 3.30 7.05 13.66 9.47 1.98 4.41 6.61

8 14 24 24 8 30 14 14 14 18 24 32 12 14 30 36 30 8 14 50

1.91 3.35 5.74 5.74 1.91 7.18 3.35 3.35 3.35 4.31 5.74 7.66 2.87 3.35 7.18 8.61 7.18 1.91 3.35 11.96

2.19 1.12 0.65 0.77 1.50 0.49 1.12 2.37 0.66 0.92 1.19 0.69 0.38 0.99 0.98 1.59 1.32 1.04 1.32 0.55

CH4N+, representative of many amino acids 29 though commonly glutamate, phenylalanine, lysine and methionine. We prepared the amino acid composition for the F(ab′)2 and Fc fragments from our antibody sequence (Table 1). By extension, the Fab, Fab′, and Fab′-IAA fragments have the same percentage composition of each amino acid, but contain minor differences in overall chemical composition. It was found that the F(ab′)2 (and subsequently Fab′) composition contained an equivalent prevalence of phenylalanine to the Fc fragment, but a lower composition of lysine, complicating analysis. The peak ratio method has been shown to give good resolution between F(ab′)2 and Fc fragments immobilized on gold.26 Thus, we identified strongly loading mass fragments from PC2 that could be attributed to singular amino acids and produced their F(ab′)2/Fc ratio (Table 2). A ratio greater than 1 is an indicator of amino acids that have prevalence in F(ab′)2 (and subsequently Fab, Fab′ and Fab′-IAA) over Fc. Thus, it would be expected that a higher ratio of the intensities of the 107.05 m/z peak (tyrosine) to the 61.01 m/z peak (methionine) and a lower ratio of the intensities of the 80.05 m/z peak (proline) to the 43.03 m/z peak (arginine) would be

representative of Fab′. In addition, the converse should be true for the Fc. The peak intensity ratios for each sample type are shown in Figure 8. The peak ratio of 107.05/61.01 m/z is most

Figure 8. Peak intensity ratios for individual protein samples from ToF-SIMS. 107.05/61.01 ratio should be high for Fab′, and 80.05/ 43.03 ratio should be high for Fc. Average of triplicate shown with error bars ± standard deviation.

Table 2. Comparison of Amino Acid Prevalence in ToFSIMS Spectra to Their Composition Ratio F(ab′)2/Fc from the Protein Structure amino acids more prevalent in F(ab′)2 from PCA of ToF-SIMS spectra

F(ab′)2/Fc composition ratio from protein structure

arginine tyrosine valine amino acids more prevalent in Fc from PCA of ToF-SIMS spectra

1.12 1.32 0.55 F(ab′)2/Fc composition ratio from protein structure

methionine phenylalanine proline threonine

0.38 0.99 0.98 1.32

intense for Fab′ and least intense for the Fc fragment. The intensities vary across fragments, indicating that this is perhaps a good peak ratio choice for differentiating between samples. Interestingly, the intensity of the control sample (protein-free) is quite high. This is most likely due to overlaps with siliconhydrocarbon fragments such as SiCH5O+ (61.01), or overall small ion intensity variation that results in seemingly large ratios. For the 80.05/43.03 ratio, the intensity for the Fc is similar to most samples and the intensity for the F(ab′)2 is marginally higher, indicating that this ratio may be ineffective at identifying the differences between F(ab′)2 and Fc. The Fab′ has low intensity for this ratio implying that the intensity derived from 80.05 or 43.03 m/z may not be only due to the C5H6N+ and CH3N2+ respectively. Other possible mass 8722

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Figure 9. Comparison of (a) Unsupervised Kohonen Network (UKN, false-colored) to (b) Counter-Propagation Artificial Neural Networks (CPANNs) where the class has been defined by the primary-ion source (Mn+, blue; Bi+, red; Bi3+, green). 10 ×10 NN with 100 000 epochs was used in both cases. ToF-SIMS peak list of 775 mass fragments was utilized. Numbers represent individual samples; full details are provided in Table S2.

Figure 10. Comparison of CP-ANNs where class has been defined (a) by primary-ion source (Mn+, blue; Bi+, red; Bi3+, green) or (b) by the antibody fragment (control wafer, dark purple; whole antibody IgG, red; deglycosylated antibody IgG(De), light green; F(ab′)2, yellow; Fab′, gray; iodoacetamide blocked Fab′, Fab′-IAA, pink; Fab, light blue; Fc, light purple). 10 ×10 NN with 100 000 epochs was used in both cases. ToF-SIMS peak list of 775 mass fragments was utilized. Numbers represent individual samples; full details are provided in Table S2.

fragment overlaps could be C2H8O3+, or C2H3O+, for 80.05 and 43.03 m/z, respectively. Artificial Neural Networks. UKN vs CP-ANN. Artificial neural networks (ANNs) provide a means to analyze large and complex data sets in significant detail, even when little information is known about the sample set. However, there are important factors to be considered when choosing the parameters for analysis. We compared the unsupervised Kohonen network (UKN) with the counter-propagation ANN (CP-ANN), with the class denoted as the primary-ion source (Figure 9). We chose a 10 × 10 neural network (NN) with 100 000 epochs in all cases to ensure adequate convergence. Given that the CP-ANN is essentially a UKN with output layers added one per class, it follows that the two maps will be similar if the output layers have no influence. This scenario is the case we are presented with when the CP-ANN classes are denoted as the primary-ion source. If we refer to the PCA of the data, PC1 is largely characterized by those differences in Mn+, Bi+, and Bi3+. Thus, the CP-ANN is not influenced by effectively defining PC1 as a class due to its primary role in separation of the samples. It is rare that two reconstructions will produce visually identical maps using random seed iteration; however, this is the case we observe and is a good indicator that our model has converged to our data set.

Our goal is to explore the variation in the antibody fragments specifically, so instead of defining the classes with the primaryion source, we can define the classes as each individual fragment (Figure 10). The new CP-ANN is visually different to the previous CP-ANN and UKN due to the reordering of the map based on the new output layers (classes). In our sample set, there are large repeating units such as the Fab portion of the antibody that occurs in six of the eight fragments analyzed, thus making the separation of the map based on the fragment more challenging. However, the Fc portion (shown in light purple 10b) is distinctly separate from the other classes, indicating that the method can indeed distinguish at least the Fc portion. Effect of Map Size and Iterations. We have systematically varied user-defined parameters of a CP-ANN used to model our antibody fragment data based on the ability to separate our eight classes and to provide greater insight into the samples. In terms of the map size, as a general rule,12 at least three times as many neurons (cells) as unique sample types provides good separation. In our study, we have 24 unique samples (with 3 replicates) giving a minimum map size of 72 neurons. The toolbox employed offers hexagonal neurons displayed in square-shaped networks with sizes of integer values of 2; thus, the most suitable size map should be 10 × 10. To investigate this effect, we produced ANNs of size 4 × 4, 10 × 8723

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir 10, and 16 × 16 and varied the number of iterations from 1 to 100 000 (Figure 11).

weightings for other classes. This trend continues with increasing epochs where one class (Fab′-IAA, pink) is missing until we reach the 100 000 epoch map. At this point, at least one neuron is assigned to each color though multiple samples from different classes are observed in single neurons. This indicates that a 4 × 4 map is inadequate for uniquely resolving this data set. Increasing the map size to 10 × 10 NN, the map with 10 epochs immediately distinguishes all 8 classes and begins to accurately classify samples into the correct classes. To assist with our interpretation, we prepared the class identification (misclassification) error rate for each of the Figure 11 maps (and others) as an average of 5 individual maps trained with these conditions (Figure S5). Ten epochs had a low misclassification rate (9%); however, this result could be improved by increasing the number of epochs. The limiting factor of ANNs can be the computation time which is influenced most heavily by the number of epochs, the size of the map, and the complexity of the sample set. For our computer system and data set, increasing the number of epochs for a 10 × 10 NN varied from a few minutes (10−100 epochs) to multiple hours (100,000 epochs). To achieve an accurate (i.e., converged) model using an ANN, large amounts of data with a large number of epochs should be used. Realistically, this is dependent on computation time available and as such parameters should be chosen that achieve comprehensible data. A 16 × 16 NN is effectively too large for this data set and, subsequently, is inefficient in terms of data interpretation despite requiring less computational time. In these maps, the majority are empty neurons without samples present and the sample triplicates are quite spread independent of the number of epochs. We expect our triplicates to be very similar and the purpose of our interpretation is to identify differences in classes. As each neuron carries loadings from each variable, then having more neurons per class requires extra information to be interpreted to gain an understanding of the class. One use for a map of this size would be to explore the reproducibility of data collection to assist in understanding factors contributing negatively to consistency. It is also observed that with excessively large maps, the misclassification error rate is often zero due to the presence of many sample free neurons in each class. Subsequently, it is of upmost importance that the map size is selected based on the data to be extracted and that researchers understand that a trial-and-error process is required to exercise the power of ANNs. Reproducibility. To investigate the reproducibility of CPANNs we used the 10 × 10 NN with 100 000 epochs and performed three independent reconstructions for our whole data set using identical parameters (Figure 12). It is clear that the maps are visually different; however, the data contained are similar. As outlined earlier, this visual difference is one of the major factors that have limited the widespread acceptance of ANNs. These differences are due to the random seed approach and thus the reconstruction varies based on sampling the input data. In our case, the error rate varied between 2.8 and 9.7% with an average of 6.0%, indicating a good fit. This error is representative of how well the samples with user-defined classes match the predicted classes in the CP-ANN. If we focus on a specific aspect of the data, for example the black circles in Figure 12, we can see that in each reconstruction, the triplicate 28, 29, 30 (whole IgG, Bi3+) falls on the same neuron and is immediately adjacent to sample 43 (Fab, Bi3+), indicating similarity. The remainder of the

Figure 11. Changes to the Artificial Neural Network user-defined parameters. Map size and number of epochs (iterations) were varied to investigate convergence of model about eight classes: Control wafer, dark purple; whole antibody IgG, red; deglycosylated antibody IgG(De), light green; F(ab′)2, yellow; Fab′, gray; iodoacetamide blocked Fab′, Fab′-IAA, pink; Fab, light blue; Fc, light purple. ToFSIMS peak list of 775 mass fragments was utilized. Red dots represent individual samples to demonstrate clustering differences.

Given that the number of epochs required to allow the model to truly converge on the data set is related to the number of variables it contains, it is clear that a single epoch cannot resolve a complex data set such as this. In the first row of Figure 11, we demonstrate the CP-ANN attempting this. The first and most obvious finding is that the color of a single class is maintained regardless of the size of the map, this is due to no class being defined for any neuron. This is because a single iteration of the random seed is insufficient to begin to define a class; however the information revealed may be important. The ANN has simply revealed part of the solution; the variance in a small number of variables. As we increase the number of epochs performed on the 4 × 4 map, we begin to see the formation of distinct classes. For 10 epochs, 4 of the 16 neurons have been denoted a single class, and the remainder share classes. Further, only 6 of the 8 classes are evident in the color plot as there is insufficient space in the map to spread the sample data points. The remaining 2 classes are not defined as they are shared on neurons that have higher 8724

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

Figure 12. Three separate CP-ANNs constructed using identical parameters; 10 × 10 NN with 100 000 epochs. Error (misclassification) rates are 5.6%, 9.7%, and 2.8% for reconstructions 1, 2, and 3 respectively. Eight classes: Control wafer , dark purple; whole antibody IgG, red; deglycosylated antibody IgG(De), light green; F(ab′)2, yellow; Fab′, gray; iodoacetamide blocked Fab′, Fab′-IAA, pink; Fab, light blue; Fc, light purple. ToF-SIMS peak list of 775 mass fragments was utilized. Numbers represent individual samples; full details are provided in Table S2. Black marks serve as a guide to the eye.

Table 3. Correlation of the User-Defined Class (Input) with the ANN Assigned Class (Output) Averaged across the Three Reconstructions (from Figure 12) and Sorted by Primary-Ion Source: (a) Total Summed Average (AVG), (b) Mn+, (c) Bi+, and (d) Bi3+a

a

ToF-SIMS peak list of 775 mass fragments was utilized.

triplicates (44, 45) are identified in the same class as 43 as would be expected, but vary in the number of neurons

separating them. The neuron spacing is not a complete representation of the similarity between samples as the 8725

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

had poor covering (misclassified as control), may have had some residual glycosylation (misclassified as IgG) or simply was similar to the loading for Fab with the lighter source ion Mn+. Equivalently, we prepared the misclassification tables for the reconstructions produced with the amino acid mass fragments (Table S6). A similar trend was observed where the heavier Bi3+ gave the best classification results at 0.92. However, when lighter primary-ions were concerned, the Mn+ fared better than the Bi+ with 0.88 and 0.83 success rates, respectively. On average, this method had a success rate of 0.85, less than the utilization of a comprehensive 775 peak list; 0.94. Bi3+ Source. We employed the optimum parameters to demonstrate the power of ANNs for resolving antibody fragment classes. The Bi3+ primary-ion source that produced improved ion intensity at higher masses was selected, reducing the data set to 8 unique sample types. The 6 × 6 map was chosen as the minimum map size as based on the general rule, the map size should be greater than 24 (3 × 8). The 6 × 6 CPANN was trained for 100 000 epochs, to ensure convergence, using two of the triplicate data from each sample type. Figure 13 shows the final CP-ANN map where the third point of the

Euclidian distance may in fact be large (indicating dissimilar samples) but the map size confines the spread of the samples. In our case, we wish to define the difference between classes, not necessarily the difference between our triplicates thus our map size is appropriate. For the reconstructions with the lower error rates (1 and 3) the samples 44 and 45 fall on the same neuron indicating that they are similar. Overall the similarity of the whole IgG class to the Fab class is unsurprising, given that the IgG contains this identical Fab component. The IgG(De), Fab′ and F(ab′)2 classes are also in close proximity to the IgG. However, the Fc class is not directly adjacent, despite the IgG containing this unaltered component. This may indicate that the IgG is perhaps oriented with its Fab region facing away from the surface. This is consistent with observations in PC2 in Figure 3 where the Fc is well separated from the other samples. We also prepared the equivalent reconstructions for the amino acid mass fragment list and found that the misclassification rate was slightly higher (Figure S6). In this case, the rate varied from 8.3% to 15.3% with an average of 11.1%. Effect of Primary-Ion Source. The sensitivity of ToFSIMS is such that populations of individual secondary ion species will contribute to the identification of antibody fragments. These populations are defined by both the sample and vary according to the primary-ion source. As discussed earlier, the heavy primary-ions have the ability to enhance the intensity of the heavier mass ion species thus changing the ratio of the populations.2 Furthermore, we identified in our PCA and in our early UKN and CP-ANN that the primary-ion source was a significant contributor to the difference in the samples and mass spectra. Since it is known that different primary-ions will produce different protein fragmentation patterns,25 it was inferred that the classification ability of the CP-ANN may be reduced by the use of multiple primary-ions in a single data set. We compared the user-defined class with the CP-ANN predicted class assignments for each of our sample types and averaged across three reconstructions, and then compared the difference between each of the primary-ion sources as averaged across the three reconstructions (Table 3). The resulting heat map is an indicator of both the primary-ions’ ability to resolve differences in the sample types and also to identify incorrectly assigned common fragments. The average across the diagonal indicates the success rate so in the case of the AVG, this is 0.94 (equivalent to the 6.0% misclassification error rate of the CPANNs). Overall this indicates that the use of data from three different primary-ions results in a very good classification ability of 94%. This should not be understated, given that the use of multiple primary-ions in a single multivariate data set is uncommon, yet the CP-ANN analysis gave good classification. For analysis with the Mn+, Bi+, and Bi3+ primary-ion sources, the success rates are 0.84, 0.97, and 1.00, respectively. This is a unanimous indicator that Bi3+ is the most suitable primary-ion source for determining the difference between closely matching antibody fragments. We also produced the correlation tables for each reconstruction individually to compare the difference in primary-ion sources (Tables S3−S5) and found that, only for Reconstruction 3, Bi+ also had a success rate of 1.00. Conversely it is apparent that the majority of the error simply arises from the use of Mn+ where the population of heavier secondary ions is limited. Interestingly, we observed that the Fab′-IAA and the F(ab′) 2 samples were the highest misclassified being mistaken for one another. The IgG(De) was also misclassified readily by Mn+ with IgG and the control sample and the Fab. This indicates that the IgG(De) may have

Figure 13. 6 × 6 CP-ANN trained for 100 000 epochs with 2 of the 3 Bi3+ replicate samples with class denoted as fragment type. The final triplicate point was used as test data (T). ToF-SIMS peak list of 775 mass fragments was utilized. Control wafer, dark purple; whole antibody IgG, red; deglycosylated antibody IgG(De), light green; F(ab′)2, yellow; Fab′, gray; iodoacetamide blocked Fab′, Fab′-IAA, pink; Fab, light blue; Fc, light purple.

triplicate was used for testing (T). The ANN had a classification error rate of zero, indicating that the model was complete and could accurately assign each of the fragment triplicate data to its class. As a comparison we produced a CP-ANN under the same condition using the reduced list of 35 amino acid related mass fragments (Figure 14). In a similar fashion, the CP-ANN had an error rate of zero and correctly assigned each of the fragment triplicate to its class. This result occurred despite the indication that the Bi3+ primary-ion applied to the amino acid list was less successful than the use of the 775 mass fragment list. In a similar manner to loadings in PCA, CP-ANNs produce class weights that identify the contribution that each variable has to the class. We prepared the class weights for both the 775 mass fragment CP-ANN and the 35 amino acid mass fragment CP-ANN (Figures S7 and S8, respectively). For the 775 mass fragment class weights, the complexity of each variable contribution is evident. In the case of the amino acid class 8726

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir

complex system. Artificial neural networks have been shown to be better suited to classifying biomolecular species due to their ability to incorporate higher order PCs and plot to a 2D map as an aid to visualizing complex relationships in the data. CPANNs can be utilized to produce class weights, akin to the loadings plots of PCA, to investigate with fine detail contributions from all variables. PCA employs a linear methodology, where each PC is removed from the data set before the computation of the next, forcing independent orthogonal analysis. Conversely, ANNs allow the incorporation of each new PC on the data set in a nonlinear fashion. ANN class weights allow the contribution of variables to each sample type to be assessed simultaneously, a process that requires oneto-one time-consuming sequential analysis with PCA. We have shown that employing ToF-SIMS with a Bi3+ primary-ion source can create ion intensities at heavier masses that allow ANNs to separate antibody fragments with recurring segments which is a remarkable finding. Ultimately, data analysis of large and complex systems is improved by utilization of both PCA and ANN methods due to their complementary nature.

Figure 14. 6 × 6 CP-ANN trained for 100 000 epochs with 2 of the 3 Bi3+ replicate samples with class denoted as fragment type. The final triplicate point was used as test data (T). ToF-SIMS peak list of 35 amino acid mass fragments was utilized. Control wafer, dark purple; whole antibody IgG, red; deglycosylated antibody IgG(De), light green; F(ab′)2, yellow; Fab′, gray; iodoacetamide blocked Fab′, Fab′IAA, pink; Fab, light blue; Fc, light purple.



ASSOCIATED CONTENT

S Supporting Information *

weights, it is considerably easier to assess which amino acids contribute strongly to each class. Interestingly, the contributions for F(ab′)2 and Fab′-IAA were most similar, which is consistent with the high misclassification rates in Table 3 for all 775 mass fragments. The Fc class weights were particularly strong with amino acid variables 7 and 8, corresponding to 60.05 and 61.01 m/z (serine and methionine, respectively). This is consistent with the peak ratio composition where methionine occurs predominantly in the Fc region and also PCA. We have demonstrated the ability to resolve an antibody from its proteolytic fragments that contain not only similar amino acids, but both identical short-range order (sequence) and long-range order (whole Fab segment), and that may only vary by few chemical species (Fab compared to Fab′, free thiol). This is a significant finding, as it demonstrates complex data sets can be decoupled with neural network data analysis using a large peak list derived from ToF-SIMS with a heavy primary-ion source. This was further verified by the use of an amino acid related fragment list. This represents a significant step forward from the earlier ToF-SIMS neural network analysis approaches, and infers that the use of heavier primary-ion sources (Bi3+, 627 u) over the earlier generation of sources (such as Cs+, 133 u) is better suited for ToF-SIMS protein analysis and differentiation.

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.langmuir.6b02312. Extra PCA scores and loadings plots, amino acid fragment list, sample list, error (misclassification) rates for various map sizes and number of epochs, amino acid mass fragment reconstructions, and tables outlining effect of primary-ion source for reconstructions and CP-ANN class weights (PDF)



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected]. Notes

The authors declare no competing financial interest.



ACKNOWLEDGMENTS This work was performed in part at the Victorian Node of the Australian National Fabrication Facility (ANFF) through the La Trobe University Centre for Materials and Surface Science. Special thank you to Judith Scoble and Luisa Pontes-Braz from CSIRO for important discussion and preparation of the antibody fragments. We acknowledge the Milano Chemometrics and QSAR Research Group for developing the Kohonen and CP-ANN Toolbox.



CONCLUSION We have demonstrated the use of ANNs as a valuable interpretation method for complex ToF-SIMS antibody spectra. We have shown that selecting a large number of mass peaks from the spectrum, regardless of whether they are known, can be ultimately beneficial in providing an improved level of detail to allow multivariate separation of a complex sample set; an antibody and its proteolytic fragments. It is well established that principal component analysis is a useful tool in the multivariate analysis of ToF-SIMS data from biomolecules. However, large and complex data can be difficult to interpret without in-depth and time-consuming analysis; some biomolecular species and relationships remain unresolved. The use of an amino acid only peak list and peak ratio method yield inconsistent results in this



REFERENCES

(1) Graham, D. J.; Wagner, M. S.; Castner, D. G. Information from complexity: Challenges of TOF-SIMS data interpretation. Appl. Surf. Sci. 2006, 252 (19), 6860−6868. (2) Touboul, D.; Kollmer, F.; Niehuis, E.; Brunelle, A.; Laprevote, O. Improvement of biological time-of-flight-secondary ion mass spectrometry imaging with a bismuth cluster ion source. J. Am. Soc. Mass Spectrom. 2005, 16 (10), 1608−1618. (3) Benninghoven, A. Chemical Analysis of Inorganic and Organic Surfaces and Thin Films by Static Time-of-Flight Secondary Ion Mass Spectrometry (TOF-SIMS). Angew. Chem., Int. Ed. Engl. 1994, 33 (10), 1023−1043.

8727

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728

Article

Langmuir (4) Graham, D. J.; Castner, D. G. Multivariate Analysis of ToF-SIMS Data from Multicomponent Systems: The Why, When, and How. Biointerphases 2012, 7, 1−4. (5) Wagner, M. S.; Castner, D. G. Characterization of adsorbed protein films by time-of-flight secondary ion mass spectrometry with principal component analysis. Langmuir 2001, 17 (15), 4649−4660. (6) Sanni, O. D.; Wagner, M. S.; Briggs, D.; Castner, D. G.; Vickerman, J. C. Classification of adsorbed protein static ToF-SIMS spectra by principal component analysis and neural networks. Surf. Interface Anal. 2002, 33 (9), 715−728. (7) Park, J.-W.; Cho, I.-H.; Moon, D. W.; Paek, S.-H.; Lee, T. G. ToF-SIMS and PCA of surface-immobilized antibodies with different orientations. Surf. Interface Anal. 2011, 43 (1−2), 285−289. (8) Foster, R. N.; Harrison, E. T.; Castner, D. G. ToF-SIMS and XPS Characterization of Protein Films Adsorbed onto Bare and Sodium Styrenesulfonate-Grafted Gold Substrates. Langmuir 2016, 32 (13), 3207−3216. (9) Kohonen, T. Self-Organized Formation of Topologically Correct Feature Maps. Biological Cybernetics 1982, 43 (1), 59−69. (10) Kohonen, T. The Self-Organizing Map. Proc. IEEE 1990, 78 (9), 1464−1480. (11) Kohonen, T. Essentials of the self-organizing map. Neural Networks 2013, 37, 52−65. (12) Brereton, R. G. Self organising maps for visualising and modelling. Chem. Cent. J. 2012, 6, S1. (13) Zupan, J.; Novic, M.; Ruisanchez, I. Kohonen and counterpropagation artificial neural networks in analytical chemistry. Chemom. Intell. Lab. Syst. 1997, 38 (1), 1−23. (14) Ballabio, D.; Consonni, V.; Todeschini, R. The Kohonen and CP-ANN toolbox: A collection of MATLAB modules for Self Organizing Maps and Counterpropagation Artificial Neural Networks. Chemom. Intell. Lab. Syst. 2009, 98 (2), 115−122. (15) Ballabio, D.; Vasighi, M. A MATLAB toolbox for Self Organizing Maps and supervised neural network learning strategies. Chemom. Intell. Lab. Syst. 2012, 118, 24−32. (16) Tokutaka, H.; Obu-Cann, K.; Fujimura, K.; Ikeda, Y.; Yoshihara, K. Metal Mat Grp, S. Application of self-organizing maps (SOMs) to chemical spectral analysis of elements in the Periodic Table. Surf. Interface Anal. 2002, 34 (1), 610−614. (17) Mariey, L.; Signolle, J. P.; Amiel, C.; Travert, J. Discrimination, classification, identification of microorganisms using FTIR spectroscopy and chemometrics. Vib. Spectrosc. 2001, 26 (2), 151−159. (18) Kalegowda, Y.; Harmer, S. L. Classification of time-of-flight secondary ion mass spectrometry spectra from complex Cu−Fe sulphides by principal component analysis and artificial neural networks. Anal. Chim. Acta 2013, 759, 21−27. (19) Ball, G.; Mian, S.; Holding, F.; Allibone, R. O.; Lowe, J.; Ali, S.; Li, G.; McCardle, S.; Ellis, I. O.; Creaser, C.; Rees, R. C. An integrated approach utilizing artificial neural networks and SELDI mass spectrometry for the classification of human tumours and rapid identification of potential biomarkers. Bioinformatics 2002, 18 (3), 395−404. (20) Chen, Y. D.; Zheng, S.; Yu, J. K.; Hu, X. Artificial neural networks analysis of surface-enhanced laser desorption/ionization mass spectra of serum protein pattern distinguishes colorectal cancer from healthy population. Clin. Cancer Res. 2004, 10 (24), 8380−8385. (21) Grus, F. H.; Podust, V. N.; Bruns, K.; Lackner, K.; Fu, S. Y.; Dalmasso, E. A.; Wirthlin, A.; Pfeiffer, N. SELDI-TOF-MS ProteinChip Array profiling of tears from patients with dry eye. Invest. Ophthalmol. Visual Sci. 2005, 46 (3), 863−876. (22) Castner, D. G.; Ratner, B. D. Biomedical surface science: Foundations to frontiers. Surf. Sci. 2002, 500 (1−3), 28−60. (23) Sato, J. D.; Kawamoto, T.; Le, A. D.; Mendelsohn, J.; Polikoff, J.; Sato, G. H. Biological effects in vitro of monoclonal antibodies to human epidermal growth factor receptors. Mol. Biol. Med. 1983, 1 (5), 511−29. (24) GrueningerLeitch, F.; D'Arcy, A.; D'Arcy, B.; Chene, C. Deglycosylation of proteins for crystallization using recombinant fusion protein glycosidases. Protein Sci. 1996, 5 (12), 2617−2622.

(25) Muramoto, S.; Graham, D. J.; Wagner, M. S.; Lee, T. G.; Moon, D. W.; Castner, D. G. ToF-SIMS Analysis of Adsorbed Proteins: Principal Component Analysis of the Primary Ion Species Effect on the Protein Fragmentation Patterns. J. Phys. Chem. C 2011, 115 (49), 24247−24255. (26) Wang, H.; Castner, D. G.; Ratner, B. D.; Jiang, S. Probing the Orientation of Surface-Immobilized Immunoglobulin G by Time-ofFlight Secondary Ion Mass Spectrometry. Langmuir 2004, 20 (5), 1877−1887. (27) Samuel, N. T.; Wagner, M. S.; Dornfeld, K. D.; Castner, D. G. Analysis of Poly(amino acids) by Static Time-of-Flight Secondary Ion Mass Spectrometry (TOF-SIMS). Surf. Sci. Spectra 2001, 8 (3), 163− 184. (28) Kosobrodova, E.; Jones, R. T.; Kondyurin, A.; Chrzanowski, W.; Pigram, P. J.; McKenzie, D. R.; Bilek, M. M. M. Orientation and conformation of anti-CD34 antibody immobilised on untreated and plasma treated polycarbonate. Acta Biomater. 2015, 19, 128−137. (29) Mantus, D. S.; Ratner, B. D.; Carlson, B. A.; Moulder, J. F. Static secondary-ion mass-spectrometry of adsorbed proteins. Anal. Chem. 1993, 65 (10), 1431−1438.

8728

DOI: 10.1021/acs.langmuir.6b02312 Langmuir 2016, 32, 8717−8728