Modeling Biological Activities of Nanoparticles - Nano Letters (ACS

Oct 5, 2012 - cell type, model, no. descriptors, r2train, SEE, r2test, SEP ...... Khara D. Grieger , Jennifer Hoponick Redmon , Eric S. Money , Mark W...
0 downloads 0 Views 362KB Size
Subscriber access provided by NORTH CAROLINA STATE UNIV

Communication

Modelling biological activities of nanoparticles Vidana Chandana Epa, Frank R Burden, Carlos Tassa, Ralph Weissleder, Stanley Y. Shaw, and David Alan Winkler Nano Lett., Just Accepted Manuscript • Publication Date (Web): 05 Oct 2012 Downloaded from http://pubs.acs.org on October 8, 2012

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Nano Letters is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

Modelling biological activities of nanoparticles V. Chandana Epa#, Frank R. Burden#, Carlos Tassa†, Ralph Weissleder†, Stanley Shaw†¶, and David A. Winkler#∞*

#CSIRO ∞Monash

Materials Science and Engineering, 343 Royal Parade, Parkville, Victoria 3052, Australia. Institute of Pharmaceutical Sciences, 381 Royal Parade, Parkville 30152, Australia. †Center

for Systems Biology Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, ¶Broad Institute of Harvard and MIT, Cambridge, Massachusetts

E-mail: [email protected]

1

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 20

Abstract: Products are increasingly incorporating nanomaterials, but we have a poor understanding of their adverse effects. To assess risk, regulatory authorities need more experimental testing of nanoparticles. Computational models play a complementary role in allowing rapid prediction of potential toxicities of new and modified nanomaterials. We generated quantitative, predictive models of cellular uptake and apoptosis induced by nanoparticles for several cell types. We illustrate the potential of computational methods to make a contribution to nanosafety.

Keywords: Nanoparticle toxicity, model prediction, apoptosis, cellular uptake, Bayesian methods

Many products are now exploiting the novel properties of nanomaterials but their potential harmful effects are incompletely understood, a critical issue for regulatory authorities. Experimental testing of all potential nanomaterials is impractical; computational approaches such as machine learning methods can help assess potential risk of new and modified nanomaterials, and prioritize nanomaterials for experimental testing. Puzyn et al.1 and Fourches et al.2 recently reported models of nanoparticle properties that demonstrated proof of concept for this approach. Here we report the use quantitative structure–activity relationship (QSAR) methods and sparse nonlinear methods3-6 to generate robust models for two complex, systematically acquired datasets: (i) 31 nanoparticles consisting of 11 different metal core/coating combinations; and (ii) 109 nanoparticles on one predominant core platform but bearing different surface modifications. These datasets of nanoparticle bioactivity illustrate two levels of complexity: activity differences across divergent materials; and activity differences arising from surface modifications to a common core platform. We generated quantitative, predictive, and informative models that describe nanostructure-activity relationships for cellular uptake and apoptosis induced by nanomaterials.

2

ACS Paragon Plus Environment

Page 3 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

Our approach provides important advantages over previous methods and can provide guidance for nanoparticle regulation and the future design of safe nanomaterials.

Shaw et al.7 systematically profiled the effects of fifty different nanoparticles (encompassing 11 different metal core/coating combinations plus different surface modifications) in several biological contexts: in four cell lines, using four biological assays in each cell line, at four concentrations per assay (see Methods). These data were used for our first computational modelling study. Of the possible combinations of biological assays and cell-types, only the apoptosis assays exhibited a dose-response relationship. Of these, only the smooth muscle cell apoptosis assay generated statistical significant models. We initially investigated the dependence of the apoptosis response on the relaxivities (R1 and R2) and the zeta potential (available for 32 of the nanoparticles). We found a very significant relationship between the relaxivity R1, and the apoptosis assay results. However, as the relaxivities correlated almost completely with the type of iron oxide core, it is very likely that the type of core material not the relaxivities was influencing the smooth muscle apoptosis. Consequently, we developed models using three indicator variables (taking the values of 1 or 0 when the condition is present or absent) for core material, surface coating, and surface charge. This yielded a simple but statistically significant nano-QSAR equation that predicted smooth muscle apoptosis (SMA) induced by the metal oxide nanoparticles:

SMA = 2.26(±0.72) – 10.73(±1.05) IFe2O3 – 5.57(±0.98) I Idextran – 3.53 (±0.54) Isurf.chg

Model statistics were as follows: squared regression coefficient r2train = 0.81; test set regression coefficient r2test =0.86; standard error of estimation (SEE) =3.6; and of external prediction (SEP) = 3.3. We also derived highly statistically significant nonlinear models using these descriptors. Model statistics as follows: squared regression coefficient r2train = 0.80; test set regression coefficient r2test 3

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 20

=0.90; standard error of estimation (SEE) =2.8; and of external prediction (SEP) = 2.9. There were 6 effective weights in the nonlinear model ensuring that the model had not overfitted the data. This nonlinear neural network model generated smaller prediction errors for both training and test data than the linear model. The dominant contributor to the structure-activity relationship was the nature of the core material. Interestingly, the nature of the surface charge, or the zeta potential of the nanoparticles provided a smaller contribution to the resulting simple QSAR model. A plot of experimentally determined versus predicted values of SMA for the nonlinear model is presented in Fig. 1. The agreement between the observed apoptosis values and those predicted by the nanoQSAR model is satisfactory for the metal oxides from the training set (black circles) and those from the test set (red triangles).

4

ACS Paragon Plus Environment

Page 5 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

Figure 1. Performance of the smooth muscle apoptosis assay nonlinear model derived from data for 31 nanoparticles. Each point represents one type of nanoparticle. Axes are in units of smooth muscle apoptosis response as defined in the Methods section.

Features of the model shed light on biological effects of the nanomaterials. The weak dependence of the cell apoptosis on the surface charge indicator variable or zeta potential of the particles was initially puzzling. However, in the presence of plasma or serum (as is present in cell culture media), nanoparticles absorb proteins8, 9. This protein coat or corona causes the zeta potential to converge to a relatively constant value of -10 to -20 mV, so it is not unexpected that surface charge-related 5

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 20

descriptors of pristine nanoparticles are not strong descriptors in the model. The strongest factor affecting smooth muscle apoptosis was the nanoparticle core material, which may relate to the ability of iron oxide nanoparticles to generate reactive oxygen species (ROS), and cellular cytotoxic10 and proinflammatory responses.11 Hierarchical clustering analysis of in vitro data by Shaw et al. also suggested that for some particles, core composition exerted a discernible effect on assay responses.

Weissleder et al.12 reported cellular uptake of 109 nanoparticles sharing a superparamagnetic core and dextran coating, but bearing different small molecules conjugated to their surface. This data set was used to generate nano-QSAR models of cellular uptake. Of the five cell lines tested, only the pancreatic cancer (PaCa2) and human umbilical vein endothelial cell (HUVEC) lines showed significant variation in uptake of surface modified nanoparticles (Figure 2). The three macrophage or macrophage-like cell lines showed negligible variation in uptake with changes in surface chemistry. Table 1 summarizes the nano-QSAR models generated for these cell types.

Figure 2. Uptake of 109 types of surface modified nanoparticles by HUVEC and pancreatic cancer cell (PaCa2). 6

ACS Paragon Plus Environment

Page 7 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

Table 1. Statistics for the best nano-QSAR models for the nanoparticle uptake by each cell type Cell Type Model HUVEC

PaCa2

Number of descriptors r2 train SEE

r2 test SEP

linear

11

0.74

0.34 0.63

0.36

nonlinear

11

0.70

0.30 0.66

0.33

linear

19

0.76

0.19 0.79

0.24

nonlinear

19

0.77

0.15 0.54

0.28

The linear and nonlinear models of nanoparticle uptake had similar statistical power. Figure 3 shows the excellent agreement between experimentally determined and predicted log nanoparticle uptake in PaCa2 cells for the training set and test sets for the linear model. Uptake by HUVEC and PaCa2 cells could be predicted to within a factor of two.

7

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 20

Figure 3. Performance of PaCa2 nanoparticle uptake model for the training set (black dots) and test set (red triangles). Each point represents a different type of surface-modified nanoparticle.

There is almost no overlap in the identity of the 11 descriptors in the HUVEC model and the 19 descriptors in the PaCa2 model suggesting they respond quite differently to the nanoparticle surface chemistry. For example, there are two π electronegativity autocorrelation descriptors in the HUVEC model while there are three σ electronegativity autocorrelation descriptors in the PaCa2 model. This suggests that our model captures important aspects of how nanoparticle surface chemistry controls uptake in different cell types.

8

ACS Paragon Plus Environment

Page 9 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

The value of a QSAR model lies in its ability to guide the design and synthesis of novel materials having optimum properties, for example, uptake by pancreatic cancer cells. To implement this in practice, an ideal QSAR model should have good predictive performance while using molecular descriptors that are easier to interpret by chemists. Consequently, we also generated nano-QSAR models of nanoparticle uptake by HUVEC and PaCa2 cells using optimal subsets of descriptors derived from a set of 124 chemically interpretable descriptors (Table 4).

Table 4. Statistics for the nanoparticle uptake models with interpretable descriptors Cell type

Model

Descriptors

r2 train

SEE

r2 test

SEP

HUVEC

BRANNLP

7

0.55

0.38

0.72

0.30

PaCa2

MLREM

8

0.64

0.26

0.62

0.32

There was only a slight deterioration in the statistical quality of the PaCa2 uptake model. Uptake could be predicted to within a factor of 2 when using these chemically interpretable descriptors.

According to the reference criteria, the difference between training and test set r2 values should not exceed 0.3.13 Our models meet this criterion and also pass external validation so could be used to predict the toxicity of new, untested surface modified nanoparticles. However, reliable predictions can only be made within the optimum prediction space (applicability domain) of the model.

Fourches et al.2 reported biological classification models for these data sets. They combined Shaw et al.’s data for all cell types, assays, and concentrations into a single biological response variable used to define active/inactive classes of activity. They did not dissect out that part of the data containing 9

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 20

structure-activity information, and only generated classification (active/inactive) models. These models correctly predicted the class 73% in spite of the inclusion of data for cell types and assays unaffected by the presence of nanoparticles. A similar approach was also adopted to model the data reported by Weissleder et al. Fourches et al. reported classification models having accuracies of 0.65-0.80 for only one of the five cell lines and for a subset of the 109 nanoparticles. In contrast, our models used only cell types and assays where a structure-activity relationship existed, generated quantitative not qualitative models and predictions that were within a factor of two of the observed uptake values. These models were also derived from larger, more chemically diverse sets of nanoparticles with a broader range of biological properties than the model reported by Puzyn et al.1

Fourches et al.2 also analyzed the types of properties that successfully generated models for the PaCa2 uptake of nanoparticles. They found that lipophilicity, molecular polarizability, and hydrogen bonding were important molecular properties to discriminate between high and low uptake of nanoparticles. Our nanoparticle uptake model for HUVEC cells selected molecular size and shape, hydrogen bonding capacity, and acidity, and hydrophilicity as the most relevant descriptors (see Supplementary Table 3). The descriptors relating to hydrogen bonding capacity contributed positively to the model, meaning they enhanced uptake of functionalized nanoparticles, while molecular shape (largely described by mass weighted second order mass moment) was negatively correlated with uptake by HUVEC cells. The model for uptake by PaCa2 cells used descriptors for broadly similar molecular properties (molecular size and shape, and hydrogen bonding) to those selected for the HUVEC models, the balance between these properties being different (Supplementary Table 3). The hydrogen bonding properties of the surface functionalization again contributed positively to nanoparticle uptake by PaCa2 cells, and molecular shape again was negatively correlated with uptake. Although the mechanisms by which functionalized nanoparticles are taken up by cells are varied and complex and still largely unclear, the positive contribution of 10

ACS Paragon Plus Environment

Page 11 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

hydrogen bonding to uptake is consistent with the cellular targeting studies of Vincent et al.14 and the general mechanism of interaction of these materials with cell surface proteins with subsequent activation of internalization mechanisms.15

Apart from using interpretable descriptors to build models, nanoparticles with desired values of cellular uptake may also be identified in large virtual libraries of surface modified nanoparticles by predicting the uptake of library members using the nano-QSAR models.

In conclusion, the present study uses two large sets experimental nanoparticle data and novel computational modelling methods to generate robust models of cellular uptake and induction of apoptosis by metal oxide nanoparticles in several types of cells. We have shown how nano-QSAR models can provide an in silico estimation of these biological properties in untested nanomaterials, particularly metal oxides when they employ chemically interpretable descriptors. The models can also be used identify useful nanoparticle modifications in large virtual libraries when interpretable descriptors are not available. Ours is one of the first reports of quantitative modelling and prediction of important nanoparticle properties. Although based on limited data, the results show that machine learning modelling techniques show considerable promise for analysis of the biological effects of nanoparticles. They may also be useful for modelling the effects of different bodily environments such as serum, plasma or lung fluids on nanoparticle composition, as well as nanoparticle cellular uptake and interaction with cellular biochemical systems. Furthermore, analysis of nanoparticle interactions with cells can inform the development of novel and exciting modes of targeted delivery of therapeutics or diagnostics to diseased tissues16, 17.

11

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 20

Methods Biological data. Two recent papers reported relatively large, high throughput studies of the biological interactions of engineered nanoparticles. Shaw et al.7 reported a study on the effects of fifty different nanoparticles in four cell lines (endothelial and smooth muscle cells, monocytes, and hepatocytes), using four biological assays (ATP content, reducing equivalents, caspase-mediated apoptosis, and mitochondrial membrane potential) in each cell line, at four concentrations per assay. The compositions of the nanoparticles, their surface coatings, surface functionalization, and measured properties are summarized in Supplementary Table 1. These experiments generated potentially sixty-four biological response variables for each of the fifty nanoparticles (3200 data points). As the data did not allow an EC50 or similar parameter to be calculated from the doseresponse data, we used the slope of the dose response curve (effectively the rate of change of biological response with concentration of the nanoparticles) as a dependent variable in the analyses. We generated models either using all of the data in the model, or splitting the data into a training set of 26, and a test set of 6 nanoparticles using a k-means clustering method.

Weissleder et al.12 screened a library of 109 fluorescent nanoparticles sharing a common superparamagnetic iron oxide core and dextran coating, whose surfaces were conjugated with diverse small molecules. Nanoparticles were evaluated for cellular uptake (in human umbilical vein endothelial cells (HUVEC), primary resting human macrophages (RestMph), granulocyte macrophage colony stimulating factor–stimulated human macrophages (GMCSF_Mph), a U937 human macrophage-like cell line (U937), and human pancreatic ductal adenocarcinoma cells (PaCa2)), measured by well fluorescein isothiocyanate (FITC) concentrations. Experimental data

12

ACS Paragon Plus Environment

Page 13 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

was used as its log10 transform, and clustering used to divide it into a test set of 21 molecules (i.e. 20% of the data) and a training set of 87 molecules (80% of the data).

In assessing whether assays contain useful biological information, the z-scored data were used where Z NP = (μNP − μPBS)/σPBS, where μ and σ are the mean and standard deviation of assay replicates, respectively, and the NP and PBS subscripts represent assays in the presence of PBS buffer controls. Assays where most of the dose-response curves fell within a z-score of ±2 were considered to demonstrate negligible effect.

Nanoparticle characterization. The data from Shaw et al.7 (bioassay data set) largely investigated the effects of the nanoparticle core on biological responses. The nanoparticles cores consisted of Fe2O3 and Fe3O4 metal oxide nanoparticles, and CdSe quantum dots. The nanoparticle coatings were largely cross-linked dextran, polyvinyl alcohol, or amphiphilic polymer like polyethylene glycol (PEG). Surface modification in most cases resulted in basic (amine) or acidic (carboxylate) functional groups (Supplementary Table 1). The sizes of the nanoparticles were very similar (approximately 30nm) except for one nanoparticle that was approximately double the size. Consequently nanoparticle size was excluded as a model parameter. The quantum dot nanoparticles were used at x1000 lower concentration than the iron oxide nanoparticles so were excluded from the study.

The data set reported by Weissleder et al.12 (cellular uptake data set) used chemical reactions to conjugate a range of small organic molecules to the nanoparticle surface. Scheme 1 of the Supplementary Information depicts the reactions and final products involved. We removed one molecule, the multifunctional diethylenetriaminepentaacetic dianhydride due to uncertainty as to the final reaction product conjugated to the nanoparticles. The covalently bound molecules 13

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 20

generated are listed in Table 2 of the Supplementary Information. The surface conjugated molecules are of varied chemical structures belonging to the anhydride (mostly cyclic), amine, and amino acid classes. Amino acids were considered to have undergone the same chemistry as amines. In the case of asymmetrically substituted cyclic anhydrides there is some ambiguity as to the unique final product. We resolved this ambiguity for the anhydrides by truncating the capping amide group, i.e. having a carboxylate functional group at each end of the molecule. The structures of the molecules were constructed with Sybyl-X v.1.2 (Tripos, Inc.) and their three-dimensional geometries optimized with Concord v 6.1.3 (Tripos, Inc.).

Modelling. We employed both linear modelling methods (simple multiple linear regression, MLR, and sparse linear modelling and feature selection, MLR-EM5), and two nonlinear Bayesian regularized artificial neural network methods to construct nano-QSAR models of biological effects of nanoparticles. The nonlinear modelling methods comprised of feed forward, fully connected networks with single input, hidden, and output layers. The complexity of the nonlinear models was controlled by Bayesian regularization, using Gaussian 3, and Laplacian priors4 (BRANNGP and BRANNLP methods). Sparse Laplacian priors automatically prune irrelevant descriptors and network weights, leading to sparse robust models. In modelling the data with BRANNGP method, we employed the same sparse, optimal descriptor set that had been selected by the MLREM protocol.

For the smooth muscle apoptosis model the size of the data set was relatively small so a limited number of descriptors could be screened for inclusion in the model without the risk of chance correlations or overfitting of the model. Consequently, we used indicator variables to describe the nanoparticles. These took values of 0 or 1 depending on whether a particular nanoparticle feature was absent or present. Features encoded in this way were the nature of the nanoparticle core (+1 for Fe2O3 and 0 for Fe3O4), type of coating (+1 for dextran and 0 for other coatings), and nature of 14

ACS Paragon Plus Environment

Page 15 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

surface functionality (encoded as +1 (basic), -1 (acidic) or 0 (neural)). We also used measured nanoparticle diameter, relaxivities R1 and R2, and zeta potential as descriptors (see Supplementary Table 1). One nanoparticle (NP19) was a very large outlier in the models and was excluded (reported models were generated from data for 31 nanoparticles). Development of nanoparticlespecific descriptors is an important research need in this field18.

To model the cellular uptake of nanoparticles, we calculated a set of 691 molecular descriptors, drawn from different sources, for each molecular species on the nanoparticle surface. Descriptors included constitutional, topological, path counts, connectivity indices, information indices, edge adjacency indices, topological charge, eigenvalue-based indices, 2D binary fingerprints, and 2D frequency fingerprints classes from DRAGON v 5.519, 2D autocorrelation descriptors from DRAGON19 and ADRIANA v 2.220, and atomistic, Burden index, and binned charge descriptors6, 21, 22 computed with an in-house modeling software package. We also used a smaller set of chemically interpretable descriptors of the following descriptor classes: constitutional, functional group counts, topological, geometrical, and atom-centered fragments (See Supplementary Table 3 for descriptions of these descriptors)

We used the multiple linear regression with expectation maximization (MLREM) sparse feature reduction method to select a small set of the relevant descriptors from the pool of 691 descriptors5. The MLREM method was applied repeatedly at increasing levels of sparsity by varying the values of the control hyperparameters5. The minimal and optimum set of molecular descriptors was taken at the level of sparsity beyond which the quality of the model starts to deteriorate significantly (deterioration of the model at an increased level of sparsity signifies that some relevant descriptors have been removed).

15

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 20

The quality of the models was assessed by the values of the (squared) correlation coefficient, r2, and the standard error of estimation, SEE and SEP, for the training and test sets respectively.

Supporting Information Available: Reaction scheme for surface modification, measured properties of nanoparticles in the bioassay data set, small molecule reagents used to functionalize the nanoparticle surfaces, and descriptors used to generate cellular uptake models. This material is available free of charge via the Internet at http://pubs.acs.org.

Acknowledgement The authors acknowledge support of the CSIRO Advanced Materials Transformational Capability Platform, and assistance from Dr. Richard Evans on chemical reaction products.

Author contributions S.S, C.T. and R.W. generated the experimental data, V.E., F.B and D.W. selected optimal structural descriptors, developed and validated the QSAR models and discussed the results. V.E. and D.W. wrote the paper.

Additional information The authors declare no competing financial interests. Correspondence and requests for materials should be addressed to DAW.

16

ACS Paragon Plus Environment

Page 17 of 20

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

17

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 20

References 1) Puzyn, T.; Rasulev, B.; Gajewicz, A.; Hu, X. K.; Dasari, T. P.; Michalkova, A.; Hwang, H. M.; Toropov, A.; Leszczynska, D.; Leszczynski, J. Nature Nanotech. 2011, 6, (3), 175-178. 2) Fourches, D.; Pu, D. Q. Y.; Tassa, C.; Weissleder, R.; Shaw, S. Y.; Mumper, R. J.; Tropsha, A. ACS Nano 2010, 4, (10), 5703-5712. 3) Burden, F. R.; Winkler, D. A. J. Med. Chem. 1999, 42, (16), 3183-3187. 4) Burden, F. R.; Winkler, D. A. QSAR Comb. Sci. 2009, 28, (10), 1092-1097. 5) Burden, F. R.; Winkler, D. A. QSAR Comb. Sci. 2009, 28, (6-7), 645-653. 6) Winkler, D. A.; Burden, F. R. Mol. Simulat. 2000, 24, (4-6), 243-+. 7) Shaw, S. Y.; Westly, E. C.; Pittet, M. J.; Subramanian, A.; Schreiber, S. L.; Weissleder, R. Proc. Natl. Acad. Sci. USA 2008, 105, (21), 7387-7392. 8) Rezwan, K.; Meier, L. P.; Rezwan, M.; Voros, J.; Textor, M.; Gauckler, L. J. Langmuir 2004, 20, (23), 10055-10061. 9) Rezwan, K.; Studart, A. R.; Voros, J.; Gauckler, L. J. J. Phys. Chem. B 2005, 109, (30), 1446914474. 10) Shubayev, V. I.; Pisanic, T. R.; Jin, S. H. Adv. Drug Deliv. Rev. 2009, 61, (6), 467-477. 11) Monteiller, C.; Tran, L.; MacNee, W.; Faux, S.; Jones, A.; Miller, B.; Donaldson, K. Occup. Environ. Med. 2007, 64, (9), 609-615. 12) Weissleder, R.; Kelly, K.; Sun, E. Y.; Shtatland, T.; Josephson, L. Nature Biotech. 2005, 23, (11), 1418-1423. 13) Eriksson, L.; Jaworska, J.; Worth, A. P.; Cronin, M. T.; McDowell, R. M.; Gramatica, P. Environ. Health Persp. 2003, 111, (10), 1361-75. 14) Vincent, A.; Babu, S.; Heckert, E.; Dowding, J.; Hirst, S. M.; Inerbaev, T. M.; Self, W. T.; Reilly, C. M.; Masunov, A. E.; Rahman, T. S.; Seal, S. ACS Nano 2009, 3, (5), 1203-1211. 18

ACS Paragon Plus Environment

Page 19 of 20

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Nano Letters

15) Albanese, A.; Tang, P. S.; Chan, W. C. Ann. Rev. Biomed. Eng. 2012, 14, 1-16. 16) Khemtong, C.; Kessinger, C. W.; Gao, J. M. Chem. Commun. 2009, (24), 3497-3510. 17) Tan, S. J.; Kiatwuthinon, P.; Roh, Y. H.; Kahn, J. S.; Luo, D. Small 2011, 7, (7), 841-856. 18) Clark, K. A.; White, R. H.; Silbergeld, E. K. Regul Toxicol Pharm 2011, 59, (3), 361-363. 19) Mauri, A.; Consonni, V.; Pavan, M.; Todeschini, R. MATCH-Comm. Math. Co. 2006, 56, (2), 237248. 20) Sadowski, J.; Wagener, M.; Gasteiger, J. Angew. Chem. Int. Edn. 1995, 34, (23-24), 2674-2677. 21) Burden, F. R.; Polley, M. J.; Winkler, D. A. J. Chem. Inf. Mod. 2009, 49, (3), 710-715. 22) Winkler, D. A.; Burden, F. R.; Watkins, A. J. R. Quant. Struct.-Act. Rel. 1998, 17, (1), 14-19.

19

ACS Paragon Plus Environment

Nano Letters

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 20

TOC graphic

20

ACS Paragon Plus Environment