Comparative Proteomics of Human and Macaque Milk Reveals

Mar 11, 2015 - ... glycans using Glycerol Free PNGase F (New England Biolabs, Ipswich, MA). ..... To demonstrate this, we compiled a list of milk prot...
0 downloads 0 Views 2MB Size
Subscriber access provided by Imperial College London | Library

Article

Comparative proteomics of human and macaque milk reveals species-specific nutrition during post-natal development Kristen Lina Beck, Darren Weber, Brett S Phinney, Jennifer T. Smilowitz, Katie Hinde, Bo Lönnerdal, Ian Korf, and Danielle G. Lemay J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/pr501243m • Publication Date (Web): 11 Mar 2015 Downloaded from http://pubs.acs.org on March 16, 2015

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Comparative proteomics of human and macaque milk reveals species-specific nutrition during postnatal development

Kristen L. Beck1, Darren Weber2, Brett S. Phinney2, Jennifer T. Smilowitz3, Katie Hinde4, Bo Lönnerdal5, Ian Korf1, Danielle G. Lemay1* *Corresponding author

1

2

3

Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA 95616

Proteomics Core, University of California Davis, 451 Health Sciences Drive, Davis, CA 95616

Department of Food Science and Technology, University of California Davis, One Shields Avenue, Davis, CA 95616

4

Department of Human Evolutionary Biology, Harvard University, 11 Divinity Avenue, Cambridge, MA 02138 5

Department of Nutrition, University of California Davis, One Shields Avenue, Davis, CA 95616

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 49

KEYWORDS Lactation, milk, proteome, LC-MS/MS, human, macaque, development, nutrition, comparative biology, infant

ABSTRACT

Milk has been well established as the optimal nutrition source for infants, yet there is still much to be understood about its molecular composition. Therefore, our objective was to develop and compare comprehensive milk proteomes for human and rhesus macaques to highlight differences in neonatal nutrition. We developed a milk proteomics technique that overcomes previous technical barriers including pervasive post-translational modifications and limited sample volume. We identified 1,606 and 518 proteins in human and macaque milk, respectively. During analysis of detected protein orthologs, we identified 88 differentially abundant proteins. Of these, 93% exhibited increased abundance in human milk relative to macaque and include lactoferrin, polymeric immunoglobulin receptor, alpha-1 antichymotrypsin, vitamin D-binding protein, and haptocorrin. Furthermore, proteins more abundant in human milk compared to macaque are associated with development of the gastrointestinal tract, the immune system, and the brain. Overall, our novel proteomics method reveals the first comprehensive macaque milk proteome and 524 newly identified human milk proteins. The differentially abundant proteins observed are consistent with the perspective that human infants, compared to non-human primates, are born at a slightly earlier stage of somatic development and require additional support through higher quantities of specific proteins to nurture human infant maturation.

2 ACS Paragon Plus Environment

Page 3 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION Milk serves as the first source of nutrition for mammalian infants, and it provides numerous health benefits. In humans, breast-feeding has been shown to decrease the rate of asthma,1 diabetes,2,3 and morbidity or mortality due to diarrheal diseases.4 Despite these advantages, only 18.8% of mothers achieve the recommendation of the American Pediatric Association to exclusively breast-feed for the first six months.5 While infant formula is designed to meet the infant’s nutrient requirements,6 breast milk alternatives do not provide the multitude of complex molecules found in breast milk that confer various bioactivities. Milk proteins serve a variety of different functions in addition to their role as the primary source of amino acids. For example, bioactive peptides resulting from proteolysis of milk decrease the growth of E. coli and S. aureus in antibacterial assays.7 In addition, milk proteins contribute substantially to the development of the infant’s immune system.8 Several proteins such as lactoferrin (LF), lysozyme (LYZ), immunoglobulin A (IgA) and others are found to be more abundant in human milk compared to bovine milk and serve to protect the infant and signal to its immune system.8-10 By better characterizing the milk proteome, we aim to learn more about factors potentially affecting early development of infants and use these data to inform the composition infant formula. In addition to serving a crucial role in infant development and nutrition, milk proteins have recently been shown to have relevance to two disease states. Arcaro et al. identified five proteins to be differentially abundant in milk of mothers with pregnancy-associated breast cancer.11 Furthermore, Mange et al. have found the presence of two proteins in breast milk to correlate with viral transmission from HIV-infected mothers to the infant.12 These findings provide an avenue for non-invasive diagnostics for two diseases that affect mothers and infants globally. However, further exploration of these discoveries as a potential diagnostic tool will require research in a non-human animal model.

3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 49

The rhesus macaque (Macaca mulatta) is the most frequently used non-human primates in biomedical research13 as they have numerous physiological and developmental similarities with humans. First, the developmental and hormonal responses of mammary glands in macaques show a high degree of similarity to those of the human breast.14 Second, the casein to whey ratio (60/40) is also comparable in human and macaque milk15 much more so than in bovine milk (casein to whey ratio16 of 82:18) indicating a higher level of conservation between the two primate species. Third, infant rhesus macaques are the only animal model that can be fed human infant formula long-term without any nutritional manipulations while still meeting all of their nutrient requirements.17 Fourth, human and macaque infants have been shown to have similar physical, microbiological,18,19 and metabolic20 developmental trajectories. Due to these similarities, the insights gained from the macaque milk proteome are more likely to be applicable to improvements in human infant nutritional strategies. While the macaque milk proteome has the ability to provide us many insights into human health from a nutritional, diagnostic, and even evolutionary perspective,21 the current understanding of macaque milk proteins is very limited. Over twenty years ago, Kunz and Lönnerdal characterized the macaque milk proteome using SDS gel electrophoresis and identified 9 proteins.15 Higher yield tandem mass spectrometry and newer technologies have not been applied to the macaque milk proteome until now. This is particularly important because milk proteomics studies in other species22-25 indicate there should be orders of magnitude more proteins in macaque milk than currently described in the literature. Furthermore, even the most advanced published human milk proteomes do not provide a comprehensive catalog of all milk proteins. Work by our lab and others have shown there to be over 10,000 genes expressed in the mammary gland during lactation (human, rhesus macaque, and bovine26,27). Yet the most detailed proteomes for any species15,22-25,28,29 have identified less than 20% of the resultant protein products. While not every mRNA will yield a protein product in milk, studies have shown mRNA and protein concentrations to be clearly correlated (R2 = 0.47–0.71)30-32 indicating that only a fraction of all possible proteins in milk have been identified. Therefore we improved current milk proteomics 4 ACS Paragon Plus Environment

Page 5 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

techniques and applied them to a high throughput investigation of human and macaque milks. Additionally, we compared the protein contents of the two species to better understand evolutionary patterns in composition and abundance.

EXPERIMENTAL METHODS Milk Sample Collection Samples were collected as previously described.26 In summary, human milk samples (n=3) were collected from women enrolled in the UC Davis Foods for Health Institute Lactation Study (additional information on maternal characteristics in Additional File 1). All aspects of this study were approved by the UC Davis institutional review board, and all women provided their written informed consent. This trial was registered on clinicaltrials.gov (ClinicalTrials.gov Identifier: NCT01817127).

Rhesus macaque milk samples (n=5) were collected at the California National Primate Research Center using methods approved by the Animal Care and Use Committee at the University of California, Davis. Healthy mothers of both species were selected based on the following: multiparous, no incidence of metabolic diseases, delivered a male infant, and at peak lactation. Macaque milk was collected at approximately 90 days postpartum; human milk was collected at approximately 180 days postpartum. Milk samples were collected between 7:00 am – 9:00 am and stored at -80°C until needed. Samples from each donor were analyzed independently without pooling.

Sample Preparation

5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 49

Equal parts thawed whole milk (15 µl) and NuPAGE LDS Sample Buffer 4x (Invitrogen, Grand Island NY) were centrifuged at 10,000 g for 1 min. Gradient gel electrophoresis was then completed on 3 human and 5 macaque milk samples (Novex 10-20% Tris-Glycine Mini Gel, Grand Island NY). The cathodic and anodic compartments were filled with 5% MOPS/SDS running buffer (Teknova, Hollister CA) and ran to completion. Protein bands were stained for 20 min with InstantBlue protein stain (Expedon, San Diego CCA) and imaged on a FujiFilm LAS-4000 digital imaging system. Gel lanes of each sample were separated and 8 gel slices were cut out of the resulting gel lane. In-gel reduction with dithiolthreitol (DTT) and alkylation with iodoacetamide was completed followed by an enzymatic cleavage of N-linked glycans using Glycerol Free PNGase F (New England Biolabs, Ipswich MA). Gel pieces were incubated with 250 µl ammonium bicarbonate and 1.5 µl of the enzyme for 4 h at 37°C. After incubation, liquid was discarded, gel slices were rinsed 2 times with acetonitrile, and dried under vacuum. A tryptic digest and peptide extraction was then completed as previously described.33-35 Samples were stored at -80°C until ready for analysis with LC-MS/MS.

Protein Identification by Liquid Chromatography and Tandem Mass Spectrometry LC-MS/MS analysis was performed on a standard top 10 method using a Thermo Scientific QExactive Orbitrap mass spectrometer in conjunction with a Paradigm MG4 HPLC (Michrom Bioresources, Auburn CA). The digested peptides were loaded onto a Michrom C18 trap and desalted before they were separated using a Michrom 200µm x 150 mm Magic C18AQ reverse phase column. A flow rate of 2 µL/min was used. Peptides were eluted using a 120 min gradient with 2% B to 35% B over 100 min, 35% B to 80% B for 7 min, 80% B for 2 min, and then a decrease from 80% to 5% B in 1 min, and held at 98% A for 10 min (A= 0.1% formic acid, B= 100% acetonitrile). The instrument was run in a datadependent mode using the following settings: spray voltage, 2.2kV; ion transfer capillary temperature, 200°C; scan range, 300–1600 m/z; MS maximum injection time, 2 ms; MS automatic gain control, 1e6;

6 ACS Paragon Plus Environment

Page 7 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

MS/MS maximum injection time, 250 ms; MS/MS automatic gain control, 5e4; normalized collision energy, 27; dynamic exclusion, 60s. Precursor resolution was set at 70,000 and product ion resolution was set to 17,500. The intensity threshold was set to 4.3e3 and the underfill ratio was set to 1%. Two technical LC-MS/MS analyses were carried out for each gel slice.

Database Searches and Protein Assignment Tandem mass spectra with charge states greater than +1 were analyzed using X! Tandem from The GPM36 (thegpm.org, version CYCLONE 2013.02.01). X! Tandem searched the UniProt human complete proteome set (2013_02_25; 70,136 entries) and a custom database for macaque (36,410 entries). To build a better-annotated custom macaque database, we first retrieved peptide sequences from Ensembl (assembly MMUL_1, release 73, pep.all), and then used NCBI BLAST to query UniProtKB/Swiss-Prot (swissprot) and Non-redundant protein sequences (nr) databases. Protein descriptions for the 20 highest scoring hits were aggregated using BLAST2GO and incorporated to our custom database as FASTA headers. For both species in addition to these protein databases, 110 nonhuman common laboratory contaminants form the common repository of adventitious proteins database (thegpm.org) and an equal number of reverse sequences assuming tryptic digest were included in the database searching. X! Tandem was searched with a precursor and fragment mass error of 20 ppm. Carbamidomethyl was specified as a complete modification. Oxidation of tryptophan and methionine; deamidation of asparaginine and glutamine (higher due to PNGase F treatment); and dioxidation of methionine were specified as potential modifications. Scaffold (version 4, Proteome Software Inc.,37 Portland OR) was used to validate MS/MSbased peptide and protein identifications. Peptide identifications were accepted if they had less than 1% peptide decoy False Discovery Rate (FDR) and protein identifications were accepted with minimum of one identified peptides and an approximate protein decoy FDR of 5%. The data has been made available at the MassIVE proteome repository (www. http://massive.ucsd.edu, MassIVE ID: MSV000079045).

7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 49

Homolog Identification and Quantitative Analysis Human and macaque FASTA databases resulting from the proteins detected within the filtered thresholds described above were utilized in a reciprocal best BLAST approach and in-paralog clustering using InParanoid38 (version 4.1 for command line, http://inparanoid.sbc.su.se). The defaults search parameters were used. This generated ortholog clusters with high sequence identity (775/792 protein orthologs were over 98% identical), and therefore our comparative analysis could be completed on spectral counts. In order to complete relative abundance comparisons for each ortholog cluster, raw spectral counts for human and macaque proteins were normalized using 5% trimmed mean.39-43 The resulting quantitative count data was fit using a quasipoisson model and a log2 fold change was calculated.44,45 A run effect was also applied as one sample each of human and macaque milk was ran several months prior to the remaining samples. By using the Student’s t-test to compute p-values and applying the BenjaminiHochberg method to correct for multiple hypothesis testing, we considered proteins to be differentially with a p-value < 0.05.

Functional Enrichment Analysis Orthologs were subjected to Gene Ontology (GO) term analysis with DAVID: Functional Annotation Tool (http://davidabcc.ncifcrf.gov/).46 GO_BP_FAT (biological process), GO_MF_FAT (molecular function), and KEGG_PATHWAY were the selected settings. Tissue specificity was queried using the UP_TISSUE (UniProt Tissue) and GNF_U133A__QUARTILE (Genomics Institute of the Novartis Research Foundation).47 Functional annotation clustering and functional annotation charts were reviewed for significant enrichments (p-value < 0.05 corrected for multiple hypothesis testing using the Benjamini-

8 ACS Paragon Plus Environment

Page 9 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Hochberg procedure). For GO analysis of macaque proteins, UniProt IDs were queried to Ensembl’s Biomart (MMUL1) and external attributes of “GO Accession”, “GO Name”, and “GO Evidence Code” were selected. Duplicate entries were excluded.

9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 49

RESULTS Novel proteomics method aids detection of proteins in human and macaque milk

10 ACS Paragon Plus Environment

Page 11 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

For the proteomics investigation of human and rhesus macaque milk, we developed a smallvolume high throughput proteomics technique (Figure 1) that was found to yield a low peptide and protein decoy False Discovery Rate (FDR). For this method, 15 µl of whole milk from human (n=3) and macaque (n=5) undergoes a one-step fractionation using gradient SDS-PAGE (Figure 2) and then physically separated into multiple gel slices. Each slice is reduced, alkylated, and subjected to two enzymatic digests: PNGase F and trypsin. PNGase F serves to cleave N-linked glycans from the protein chain. The resulting deglycosylated milk proteins undergo a tryptic digest in preparation for detection by LC-MS/MS with a QExactive orbitrap mass spectrometer. Detected spectra are assigned to protein databases as described in Methods. In preliminary testing with human milk, PNGase F digested samples compared to an undigested control exhibited a 3-fold increase in the total number of detected spectra (Additional File 2).

11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 49

Characterization of macaque milk proteins The concentration of macaque milk proteins has been characterized in numerous studies,15,48,49 but here we provide a much more comprehensive catalog of protein sequences. We applied the aforementioned technique to whole milk from Macaca mulatta mothers and were able to identify 518 proteins with a protein and peptide decoy FDR of 5.7% and 0.16%, respectively (Additional File 3). Some of the most abundant proteins (Table 1) are shown to be involved in transport, cellular iron homeostasis, lactose biosynthesis, innate immune response, and protein binding. The most abundant protein in macaque milk, glycodelin [Ensembl: ENSMMUP00000040154], yielded 9,977 detected 12 ACS Paragon Plus Environment

Page 13 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

exclusive spectra, yet it is not present in human milk (Additional File 4) nor bovine milk.25 Glycodelin was first observed in macaque milk using immunoblotting in 199450 and is confirmed in this study with sequence level characterization and quantification. We identified 189 unique peptides to support its presence and 77% of the total amino acid sequence was detected using LC-MS/MS. Glycodelin is involved in in glandular morphogenesis51 and may have other functions. It is identified in other nonhuman primates such as the Crab-eating macaque (Macaca fascicularis, UniProt: G7PR69) and yellow baboon (Papio cynocephalus, UniProt: O77511) with sequence identity > 95%. Yet for human and bovine, the two most similar proteins exhibit 50% or less sequence identity (glycodelin/progestagen associated endometrial protein, UniProt: P09466 and beta-lactoglobulin, UniProt: P02754, respectively). These two proteins have functional similarity and a putative common origin,52,53 but have not been associated with glycodelins found in non-human primates. Some

of

the

ENSMMUP00000006446],

other

highly

kappa-casein

abundant [Ensembl:

proteins

including

beta-casein

ENSMMUP00000007920],

and

[Ensembl: lactoferrin

[Ensembl: ENSMMUP00000018727] have also been identified as predominant milk proteins in other species namely bovine25,54 and human milk.8,28,29 Furthermore, there are several proteins involved in lipid synthesis and processing including fatty acid synthase [Ensembl: ENSMMUP00000016836] and butyrophilin subfamily 1 member a1 [Ensembl: ENSMMUP00000000385] as identified in the milk of other species.

Expansion of the human milk proteome Additionally, we applied our small volume proteomics method to human milk and identified 1,606 proteins with protein and peptide decoy FDRs of 5.4% and 0.24%, respectively (Additional File 4). The most abundant proteins present (Table 2) are, as expected, milk proteins such as lactoferrin [UniProt: P02788], β-casein [UniProt: P05814], α-lactalbumin [UniProt: P00709], and κ-casein [UniProt: P07498]. 13 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 49

There are also several proteins involved in immune function including lysozyme C [UniProt: P61626], polymeric immunoglobulin receptor [UniProt: P01833], immunoglobulin kappa [UniProt: P01834], immunoglobulin alpha [UniProt: P01876], and complement C3 [UniProt: P01024]. In addition, our milk proteome provides an expansion to the catalog of previously known human milk proteins. To demonstrate this, we compiled a list of milk proteins from ten different high throughput studies completed over the last eight years: Gao et al.,29 Hettinga et al.,8 Liao et al.,33-35 Mange et al.,12 Molinari et al.,55 Palmer et al.,56 Picariello et al.,57 and Zhang et al.28 These combined milk proteomes result in 2,032 proteins, each with a unique UniProt identifier (Additional File 5). There were 1,082 proteins detected using our proteomics method that were also identified by one or more of these previous studies. In addition, we found 524 proteins that have never been identified in human milk previously (Figure 3A). Furthermore, we identified the consensus of these previously published proteomes to be 34 proteins (Additional File 6). Our method was able to detect all of these consensus milk proteins (Figure 3B). We combined our proteome with the aforementioned publications to create the comprehensive human milk proteome which can be used as a tool for other milk biologists (Additional File 5).

14 ACS Paragon Plus Environment

Page 15 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Comparative analysis of protein contents in human and macaque milk We completed a cross-species comparison between human and macaque milk protein contents to identify similarities and differences in composition. Using InParanoid58 (http://inparanoid.sbc.su.se/cgibin/index.cgi, version 4.1 for command line), we clustered orthologous proteins and their in-paralogs for both human and macaque milk. There were 396 clusters after selection for high scoring reciprocal best

15 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 49

hits (Additional File 7). While this leaves 1,189 and 121 proteins, respectively, that were without an ortholog in human or macaque milk, we did not consider these entries to be uniquely present in that single species’ milk. Because the peptide-to-protein assignment process requires a database to positively identify a protein, and these databases are often incomplete (especially for less studied species like macaque), they can therefore incorrectly miss identification of a protein that may actually be present. To complete our comparative analysis, we used this orthologous protein list to determine differences in protein abundance between the species. Exclusive spectra were able to be utilized in our relative comparison of protein abundance as 97.9% of proteins maintained a more than 98% sequence identity with their ortholog pair in the opposite species. Although from different species, orthologous proteins with high sequence identity will result in near identical tryptic peptides for comparable detection with MS/MS. Exclusive spectra were normalized across all replicates in both species using the trimmed mean method as previously described,39 and the difference in abundance between species was determined using a quasi-Poisson model.44,45 This yielded 88 proteins with differential abundance and p-value < 0.05 after adjusting for multiple hypothesis testing with the Benjamini-Hochberg procedure (Table 3). There were 82 proteins that were more abundant in human milk compared to macaque milk (Figure 4).

16 ACS Paragon Plus Environment

Page 17 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

To determine possible functions of these proteins, we used the DAVID Bioinformatics Toolkit46 (http://david.abcc.ncifcrf.gov) to conduct functional enrichment analyses. Milk proteins uniquely abundant in human milk were significantly enriched for 27 GO terms including translational elongation, acute inflammatory response, defense response, and hexose metabolic process (Figure 5).

17 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 49

We next investigated these proteins for enrichment in expression in various human tissues. For the proteins more abundant in human milk, 10 tissues with statistically significant enrichment were identified: caudate nucleus, adrenal cortex, parietal lobe, heart, medulla oblongata, globus pallidus, superior cervical ganglion, lung and thyroid (Figure 6).

18 ACS Paragon Plus Environment

Page 19 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

There were six ortholog clusters that were more abundant in macaque milk. Due to the small size of this protein list, functional enrichment with the DAVID Bioinformatics Toolkit was not feasible. We therefore queried Ensembl’s Biomart59 for GO terms associated with each protein (Table 4). Protein binding, extracellular region, and proteolysis were the most prevalent terms. Calcium ion transport and calcium ion binding were also annotated for β-casein and this protein may potentially be more abundant due to the higher concentration of calcium in macaque milk.49,60 No tissues were enriched in the subset of ortholog clusters that were more abundant in macaque milk.

19 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 49

DISCUSSION Advances in milk proteomics techniques Developing effective milk proteomics techniques does not come without challenges including interfering macronutrients (e.g. phospholipids and oligosaccharides),61,62 a small number of highly abundant casein and whey proteins,63 and pervasive post-translational modifications.64 Typical protein purification techniques necessary to overcome these barriers result in an overall depletion in concentration and diversity of identifiable proteins. Many methods compensate by requiring a large sample volume (500–2000 µL), but sample availability is often much smaller especially for non-human primates limiting the species that can be investigated and the number of biological replicates available for testing. We therefore developed a method that only requires 15 µl of whole milk and utilizes a simplified sample preparation method and one-step gel fractionation. The major advantage to our method lies in the combination of three key sample preparation techniques. (1) During electrophoresis, the denatured proteins are bound to the gradient gel allowing lipids and sugars to be released. (2) Highly abundant proteins are isolated in size-specific gel slices and run on the LC-MS/MS individually. Using this strategy, they are less likely to flood the ion detector and mask detection of lower abundance proteins. However, this may alter the absolute abundance of proteins in rank order however and should be considered when interpreting the data relative to other studies with alternative sample preparation techniques. (3) The most predominant post-translational modification in milk is eradicated. It is estimated that ~70% of milk proteins are glycosylated.64 Therefore we incorporated an N-linked glycan cleavage with PNGase F to our protocol. This serves to expose more of the peptide chain and aid its detection by mass spectrometry. To our knowledge, this is the first published use of PNGase F in high throughput milk proteomics, and it appears to contribute to the higher number of proteins identified in this study jointly with the other sample preparation steps.

20 ACS Paragon Plus Environment

Page 21 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

In addition, it is important to highlight that proteomes were compared across equal volumes of milk. The same volume (15 µl) of whole human and macaque milks was loaded in each gel lane. Macaque milk is known to contain a higher concentration of protein (2.1% mean crude protein) than human milk (1.3% mean crude protein).65 However, in order to not artificially introduce bias, we did not adjust for concentration differences prior to gel fractionation or mass spectrometry and instead report abundance found in this volume of milk. Despite the fact that the protein concentration in macaque milk is higher, the majority of proteins with differential abundance are shown to have increased spectra in human milk. Thus, the over-abundance of human milk proteins may even be greater than what we have calculated. In the context of nutrition, it is also important to note that humans consume larger volumes of milk and thus our estimates of proteins that are more abundant in human milk, relative to macaque, are very conservative in all respects. As previously mentioned, over 10,000 genes are identified to be expressed in the mammary gland during lactation (human, rhesus macaque and bovine26,27,63). The human and macaque milk proteomes included in this study continue to expand our understanding of milk’s composition and build the catalog of milk proteins. This macaque milk proteome is a 62-fold expansion to the previous study,50 and our work serves as the first high throughput catalog of macaque milk proteins. In addition, the human proteome showcased here identifies 524 proteins that have never before been detected in human milk. These newly identified proteins are most likely due to 1. variation in sample preparation methods including glycan cleavage, 2. the stochastic nature of mass spectrometry, and 3. biological variation in mothers’ milk composition which has been shown to affect the total number of proteins present in milk previously.12,66 Together, the proteins experimentally confirmed here will serve to greatly improve publicly available databases and annotations.

Implications for human nutrition from comparative milk proteomics

21 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 49

Comparing the relative abundance of orthologous proteins in the milk from the two species generates novel insights into the primate milk proteome and divergences between them possibly shaped by natural selection. Bile-salt stimulated lipase (BSSL), also known as carboxyl ester lipase (CEL), was much higher in human milk than in macaque milk. In fact, significant BSSL activity has not been detected in macaque milk.67 BSSL has been shown to facilitate lipolysis during the early neonatal period, a time when pancreatic co-lipase expression is very low in human infants.68 A clinical cross-over study has shown that when pasteurized human breast milk (inactive BSSL) was fed to pre-term infants, fat utilization was considerably lower and growth was poorer than when unpasteurized breast milk (with active BSSL) was fed.69 Although it is unknown why macaques lack BSSL activity, it may be that their earlier maturation includes earlier expression of pancreatic enzymes involved in lipid digestion, thereby limiting the need for an exogenous lipase. Haptocorrin (also called transcobalamin-1 and vitamin B12-binding protein) was also found in high concentrations in human milk. It has previously been demonstrated that haptocorrin in human milk is remarkably stable against in vitro proteolysis.70 Haptocorrin has anti-bacterial properties and can inhibit the growth of enteropathogenic E. coli even at low concentrations.70 Haptocorrin binds most of vitamin B12 in breast milk and binds to human intestinal cells in vitro.70 Haptocorrin therefore likely facilitates the uptake of vitamin B12 at an age when expression of an intrinsic factor is low. We speculate that the vitamin B12 requirement of the macaque infant is lower or that their earlier maturation includes earlier expression of an intrinsic factor than for the human infant. Several components of immunoglobulins were also higher in human milk than in macaque milk. For example, the polymeric immunoglobulin receptor is very high in human milk (3.4-fold difference). This is an Fc receptor, expressed on several glandular epithelia including those of liver and breast, which facilitates the secretion of immunoglobulin A and immunoglobulin M.71 These findings may explain the high concentrations of SIgA and also illustrate the need for molecules in human milk supporting immunity and development of immune function. 22 ACS Paragon Plus Environment

Page 23 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Lactoferrin is considerably higher in human than in macaque milk (2.5-fold difference), which may have implications for both iron nutrition and early development of the infant.72,73 Breast milk is very low in iron, but in spite of this, breast-fed infants usually have adequate iron status even after 6 months of exclusive breast-feeding.74 Lönnerdal and colleagues have revealed that there is a specific receptor for lactoferrin in the human small intestine and that this receptor facilitates uptake of iron into the mucosa.75 Lactoferrin enters the mucosal cell by an endocytic process and binds to the nucleus, thus acting as a transcription factor.75,76 Lactoferrin thereby enhances the expression of genes involved in cellular growth and proliferation as well as in immune function (e.g. cytokine expression). As macaque milk is higher in iron and the macaque infant is more mature, the need for facilitated iron absorption and stimulated early development may be reduced, diminishing the need for lactoferrin in macaque milk. Human milk was also found to be higher in alpha-1-antitrypsin (2.4-fold difference), a protease inhibitor. Several human milk proteins (among them lactoferrin and SIgA) survive gastrointestinal digestion to some extent and are found intact in the stool of exclusively breast-fed infants.77 This may in part be due to the relative stability of these milk proteins, but also to the presence of high concentrations of alpha-1-antitrypsin in human milk.78 We have shown in vitro that the presence of alpha-1-antitrypsin during simulated digestion limits the digestive capacity.78,79 Thus, alpha-1-antitrypsin may serve as a natural “brake” to limit digestion in the newborn human infant, thereby helping to preserve some bioactive proteins. Human milk is substantially higher (5.3-fold difference) in vitamin D-binding protein than macaque milk, which had not been reported previously. Newborn infants have a high requirement of vitamin D, and it may be speculated that human infants require more vitamin D-binding protein to transport and, possibly, also facilitate the uptake of vitamin D during the newborn period. Little is known about vitamin D absorption in infants and further studies are needed to explore this and the significance of its binding protein in milk.

23 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 49

Milk proteomics from an evolutionary perspective Although similar relative to non-primate mammals, rhesus macaques and human neonates differ in some aspects of their neurodevelopment, nutritional ecology, and disease risk. These different selective pressures may explain the seemingly derived features of the protein content in human milk compared to rhesus macaques. Humans are born relatively less developed than are other primates including rhesus macaques.80 Humans have much greater postnatal brain growth and development than do other primates. For example, the brain of the rhesus macaque neonate has attained ~50% of adult brain mass, whereas the human neonate’s brain is a quarter of the mass of an adult human.81 Indeed in humans the period of infancy and childhood is marked by substantial neurodevelopment and glucose demand.82 Despite being born relatively less developed, even among non-industrialized populations, human infants begin to consume complementary foods at a seemingly younger developmental stage83 and are possibly weaned relatively young84 compared to other primates. Moreover, changes in our recent past, namely the transitions to subsistence agriculture and animal domestication, increased sedentism and population density, altered diet and nutrition, and intensified zoonotic and person-to-person disease transmission.85,86 Taken together, these features of the human condition, reveal an early life period of mother’s milk providing essential support for neurodevelopment and immune protection during a seemingly truncated period of infancy. Given that the gross concentration of protein in human milk is lower than in rhesus macaques,87 and that particular proteins implicated in neurodevelopment and immune function are seemingly enriched in human milk is compelling evidence that natural selection has shaped human milk for the specific developmental priorities of the human neonate. However, this interpretation remains speculative until proteomic analyses are conducted in similar detail with additional primate species, particularly other closely related hominoids- chimpanzee, bonobo, gorilla, and orangutan.

24 ACS Paragon Plus Environment

Page 25 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

CONCLUSIONS Human milk is optimal for infant nutrition during post-natal development. In this study, we discovered additional ingredients in milk by investigating the comprehensive proteomes of both human and macaque milk. To allow for a fair comparison of proteomes despite unevenly annotated protein databases, we conservatively restricted our analyses only to those proteins positively identified in the milk of both species. Even with this restriction, we identified 82 proteins that were significantly more abundant in human milk. These human milk proteins were enriched for neurodevelopment and immune functions. Specific milk proteins, more abundant in human, suggest many areas in which human infants need additional nutritional support: increased absorption of iron, vitamin B-12, and vitamin D, assisted digestion of lipids, reduced proteolysis of bioactive proteins/peptides, and increased immune defense. These findings suggest new priorities for the improvement of infant formula so that even infants without access to mother’s milk are optimally nourished during early life.

ASSOCIATED CONTENT Additional File 1.docx: Summary of maternal and sample characteristics Additional File 2.xlsx: Treatment with PNGase F compared to an undigested control Additional File 3.xlsx: Proteins identified in macaque whole milk using LC-MS/MS Additional File 4.xlsx: Proteins identified in human whole milk using LC-MS/MS Additional File 5.xlsx: Comprehensive human milk proteome 25 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 49

Additional File 6.xlsx: Consensus of previously published human milk proteomes Additional File 7.xlsx: Orthologous protein clusters defined using InParanoid This material is available free of charge via the Internet at http://pubs.acs.org.

AUTHOR INFORMATION Corresponding Author *Correspondence can be sent through email to dglemay@ucdavis.edu, by phone at 530-752-7411 or by fax at 530-752-0436. Author Contributions DGL conceived of the research. DGL and KLB designed the study. KLB, JTS, BSP, DW, and DGL developed the proteomics method. KLB and DW completed the proteomics experiments. KLB completed the bioinformatic analyses. All authors analyzed the data. KLB, DGL, BL, and KH wrote the manuscript. All authors read and approved the final manuscript. Funding Sources This project was supported by Award Number P51RR000169 from the National Center for Research Resources to the California National Primate Research Center (Pilot Award to DGL). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Center for Research Resources or the National Institutes of Health. KLB was supported by Grant Number T32-GM008799 from NIGMS-NIH.

ACKNOWLEDGEMENT 26 ACS Paragon Plus Environment

Page 27 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Much gratitude must be expressed to Blythe Durbin-Johnson for her informative discussion of statistical practices for this manuscript and to Shantel Chand for her dutiful work parsing data for the consensus proteome.

ABBREVIATIONS FDR, false discovery rate; GO, gene ontology; LC-MS/MS, liquid chromatography followed by tandem mass spectrometry; PTM, post-translational modification

TABLES Table 1. Most Abundant Proteins in Whole Macaque Milk

# Unique Peptidesc

Percent Coveraged

Associated GO Termse

ENSMMUP00000040154 9977

189

77.00%

transport, extracellular region

ENSMMUP00000006446 8251

367

92.00%

transport, extracellular region, transporter activity

89.00%

cellular iron ion homeostasis, iron ion transport, regulation of cytokine production, innate immune response in mucosa, antibacterial humoral response

Protein Namea

Ensembl Protein ID

glycodelin, progestagenassociated endometrial protein beta-casein

lactotransferrin, lactoferrin

Exclusive Spectrab

ENSMMUP00000018727 5634

665

serum albumin

ENSMMUP00000005100 2312

328

89.00%

transport, negative regulation of apoptotic process, maintenance of mitochondrion location, extracellular vesicular exosome

xanthine dehydrogenase oxidase

ENSMMUP00000000757 1929

424

73.00%

oxidation-reduction process, electron carrier activity, iron ion binding

kappa-casein

ENSMMUP00000007920 1405

106

55.00%

lactation, extracellular region

27 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 49

butyrophilin subfamily 1 member a1

ENSMMUP00000030294 1328

207

73.00%

integral to membrane, membrane, protein binding

alpha-lactalbumin

ENSMMUP00000000385 1262

87

80.00%

lactose biosynthetic process, calcium ion binding, protein binding

196

59.00%

protein binding

56.00%

methylation, metabolic process, cellular response to interleukin-4, biosynthetic process, catalytic activity

polymeric immunoglobulin receptor ENSMMUP00000014705 1215

fatty acid synthase

ENSMMUP00000016836 1059

419

lysozyme c

ENSMMUP00000011770 897

83

75.00%

defense response to bacterium, metabolic process, cell wall macromolecule catabolic process, cytolysis, hydolase activity

ig alpha-1 chain c region

ENSMMUP00000003711 648

106

64.00%

protein binding

perilipin-2

ENSMMUP00000015913 499

110

61.00%

none listed

monocyte differentiation antigen cd14

ENSMMUP00000013085 459

107

75.00%

positive regulation of endocytosis, response to molecule of bacterial origin, innate immune response, inflammatory response, membrane raft

lactoperoxidase

ENSMMUP00000003022 436

142

58.00%

response to oxidative stress, oxidation-reduction process, heme binding, peroxidase activity

lipocalin, alpha-2microglobulin-related subunit of mmp-9

ENSMMUP00000031347 354

50

73.00%

transport, extracellular region, small molecule binding, transporter activity

ovostatin homolog 2

ENSMMUP00000019717 348

141

37.00%

negative regulation of endopeptidase activity, extracellular space, endopeptidase inhibitor activity

immunoglobulin lambdalike polypeptide 5,

ENSMMUP00000037212 342

76

54.00%

protein binding

serotransferrin

ENSMMUP00000011690 334

153

55.00%

cellular iron ion homeostasis, iron ion transport, edocytic vesicle

ig gamma-1 chain c region

ENSMMUP00000030878 216

84

48.00%

integral to membrane, protein binding

a

Protein name is generated using BLAST searches to other organism databases for high protein sequence identity. Synonymous names are separated by commas.

28 ACS Paragon Plus Environment

Page 29 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

b

Summed exclusive spectrum above threshold percentages for all replicates (n=5)

c

Total peptides identified uniquely to this protein summed for all replicates (n=5)

d

Maximum percent of total amino acid length for the protein in a single replicate

e

GO terms associated with the indicated proteins

Table 2. Most Abundant Proteins in Whole Human Milk

Protein Name

UniProt Accessio n UniProt ID

Exclusive # Unique Percent Spectraa Peptidesb Coveragec Associated GO Termsd

P02788

TRFL_HUM AN

Serum albumin

P02768

ALBU_HUM AN 3861

Polymeric immunoglobulin receptor

P01833

PIGR_HUMA N 3403

Lactotransferrin

Bile salt-activated lipase P19835

Xanthine dehydrogenase/ox idase P47989

Fatty acid synthase

P49327

CEL_HUMA N

11611

2437

XDH_HUMA N 2373

FAS_HUMA N

2337

94.00%

response to host immune response, iron assimilation by chelation and transport, antibacterial humoral response, interaction with host, cellular ion homeostasis, iron ion transport, proteolysis

350

89.00%

response to platinum ion, sodium-independent organic anion transport, hemolysis by symbiont of host erythrocytes, platelet activation

216

69.00%

extracellular region, integral to plasma membrane, protein binding

57.00%

ceramide catabolic process. intestinal lipid catabolic process, protein esterification, lipid digestion, triglyceride metabolic process

85.00%

negative regulation of vasculogenesis, xanthine catabolic process, purine nucleotide catabolic process, positive regulation of p38MAPK cascade, negative regulation of endothelial cell differentiation

79.00%

pantothenate metabolic process, positive regulation of cellular metabolic process, watersoluble vitamin metabolic process, fatty acid biosynthetic process, oxidation-reduction process

601

169

376

479

29 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 30 of 49

Beta-casein

P05814

CASB_HUM AN

2010

235

93.00%

negative regulation of catalytic activity, calcium ion transport, calcium ion binding, transporter activity

Butyrophilin subfamily 1 member A1

Q13410

BT1A1_HUM AN 1982

158

70.00%

extracellular region, integral to plasma membrane, receptor activity, protein binding

Lactadherin

Kappa-casein

Alphalactalbumin

Q08431

MFGM_HUM AN 1775

151

91.00%

positive regulation of apoptotic cell clearance, phagocytosis, modulation of virus of host morphology or physiology, cell adhesion, angiogenesis

P07498

CASK_HUM AN 1692

77

68.00%

lactation, extracellular region

84.00%

lactose biosynthetic process, cell-cell signaling, defense response to bacterium, signal transductino, apoptotic process, calcium ion binding

64.00%

receptor-mediated endocytosis, endosome membrane, mannose binding, protein binding, integral to plasma membrane

71.00%

cytolysis, cell wall macromolecule catabolic process, defense response to bacterium, inflammatory response

77.00%

long-chain fatty acid transport, lipid storage, response to organic cyclic compound, small molecule metabolic process

58.00%

cell junction assembly, de novo posttranslational protein folding, adherens junction organization, cell-cell junction organization, innate immune response regulation of triglyceride biosynthetic process, regulation of complement activation, positive regulation of lipid storage, positive regulation Gprotein coupled receptor protein, inflammatory response

P00709

LALBA_HU MAN

1390

Macrophage mannose receptor 1 P22897

MRC1_HUM AN 1204

Lysozyme C

P61626

LYSC_HUM AN

Q99541

PLIN2_HUM AN 970

Perilipin-2

Actin cytoplasmic 1

Complement C3

P60709

P01024

ACTB_HUM AN 815

CO3_HUMA N

P01834 Ig kappa chain C

1025

IGKC_HUM

99

270

87

100

20

803

292

73.00%

782

42

92.00% immune response, regulation of immune response,

30 ACS Paragon Plus Environment

Page 31 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

region

AN

Keratin type I cytoskeletal 9

P35527

K1C9_HUM AN

P01876

IGHA1_HUM AN 760

Ig alpha-1 chain C region

Monocyte differentiation antigen CD14

P08571

CD14_HUM AN

Fc-gamma receptor signaling pathway, complement activation, antigen binding

782

729

89

168

86

67.00%

intermediate filament organization, epidermis development, structural constituent of cytoskeleton

83.00%

protein-chromophore linkage, immune response, antigen binding, extracellular region, protein binding

84.00%

toll-like receptor signaling pathways, positive regulation of cytokine secretion, positive regulation of endocytosis, phagocytosis, inflammatory response

a

Summed exclusive spectrum above threshold percentages for all replicates (n=3)

b

Total peptides identified uniquely to this protein summed for all replicates (n=3)

c

Maximum percent of total amino acid length for the protein in a single replicate

d

GO terms associated with the indicated proteins

Table 3. Differentially abundant ortholog clusters

Protein Name

Human UniProt Accession

Macaque EnsemblID

Cathepsin Z

Q9UBR2

ENSMMUP00000020183 -4.565

2.505E-02

Beta-casein

P05814

ENSMMUP00000006446 -1.960

2.747E-02

Histone H3.1

P68431

ENSMMUP00000001925 -1.086

8.583E-22

Semaphorin-4B

Q9NPR2

ENSMMUP00000013543 -1.086

9.378E-22

Suppressor of tumorigenicity 14 protein

Q9Y5Y6

ENSMMUP00000023696 -1.086

9.378E-22

Leukocyte elastase inhibitor

P30740

ENSMMUP00000015382 -0.086

5.423E-17

log2.fold.change adj.p.val

31 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 32 of 49

Complement C1q tumor necrosis factor-related protein 1 Q9BXJ1

ENSMMUP00000007247 0.236

2.919E-19

Ig lambda chain V-VI region EB4

P06319

ENSMMUP00000018087 0.236

2.919E-19

Calpain-1 catalytic subunit

P07384

ENSMMUP00000020074 0.438

9.378E-22

Ras GTPase-activating-like protein IQGAP2

Q13576

ENSMMUP00000030270 0.499

2.815E-21

Cytoplasmic aconitate hydratase

P21399

ENSMMUP00000010735 0.603

3.089E-02

Elongation factor 1-alpha 1

P68104

ENSMMUP00000006451 0.954

4.092E-02

Ig lambda chain V-I region NEWM

P01703

ENSMMUP00000009757 0.990

3.324E-02

Heat shock cognate 71 kDa protein

P11142

ENSMMUP00000006240 1.065

3.058E-02

Monocyte differentiation antigen CD14

P08571

ENSMMUP00000013085 1.094

3.324E-02

Alpha-1-antitrypsin

P01009

ENSMMUP00000026879 1.282

3.089E-02

Fatty acid synthase

P49327

ENSMMUP00000016836 1.294

1.155E-02

UTP--glucose-1-phosphate uridylyltransferase Q16851

ENSMMUP00000011490 1.305

2.747E-02

Lactotransferrin

P02788

ENSMMUP00000018727 1.311

4.092E-02

Beta-14-galactosyltransferase 1

P15291

ENSMMUP00000002785 1.313

4.361E-02

Ras-related protein Rab-1A

P62820

ENSMMUP00000028603 1.675

4.220E-02

Polymeric immunoglobulin receptor

P01833

ENSMMUP00000014705 1.754

1.915E-02

Antithrombin-III

P01008

ENSMMUP00000000842 1.827

1.519E-02

Alpha-1B-glycoprotein

P04217

ENSMMUP00000016357 1.918

1.915E-02

Lanosterol synthase

P48449

ENSMMUP00000029398 2.030

4.180E-02

78 kDa glucose-regulated protein

P11021

ENSMMUP00000006906 2.072

6.361E-03

Clathrin heavy chain 1

Q00610

ENSMMUP00000013774 2.090

3.324E-02

Tissue alpha-L-fucosidase

P04066

ENSMMUP00000003870 2.133

3.089E-02

Peptidyl-prolyl cis-trans isomerase A

P62937

ENSMMUP00000036877 2.199

4.361E-02

32 ACS Paragon Plus Environment

Page 33 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Elongation factor 2

P13639

ENSMMUP00000029973 2.283

3.078E-02

Elongation factor 1-gamma

P26641

ENSMMUP00000008675 2.329

2.694E-02

Vitamin D-binding protein

P02774

ENSMMUP00000030587 2.418

1.662E-02

Galectin-3-binding protein

Q08380

ENSMMUP00000025835 2.423

3.324E-02

Synaptic vesicle membrane protein VAT-1 homolog Q99536

ENSMMUP00000021043 2.476

3.324E-02

Alpha-enolase

P06733

ENSMMUP00000030227 2.504

1.155E-02

Ig heavy chain V-I region V35

P23083

ENSMMUP00000029141 2.549

2.735E-02

60S ribosomal protein L30

P62888

ENSMMUP00000026408 2.598

3.995E-02

Ras-related protein Rab-7a

P51149

ENSMMUP00000002391 2.626

3.078E-02

Glyceraldehyde-3-phosphate dehydrogenase

P04406

ENSMMUP00000007714 2.819

4.860E-02

Protein disulfide-isomerase

P07237

ENSMMUP00000012107 2.823

1.662E-02

Pyruvate kinase isozymes M1/M2

P14618

ENSMMUP00000030876 2.868

3.634E-02

60S ribosomal protein L18

Q07020

ENSMMUP00000029169 2.914

3.324E-02

Transthyretin

P02766

ENSMMUP00000021922 3.036

4.656E-02

Alpha-2-HS-glycoprotein

P02765

ENSMMUP00000023130 3.041

2.638E-02

Synaptosomal-associated protein 23

O00161

ENSMMUP00000018453 3.113

4.860E-02

Protein disulfide-isomerase A6

Q15084

ENSMMUP00000016897 3.269

3.078E-02

Nucleobindin-1

Q02818

ENSMMUP00000005203 3.372

3.780E-02

Proactivator polypeptide

P07602

ENSMMUP00000017025 3.455

2.909E-02

Alpha-actinin-4

O43707

ENSMMUP00000014104 3.455

2.505E-02

Perilipin-3

O60664

ENSMMUP00000031951 3.499

2.209E-02

Ezrin

P15311

ENSMMUP00000020536 3.503

4.202E-03

Polyadenylate-binding protein 1

P11940

ENSMMUP00000027420 3.504

3.078E-02

33 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 34 of 49

Complement component C9

P02748

ENSMMUP00000026132 3.511

1.396E-02

Endoplasmin

P14625

ENSMMUP00000030897 3.564

1.710E-02

ATP-citrate synthase

P53396

ENSMMUP00000024909 3.578

7.690E-03

Q02809

ENSMMUP00000009121 3.604

4.167E-02

Phosphoglycerate kinase 1

P00558

ENSMMUP00000040963 3.612

2.505E-02

C-C motif chemokine 28

Q9NRJ3

ENSMMUP00000000253 3.631

2.505E-02

Clusterin

P10909

ENSMMUP00000032169 3.660

4.167E-02

Gelsolin

P06396

ENSMMUP00000017095 3.710

1.662E-02

40S ribosomal protein SA

P08865

ENSMMUP00000030248 3.751

3.583E-02

Protein phosphatase 1 regulatory subunit 7

Q15435

ENSMMUP00000022354 3.757

1.760E-02

40S ribosomal protein S4 X isoform

P62701

ENSMMUP00000004386 3.837

4.220E-02

60S ribosomal protein L8

P62917

ENSMMUP00000017579 3.844

2.638E-02

Beta-2-glycoprotein 1

P02749

ENSMMUP00000019820 3.867

2.098E-02

Sulfhydryl oxidase 1

O00391

ENSMMUP00000015804 3.873

1.155E-02

Protein disulfide-isomerase A3

P30101

ENSMMUP00000033719 4.037

1.662E-02

Dolichyl-diphosphooligosaccharide--protein glycosyltransferase subunit 1

P04843

ENSMMUP00000024084 4.084

1.155E-02

Keratin type I cytoskeletal 9

P35527

ENSMMUP00000024535 4.117

3.420E-02

Annexin A5

P08758

ENSMMUP00000012054 4.151

4.220E-02

Ribosome-binding protein 1

Q9P2E9

ENSMMUP00000018132 4.351

3.792E-02

Myosin-9

P35579

ENSMMUP00000008018 4.377

5.299E-03

Kininogen-1

P01042

ENSMMUP00000024880 4.404

2.560E-02

T-complex protein 1 subunit gamma

P49368

ENSMMUP00000038740 4.445

1.237E-02

Heat shock protein HSP 90-alpha

P07900

ENSMMUP00000029862 4.463

1.915E-02

Procollagen-lysine2-oxoglutarate dioxygenase 1

5-

34 ACS Paragon Plus Environment

Page 35 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Filamin-B

O75369

ENSMMUP00000000958 4.499

9.794E-25

Transcobalamin-1

P20061

ENSMMUP00000037096 4.541

2.952E-02

Calreticulin

P27797

ENSMMUP00000005869 4.707

1.155E-02

60S ribosomal protein L27

P61353

ENSMMUP00000010595 4.741

1.915E-02

Bile salt-activated lipase

P19835

ENSMMUP00000001700 5.317

7.690E-03

Phosphoglucomutase-1

P36871

ENSMMUP00000026371 5.597

7.898E-03

4-trimethylaminobutyraldehyde dehydrogenase P49189

ENSMMUP00000016657 5.808

1.829E-02

Fibronectin

P02751

ENSMMUP00000016182 5.876

3.229E-02

Cofilin-1

P23528

ENSMMUP00000025002 6.018

1.445E-02

Gamma-glutamyltranspeptidase 1

P19440

ENSMMUP00000027631 6.021

4.165E-02

Ig kappa chain V-III region VG (Fragment)

P04433

ENSMMUP00000039651 6.160

4.167E-02

Apolipoprotein A-IV

P06727

ENSMMUP00000021207 6.769

4.092E-02

Cytosolic non-specific dipeptidase

Q96KP4

ENSMMUP00000011718 7.415

1.860E-02

Table 4. Gene Ontology Terms for Proteins More Abundant in Macaque Milk

GO Term Human GO Term Evidence UniProt/SwissProt Accession Code ID

Macaque Ensembl Protein ID

GO:0043086

TAS

CASB_HUMAN

ENSMMUP00000006446

calcium ion transport

GO:0006816

TAS

CASB_HUMAN

ENSMMUP00000006446

transport

GO:0006810

IEA

CASB_HUMAN

ENSMMUP00000006446

extracellular region

GO:0005576

TAS

CASB_HUMAN

ENSMMUP00000006446

GO Term Name negative regulation catalytic activity

of

35 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 36 of 49

extracellular region

GO:0005576

IEA

CASB_HUMAN

ENSMMUP00000006446

enzyme inhibitor activity

GO:0004857

TAS

CASB_HUMAN

ENSMMUP00000006446

calcium ion binding

GO:0005509

TAS

CASB_HUMAN

ENSMMUP00000006446

transporter activity

GO:0005215

IEA

CASB_HUMAN

ENSMMUP00000006446

angiotensin maturation

GO:0002003

TAS

CATZ_HUMAN

ENSMMUP00000020183

cellular protein metabolic process GO:0044267

TAS

CATZ_HUMAN

ENSMMUP00000020183

epithelial tube branching involved in lung morphogenesis GO:0060441

IEA

CATZ_HUMAN

ENSMMUP00000020183

proteolysis

GO:0006508

IEA

CATZ_HUMAN

ENSMMUP00000020183

lysosome

GO:0005764

IEA

CATZ_HUMAN

ENSMMUP00000020183

endoplasmic reticulum

GO:0005783

IDA

CATZ_HUMAN

ENSMMUP00000020183

extracellular space

GO:0005615

IDA

CATZ_HUMAN

ENSMMUP00000020183

plasma membrane

GO:0005886

TAS

CATZ_HUMAN

ENSMMUP00000020183

GO:0008234

IEA

CATZ_HUMAN

ENSMMUP00000020183

regulation of gene silencing GO:0060968

IDA

H31_HUMAN

ENSMMUP00000001925

blood coagulation

GO:0007596

TAS

H31_HUMAN

ENSMMUP00000001925

nucleosome assembly

GO:0006334

IEA

H31_HUMAN

ENSMMUP00000001925

nucleoplasm

GO:0005654

TAS

H31_HUMAN

ENSMMUP00000001925

nucleosome

GO:0000786

IEA

H31_HUMAN

ENSMMUP00000001925

extracellular region

GO:0005576

TAS

H31_HUMAN

ENSMMUP00000001925

cysteine-type activity

peptidase

36 ACS Paragon Plus Environment

Page 37 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

protein heterodimerization activity GO:0046982

IEA

H31_HUMAN

ENSMMUP00000001925

DNA binding

GO:0003677

IEA

H31_HUMAN

ENSMMUP00000001925

protein binding

GO:0005515

IPI

H31_HUMAN

ENSMMUP00000001925

protein binding

GO:0005515

IEA

H31_HUMAN

ENSMMUP00000001925

GO:0007399

IEA

SEM4B_HUMAN ENSMMUP00000013543

GO:0007275

IEA

SEM4B_HUMAN ENSMMUP00000013543

cell differentiation

GO:0030154

IEA

SEM4B_HUMAN ENSMMUP00000013543

membrane

GO:0016020

IEA

SEM4B_HUMAN ENSMMUP00000013543

integral to membrane

GO:0016021

IEA

SEM4B_HUMAN ENSMMUP00000013543

receptor activity

GO:0004872

IEA

SEM4B_HUMAN ENSMMUP00000013543

protein binding

GO:0005515

IEA

SEM4B_HUMAN ENSMMUP00000013543

proteolysis

GO:0006508

IDA

ST14_HUMAN

ENSMMUP00000023696

proteolysis

GO:0006508

IEA

ST14_HUMAN

ENSMMUP00000023696

GO:0019897

IEA

ST14_HUMAN

ENSMMUP00000023696

membrane

GO:0016020

IEA

ST14_HUMAN

ENSMMUP00000023696

extracellular region

GO:0005576

IEA

ST14_HUMAN

ENSMMUP00000023696

GO:0005887

TAS

ST14_HUMAN

ENSMMUP00000023696

GO:0005886

TAS

ST14_HUMAN

ENSMMUP00000023696

GO:0016323

IEA

ST14_HUMAN

ENSMMUP00000023696

nervous development

system

multicellular development

organismal

extrinsic membrane

integral membrane

to

to

plasma

plasma

plasma membrane

basolateral

plasma

37 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 38 of 49

membrane extracellular space

GO:0005615

IEA

ST14_HUMAN

ENSMMUP00000023696

GO:0008236

IDA

ST14_HUMAN

ENSMMUP00000023696

catalytic activity

GO:0003824

IEA

ST14_HUMAN

ENSMMUP00000023696

serine-type activity

GO:0004252

IEA

ST14_HUMAN

ENSMMUP00000023696

GO:0005515

IEA

ST14_HUMAN

ENSMMUP00000023696

GO:0008236

IEA

ST14_HUMAN

ENSMMUP00000023696

serine-type activity

peptidase

endopeptidase

protein binding serine-type activity

peptidase

REFERENCES (1) Kull I., Almqvist C., Lilja G., Pershagen G., Wickman M. Breast-feeding reduces the risk of asthma during the first 4 years of life.. J Allergy Clin Immunol 2004, 114, 755-60. (2) Smilowitz J.T., Totten S.M., Huang J., Grapov D., Durham H.A., Lammi-Keefe C.J., Lebrilla C., German J.B. Human milk secretory immunoglobulin a and lactoferrin N-glycans are altered in women with gestational diabetes mellitus.. J Nutr 2013, 143, 1906-12. (3) Owen C.G., Martin R.M., Whincup P.H., Smith G.D., Cook D.G. Does breastfeeding influence risk of type 2 diabetes in later life? A quantitative analysis of published evidence. The American journal of clinical nutrition 2006, 84, 1043-1054.

38 ACS Paragon Plus Environment

Page 39 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(4) Newburg D.S., Ruiz-Palacios G.M., Morrow A.L. Human milk glycans protect infants against enteric pathogens.. Annu Rev Nutr 2005, 25, 37-58. (5) Control, C. F. D. and (CDC), P. Breastfeeding Report Card 2014, CDC Web site. 2013. (6) Koletzko B., Baker S., Cleghorn G., Neto U.F., Gopalan S., Hernell O., Hock Q.S., Jirapinyo P., Lonnerdal B., Pencharz P. Global standard for the composition of infant formula: recommendations of an ESPGHAN coordinated international expert group. Journal of pediatric gastroenterology and nutrition 2005, 41, 584-599. (7) Dallas D.C., Guerrero A., Khaldi N., Castillo P.A., Martin W.F., Smilowitz J.T., Bevins C.L., Barile D., German J.B., Lebrilla C.B. Extensive in vivo human milk peptidomics reveals specific proteolysis yielding protective antimicrobial peptides.. J Proteome Res 2013, 12, 2295-304. (8) Hettinga K., van Valenberg H., de Vries S., Boeren S., van Hooijdonk T., van Arendonk J., Vervoort J. The host defense proteome of human and bovine milk.. PLoS One 2011, 6, e19433. (9) Smolenski G., Haines S., Kwan F.Y.-S., Bond J., Farr V., Davis S.R., Stelwagen K., Wheeler T.T. Characterisation of Host Defence Proteins in Milk Using a Proteomic Approach. J Proteome Res 2006, 6, 207-215. (10) Smolenski G.A., Broadhurst M.K., Stelwagen K., Haigh B.J., Wheeler T.T. Host defence related responses in bovine milk during an experimentally induced Streptococcus uberis infection. Proteome Science 2014, 12, 19. (11) Arcaro K.F., Browne E.P., Qin W., Zhang K., Anderton D.L., Sauter E.R. Differential expression of cancer-related proteins in paired breast milk samples from women with breast cancer.. J Hum Lact 2012, 28, 543-6.

39 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 40 of 49

(12) Mange A., Tuaillon E., Viljoen J., Nagot N., Bendriss S., Bland R.M., Newell M.-L., Van de Perre P., Solassol J. Elevated concentrations of milk β2-microglobulin is associated with increased risk of breastfeeding transmission of HIV-1 (Vertical Transmission Study).. J Proteome Res 2013, (13) Gibbs R.A., Rogers J., Katze M.G., Bumgarner R., Weinstock G.M., Mardis E.R., Remington K.A., Strausberg R.L., Venter J.C., Wilson R.K., Batzer M.A., Bustamante C.D., Eichler E.E., Hahn M.W., Hardison R.C., Makova K.D., Miller W., Milosavljevic A., Palermo R.E., Siepel A., Sikela J.M., Attaway T., Bell S., Bernard K.E., Buhay C.J., Chandrabose M.N., Dao M., Davis C., Delehaunty K.D., Ding Y., Dinh H.H., Dugan-Rocha S., Fulton L.A., Gabisi R.A., Garner T.T., Godfrey J., Hawes A.C., Hernandez J., Hines S., Holder M., Hume J., Jhangiani S.N., Joshi V., Khan Z.M., Kirkness E.F., Cree A., Fowler R.G., Lee S., Lewis L.R., Li Z., Liu Y.-S., Moore S.M., Muzny D., Nazareth L.V., Ngo D.N., Okwuonu G.O., Pai G., Parker D., Paul H.A., Pfannkoch C., Pohl C.S., Rogers Y.-H., Ruiz S.J., Sabo A., Santibanez J., Schneider B.W., Smith S.M., Sodergren E., Svatek A.F., Utterback T.R., Vattathil S., Warren W., White C.S., Chinwalla A.T., Feng Y., Halpern A.L., Hillier L.W., Huang X., Minx P., Nelson J.O., Pepin K.H., Qin X., Sutton G.G., Venter E., Walenz B.P., Wallis J.W., Worley K.C., Yang S.-P., Jones S.M., Marra M.A., Rocchi M., Schein J.E., Baertsch R., Clarke L., Csürös M., Glasscock J., Harris R.A., Havlak P., Jackson A.R., Jiang H., Liu Y., Messina D.N., Shen Y., Song H.X.-Z., Wylie T., Zhang L., Birney E., Han K., Konkel M.K., Lee J., Smit A.F.A., Ullmer B., Wang H., Xing J., Burhans R., Cheng Z., Karro J.E., Ma J., Raney B., She X., Cox M.J., Demuth J.P., Dumas L.J., Han S.-G., Hopkins J., Karimpour-Fard A., Kim Y.H., Pollack J.R., Vinar T., Addo-Quaye C., Degenhardt J., Denby A., Hubisz M.J., Indap A., Kosiol C., Lahn B.T., Lawson H.A., Marklein A., Nielsen R., Vallender E.J., Clark A.G., Ferguson B., Hernandez R.D., Hirani K., Kehrer-Sawatzki H., Kolb J., Patil S., Pu L.-L., Ren Y., Smith D.G., Wheeler D.A., Schenck I., Ball E.V., Chen R., Cooper D.N., Giardine B., Hsu F., Kent W.J., Lesk A., Nelson D.L., O'brien W.E., Prüfer K., Stenson P.D., Wallace J.C., Ke H., Liu X.-M., Wang P., Xiang A.P., Yang F., Barber G.P., Haussler D., Karolchik D., Kern A.D., Kuhn R.M., Smith K.E.,

40 ACS Paragon Plus Environment

Page 41 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Zwieg A.S. Evolutionary and biomedical insights from the rhesus macaque genome.. Science 2007, 316, 222-34. (14) Cline J.M., Wood C.E. The Mammary Glands of Macaques.. Toxicol Pathol 2008, 36, 134s-141s. (15) Kunz C., Lönnerdal B. Protein composition of rhesus monkey milk: comparison to human milk.. Comp Biochem Physiol Comp Physiol 1993, 104, 793-7. (16) Kashyap S., Okamoto E., Kanaya S., Zucker C., Abildskov K., Dell R.B., Heird W.C. Protein quality in feeding low birth weight infants: a comparison of whey-predominant versus caseinpredominant formulas.. Pediatrics 1987, 79, 748-55. (17) Davidson L.A., Litov R.E., Lönnerdal B. Iron Retention from Lactoferrin-Supplemented Formulas in Infant Rhesus Monkeys. Pediatric Research 1990, 27, 176-180. (18) Bailey M.T., Lubach G.R., Coe C.L. Prenatal stress alters bacterial colonization of the gut in infant monkeys.. J Pediatr Gastroenterol Nutr 2004, 38, 414-21. (19) Bailey M.T., Coe C.L. Maternal separation disrupts the integrity of the intestinal microflora in infant rhesus monkeys.. Dev Psychobiol 1999, 35, 146-55. (20) O'Sullivan A., He X., McNiven E.M.S., Hinde K., Haggarty N.W., Lönnerdal B., Slupsky C.M. Metabolomic phenotyping validates the infant rhesus monkey as a model of human infant metabolism.. J Pediatr Gastroenterol Nutr 2013, 56, 355-63. (21) Roncada P., Stipetic L.H., Bonizzi L., Burchmore R.J.S., Kennedy M.W. Proteomics as a tool to explore human milk in health and disease.. J Proteomics 2013, 88, 47-57. (22) Reinhardt T.A., Lippolis J.D. Bovine milk fat globule membrane proteome. Journal of Dairy Research 2006, 73, 406-416.

41 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 42 of 49

(23) Reinhardt T.A., Lippolis J.D. Developmental Changes in the Milk Fat Globule Membrane Proteome During the Transition from Colostrum to Milk. Journal of Dairy Science 2008, 91, 2307-2318. (24) Reinhardt T.A., Lippolis J.D., Nonnecke B.J., Sacco R.E. Bovine milk exosome proteome.. J Proteomics 2012, 75, 1486-92. (25) Reinhardt T.A., Sacco R.E., Nonnecke B.J., Lippolis J.D. Bovine milk proteome: Quantitative changes in normal milk exosomes, milk fat globule membranes and whey proteomes resulting from Staphylococcus aureus mastitis. Journal of Proteomics 2013, 82, 141 - 154. (26) Lemay D.G., Hovey R.C., Hartono S.R., Hinde K., Smilowitz J.T., Ventimiglia F., Schmidt K.A., Lee J.W., Islas-Trejo A., Silva P.I., Korf I., Medrano J.F., Barry P.A., German J.B. Sequencing the transcriptome of milk production: milk trumps mammary tissue. BMC Genomics 2013, 14, 872. (27) Harhay G.P., Smith T.P., Alexander L.J., Haudenschild C.D., Keele J.W., Matukumalli L.K., Schroeder S.G., Van Tassell C.P., Gresham C.R., Bridges S.M., Burgess S.C., Sonstegard T.S. An atlas of bovine gene expression reveals novel distinctive tissue characteristics and evidence for improving genome annotation.. Genome Biol 2010, 11, R102. (28) Zhang Q., Cundiff J.K., Maria S.D., McMahon R.J., Woo J.G., Davidson B.S., Morrow A.L., Cundiff J., Maria S., McMahon R., Woo J., Davidson B., Morrow A. Quantitative Analysis of the Human Milk Whey Proteome Reveals Developing Milk and Mammary-Gland Functions across the First Year of Lactation. Proteomes 2013, 1, 128-158. (29) Gao X., McMahon R.J., Woo J.G., Davidson B.S., Morrow A.L., Zhang Q. Temporal changes in milk proteomes reveal developing milk functions.. J Proteome Res 2012, 11, 3897-907. (30) Schwanhäusser B., Busse D., Li N., Dittmar G., Schuchhardt J., Wolf J., Chen W., Selbach M. Global quantification of mammalian gene expression control.. Nature 2011, 473, 337-42.

42 ACS Paragon Plus Environment

Page 43 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(31) Vogel C., de Abreu R.S., Ko D., Le S.-Y., Shapiro B.A., Burns S.C., Sandhu D., Boutz D.R., Marcotte E.M., Penalva L.O. Sequence signatures and mRNA concentration can explain two-thirds of protein abundance variation in a human cell line.. Mol Syst Biol 2010, 6, 400. (32) Lu P., Vogel C., Wang R., Yao X., Marcotte E.M. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nature biotechnology 2007, 25, 117-124. (33) Liao Y., Alvarado R., Phinney B., Lönnerdal B. Proteomic characterization of specific minor proteins in the human milk casein fraction.. J Proteome Res 2011, 10, 5409-15. (34) Liao Y., Alvarado R., Phinney B., Lönnerdal B. Proteomic Characterization of Human Milk Whey Proteins during a Twelve-Month Lactation Period. J Proteome Res 2011, 10, 1746-1754. (35) Liao Y., Alvarado R., Phinney B., Lönnerdal B. Proteomic characterization of human milk fat globule membrane proteins during a 12 month lactation period.. J Proteome Res 2011, 10, 3530-41. (36) Fenyö D., Beavis R.C. A method for assessing the statistical significance of mass spectrometrybased protein identifications using general scoring schemes.. Anal Chem 2003, 75, 768-74. (37) Searle B.C. Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies.. Proteomics 2010, 10, 1265-9. (38) O'Brien K.P., Remm M., Sonnhammer E.L.L. Inparanoid: a comprehensive database of eukaryotic orthologs.. Nucleic Acids Res 2005, 33, D476-80. (39) Robinson M.D., Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology 2010, 11, R25.

43 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 44 of 49

(40) Yang W., Cai Q., Lui V.W.Y., Everley P.A., Kim J., Bhola N., Quesnelle K.M., Zetter B.R., Steen H., Freeman M.R., Grandis J.R. Quantitative proteomics analysis reveals molecular networks regulated by epidermal growth factor receptor level in head and neck cancer.. J Proteome Res 2010, 9, 3073-82. (41) Hundertmark C., Fischer R., Reinl T., May S., Klawonn F., Jänsch L. MS-specific noise model reveals the potential of iTRAQ in quantitative proteomics. Bioinformatics 2009, 25, 1004-1011. (42) Timm W., Scherbart A., Böcker S., Kohlbacher O., Nattkemper T.W. Peak intensity prediction in MALDI-TOF mass spectrometry: A machine learning study to support quantitative proteomics. BMC Bioinformatics 2008, 9, 443. (43) Khan A.P., Poisson L.M., Bhat V.B., Fermin D., Zhao R., Kalyana-Sundaram S., Michailidis G., Nesvizhskii A.I., Omenn G.S., Chinnaiyan A.M., Sreekumar A. Quantitative Proteomic Profiling of Prostate Cancer Reveals a Role for miR-128 in Prostate Cancer. Molecular & Cellular Proteomics 2010, 9, 298-312. (44) Li M., Gray W., Zhang H., Chung C.H., Billheimer D., Yarbrough W.G., Liebler D.C., Shyr Y., Slebos R.J.C. Comparative shotgun proteomics using spectral count data and quasi-likelihood modeling.. J Proteome Res 2010, 9, 4295-305. (45) Leitch M.C., Mitra I., Sadygov R.G. Generalized Linear and Mixed Models for Label-Free Shotgun Proteomics. Stat Interface 2012, 5, 89-98. (46) Huang D.W., Sherman B.T., Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.. Nat Protoc 2009, 4, 44-57. (47) Su A.I., Wiltshire T., Batalov S., Lapp H., Ching K.A., Block D., Zhang J., Soden R., Hayakawa M., Kreiman G., Cooke M.P., Walker J.R., Hogenesch J.B. A gene atlas of the mouse and human proteinencoding transcriptomes. Proceedings of the National Academy of Sciences of the United States of America 2004, 101, 6062-6067. 44 ACS Paragon Plus Environment

Page 45 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(48) Osthoff G., Hugo A., de Wit M., Nguyen T.P.M., Seier J. Milk composition of captive vervet monkey (Chlorocebus pygerythrus) and rhesus macaque (Macaca mulatta) with observations on gorilla (Gorilla gorilla gorilla) and white handed gibbon (Hylobates lar).. Comp Biochem Physiol B Biochem Mol Biol 2009, 152, 332-8. (49) Lönnerdal B., Keen C.L., Glazier C.E., Anderson J. A longitudinal study of rhesus monkey (Macaca mulatta) milk composition: trace elements, minerals, protein, carbohydrate, and fat.. Pediatr Res 1984, 18, 911-4. (50) Kunz C., Lönnerdal B. Isolation and characterization of a 21 kDa whey protein in rhesus monkey (Macaca mulatta) milk.. Comp Biochem Physiol Biochem Mol Biol 1994, 108, 463-9. (51) Seppälä M., Taylor R.N., Koistinen H., Koistinen R., Milgrom E. Glycodelin: a major lipocalin protein of the reproductive axis with diverse actions in cell recognition and differentiation.. Endocr Rev 2002, 23, 401-30. (52) Huhtala M.L., Seppälä M., Närvänen A., Palomäki P., Julkunen M., Bohn H. Amino acid sequence homology between human placental protein 14 and beta-lactoglobulins from various species.. Endocrinology 1987, 120, 2620-2. (53) Julkunen M., Seppälä M., Jänne O.A. Complete amino acid sequence of human placental protein 14: a progesterone-regulated uterine protein homologous to beta-lactoglobulins.. Proc Natl Acad Sci U S A 1988, 85, 8845-9. (54) Nissen A., Bendixen E., Ingvartsen K.L., Røntved C.M. Expanding the bovine milk proteome through extensive fractionation. Journal of Dairy Science 2013, (55) Molinari C.E., Casadio Y.S., Hartmann B.T., Livk A., Bringans S., Arthur P.G., Hartmann P.E. Proteome mapping of human skim milk proteins in term and preterm milk.. J Proteome Res 2012, 11, 1696-714. 45 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 46 of 49

(56) Palmer D.J., Kelly V.C., Smit A.-M., Kuy S., Knight C.G., Cooper G.J. Human colostrum: identification of minor proteins in the aqueous phase by proteomics.. Proteomics 2006, 6, 2208-16. (57) Picariello G., Ferranti P., Mamone G., Klouckova I., Mechref Y., Novotny M.V., Addeo F. Gelfree shotgun proteomic analysis of human milk.. J Chromatogr A 2012, 1227, 219-33. (58) Ostlund G., Schmitt T., Forslund K., Köstler T., Messina D.N., Roopra S., Frings O., Sonnhammer E.L.L. InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.. Nucleic Acids Res 2010, 38, D196-203. (59) Kinsella R.J., Kähäri A., Haider S., Zamora J., Proctor G., Spudich G., Almeida-King J., Staines D., Derwent P., Kerhornou A., Kersey P., Flicek P. Ensembl BioMarts: a hub for data retrieval across taxonomic space.. Database (Oxford) 2011, 2011, bar030. (60) Hinde K., Foster A.B., Landis L.M., Rendina D., Oftedal O.T., Power M.L. Daughter dearest: Sexbiased calcium in mother's milk among rhesus macaques.. Am J Phys Anthropol 2013, (61) Chichlowski M., German J.B., Lebrilla C.B., Mills D.A. The influence of milk oligosaccharides on microbiota of infants: opportunities for formulas.. Annu Rev Food Sci Technol 2011, 2, 331-51. (62) Smilowitz J.T., O'Sullivan A., Barile D., German J.B., Lönnerdal B., Slupsky C.M. The human milk metabolome reveals diverse oligosaccharide profiles.. J Nutr 2013, 143, 1709-18. (63) Lemay D.G., Ballard O.A., Hughes M.A., Morrow A.L., Horseman N.D., Nommsen-Rivers L.A. RNA Sequencing of the Human Milk Fat Layer Transcriptome Reveals Distinct Gene Expression Profiles at Three Stages of Lactation. PLoS ONE 2013, 8, e67531. (64) An H.J., Froehlich J.W., Lebrilla C.B. Determination of glycosylation sites and site-specific heterogeneity in glycoproteins.. Curr Opin Chem Biol 2009, 13, 421-6.

46 ACS Paragon Plus Environment

Page 47 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(65) Hinde K., Milligan L.A. Primate milk: proximate mechanisms and ultimate perspectives.. Evol Anthropol 2011, 20, 9-23. (66) Grapov D., Lemay D.G., Weber D., Phinney B.S., Chertok I., Gho D.S., German J.B., Smilowitz J.T. The Human Colostrum Whey Proteome is Altered in Gestational Diabetes Mellitus.. J Proteome Res 2014, (67) Bläckberg L., Hernell O., Olivecrona T., Domellöf L., Malinov M. The bile salt-stimulated lipase in human milk is an evolutionary newcomer derived from a non-milk protein. FEBS Letters 1980, 112, 51-54. (68) Li X., Lindquist S., Lowe M., Noppa L., Hernell O. Bile salt-stimulated lipase and pancreatic lipase-related protein 2 are the dominating lipases in neonatal fat digestion in mice and rats.. Pediatr Res 2007, 62, 537-41. (69) Casper C., Carnielli V.P., Hascoet J.-M., Lapillonne A., Maggio L., Timdahl K., Olsson B., Vågerö M., Hernell O. rhBSSL Improves Growth and LCPUFA Absorption in Preterm Infants Fed Formula or Pasteurized Breast Milk.. J Pediatr Gastroenterol Nutr 2014, 59, 61-9. (70) Adkins Y., Lönnerdal B. Potential host-defense role of a human milk vitamin B-12-binding protein, haptocorrin, in the gastrointestinal tract of breastfed infants, as assessed with porcine haptocorrin in vitro.. Am J Clin Nutr 2003, 77, 1234-40. (71) Kaetzel C.S. The polymeric immunoglobulin receptor: bridging innate and adaptive immune responses at mucosal surfaces.. Immunol Rev 2005, 206, 83-99. (72) Sánchez L., Calvo M., Brock J.H. Biological role of lactoferrin.. Arch Dis Child 1992, 67, 657-61. (73) Lönnerdal B., Iyer S. Lactoferrin: molecular structure and biological function.. Annu Rev Nutr 1995, 15, 93-110.

47 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 48 of 49

(74) Raj S., Faridi M.M.A., Rusia U., Singh O. A prospective study of iron status in exclusively breastfed term infants up to 6 months of age. International Breastfeeding Journal 2008, 3, 3. (75) Jiang R., Lopez V., Kelleher S.L., Lönnerdal B. Apo- and holo-lactoferrin are both internalized by lactoferrin receptor via clathrin-mediated endocytosis but differentially affect ERK-signaling and cell proliferation in Caco-2 cells.. J Cell Physiol 2011, 226, 3022-31. (76) Ashida K., Sasaki H., Suzuki Y.A., Lönnerdal B. Cellular internalization of lactoferrin in intestinal epithelial cells.. Biometals 2004, 17, 311-5. (77) Davidson L.A., Lönnerdal B. Persistence of human milk proteins in the breast-fed infant.. Acta Paediatr Scand 1987, 76, 733-40. (78) Chowanadisai W., Lönnerdal B. Alpha(1)-antitrypsin and antichymotrypsin in human milk: origin, concentrations, and stability.. Am J Clin Nutr 2002, 76, 828-33. (79) Lindberg T. Protease inhibitors in human milk.. Pediatr Res 1979, 13, 969-72. (80) Rosenberg K.R., Trevathan W.R. An anthropological perspective on the evolutionary context of preeclampsia in humans.. J Reprod Immunol 2007, 76, 91-7. (81) Barton R.A., Capellini I. Maternal investment, life histories, and the costs of brain growth in mammals.. Proc Natl Acad Sci U S A 2011, 108, 6169-74. (82) Kuzawa C.W., Chugani H.T., Grossman L.I., Lipovich L., Muzik O., Hof P.R., Wildman D.E., Sherwood C.C., Leonard W.R., Lange N. Metabolic costs and evolutionary implications of human brain development.. Proc Natl Acad Sci U S A 2014, (83) Sellen D.W. Evolution of infant and young child feeding: implications for contemporary public health.. Annu Rev Nutr 2007, 27, 123-48.

48 ACS Paragon Plus Environment

Page 49 of 49

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

(84) Sellen D.W. Comparison of infant feeding patterns reported for nonindustrial populations with current recommendations.. J Nutr 2001, 131, 2707-15. (85) Armelagos G.J., Goodman A.H., Jacobs K.H. The origins of agriculture: Population growth during a period of declining health. Population and Environment 1991, 13, 9-22. (86) Larsen C.S. Animal source foods and human health during evolution.. J Nutr 2003, 133, 3893S3897S. (87) Hinde K., German J.B. Food in an evolutionary context: insights from mother's milk.. J Sci Food Agric 2012, 92, 2219-23.

GRAPHICAL ABSTRACT

49 ACS Paragon Plus Environment