Intelligent Mixing of Proteomes for Elimination of False Positives in

Sep 19, 2016 - Protein complexes are essential in all organizational and functional aspects of the cell. Different strategies currently exist for anal...
3 downloads 12 Views 893KB Size
Subscriber access provided by UNIV OF CALIFORNIA SAN DIEGO LIBRARIES

Technical Note

Intelligent Mixing of Proteomes (iMixPro) for elimination of false positives in affinity purification-mass spectrometry Sven Eyckerman, Francis Impens, Emmy Van Quickelberghe, Noortje Samyn, Giel Vandemoortele, Delphine De Sutter, Jan Tavernier, and Kris Gevaert J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.6b00517 • Publication Date (Web): 19 Sep 2016 Downloaded from http://pubs.acs.org on September 19, 2016

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Journal of Proteome Research is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Intelligent Mixing of Proteomes (iMixPro) for elimination of false positives in affinity purification-mass spectrometry Sven Eyckerman1,2, Francis Impens1,2,3, Emmy Van Quickelberghe1,2, Noortje Samyn1,2, Giel Vandemoortele1,2, Delphine De Sutter1,2, Jan Tavernier1,2, Kris Gevaert1,2,* 1

VIB Medical Biotechnology Center, VIB, A. Baertsoenkaai 3 B-9000 Ghent, Belgium

2

Department of Biochemistry, Ghent University, A. Baertsoenkaai 3 B-9000 Ghent, Belgium

3

VIB Proteomics Expertise Center, VIB, A. Baertsoenkaai 3 B-9000 Ghent, Belgium

KEYWORDS: affinity purification-mass spectrometry, protein complex, SILAC

*Correspondence to Kris Gevaert VIB Medical Biotechnology Center A. Baertsoenkaai 3, B-9000 Gent, Belgium Tel: +32-9-264.92.74 Fax: +32-9-264.94.90 Email: [email protected]

1 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT Protein complexes are essential in all organizational and functional aspects of the cell. Different strategies currently exist to analyze such protein complexes by mass spectrometry, including affinity purification (AP-MS) and proximal labeling based strategies. However, the high sensitivity of current mass spectrometers typically results in extensive protein lists mainly consisting of non-specifically copurified proteins. Finding the true positive interactors in these lists remains highly challenging. Here, we report a powerful design based on differential labeling with stable isotopes combined with non-equal mixing of control and experimental samples to discover bona fide interaction partners in AP-MS experiments. We apply this intelligent Mixing of Proteomes (iMixPro) concept to overexpression experiments for RAF1, RNF41 and TANK, and also to engineered cell lines expressing epitope-tagged endogenous PTPN14, JIP3 and IQGAP1. For all baits, we confirmed known interactions and found a number of novel interactions. The results for RNF41 and TANK were compared to a classical affinity purification experiment which demonstrated the efficiency and the specificity of the iMixPro approach.

2 ACS Paragon Plus Environment

Page 2 of 32

Page 3 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION The highly heterogeneous nature of protein interactions drives the expanding arsenal of protein interaction technologies. Two major fields can be distinguished, based either on targeted or on unbiased detection of interaction partners. In the targeted or binary approaches, a test is performed between a bait protein and a candidate prey protein, which results ultimately in a reporter activity (1). The yeast two-hybrid system (2) and the mammalian protein-protein interaction trap (3) are examples that have matured further to allow arrayed proteome-wide implementation (4). So-called co-complex methods allow unbiased detection of protein partners by using enrichment strategies to capture the bait protein under conditions where the associations in the protein complex are preserved. Classical approaches include the affinity purification-mass spectrometry (AP-MS) approaches where an epitope tagged version of the bait is expressed and purified from lysates (5) with similar proteome-wide applications (6). Variations include tandem affinity purifications (TAP)-MS with two consecutive purifications (7, 8), and the direct purifications using specific antibodies (IP-MS; e.g. (9)). The lysis-independent approaches are a recent development wherein the variations in purifications due to lysis conditions (10) are tackled by proximal labeling using an enzymatic activity. BioID uses a promiscuous biotin ligase (11) for adding biotin groups to adjacent proteins, while APEX uses a peroxidase for this purpose (12). Another recently developed lysis-independent approach sorts protein complexes into virus-like particles under native conditions (13). The co-complex methods invariably lead to long protein candidate lists upon MS analysis of the samples. The main reason for this is the association of a huge number of proteins with the purification matrix, which is driven by weak, non-specific associations, further augmented by high abundancies of some proteins and avidity effects as multiple sites of associations can be present on the matrix. Other proteins can then bind on their turn to these background proteins, which, combined with the binding of several chaperones, results in lists that easily exceed 1,000 proteins (14). This clearly poses extreme challenges 3 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

on the analysis of such data and, accordingly, a number of strategies have been described to tackle this particular challenge. On the experimental side, the protocols can be further improved (15), standardized (16) or even combined (10), or, alternatively, the background can be reduced by consecutive purifications on different matrices (e.g. TAP-MS; (7, 17)). On the computational side, many filtering strategies such as SAINT (18) and SFINX (19) are available with their respective strengths and weaknesses, and their particular application windows (20, 21). Another important opportunity for tackling the complexity issue of co-complex analysis lies in experimental design. Indeed, the design of co-complex experiments is an essential aspect for successful downstream analysis. Clearly, the number of repeat and control experiments critically affects the outcome and thus the candidate interactor protein lists in label-free studies. The affinity enrichment-mass spectrometry (AE-MS) approach is a recent label-free implementation of AP-MS based on low stringency purifications using biological repeats of control and experiment samples under rigorous protocol routine (22). In this approach, the lack of peptide/protein quantification in control samples is solved by imputation of missing values in the underlying MAXQUANT/PERSEUS analysis pipeline. AE-MS has recently been applied to a large set of GFP-tagged human proteins expressed from endogenous promoters (23). Other experimental design options are available including the QUICK approach where differential labeling via Stable Isotope Labeling of Amino acids in Cell Culture (SILAC) is combined with RNAi-based knockdown of bait proteins to define interaction partners based on their difference from an equimolar ratio (24). Of note in this approach is the fact that RNAi-based knockdown is rarely complete, thus resulting in residual expression the target protein. In a differential SILAC analysis, this leads to a doublet MS peak with a high intensity difference between the light and heavy peptide forms. Although analysis software is improving, accurate determination of the ratio values from such highly regulated peptide doublets remains challenging and typically requires manual curation (25). Another use for SILAC in co-complex research relates to the 4 ACS Paragon Plus Environment

Page 4 of 32

Page 5 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

mixing protocol used for combining the light and heavy labeled samples. In Purification After Mixing (PAM)-SILAC, the labeled samples (i.e. control and specific pull down samples) are mixed and analyzed at different time points. The dynamic interactions are readily exchanged resulting in a gradual loss of the differential signal. In Mixing After Purification (MAP)-SILAC, the samples are only mixed right before the trypsin digest, providing a reference for PAM-SILAC (26). In the present study we apply an intelligent design to segregate specific binding partners from the bulk of background proteins by introducing different ratio-tags on proteins. By mixing a pair of light- and heavy-labeled bait samples together with 3 additional light control samples, we obtain a 1/1 ratio-tag for the bait protein and its associated partners, while all background proteins have a 4/1 light/heavy (L/H) ratio. We previously applied a similar design scheme to delineate proteolytic events in protease degradome studies (25). We explore our intelligent Mixing of Proteomes (iMixPro) concept to AP-MS experiments using both (over)expressed and endogenous proteins. In addition, for some of these baits, the results were compared to a classical AP-MS design.

5 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

EXPERIMENTAL SECTION Cell lines and reagents The pMET7-FLAG-RNF41 C34S H36Q was described elsewhere (27). The coding sequence for human cRAF/RAF1 was cloned in the pMET7-FLAG vector (28). The coding sequence for TANK was obtained from the ORFEOME 8.1 collection and was transferred to a pMET7-FLAG-GW destination vector. Engineering of the HCT116 cells for N-terminal tagging of JIP-3 was performed according to (29) using a rAAV targeting vector containing a Floxed neomycin Resistance cassette followed by the translation initiation site and the sequence for the 3xFLAG-tag in frame to the encoded JIP3 protein. After virus production, HCT116 cells were infected and selected by G418. PCR screening allowed selection of single cell clones with the correct integration. Clones were then treated with CRE recombinase by adenoviral vector delivery and screened using genomic PCR for correct Neomycin cassette elimination. Clones were further verified using Southern blot, Western blot and targeted Sanger sequencing of the altered region. The HCT116 cell line expressing FLAG-tagged IQGAP1 protein is described elsewhere (30). The DLD-1 cell line expressing FLAG-tagged endogenous PTPN14 was obtained from Horizon Discovery (UK).

SILAC labeling, cell lysis, pull down conditions and sample processing HEK293T cells were cultured using standard conditions in DMEM (GIBCO, Life Technologies) complemented with 10% FBS in an 8% CO2 humidified atmosphere. For SILAC labeling, cells were grown in either light medium or heavy medium containing respectively 12C6 Lys and 12C6 Arg, or 13C6 Lys and 13C6 Arg at concentrations that prevent proline conversion. For transient transfections, 2.2x107 HEK293T cells were seeded in a 145 cm2 dish (PD145) the day before transfection. The seeded cells were transfected for 8h using linear polyethylene imine (PEI) with 14.5 µg bait expression vector or mock plasmid. The

6 ACS Paragon Plus Environment

Page 6 of 32

Page 7 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

bait expression plasmid was transfected in 1 PD145 dish which was light labeled and 1 PD145 dish expressing heavy labeled proteins. The mock plasmid was transfected in cells cultured in 3 additional light-labeled PD145 dishes. Cell lysis and pull down conditions were based on the protocol described by Kean and colleagues (15). The cells were lysed after 32 h using lysis buffer (50 mM HEPES-KOH pH 8.0; 100 mM KCl, 2 mM EDTA, 0.1% NP40, 10% glycerol, 1 mM DTT, 0.5 mM PMSF, protease inhibitor cocktail tablet [Complete™, Roche], 0.25 mM sodium orthovanadate, 50 mM glycerophosphate and 10 mM NaF) (15). One additional freeze-thaw treatment was performed to ensure complete homogenization. After centrifugation (11000xg) for 15 min at 4°C, insoluble material was removed and the supernatant was incubated for 2 h at 4°C with 20 µl paramagnetic MyOne beads™ (Dynal, Thermo) loaded with 2 µg biotinylated anti-FLAG antibody (clone M2, Sigma Aldrich) for each of the samples (5 samples for each iMixPro analysis). All samples were washed once with 1 ml lysis buffer, and 200 µl FLAG magnetic rinsing buffer (20 mM TRIS-HCl pH 8.0 and 2 mM CaCl2) was added to each of the samples. Samples were then combined in a single reaction tube. The total supernatant was removed from the beads, and 750 ng sequencing-grade trypsin (Promega) in 5 µl TRIS-HCl pH 8.0 was added to digest the protein complexes overnight on the matrix. Another 3 h incubation with an additional 250 ng trypsin was performed after removal of the beads. After acidification of the samples by 2% formic acid, samples were directly analyzed on a Q Exactive mass spectrometer (PTPN14, JIP3 and IQGAP1), or were further fractionated using capillary HPLC (RAF1, RNF41 and TANK) prior to LC-MS/MS analysis. For further fractionation, we loaded 2.5 µl of the peptide mixture on a RP-HPLC system (C18-HD; 3 µm beads, 12 cm column with 250 µm internal diameter (I.D.)). After 10 min of isocratic pumping of solvent A (10 mM ammonium acetate in water/acetonitrile (98/2, v/v) at pH 5.5), the peptides were separated using a linear gradient from 100% solvent A to 100% solvent B (10 mM ammonium acetate in acetonitrile/water (70/30, v/v), at pH 5.5) for 30 min at a flow rate of 3 µl/min. Peptides eluting between 0 and 80 min were collected at a time interval of 1 min each, further pooled into 20 fractions, vacuum 7 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

dried and each fraction was re-dissolved in 12 µl of 2% acetonitrile and 0.1% TFA. We introduced the peptide mixtures from the 20 fractions into an LC-MS/MS system through an Ultimate 3000 RSLC nano LC (Thermo Scientific, Bremen, Germany) in-line connected to a LTQ-Orbitrap Velos mass spectrometer (RAF1, RNF41; Thermo Fisher Scientific).

Mass spectrometry and data analysis The sample mixture was first loaded on a trapping column (made in-house, 100 μm I.D. × 20 mm, 5 μm beads C18 Reprosil-HD, Dr. Maisch, Ammerbuch-Entringen, Germany). After flushing from the trapping column, the sample was loaded on an analytical column (made in-house, 75 μm I.D. × 150 mm, 5 μm beads C18 Reprosil-HD, Dr. Maisch) packed in the nanospray needle (PicoFrit SELF/P PicoTip emitter, PF360-75-15-N-5, NewObjective, Woburn, USA). Peptides were loaded with loading solvent (0.1% TFA in water) and separated with a linear gradient from 98% solvent A’ (0.1% formic acid in water) to 40% solvent B′ (0.08% formic acid in water/acetonitrile, 20/80 (v/v)) in 30 min at a flow rate of 300 nl/min. This was followed by a 15 min wash reaching 99% solvent B’. The mass spectrometer was operated in data-dependent mode, automatically switching between MS and MS/MS acquisition for the ten most abundant peaks in a given MS spectrum. In the LTQ-Orbitrap Velos, full scan MS spectra were acquired in the Orbitrap at a target value of 1E6 with a resolution of 60,000. The ten most intense ions were then isolated for fragmentation in the linear ion trap, with a dynamic exclusion of 20 s. Peptides were fragmented after filling the ion trap at a target value of 1E4 ion counts. The Q Exactive instrument was operated in data-dependent, positive ionization mode, automatically switching between MS and MS/MS acquisition for the 10 most abundant peaks in a given MS spectrum. The source voltage was 3.4 kV, and the capillary temperature was 275°C. One MS1 scan (m/z 400−2000, AGC target 3 × 106 ions, maximum ion injection time 80 ms) acquired at a resolution of 70,000 (at 200 m/z) was followed by up to 10

8 ACS Paragon Plus Environment

Page 8 of 32

Page 9 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

tandem MS scans (resolution 17,500 at 200 m/z) of the most intense ions fulfilling the defined selection criteria (AGC target 5 × 104 ions, maximum ion injection time 60 ms, isolation window 2 Da, fixed first mass 140 m/z, spectrum data type: centroid, underfill ratio 2%, intensity threshold 1.7xE4, exclusion of unassigned, 1 and >5 charged precursors, peptide match preferred, exclude isotopes on, dynamic exclusion time 20 s). The HCD collision energy was set to 25% Normalized Collision Energy and the polydimethylcyclosiloxane background ion at 445.120025 Da was used for internal calibration (lock mass). MAXQUANT searches (version 1.5.3.30) were performed against the human SwissProt database (March 2016), with 4.5 ppm and 20 ppm tolerance on precursor and fragment mass respectively, with trypsin/P settings allowing up to 2 missed cleavages, and with methionine oxidation, N-terminal acetylation and pyroglutamate formation as variable modifications. Arg6 and Lys6 setting was used for labels. Minimum peptide length was set to 7, maximum peptide mass was 4,600 Da. PSM FDR and protein FDR were set to 0.01. Min. peptides and min. razor + unique peptides were set to 1. Contaminants and identifications against the REVERSE database were removed in the PERSEUS (version 1.5.3.2) analysis before the log2 transformation of the non-normalized protein and peptide ratios. When repeat experiments were available, the searches were performed together with the original sample to allow matching of MS spectra between runs. All raw data and MAXQUANT combined files were uploaded to PRIDE through ProteomExchange (31). Candidate proteins were obtained by evaluating the peptide ratios of all peptides attributed to a protein. Proteins were only retained if at least two peptides were quantified, and if the majority of the peptides (≥75%) had a heavy over light (H/L) log2 ratio surpassing -1. For completeness, the proteins identified and quantified by a single peptide with a H/L ratio over 0.5 were added to the previous list of proteins in Supplementary Table 2. AP-MS data were searched using the Mascot search engine ((32); MatrixScience, www.matrixscience.com) by creating Mascot Generic Files from the MS/MS data in each LC run using 9 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

the Distiller software (version 2.4.3.3, Matrix Science, www.matrixscience.com/Distiller). While generating these peak lists, grouping of spectra was allowed in Distiller with a maximal intermediate retention time of 30 s and a maximal intermediate scan count of 5 was used where possible. Grouping was done with 0.005 Da precursor ion tolerance. A peak list was only generated when the MS/MS spectrum contains more than 10 peaks. There was no de-isotoping and the relative signal to noise limit was set at 2. These peak lists were then searched using the Mascot search engine with the Mascot Daemon interface (version 2.4.1, Matrix Science). Spectra were searched against the human protein entries in the Swiss-Prot database (SP2014_07; 20284 sequence entries). Variable modifications were set methionine oxidation, pyro-glutamate formation of amino terminal glutamine and acetylation of the protein N-terminus. The mass tolerance on precursor ions was set to 10 ppm (with Mascot’s C13 option set to 1) and on fragment ions to 20 mmu. The instrument setting was put on ESI-QUAD. Enzyme was set to trypsin, allowing for one missed cleavage. Only peptides that scored above the threshold score and that were ranked first, set at 99% confidence, were withheld. RAW files and identification lists were uploaded to PRIDE via Proteomexchange and carry PXD004246 as dataset identifier (31). Spectral count files for the different baits in combination with three EGFP bait control experiments were uploaded and analysed by standard SAINT through the CRAPOME website (www.crapome.org).

10 ACS Paragon Plus Environment

Page 10 of 32

Page 11 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

RESULTS AND DISCUSSION The iMixPro design strategy for AP-MS Intelligent mixing of labeled control and experimental samples was previously successfully applied for the discovery of protease sites on a proteome-wide scale (25). We reasoned that a similar design could also aid in the analysis of co-complex experiments where a high number of false positives typically hamper the identification of specific and unique interaction partners. Figure 1 shows the concept of the iMixPro application for AP-MS experiments. In iMixPro, a total of 5 samples are prepared; in the example presented in Figure 1, these include two affinity purifications for conditions where the bait is present, one from light and one from heavy stable isotope-labeled cells, while additional control purifications are performed from conditions without any bait expression in light labeled cells. All purifications are performed separately, and samples are only mixed right before trypsin digestion. The combined trypsin-digested sample is then presented for MS analysis. This results in two ratio distributions when all obtained ratios are considered: a large distribution around a 4/1 L/H (light to heavy) ratio, containing all contaminant proteins as the mixing of samples leads to a skewed increase towards light peptides for these proteins; and a much smaller distribution around the 1/1 L/H ratio for those proteins that are unique for the bait samples. Peptides generated from contaminant proteins introduced in the sample by cell culture conditions (e.g. serum proteins), sample handling or preparation (e.g. keratins) or the matrix (e.g. antibody or protein A or G), will have a very high light to heavy ratio, and can be easily removed from the analysis.

Application of iMixPro to samples upon transient transfection of bait proteins To explore the iMixPro concept, we started by transient expression of a FLAG-tagged version of the MAPK pathway kinase RAF1 (cRAF) in HEK293T cells. The iMixPro labeling scheme discussed in the 11 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

previous section was used for identifying RAF1 interactors. The MS/MS data were analyzed by MAXQUANT (33) and PERSEUS (22). The numbers of total and quantified peptide to spectrum matches (PSMs), peptides and proteins for the different experiments in this study are shown in Supplementary Table 1. In the upper panels of Figure 2, we show the ratio distributions of the identified and quantified proteins. RAF1 resides near a 1/1 L/H ratio, while the majority of identified and quantified proteins resides around a 4/1 L/H ratio. According to the iMixPro principle, interacting proteins are expected to segregate with the bait protein around similar ratios. Indeed, when assessing the proteins having ratios close to the bait ratio (log2 ratios above -1), the selected protein lists contain several known interaction partners as evident from the BioGRID database (version 3.4 (34), Table 1). The protein lists were manually curated to assess consistent ratios for the different quantified peptides from the same protein (thus considering variation of the distinct peptides). A high confidence protein list was derived by only retaining the proteins identified with more than one quantified peptide, with the majority of the quantified peptides having log2 ratios > -1 (Table 1). In this way we identified 14.3.3 proteins as interaction partners, as well as the chaperones CDC37 and HSP90. Other identified proteins include two members of the ESCRT-III complex CHMP4A and CHMP4B. Supplementary Table 2 shows the additional proteins identified with a single quantified peptide in the relevant ratio range. When assessing the ratios for the identified and quantified peptides, we see a similar picture wherein most of the peptide ratios are indeed observed around 4/1, reflecting the high number of contaminating proteins in the sample. The distribution is slightly skewed towards 1/1 ratio values. The peptide ratios for known binding partner families (14.3.3 proteins and chaperones) are highlighted and show that these reside around 1 (Figure 2, lower panels). The association of RAF1 with the ESCRT-III complex has not been shown before. It will be interesting to see if there are functional implications to this association, such as the sorting or recruitment to multivesicular bodies of activated RAF1 complexes.

12 ACS Paragon Plus Environment

Page 12 of 32

Page 13 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

We repeated this iMixPro strategy for two additional bait proteins, Really Interesting New Gene (RING) Finger protein 41 (RNF41; also called Neuregulin Receptor Degradation Protein-1/NRDP1) and TRAF Family Member-Associated NF-kB Activator (TANK or I-TRAF). Protein ratios for the experiments are shown in Supplementary Figure 1 (bait proteins are highlighted). As for RAF1, we removed proteins with variable peptide ratios. For RNF41 the list seems comprehensive, including interaction partners reported in BioGRID (BIRC6/BRUCE, SOGA1, HOMER2, MARK2, and CACYCB). The interaction with NAV1, KIAA1598/SHOOTIN1A was found using the lysis-independent Virotrap approach and was also confirmed by classical AP-MS ((13); see also below). NAV1 and KIAA1598 are involved in axon guidance and neuronal polarization respectively (35, 36). A number of candidate partners were not reported before, including isoforms from known interactors (HOMER1, SOGA2), as well as other proteins such as FHOD3, LRRFIP2 and FLII (Table 1). It will be interesting to see if and how these novel protein associations add to the receptor sorting function of RNF41 (37). For TANK, we identified its known interaction partners BIRC2, TBK1, TRAF2 and TRAF3. DIABLO was identified with consistent peptide ratios around the chosen cut-off value (Table 1). Peptide ratio plots for these experiments are shown in Supplementary Figure 2. Although the results clearly indicate that iMixPro allows for robust identification of known interaction partners of different bait proteins, we repeated the iMixPro experiment with TANK to evaluate reproducibility of the strategy. Supplementary Figure 3 shows the ratio distributions for the identified and quantified proteins and their peptides. The resulting protein lists are shown in Supplementary Table 2. Both experiments show strong overlap in the high confidence identifications. To further evaluate the overall reproducibility, we visualized the ratios of the peptides identified in both experiments in a scatter plot and assessed the correlation between the two experiments. Of note, the majority of peptides linked to the known and novel interaction partners of TANK showed very similar behavior in the two experiments (Supplementary Figure 4). Nevertheless, the lists of proteins identified and

13 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

quantified with a single peptide are significantly different, underscoring the fact that validation efforts should be ideally focused on the proteins identified and quantified with at least two different peptides.

Comparison of iMixPro to classical AP-MS We further compared the data obtained with iMixPro to a classical label-free AP-MS approach. Therefore, we performed triplicate pull-down experiments on the RNF41 and TANK baits in combination with three EGFP control pull down experiments to allow analysis by SAINT (18). The protein interactions obtained by SAINT for SAINT probabilities exceeding 0.8 are compared with iMixPro in Figure 3. The complete SAINT output is shown in Supplementary Table 3. For RNF41, the candidate list for SAINT is short when the recommended 0.8 SAINT probability score is considered (7 candidate partners) with only 4 overlapping proteins. Dropping the SAINT probability score to 0.5 reveals another 3 overlapping proteins, but increases the number of candidates resulting in a total of 86 candidate interaction partners. Based on its powerful design, a single iMixPro experiment thus seems to reveal all relevant associations, while 3 control experiments and 3 replicates were required in the classical AP-MS approach used here. In addition, it can be expected that many non-specific associations are still present in the APMS lists. Proteins such as ACTB, SPTNB, IRS4, DBN, SAFB, and ACACA are found with high frequencies in the CRAPOME table (www.CRAPOME.org; (14)) and are therefore more likely to be contaminants.

iMixPro on engineered endogenous proteins Transient expression typically leads to high expression levels of proteins which can affect different aspects of the protein such as localization and folding, causing possible aberrant associations or even toxicity (38). Inducible expression systems allow better control over protein expression (e.g. Flp-In T-

14 ACS Paragon Plus Environment

Page 14 of 32

Page 15 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Rex™ system; (39)) but still result in a considerable amount of protein in the target cells and may require extensive cell line manipulations (e.g. only few Flp-In T-REx™ cell lines are currently available). With the advances in the field of genome engineering, it became feasible and efficient to introduce epitope tags on endogenous proteins (40). To explore iMixPro in the context of tagged endogenous proteins, we introduced a 3xFLAG tag on the Nterminus of the JIP3 protein. The cell line was derived from the HCT116 colon carcinoma cell line and generated by rAav-based homologous recombination. A targeting vector containing homology arms flanking the N-terminus of JIP3 was designed to insert an N-terminal triple FLAG tag sequence. A neomycin resistance cassette flanked by LOX sites allowed selection of cells with stably integrated tag sequences. PCR screening and Southern analysis were used to verify clones before removal of the selection cassette, while additional targeted sequencing and Western blotting were performed to select and verify the correct cell clones after removal of the cassette by virus-assisted delivery of CRE recombinase. As for the transient expression experiments, we also labeled the engineered cells by light and heavy SILAC labeling using Arg and Lys residues. The parental HCT116 cells that remained unmodified were light labeled and added as controls to move peptides from contaminants to 4/1 light over heavy ratios. The 5 samples were lysed and processed separately and only mixed during the final washing step. The combined sample was then on-bead digested with trypsin and analyzed by MS without fractionation prior to injection. After MAXQUANT and PERSEUS analysis, we plotted the distribution of the protein and the peptide ratios (Figure 4). We found SPAG9/JIP4 to segregate very strongly with JIP3 in addition to a number of other proteins (KIF5B, FAM32A, MYH10, RBM34 and WDR1). The interaction with SPAG9 was also reported in an ongoing high-throughput AP-MS study (6).

15 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

We additionally tested another in-house generated cell line expressing C-terminally FLAG-tagged endogenous IQGAP1, and a commercially available cell line expressing an FLAG-tagged Protein Tyrosine Phosphatase Non-receptor type 14 (PTPN14) protein. The endogenously tagged PTPN14 contains the Calmodulin Binding Peptide (CBP) and a triple FLAG-tag on the C-terminus of the protein. The tag was introduced in both alleles of the gene in the colon carcinoma DLD-1 cell line. Generation of the HCT116 cell line expressing a C-terminal tagged endogenous IQGAP1 was described before (30). We performed iMixPro experiments as described for the JIP3 cell line resulting in the expected protein and peptide ratios (Supplementary Figure 5 and Supplementary Figure 6 respectively). Upon detailed analysis of candidate interacting proteins for PTPN14 with ratios close to 1/1, we found WWC3, a close homolog of the known interaction partner WWC1. Additional candidate interaction partners are the nuclear lamina protein TMPO, the cytoskeletal protein VIL1 and SEPT11, a member of the septin family (Table 1). The link to the nuclear lamina was not described before but fits the role of this phosphatase in cell proliferation (41). For IQGAP1, the list of highly confident interacting proteins is limited to the known association with CDC42 in addition to a number of putative interaction partners (Table 1).

16 ACS Paragon Plus Environment

Page 16 of 32

Page 17 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

CONCLUSIONS With the ever increasing sensitivity of MS instruments, the lists of identified proteins resulting from APMS experiments get significantly longer, easily exceeding 1,000 proteins ((14), this report). Most of these proteins bind non-specifically to the matrix or the purified complex and are thus false positives. Finding the true positives in these lists is a considerable analytical challenge which is currently addressed at different levels: experimental design, improved or parallelized pull down conditions, and data analysis. With iMixPro we present a simple, yet powerful approach to isolate true positive candidates from APMS experiments. The principle relies on an experimental design scheme wherein differentially labeled samples are mixed to obtain a specific and unique ratio-tag for the bona fide interaction partners. By comparing iMixPro to a classical AP-MS filter approach, we show the efficiency and specificity obtained with this design. Further improvements on iMixPro can be focused on further reduction of (biased) user manipulation by the implementation of distribution fitting software and by the automatic assessment of peptide variation for candidate proteins. We currently use a rather arbitrary lower end cut-off of a 0.5 heavy over light ratio (log2 ratio > -1) to select the candidate proteins. The underlying design of iMixPro implies two overlapping distributions (Figure 1). Fitting software for mixture distributions could hence provide a more accurate estimate for this cut-off, and would allow an estimate of the chance of false discoveries (false discovery rate) at this value. It will be interesting to see whether the iMixPro concept can be transferred to other co-complex approaches such as BioID (11, 39) or Virotrap (13), or even to other MS-based proteomics experiments where unique events occur upon treatment. For example, unique post-translational modifications that follow upon a specific stimulus should be detectable with an iMixPro design scheme. 17 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 32

ASSOCIATED CONTENT Supporting Information The Supporting Information is available free of charge on the ACS Publications website at DOI: Supplementary Figure 1: Ratio histograms for identified and quantified proteins for RNF41 and TANK experiments. Supplementary Figure 2: Ratio histograms for identified and quantified peptides for RNF41 and TANK experiments. Supplementary Figure 3: Ratio histograms for identified and quantified proteins and peptides for a TANK repeat experiment. Supplementary Figure 4: Scatter plot showing the H/L ratios in a log2 scale for peptides identified in both repeats. Supplementary Figure 5: Histogram plots for the protein ratios obtained after iMixPro AP-MS for endogenous PTPN14 and IQGAP1. Supplementary Figure 6: Histogram plots for the peptide ratios for endogenous PTPN14 and IQGAP1. Supplementary Table 1: Relevant statistics for the iMixPro AP-MS experiments performed in this study showing the number of MS/MS spectra recorded, the number of peptide to spectra matches (PSMs) or identified MS/MS spectra, the number of peptides and quantified peptides, and the associated number of proteins and quantified proteins. Supplementary Table 2: All proteins identified and quantified with consistent Heavy over Light (H/L) peptide log2 ratios higher than -1. Supplementary Table 3: SAINT analysis for the AP-MS experiments performed for RNF41 and TANK.

18 ACS Paragon Plus Environment

Page 19 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

ACKNOWLEDGEMENTS The authors wish to thank Jonathan Vandenbussche and Evy Timmerman for MS operation and Kathleen Moens for technical assistance. The project was funded by the ‘Fund for Scientific Research-Flanders’ (FWO-Vlaanderen; grants G011312N and G050913N, and a personal grant to F.I), by the GROUP-ID Multidisciplinary Research Program, by a Methusalem grant to JT for support of SE, and by a VIB Techwatch New Technologies Grant. JT holds an ERC Advanced Grant (CYRE, 340941).

AUTHOR CONTRIBUTIONS SE managed the project and wrote the manuscript with the help of FI, KG and JT. FI devised the iMixPro concept and performed initial experiments with the help of SE. NS and DDS performed iMixPro experiments. EVQ and GV assisted with data analysis and generation of cell lines.

PROTEOMICS DATA The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE (31) partner repository with the dataset identifier PXD004246 Reviewer account details: Username: [email protected] Password: toIiVQDj

19 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

FIGURE LEGENDS Figure 1: Schematic representation of the iMixPro design scheme for affinity purification-mass spectrometry experiments. A total of 5 cell culture conditions are prepared: 3 samples are labeled by light R and K residues and serve as controls, while 2 samples contain bait protein expression. Cells are lysed and the complexes are purified on beads separately for each sample. Purified complexes are then combined and processed by trypsin on-bead digestion. Downstream MS analysis of the peptides roughly results in 2 states: light and heavy peptides from bait complexes approach equimolar ratios, while background proteins have a 4/1 light over heavy ratio. When assessing the global proteome histogram, the majority of proteins will segregate around the 4/1 ratio while a smaller population of proteins corresponding to the bait complex will have ratios that approach 1/1 light over heavy. Unlabeled proteins coming from serum or capture reagents will have extreme light ratios and can be easily removed.

Figure 2: Ratio histograms for identified and quantified proteins and peptides using RAF1 as bait in an iMixPro design scheme. The bulk of the proteins (upper panels) and peptides (lower panels) reside in between the log2 values of -5 and -2, corresponding to the contaminating proteins. The RAF1 protein (black) and its peptides (red) can be found near 1/1 ratio (right panels for zoomed plots). Associating proteins have protein and peptide ratios (green for peptides derived from 14.3.3 proteins and blue for peptides coming from chaperones) that approximate equimolar ratios. The heavy over light (H/L) ratios on the X-axis are shown in a log2 scale.

Figure 3: Comparison between iMixPro and classical AP-MS for RNF41 (upper panel) and TANK (lower panel). For iMixPro we selected the proteins identified and quantified with at least 2 peptides, with the 20 ACS Paragon Plus Environment

Page 20 of 32

Page 21 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

majority of quantified peptides having H/L ratios over -0.5. For AP-MS experiments, we performed 3 biological repeat experiments for the baits and combined the data with 3 EGFP bait control experiments. SAINT analysis was performed using RNF41 or TANK bait experiments combined with EGFP controls. Only candidate proteins with a SAINT probability score (SP) higher or equal than 0.8 were used in this comparison. Full SAINT lists can be found in Supplementary Table 3.

Figure 4: Histogram plots for the protein (upper panels) and peptide ratios (lower panels) obtained after iMixPro AP-MS for JIP3. The bait proteins are highlighted in black (right panels for zoomed regions). The peptide ratios for JIP3 are shown in red while peptide ratios for the strongly co-segregating SPAG9/JIP4 protein are shown in green. The heavy over light (H/L) ratios on the X-axis are shown in a log2 scale.

21 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

REFERENCES 1. Stynen, B.; Tournu, H.; Tavernier, J.; Van Dijck, P., Diversity in genetic in vivo methods for protein-protein interaction studies: from the yeast two-hybrid system to the mammalian split-luciferase system. Microbiol Mol Biol Rev 2012, 76, (2), 331-82. 2. Fields, S.; Song, O., A novel genetic system to detect protein-protein interactions. Nature 1989, 340, (6230), 245-6. 3. Eyckerman, S.; Verhee, A.; der Heyden, J. V.; Lemmens, I.; Ostade, X. V.; Vandekerckhove, J.; Tavernier, J., Design and application of a cytokine-receptor-based interaction trap. Nat Cell Biol 2001, 3, (12), 1114-9. 4. Rolland, T.; Tasan, M.; Charloteaux, B.; Pevzner, S. J.; Zhong, Q.; Sahni, N.; Yi, S.; Lemmens, I.; Fontanillo, C.; Mosca, R.; Kamburov, A.; Ghiassian, S. D.; Yang, X.; Ghamsari, L.; Balcha, D.; Begg, B. E.; Braun, P.; Brehme, M.; Broly, M. P.; Carvunis, A. R.; Convery-Zupan, D.; Corominas, R.; CoulombeHuntington, J.; Dann, E.; Dreze, M.; Dricot, A.; Fan, C.; Franzosa, E.; Gebreab, F.; Gutierrez, B. J.; Hardy, M. F.; Jin, M.; Kang, S.; Kiros, R.; Lin, G. N.; Luck, K.; MacWilliams, A.; Menche, J.; Murray, R. R.; Palagi, A.; Poulin, M. M.; Rambout, X.; Rasla, J.; Reichert, P.; Romero, V.; Ruyssinck, E.; Sahalie, J. M.; Scholz, A.; Shah, A. A.; Sharma, A.; Shen, Y.; Spirohn, K.; Tam, S.; Tejeda, A. O.; Trigg, S. A.; Twizere, J. C.; Vega, K.; Walsh, J.; Cusick, M. E.; Xia, Y.; Barabasi, A. L.; Iakoucheva, L. M.; Aloy, P.; De Las Rivas, J.; Tavernier, J.; Calderwood, M. A.; Hill, D. E.; Hao, T.; Roth, F. P.; Vidal, M., A proteome-scale map of the human interactome network. Cell. 2014, 159, (5), 1212-26. doi: 10.1016/j.cell.2014.10.050. 5. Gingras, A. C.; Gstaiger, M.; Raught, B.; Aebersold, R., Analysis of protein complexes using mass spectrometry. Nat Rev Mol Cell Biol 2007, 8, (8), 645-54. 6. Huttlin, E. L.; Ting, L.; Bruckner, R. J.; Gebreab, F.; Gygi, M. P.; Szpyt, J.; Tam, S.; Zarraga, G.; Colby, G.; Baltier, K.; Dong, R.; Guarani, V.; Vaites, L. P.; Ordureau, A.; Rad, R.; Erickson, B. K.; Wuhr, M.; Chick, J.; Zhai, B.; Kolippakkam, D.; Mintseris, J.; Obar, R. A.; Harris, T.; Artavanis-Tsakonas, S.; Sowa, M. E.; De Camilli, P.; Paulo, J. A.; Harper, J. W.; Gygi, S. P., The BioPlex Network: A Systematic Exploration of the Human Interactome. Cell. 2015, 162, (2), 425-40. doi: 10.1016/j.cell.2015.06.043. 7. Gavin, A. C.; Bosche, M.; Krause, R.; Grandi, P.; Marzioch, M.; Bauer, A.; Schultz, J.; Rick, J. M.; Michon, A. M.; Cruciat, C. M.; Remor, M.; Hofert, C.; Schelder, M.; Brajenovic, M.; Ruffner, H.; Merino, A.; Klein, K.; Hudak, M.; Dickson, D.; Rudi, T.; Gnau, V.; Bauch, A.; Bastuck, S.; Huhse, B.; Leutwein, C.; Heurtier, M. A.; Copley, R. R.; Edelmann, A.; Querfurth, E.; Rybin, V.; Drewes, G.; Raida, M.; Bouwmeester, T.; Bork, P.; Seraphin, B.; Kuster, B.; Neubauer, G.; Superti-Furga, G., Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002, 415, (6868), 141-7. 8. Van Leene, J.; Eeckhout, D.; Cannoot, B.; De Winne, N.; Persiau, G.; Van De Slijke, E.; Vercruysse, L.; Dedecker, M.; Verkest, A.; Vandepoele, K.; Martens, L.; Witters, E.; Gevaert, K.; De Jaeger, G., An improved toolbox to unravel the plant cellular machinery by tandem affinity purification of Arabidopsis protein complexes. Nat Protoc. 2015, 10, (1), 169-87. doi: 10.1038/nprot.2014.199. Epub 2014 Dec 18. 9. Malovannaya, A.; Lanz, R. B.; Jung, S. Y.; Bulynko, Y.; Le, N. T.; Chan, D. W.; Ding, C.; Shi, Y.; Yucer, N.; Krenciute, G.; Kim, B. J.; Li, C.; Chen, R.; Li, W.; Wang, Y.; O'Malley, B. W.; Qin, J., Analysis of the human endogenous coregulator complexome. Cell 2011, 145, (5), 787-99. 10. Hakhverdyan, Z.; Domanski, M.; Hough, L. E.; Oroskar, A. A.; Oroskar, A. R.; Keegan, S.; Dilworth, D. J.; Molloy, K. R.; Sherman, V.; Aitchison, J. D.; Fenyo, D.; Chait, B. T.; Jensen, T. H.; Rout, M. P.; LaCava, J., Rapid, optimized interactomic screening. Nat Methods 2015, 12, (6), 553-60. 11. Roux, K. J.; Kim, D. I.; Raida, M.; Burke, B., A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. J Cell Biol 2012, 196, (6), 801-10.

22 ACS Paragon Plus Environment

Page 22 of 32

Page 23 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

12. Lam, S. S.; Martell, J. D.; Kamer, K. A.-O.; Deerinck, T. J.; Ellisman, M. H.; Mootha, V. K.; Ting, A. Y., Directed evolution of APEX2 for electron microscopy and proximity labeling. Nat Methods 2015, 12, (1), 51-4 LID - 10.1038/nmeth.3179 [doi]. 13. Eyckerman, S.; Titeca, K.; Van Quickelberghe, E.; Cloots, E.; Verhee, A.; Samyn, N.; De Ceuninck, L.; Timmerman, E.; De Sutter, D.; Lievens, S.; Van Calenbergh, S.; Gevaert, K.; Tavernier, J., Trapping mammalian protein complexes in viral particles. Nat Commun. 2016, 7:11416., (doi), 10.1038/ncomms11416. 14. Mellacheruvu, D.; Wright, Z.; Couzens, A. L.; Lambert, J. P.; St-Denis, N. A.; Li, T.; Miteva, Y. V.; Hauri, S.; Sardiu, M. E.; Low, T. Y.; Halim, V. A.; Bagshaw, R. D.; Hubner, N. C.; Al-Hakim, A.; Bouchard, A.; Faubert, D.; Fermin, D.; Dunham, W. H.; Goudreault, M.; Lin, Z. Y.; Badillo, B. G.; Pawson, T.; Durocher, D.; Coulombe, B.; Aebersold, R.; Superti-Furga, G.; Colinge, J.; Heck, A. J.; Choi, H.; Gstaiger, M.; Mohammed, S.; Cristea, I. M.; Bennett, K. L.; Washburn, M. P.; Raught, B.; Ewing, R. M.; Gingras, A. C.; Nesvizhskii, A. I., The CRAPome: a contaminant repository for affinity purification-mass spectrometry data. Nat Methods 2013, 10, (8), 730-6. 15. Kean, M. J.; Couzens, A. L.; Gingras, A. C., Mass spectrometry approaches to study mammalian kinase and phosphatase associated proteins. Methods 2012, 57, (4), 400-8. 16. Varjosalo, M.; Sacco, R.; Stukalov, A.; van Drogen, A.; Planyavsky, M.; Hauri, S.; Aebersold, R.; Bennett, K. L.; Colinge, J.; Gstaiger, M.; Superti-Furga, G., Interlaboratory reproducibility of large-scale human protein-complex analysis by standardized AP-MS. Nat Methods. 2013, 10, (4), 307-14. doi: 10.1038/nmeth.2400. Epub 2013 Mar 3. 17. Oeffinger, M., Two steps forward--one step back: advances in affinity purification mass spectrometry of macromolecular complexes. Proteomics. 2012, 12, (10), 1591-608. doi: 10.1002/pmic.201100509. 18. Choi, H.; Larsen, B.; Lin, Z. Y.; Breitkreutz, A.; Mellacheruvu, D.; Fermin, D.; Qin, Z. S.; Tyers, M.; Gingras, A. C.; Nesvizhskii, A. I., SAINT: probabilistic scoring of affinity purification-mass spectrometry data. Nat Methods. 2011, 8, (1), 70-3. doi: 10.1038/nmeth.1541. Epub 2010 Dec 5. 19. Titeca, K.; Meysman, P.; Gevaert, K.; Tavernier, J.; Laukens, K.; Martens, L.; Eyckerman, S., SFINX: Straightforward Filtering Index for Affinity Purification-Mass. J Proteome Res 2016, 15, (1), 332-8 LID 10.1021/acs.jproteome.5b00666 [doi]. 20. Meysman, P.; Titeca, K.; Eyckerman, S.; Tavernier, J.; Goethals, B.; Martens, L.; Valkenborg, D.; Laukens, K., Protein complex analysis: From raw protein lists to protein interaction networks. Mass Spectrom Rev 2015, 28, (10), 21485. 21. Pu, S.; Vlasblom, J.; Turinsky, A.; Marcon, E.; Phanse, S.; Trimble, S. S.; Olsen, J.; Greenblatt, J.; Emili, A.; Wodak, S. J., Extracting high confidence protein interactions from affinity purification data: at the crossroads. J Proteomics. 2015, 118:63-80., (doi), 10.1016/j.jprot.2015.03.009. Epub 2015 Mar 14. 22. Keilhauer, E. C.; Hein, M. Y.; Mann, M., Accurate protein complex retrieval by affinity enrichment mass spectrometry (AE-MS) rather than affinity purification mass spectrometry (AP-MS). Mol. Cell Proteomics 2015, 14, (1), 120-35. 23. Hein, M. Y.; Hubner, N. C.; Poser, I.; Cox, J.; Nagaraj, N.; Toyoda, Y.; Gak, I. A.; Weisswange, I.; Mansfeld, J.; Buchholz, F.; Hyman, A. A.; Mann, M., A Human Interactome in Three Quantitative Dimensions Organized by Stoichiometries and Abundances. Cell 2015, 163, (3), 712-23. 24. Selbach, M.; Mann, M., Protein interaction screening by quantitative immunoprecipitation combined with knockdown (QUICK). Nat Methods. 2006, 3, (12), 981-3. Epub 2006 Oct 29. 25. Impens, F.; Colaert, N.; Helsens, K.; Ghesquiere, B.; Timmerman, E.; De Bock, P. J.; Chain, B. M.; Vandekerckhove, J.; Gevaert, K., A quantitative proteomics design for systematic identification of protease cleavage events. Mol Cell Proteomics. 2010, 9, (10), 2327-33. doi: 10.1074/mcp.M110.001271. Epub 2010 Jul 13.

23 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

26. Wang, X.; Huang, L., Identifying dynamic interactors of protein complexes by quantitative mass spectrometry. Mol Cell Proteomics. 2008, 7, (1), 46-57. Epub 2007 Oct 12. 27. De Ceuninck, L.; Wauman, J.; Masschaele, D.; Peelman, F.; Tavernier, J., Reciprocal crossregulation between RNF41 and USP8 controls cytokine receptor sorting and processing. J Cell Sci 2013, 126, (Pt 16), 3770-81. 28. Eyckerman, S.; Broekaert, D.; Verhee, A.; Vandekerckhove, J.; Tavernier, J., Identification of the Y985 and Y1077 motifs as SOCS3 recruitment sites in the murine leptin receptor. FEBS Lett. 2000, 486, (1), 33-7. 29. Khan, I. F.; Hirata, R. K.; Russell, D. W., AAV-mediated gene targeting methods for human cells. Nat Protoc. 2011, 6, (4), 482-501. doi: 10.1038/nprot.2011.301. Epub 2011 Mar 24. 30. Vandemoortele, G.; Staes, A.; Gonnelli, G.; Samyn, N.; De Sutter, D.; Vandermarliere, E.; Timmerman, E.; Gevaert, K.; Martens, L.; Eyckerman, S., An extra dimension in protein tagging by quantifying universal proteotypic peptides using targeted proteomics. Sci Rep. 2016, 6:27220., (doi), 10.1038/srep27220. 31. Vizcaino, J. A.; Deutsch, E. W.; Wang, R.; Csordas, A.; Reisinger, F.; Rios, D.; Dianes, J. A.; Sun, Z.; Farrah, T.; Bandeira, N.; Binz, P. A.; Xenarios, I.; Eisenacher, M.; Mayer, G.; Gatto, L.; Campos, A.; Chalkley, R. J.; Kraus, H. J.; Albar, J. P.; Martinez-Bartolome, S.; Apweiler, R.; Omenn, G. S.; Martens, L.; Jones, A. R.; Hermjakob, H., ProteomeXchange provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol. 2014, 32, (3), 223-6. doi: 10.1038/nbt.2839. 32. Perkins, D. N.; Pappin, D. J.; Creasy, D. M.; Cottrell, J. S., Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis. 1999, 20, (18), 355167. 33. Cox, J.; Mann, M., MaxQuant enables high peptide identification rates, individualized p.p.b.range mass accuracies and proteome-wide protein quantification. Nat Biotechnol. 2008, 26, (12), 136772. doi: 10.1038/nbt.1511. Epub 2008 Nov 30. 34. Chatr-Aryamontri, A.; Breitkreutz, B. J.; Oughtred, R.; Boucher, L.; Heinicke, S.; Chen, D.; Stark, C.; Breitkreutz, A.; Kolas, N.; O'Donnell, L.; Reguly, T.; Nixon, J.; Ramage, L.; Winter, A.; Sellam, A.; Chang, C.; Hirschman, J.; Theesfeld, C.; Rust, J.; Livstone, M. S.; Dolinski, K.; Tyers, M., The BioGRID interaction database: 2015 update. Nucleic Acids Res. 2015, 43, (Database issue), D470-8. doi: 10.1093/nar/gku1204. Epub 2014 Nov 26. 35. Maes, T.; Barcelo, A.; Buesa, C., Neuron navigator: a human gene family with homology to unc53, a cell guidance gene from Caenorhabditis elegans. Genomics. 2002, 80, (1), 21-30. 36. Toriyama, M.; Shimada, T.; Kim, K. B.; Mitsuba, M.; Nomura, E.; Katsuta, K.; Sakumura, Y.; Roepstorff, P.; Inagaki, N., Shootin1: A protein involved in the organization of an asymmetric signal for neuronal polarization. J Cell Biol. 2006, 175, (1), 147-57. 37. Wauman, J.; De Ceuninck, L.; Vanderroost, N.; Lievens, S.; Tavernier, J., RNF41 (Nrdp1) controls type 1 cytokine receptor degradation and ectodomain shedding. J Cell Sci. 2011, 124, (Pt 6), 921-32. doi: 10.1242/jcs.078055. 38. Gibson, T. J.; Seiler, M.; Veitia, R. A., The transience of transient overexpression. Nat Methods 2013, 10, (8), 715-21. 39. Couzens, A. L.; Knight Jd Fau - Kean, M. J.; Kean Mj Fau - Teo, G.; Teo G Fau - Weiss, A.; Weiss A Fau - Dunham, W. H.; Dunham Wh Fau - Lin, Z.-Y.; Lin Zy Fau - Bagshaw, R. D.; Bagshaw Rd Fau - Sicheri, F.; Sicheri F Fau - Pawson, T.; Pawson T Fau - Wrana, J. L.; Wrana Jl Fau - Choi, H.; Choi H Fau - Gingras, A.-C.; Gingras, A. C., Protein interaction network of the mammalian Hippo pathway reveals mechanisms of. Sci Signal 2013, 6, (302), rs15 LID - 10.1126/scisignal.2004712 [doi]. 40. Vandemoortele, G.; Gevaert, K.; Eyckerman, S., Proteomics in the genome engineering era. Proteomics. 2016, 16, (2), 177-87. doi: 10.1002/pmic.201500262. Epub 2015 Dec 20.

24 ACS Paragon Plus Environment

Page 24 of 32

Page 25 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

41. Wadham, C.; Gamble, J. R.; Vadas, M. A.; Khew-Goodall, Y., Translocation of protein tyrosine phosphatase Pez/PTPD2/PTP36 to the nucleus is associated with induction of cell proliferation. J Cell Sci. 2000, 113, (Pt 17), 3117-23.

25 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 32

Table 1: Proteins identified by iMixPro for the different baits used in this study. Only proteins identified and quantified with at least two peptides are shown. The majority of the quantified peptides for these proteins have heavy over light (H/L) log2 ratios higher than -1. The full list also containing proteins identified and quantified with a single peptide is provided as Supplementary Table 2. Proteins in bold were described before as interaction partners for these baits (BIOGrid3.4; (34)). WWC3 is a close homolog of WWC1 which was described as an interaction partner for PTPN14.

RAF1

RNF41

TANK

PTPN14

JIP3

IQGAP1

BAG2 CDC37 CHMP4A CHMP4B HSP90AA1 HSP90AB1 HSP90AB4P HSPA8 IQGAP2 PDCD6IP USP7 YWHAB YWHAE YWHAG YWHAH HSPA1B/ HSPA1A

BIRC6 CACYBP FLII HOMER1 HOMER2 KDM3B KIAA1598 LIMCH1 LRRFIP2 MARK2 MARK3 MTCL1 NAV1 NAV2 SOGA1

BIRC2 DIABLO TBK1 TRAF2 TRAF3

SEP11 GPATCH8 TMPO VIL1 WWC3

FAM32A KIF5B MYH10 RBM34 SPAG9 WDR1

CCDC47 CDC42 DECR2 EPB41

26 ACS Paragon Plus Environment

Page 27 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 1

27 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

28 ACS Paragon Plus Environment

Page 28 of 32

Page 29 of 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Figure 2

29 ACS Paragon Plus Environment

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 3

30 ACS Paragon Plus Environment

Page 30 of 32

Page 31 of 32

Figure 4

Proteins

JIP3

120

Frequency

100 80

15

60 10 40 20

5

0

0

2

1

0

-1

-2

Ratio H/L (Log2)

-3

-4

-5

-6

2

1

0

-1

-2

-3

-4

-5

-6

Ratio H/L (Log2)

Peptides 700

Frequency

600 500 50 40 30 20 10 0

400 300 200 100 0

ACS Paragon Plus Environment

3

31

2

Ratio H/L (Log2)

1

0

-1

-2

-3

-4

Ratio H/L (Log2)

-5

-6

-7

3 2 1 0 -1 -2 -3 -4 -5 -6 -7

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Journal of Proteome Research

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

FOR TOC ONLY

Image courtesy of Sven Eyckerman, Copyright 2016.

32 ACS Paragon Plus Environment

Page 32 of 32