Mapping the Ku Interactome Using Proximity ... - ACS Publications

Dec 26, 2018 - integrity7,8 and length,9,10 subtelomeric gene silencing,11 the. DNA damage ... the candidate proteins identified by BioID, AP-MS is al...
1 downloads 0 Views 891KB Size
Subscriber access provided by University of Winnipeg Library

Article

Mapping the Ku interactome using proximitydependent biotin identification in human cells Sanna Abbasi, and Caroline Schild-Poulter J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.8b00771 • Publication Date (Web): 26 Dec 2018 Downloaded from http://pubs.acs.org on December 27, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Mapping the Ku interactome using proximitydependent biotin identification in human cells Sanna Abbasi and Caroline Schild-Poulter* Robarts Research Institute and Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario N6A 5B7, Canada.

*To whom correspondence should be addressed:

519 519-931-5777 (x24164) [email protected]

1 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 37

ABSTRACT The Ku heterodimer, composed of Ku70 and Ku80, is best characterized for its role in repairing double-stranded DNA breaks, but is also known to participate in other regulatory processes. Despite our understanding of Ku protein interplay during DNA repair, the extent of Ku’s protein interactions in other processes has never been fully determined. Using proximitydependent biotin identification (BioID) and affinity-purification coupled to mass spectrometry (AP-MS) with wild-type Ku70, we identified candidate proteins that interact with the Ku heterodimer in HEK293 cells, in the absence of exogenously-induced DNA damage. BioID analysis identified approximately 250 nuclear proteins, appearing in at least two replicates, including known Ku-interacting factors such as MRE11A, WRN, and NCOA6. Meanwhile, APMS analysis identified approximately 50 candidate proteins. Of the novel protein interactors identified, many were involved in functions already suspected to involve Ku such as transcriptional regulation, DNA replication, and DNA repair, while several others suggest that Ku may be involved in additional functions such as RNA metabolism, chromatin-remodeling, and microtubule dynamics. Using a combination of BioID and AP-MS, this is the first report that comprehensively characterizes the Ku protein interaction landscape, revealing new cellular processes and protein complexes involving the Ku complex.

KEYWORDS: protein-protein interactions, proteomics, BioID, AP-MS, Ku, Ku70, DNA repair

2 ACS Paragon Plus Environment

Page 3 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION Ku is a highly abundant protein heterodimer in humans, composed of subunits, Ku70 and Ku80, that closely intertwine to form a ring-shaped complex1. Ku is well-known for its strong, sequence-independent affinity for nucleic acids, particularly double-stranded DNA ends2. Ku is best characterized for its role in repairing DNA double-stranded breaks (DSBs) through the nonhomologous end-joining (NHEJ) repair pathway3–5. Seconds after the formation of a DSB, two Ku molecules circumscribe the DNA on either side of the break, recruiting additional NHEJ repair factors2. Ku is also required for the repair of programmed DSBs in B- and T-cells during V(D)J recombination6. Aside from repairing DSBs, Ku has been implicated in various other cellular processes including maintaining telomere integrity7,8 and length9,10, subtelomeric gene silencing11, the DNA damage response12, transcriptional regulation13, and DNA replication14. Incidentally, many of these roles were uncovered by identifying proteins that interact with Ku. While some interactions with Ku are transient and only occur in specific cell types15 or under specific conditions16, others associate in a constitutive manner, implicating Ku in various protein complexes17,18. Though select studies have mapped subunit-specific protein interactions5, most proteins have been shown to interact with Ku as a whole as Ku is suspected to exist as an obligate heterodimer. Until recently, yeast two-hybrid screens and affinity-based purifications coupled to mass spectrometry were the primary methods used to identify factors that interact with Ku19–21. Through these methods, several protein interactions specific to Ku80 (DNA-PKcs22, C-terminus of WRN23, etc.) and Ku70 (MRE11A19, N-terminus of WRN23, MSH624, etc.) have been identified; however, these techniques only offer a subset of the potential Ku-interactors. Furthermore, these techniques are limited by their poor ability to identify transient interactors, preventing the identification of the full spectrum of Ku-interacting proteins. Recently, proximity-dependent biotin identification (BioID) was reported as a new biochemical technique for identifying candidate proteins that may interact in vivo25. One of the greatest advantages of this technique is its ability to capture transient protein interactions. Briefly, the protein of interest is fused to a HA-tagged, mutant biotin ligase (denoted BirA*) that covalently biotinylates the lysine residues of proximal proteins, within a ~10 nm radius25. Once candidate proteins have been biotinylated, they are isolated using streptavidin-conjugated beads 3 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 37

or resins26. Another advantage of this technique is its use of the biotin-streptavidin bond, one of the strongest non-covalent bonds in nature27. The strength of this bond sidesteps the difficulties associated with preserving weak or transient interactions. Candidate proteins are then identified by mass spectrometry (MS). BioID can be used to identify candidate proteins that interact directly, indirectly, or are in the vicinity of a protein of interest25. To date, BioID has been used with great success to identify novel interactors for a variety of proteins25,28–30. To complement and/or substantiate the candidate proteins identified by BioID, AP-MS is also frequently employed. The Ku heterodimer is known to interact with a wide variety of proteins in many cellular pathways, though the extent of these proteins is unknown. Here, we report the first use of BioID to identify proteins that interact with Ku in human cells. Using a Ku70-BirA* fusion protein, we identified known Ku interactors, validating the use of this system, as well as identifying many novel candidate proteins. To complement the BioID analysis, we used AP-MS and found that some proteins were unique to AP-MS, however many were candidates also identified in the BioID screen. Using this compiled data, we were able to establish the first comprehensive Ku protein landscape, consisting of a large network of factors that may work closely and/or function directly with Ku.

4 ACS Paragon Plus Environment

Page 5 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

MATREIALS AND METHODS Plasmid Expression Constructs The wild-type human Ku70 gene, XRCC6, was PCR amplified from a previously cloned plasmid, pMSCVpuro31, using primers that altered the stop codon and added restriction enzyme cut sites, HpaI and BamHI, to the 5’ overhangs. Primers used for PCR were the HpaI forward primer (5’ATCGTGGTTAACCGGATGTCAGGGTGGGAGTCATATTACA-AAACCGAG) and the BamHI reverse primer (5’ TACGATGGATCCCGCGCAGTCCTGGAAGTGCTTGGTGAGGGC). Ku70 was cloned into pcDNA3.1-MCS-BirA(R118G)-HA (Addgene) using the HpaI and BamHI sites, thus fusing the C-terminus of Ku70 to the mutant, promiscuous biotin ligase (BirA*). Successful clones were confirmed by sequencing analysis performed at the DNA Sequencing Facility, at Robarts Research Institute, London, Canada. Cell Culturing and Biotinylation The following human cell lines were used in this study: HeLa, HEK293, and PhoenixAMPHO (all purchased from ATCC). Cells were cultured in high-glucose Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum (FBS) at 37°C in 5% CO2. For biotinylation, cells were incubated for 24 hr with media supplemented with 50 μM biotin at 37°C in 5% CO2. HEK293 stable cell lines were maintained in 450 μg/mL geneticin (G418). Transfections and Generation of Stable Cell Line Cells were transiently transfected with Ku70-BirA* in BioID using jetPRIME transfection reagent (Polyplus) and harvested after 24 hr. For stable cell line creation, 48 hr after transfection, HEK293 cells were selected for 8 days in 450 μg/mL G418 to encourage random integration of the vector, thereby conferring resistance to G418. Upon colony formation, monoclonal cells were isolated, grown, and screened for stable expression by Western blotting. Monoclonal cells expressing the fusion protein were then pooled together to create a heterogeneous, polyclonal stable cell line. Immunofluorescence Conducted as previously described32. Cells were incubated overnight at 4°C with the primary mouse antibody against HA (H9658, Sigma, 1:1000). After washing with PBS, slides 5 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 37

were incubated for 1 hr in the dark while shaking at RT with immunofluorescence secondary antibody Alexa 488 against mouse (Invitrogen, 1:1000). After three more PBS washes, coverslips were mounted onto glass slides using ProLong Gold containing 4’,6’-diamidino-2phenylindole (DAPI) (Invitrogen). Cell images were taken with an Olympus BX51 microscope at 20X magnification and Image-Pro Plus software (Media Cybernetics, Inc.). Preparation of Extracts and Immunoblotting Whole cell extracts were prepared either using Whole Cell Extract (WCE) buffer (50 mM HEPES pH 7.4, 150 mM NaCl, 1 mM EDTA, 0.5% NP-40, 10% glycerol) or RIPA lysis buffer (0.1% SDS, 0.5% sodium deoxycholate, 1% NP-40, 50 mM Tris-HCL, 150 mM NaCl). Nuclear protein extracts were prepared as described previously31. For Western blot analysis, extracts were resolved by SDS-PAGE (8%) before transfer onto a polyvinylidene difluoride (PVDF) membrane and blocking in 5% skim milk and TBST solution. Membranes were hybridized overnight with the following antibodies: mouse anti-HA (H3663, Sigma, 1:1000), mouse antiKu70 (N3H10; Neomarkers, 1:1000), goat anti-Ku80 (M-20; Santa Cruz, 1:250), mouse anti-αtubulin (T5168, Sigma, 1:1000). Biotinylated proteins were detected similarly with the following modifications. Following transfer, PVDF membranes were blocked overnight in 2.5% bovine serum albumin in TBST solution and incubated for at least 1 hr in the same solution with HRPconjugated streptavidin (PierceTM High Sensitivity Streptavidin-HRP, Thermo Fisher, 1:20,000). All Western blots were developed using the Clarity Western ECL substrate (Bio-Rad, Hercules, CA) and imaged on the Molecular Imager® ChemiDocTM XRS system (Bio-Rad). Coimmunoprecipitation of Ku80 Whole cell extracts for Ku80 coimmunoprecipitation using Ku70-BirA*-HA were adjusted to below 0.5% NP-40 and pre-cleared before incubating at 4°C overnight with anti-HA antibody (H3663, Sigma). Immunoprecipitated proteins were isolated with PierceTM Protein G magnetic beads (ThermoFisher Scientific, Rockford, IL) and washed using Wash buffer (60 mM KCl, 25 mM HEPES pH 7.9, 0.5 mM ETDA pH 8, 0.5% NP-40, 12% glycerol) before analysis by Western blot. Double-stranded DNA (dsDNA) Pull-Down Assay 6 ACS Paragon Plus Environment

Page 7 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A 30 bp, blunt dsDNA (5'biotin/CCAGCTGATCGACGACTATGGAACTCACTC) probe with one 5’ end biotinylated was acquired from Integrated DNA Technologies. Nuclear protein extracts of the Ku70-BirA* HEK293 stable cell line were diluted to ~25 mM NaCl using DNA-Ku Binding buffer (25 mM HEPES pH 7.9, 20% glycerol, 0.2 mM EDTA pH 8, supplemented with 0.2 mM PMSF and 2 mM DTT). A 30-fold excess of dsDNA probe over the magnetic Dynabeads (MyOne Streptavidin C1; Invitrogen) was incubated at 4°C with the beads for 30 min in Binding buffer with head-over-head rotation. The diluted nuclear extracts were added to the DNA-beads mixture before rotating for 15 min at RT. Beads were washed three times with Binding buffer and proteins were released using SDS and boiled for 10 min at 95°C before Western blotting. Streptavidin Pull-Down of Biotinylated Proteins For small-scale pull-downs, to be run only on SDS-PAGE gels, two confluent 10 cm plates of cells were used. For large scale pull-downs used for MS, six 15 cm plates were seeded with cells and grown to 70-80% confluence before incubation for 24 hr in complete media supplemented with 50 μM biotin. After three PBS washes, cells were lysed at RT in 1.5 ml RIPA lysis buffer (0.1% SDS, 0.5% sodium deoxycholate, 1% NP-40, 50 mM Tris-HCL, 150 mM NaCl, supplemented with protease inhibitors 0.2 mM phenylmethane sulfonyl fluoride (PMSF), 1 mM dithiothreitol (DTT), 1 g/mL leupeptin, 10 g/mL aprotinin, 1 g/mL pepstatin (inhibitors obtained from BioShop, Burlington, Ontario, Canada) by vortex then sonication. Following lysis, samples were centrifuged at 4°C at 17968 xg for 20 min. Supernatants were transferred to lowretention microcentrifuge tubes and incubated with magnetic Dynabeads (MyOne Steptavadin C1; Invitrogen) overnight. Beads were collected and washed once in Strep-biotin wash buffer (50 mM Tris-HCL pH 8, 1% SDS (w/v), 150 mM NaCl) at RT, rotating for 5 min. Next, beads were washed twice with RIPA lysis buffer, followed by three washes in TAP lysis buffer (10% glycerol, 0.1% NP-40, 2 mM EDTA pH 8, 50 mM HEPES pH 7.9, 100 mM KCl). Finally, beads were washed three times using 50 mM NH4HCO3 (ammonium bicarbonate, ABC) solution. Approximately 10% of the total sample was reserved exclusively for Western blot analysis to check for protein biotinylation in which proteins were released from the beads with SDS and boiling for 10 min at 95°C. Meanwhile, the rest of the sample was kept bound to beads for onbead digestion before submission for MS. 7 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 37

Affinity Purification Coupled to Mass Spectrometry (AP-MS) Affinity purification was conducted using approximately 15 mg of HEK293 nuclear extracts for one large-scale immunoprecipitation experiment. Magnetic Dynabeads (DynabeadsTM Protein G; Invitrogen) were incubated with mouse Ku70 primary antibody (N3H10; Neomarkers) or normal mouse IgG (sc-2025; Santa Cruz Biotechnology) for 2 hours while nuclear HEK293 extracts were adjusted to below 150 mM NaCl using Nuclear Protein Binding buffer (25 mM HEPES pH 7.4, 60 mM KCl, 0.5 mM EDTA, 0.05% NP-40, 12% glycerol) and pre-cleared for two hours, and simultaneously treated with Benzonase® Nuclease (E1014-5KU, Sigma Aldrich). Pre-cleared extracts were incubated with the antibody-bound beads overnight at 4°C with head-over-head rotation. Beads were washed gently three times using Nuclear Protein Binding buffer before elution with 1% NP-40. To each sample, four volumes of methanol and one volume of chloroform were added before mixing by vortex and centrifugation at 14000 xg for 10 min prior to removing the top, aqueous phase. Next, three volumes of methanol was added to wash the protein sample before spinning at 14000 xg for 10 min and removing the supernatant without disturbing the protein pellet. After drying, the pellet was resuspending with 50 mM NH4HCO3 solution and sonicated briefly. On-bead and In-solution Trypsin Digests Proteins bound to the magnetic beads were first reduced using 100 mM DTT for 30 min and then alkylated at RT for 30 min in the dark with 1 M iodoacetamide (IAA). Proteins bound to beads were digested for 4 hr with shaking at 37°C with LysC (Wako), overnight with shaking at 37°C with Tryp-LysC (Promega), and finally an additional 4 hr with shaking at 37°C using mass spectrometry-grade trypsin (Promega). For all samples, the missed cleavage rate was, on average, around 6%. Magnetic beads were centrifuged and pelleted with a magnet before the supernatant was transferred to fresh tubes. Samples were dried by speed vacuum for 2.5 hr before re-suspension in 0.1% trifluoroacetic acid. Samples were cleaned using C18 ZipTips (Merck Millipre Ltd, Cork, IRL) before drying by speed vacuum again and re-suspending in 0.1% formic acid.

8 ACS Paragon Plus Environment

Page 9 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Liquid Chromatography Electrospray Ionizing Tandem Mass Spectrometry (LC-ESIMS/MS) For large-scale analysis, following on-bead digests for BioID or in-solution digests for AP-MS, samples were submitted to the UWO Biological Mass Spectrometry Laboratory / Dr. Don Rix Protein Identification Facility (London, CA) for analysis of peptides by high-resolution LC-ESI-MS/MS. Peptides were identified using an ACQUITY M-Class UHPLC system (Waters) connected to an Orbitrap Elite mass spectrometer (Thermo Scientific). Solution A consisted of water/0.1% formic acid (FA), while Solution B was acetonitrile (ACN)/0.1% FA. Peptides (~1 μg) were injected onto an ACQUITY UPLC M-Class Symmetry C18 Trap Column, 5 μm, 180 μm x 20 mm, and trapped for 6 min at a flow rate of 5 μL/min at 99% Solution A/1% Solution B. Peptides were separated on an ACQUITY UPLC M-class Peptide BEH C18 Column, 130 Å, 1.7 μm, 75 μm × 250 mm, operating at a flow rate of 300 nL/min at 35°C using a nonlinear gradient consisting of 1–7% Solution B over 1 min, 7–23% Solution B over 179 min and 23–35% Solution B over 60 min before increasing to 95% Solution B and washing. Samples were run in positive ion mode. Mass Spectrometry Data Analysis Maximum missed cleavages were set to 3. Fragment mass deviation was left at 20 ppm. Fragment mass error tolerance was set to 0.8 Da. Protein and peptide False Discovery Rate (FDR) was set to 0.01 (1%). Cysteine carbamidomethylation was set as a fixed modification while oxidation (M), N-terminal deamidation (NQ), and acetylation (protein) were set as variable modifications (maximum number of modifications per peptide = 5). All other settings were left as default. All raw MS files were searched in PEAKS Studio version 8.5 (Bioinformatics Solutions Inc., Waterloo, ON, Canada) using the Human Uniprot database (reviewed only; updated May 2014 with 40, 550 entries)33. Raw MS files were also searched against the contaminant database, Common Repository of Adventitious Proteins (cRAP). Note that for simplicity, the majority of proteins are referenced throughout by their gene name. Mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the data set identifiers, PXD010930 and PXD010931. Network Modelling and Clustering Analysis 9 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 37

Network and group figures for both BioID and AP-MS experiments, were created using a filtered list of candidate proteins. Non-specific proteins identified from the controls, cytoplasmic and mitochondrial proteins, and contaminant/common MS background proteins (Supplemental Table 1) identified from the CRAPome34 were subtracted from the initial list of identified candidate proteins leaving behind nuclear proteins, including those with documented Ku interaction evidence. Proteins identified in at least two biological replicates were the primary focus (Supplemental Table 2). The Ku interaction network, consisting of candidate proteins identified by BioID and APMS, was modelled using the Cytoscape version 3.6.135. Candidate interaction strength was determined using the STRING database; protein interactions were set to high confidence (0.7) and included text mining, experimental, and database information. The strength of interaction between two proteins was indicated by line thickness where thicker lines indicated greater confidence in interaction (not necessarily physical). For clustering analysis of the interaction data, the ClusterONE plug-in on Cytoscape was used to group proteins by potential overlapping complexes36. Identification of High Confidence Proximity Protein Partners For both BioID and AP-MS experiments, Scaffold version 4.8.7 (Proteome Software Inc., Portland, OR) was used to validate the MS/MS-based peptide and protein identifications, based on the Peptide Prophet algorithm37 with Scaffold delta-mass correction and the Protein Prophet algorithm38, respectively. Peptides had to be identified with at least 95% probability, and only proteins identified with a minimum 95% probability (resulting in a protein FDR < 1%), using at least 2 unique peptides, were analyzed further (Supplemental Table 3). Spectral counts were exported from Scaffold and formatted according to the guidelines on inputting data for SAINTexpress analysis39,40, a computational tool integrated into the CRAPome version 1.1 interface (available at crapome.org). For the BioID analysis, CRAPome controls CC532 and CC533 (HEK293 cells with BirA*-FLAG) were also run with the biological replicates. For APMS, CRAPome controls CC156 and CC165 (HEK293 cells with GFP, nuclear fraction) were used. The average of all biological replicates was taken and known interaction data from the iRefIndex database was incorporated into the SAINTexpress algorithm. After subtracting

10 ACS Paragon Plus Environment

Page 11 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

contaminant/MS background proteins (Supplemental Table 1), the remaining proteins with a SAINTexpress score ≥ 0.95, were deemed high confidence interactors. Experimental Design BioID. For the identification of wild-type Ku70 interactors using BioID, three biological replicates (n=3) were used, in which each replicate was from the same polyclonal stable HEK293 cell line, but grown, harvested, and processed independently of each other. The HEK293 negative control sample contained no biotin ligase (BirA*), thus listing non-specific proteins. From the BioID results, only nuclear proteins that were deemed high confidence interactors by SAINTexpress analysis and appeared in at least two out of three replicates were selected to be the focus for the discussion. SAINTexpress analysis, which included additional negative controls, was conducted as described above. AP-MS. To identify proteins that interact with wild-type Ku70 using AP-MS, three biological replicates (n=3) were used, in which each replicate was from the same HEK293 cell line, but grown, harvested, and processed independently of each other. The HEK293 negative control sample identified proteins that interacted with mouse IgG, thus listing non-specific interactors. The final list of high confidence protein interactors was created using SAINTexpress as described above for BioID.

11 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 37

RESULTS Effective biotinylation by Ku70-BirA* proteins. Wild-type human Ku70 was fused to the mutant biotin ligase (BirA*), added to the Cterminal of Ku70 (henceforth referred to as Ku70-BirA*). The expression of the fusion protein was observed following transient transfection (Fig. 1A). To test for biotinylation of proteins by BirA*, HEK293 cells transiently expressing the fusion protein were incubated with 50 μM biotin for 24 hours before 1 mg of whole cell extracts were used to isolate biotinylated proteins using streptavidin-conjugated beads. Biotinylation by Ku70-BirA* was indicated by the presence of multiple bands of differing sizes (Fig. 1B). As expected, more biotinylation was seen in cells expressing Ku70-BirA* (Fig. 1B). Ku70-BirA* co-immunoprecipitates with Ku80 and still associates with DNA. Ku70 exists as a heterodimer with Ku80, and together they associate with DNA ends1,41. While it has been suggested that Ku70 can act independently of Ku8041, Ku70 and Ku80 stabilize each other and are suspected to exist as an obligate heterodimer3. To test if Ku70-BirA* could still heterodimerize with Ku80, Ku80 was co-immunoprecipitated using HA-tagged Ku70BirA* (Fig. 1C). Ku80 was successfully co-immunoprecipitated with the fusion protein, implying that BirA* does not prevent association between Ku70 and Ku80 and is consistent with the fact that the BirA* is attached at the C-terminal of Ku70, a region not essential for heterodimerization with Ku801. In addition to associating with Ku80, Ku70-BirA* was also capable of binding double-stranded DNA ends (Fig. 1D). Ku70-BirA* protein localizes to the nucleus. Ku70 contains a nuclear localization signal within its C-terminus, and is reported to reside almost exclusively in the nucleus41, although some have observed extranuclear localization42. Preliminary testing with Ku70-BirA* was conducted to ensure that the BirA* addition (~35 kDa) did not alter the localization of Ku70. Previous studies using Ku70 fused to GFP have not reported any interference with the Ku70 localization or function22,43. Using immunofluorescence microscopy, the fusion protein demonstrated nuclear localization (Fig. 1E), while the ~35 kDa BirA* alone, seemed both nuclear and cytoplasmic, likely due to its small size permitting diffusion between the cytoplasm and nucleus. 12 ACS Paragon Plus Environment

Page 13 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A

Control |

BirA* |

C

Control Ku70-BirA* | |

35 kDa –

5% Input

IP: anti-HA

BirA* Ku70-BirA* BirA* 100 kDa –

– 100 kDa

|

|

|

|

75 kDa –

anti-HA

25% Input

|

5’biotin-dsDNA – +

|

|

anti-Ku70

100 kDa –

– 75 kDa

75 kDa –

PD: streptavidin

100 kDa – 75 kDa –

anti-Ku80 – 100 kDa

D

Ku70-BirA*

100 kDa –

anti-HA 63 kDa –

anti-Ku70 63 kDa –

– 63 kDa 35 kDa –

anti-α-tubulin anti-HA

B

5% Input Control

|

PD: streptavidin

BirA* Ku70-BirA* Control

|

|

|

BirA*

|

E +BirA*

Ku70-BirA*

|

+Ku70-BirA*

245 kDa – 180 kDa –

DAPI

135 kDa –

100 kDa – 75 kDa –

63 kDa –

anti-HA 48 kDa – 35 kDa – 25 kDa –

HRP-streptavidin

Figure 1. Characterization of the expression, localization, and functionality of Ku70-BirA* in human cells. A) Western blot analysis of transiently transfected HeLa cells, 24 hr after transfection using HA, Ku70, and α-tubulin antibodies. B) Western blot analysis, using a horse radish peroxidase streptavidin (HRP) probe, of a small-scale pull-down of biotinylated proteins in transfected HEK293 cells after 24 hours of incubation with supplemental biotin (50 µM). C) Western blot analysis, using Ku80 and HA antibodies, of co-immunoprecipitated Ku80 using Ku70-BirA* transiently transfected into Phoenix cells. D) Western blot analysis, using Ku70 and HA antibodies, of a DNA pull-down in HEK293 cells stably expressing Ku70-BirA*, using a blunt, 5’ biotinylated 30 bp double-stranded DNA probe. Negative control lane contains biotin only. E) Cellular localization of fusion proteins in HeLa cells, visualized using immunofluorescence microscopy analysis using HA antibody. White bar denotes 50 microns.

Stable expression of Ku70-BirA* in HEK293 cell line. In order to conduct BioID, we created a stable cell line that constitutively expresses Ku70-BirA* in HEK293 cells. HEK293 cells have been widely employed to conduct BioID, although other cell lines have also been used26,28. BioID has been successfully conducted using transient44, stable25, or inducible25 expression of the fusion proteins. We chose to employ stable 13 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 37

expression to maximize the formation of the Ku70-BirA*-Ku80 dimer. Following transfection of HEK293 cells with Ku70-BirA*, monoclonal colonies were selected and isolated (Supplemental Fig. 1A) before pooling the clones that showed stable expression to create polyclonal stable cell lines. The cellular localization and expression of BirA* and Ku70-BirA* polyclonal cell lines were visualized using immunofluorescence, revealing that the majority of cells expressed Ku70BirA* (Supplemental Fig. 1B). Identification of Ku candidate protein interactors using BioID and AP-MS. Using the established HEK293 cell line that stably expresses Ku70-BirA*, we conducted three large-scale BioID replicates, testing for biotinylation prior to MS submission (Supplemental Fig. 2). Upon MS analysis, as anticipated, pyruvate carboxylase, acetyl-CoA carboxylase 1 and 2, propionyl-CoA carboxylase α chain, and methylcrotonoyl-CoA carboxylase subunit α were identified in every sample as these enzymes are naturally biotinylated in human cells through post-translational modification25. Biotinylation is a fairly rare modification in mammalian cells, with the exception of the aforementioned carboxylases25. The presence of the carboxylases in the MS results validates the successful capture of biotinylated proteins. In order to complement the BioID analysis, we employed a second approach, AP-MS, to identify Ku protein interactors. For the AP-MS experiment, nuclear cell extracts collected from HEK293 cells were used with a Ku70 antibody to isolate proteins that coimmunoprecipitated with Ku70. After washing, coimmunoprecipitated proteins were eluted with 1% NP-40, while retaining the antibody to prevent antibody bleed-through during MS45. After subtracting the non-specific proteins that were identified in the negative controls and common contaminants/MS background proteins (Supplemental Table 1), the three replicates for each experiment were pooled. For BioID, approximately 250 proteins appeared in at least two replicates while 152 proteins appeared in all three replicates (Fig. 2A). For AP-MS, 53 proteins were identified in at least two replicates, while 17 were identified in three replicates (Fig. 2A). All of the proteins that appeared in at least two replicates, identified by either BioID or AP-MS, were sorted based on protein class with the majority being nucleic acid-binding, and transcription factors as the second largest class (Fig. 2B). Among the proteins that appeared in at least two replicates, 22 proteins were shared between BioID and AP-MS (Table 1).

14 ACS Paragon Plus Environment

Page 15 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Table 1. Ku interaction candidates identified by both BioID and AP-MS ALYREF CALM1 DDX17 DDX5 DHX9 ILF3

LBR LIG3 MATR3 NCL NONO NOP2

PARP1 SNRNP200 SRSF3 SRSF7 SSRP1 TOP2A

TOP2B U2AF2 XRCC5 XRCC6

Candidates were identified in  2 replicates of either technique and exclude negative control and common contaminants/background proteins.

The majority of proteins identified were novel candidates, not previously shown to interact with Ku (Fig. 2C). However, approximately 9% of the identified proteins were known Ku-interactors, validating the use of BioID with Ku70. Notable detected proteins that are known to interact with Ku70 include: Ku80, MRE11A, MSH6, WRN, and NCOA619,23,24,46–48. However, since the Ku70-BirA* fusion protein heterodimerizes with Ku80, the candidate proteins identified could be interacting with either Ku70 or Ku80, or both.

15 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

A

BioID

Rep 1

171

26

≥ 2 replicates 14

17 152

48

Rep 3

B

7

Known 9%

22 31

Rep 1

74

C

AP-MS 229

Page 16 of 37

3 17

32

Rep 2

62

Rep 3

19

Unknown 91%

141

Rep 2

Tr ansporter Membrane traffic protein Chaperone Hydrolase Oxidoreductase Enzyme modulator Tr ansferase Tr anscription factor Nucleic acid bi nding Ligase Receptor Defense/i mmunity protein Calcium-bi nding protein Isomerase Cytoskeletal pr otein Signal ing molecule Extracellular matrix protein

Figure 2. Analysis of protein interaction candidates that appeared in at least two biological replicates from experiments using BioID and AP-MS with wild-type Ku70. A) Comparison of the total number of candidates that appeared in at least two replicates, after excluding common contaminant and background proteins, for BioID (left), AP-MS (right), or both (middle). For both BioID and AP-MS experiments, the comparison for all biological replicates is also indicated. B) Categorization of the identified candidates that appeared in at least two biological replicates of BioID or AP-MS experiment by protein class. Analysis completed using web-server PANTHER classification system. C) Approximation of previously known Ku-interacting candidate proteins versus novel protein candidates.

Identification of high confidence protein interactors using SAINTexpress analysis. Using label-free quantitation, the SAINTexpress algorithm determines the likelihood of each protein interacting with the bait protein, while also incorporating interaction data from the literature40. We identified 27 proteins from BioID and/or AP-MS experiments, which appeared in at least two biological replicates (unless otherwise indicated), with a SAINTexpress score ≥ 0.95. These candidates were considered to be high confidence Ku-protein interactors (Table 2). Eight of the identified proteins were already known interactors based on the iRefIndex database and 16 ACS Paragon Plus Environment

Page 17 of 37 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

are indicated by asterisks (Table 2). Only Ku70 (XRCC6) was shared between the two techniques, demonstrating variability between the methods and supporting the use of both techniques to acquire the complete Ku interactome.

Table 2. High confidence BioID and AP-MS Ku interactors identified using SAINTexpress analysis Method BioID

Candidate XRCC6* IFI16 PPIL2 SPEN ZNF512 FHL1 RPA1 ANP32E LIG3* RALY TRMT1L WRN* XRCC1 DACH1 NSFL1C

AP-MS

XRCC6* XRCC5* ST13* (1 replicate) SEPT7 PIP PRKDC* BASP1 MAP4 APEX1* (1 replicate) C9orf142/PAXX SEPT2 ANXA1* (1 replicate) PHB2

SAINTexpress Score 1 1 1 1 1 1 1 0.99 0.99 0.98 0.98 0.97 0.97 0.97 0.96 1 1 1 1 1 1 1 1 1 1 1 1 0.99

Candidates were identified in  2 replicates, unless otherwise indicated, and exclude common contaminants/background proteins. Previously identified Ku interactors, based on the consolidated iRefIndex database, are denoted with an asterisk (*).

Network and grouping analysis of BioID- and AP-MS-identified protein candidates. For both BioID and AP-MS, candidates that appeared in at least two replicates were organized into a protein interaction network created using the STRING database and visualized using Cytoscape (Fig. 3). The network highlights the extent of connections between the identified proteins (nodes) and the interaction confidence, indicated by line (edge) thickness, where a thicker line suggests a greater confidence of interaction (not necessarily a physical interaction) based on a combined score including text mining, experiments, and databases, generated by STRING (minimum required interaction score was set to high confidence, 0.7). 17 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 37

Next, we used the ClusterONE plug-in on Cytoscape to identify densely interconnected overlapping regions, representing sub-networks and potential multi-protein complexes. ClusterONE uses ‘cohesiveness’ between proteins to search for potential protein complexes by grouping together densely connected proteins36. Select proteins were grouped into clusters based on their degree of interconnection (Fig. 3). From the global Ku interactome, 6 clusters with P