An Interface-Driven Design Strategy Yields a Novel, Corrugated

Aug 27, 2018 - The solenoid architecture has been a target of extensive protein design ... We employed our interface-driven strategy, designing three ...
0 downloads 0 Views 2MB Size
Subscriber access provided by UNIV OF DURHAM

Article

An interface-driven design strategy yields a novel corrugated protein architecture Mohammad ElGamacy, Murray Coles, Patrick Ernst, Hongbo Zhu, Marcus D Hartmann, Andreas Plückthun, and Andrei N. Lupas ACS Synth. Biol., Just Accepted Manuscript • DOI: 10.1021/acssynbio.8b00224 • Publication Date (Web): 27 Aug 2018 Downloaded from http://pubs.acs.org on August 29, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

An interfaceinterface-driven design strategy yields a novel, novel, corrugated protein architecture Mohammad ElGamacy, Murray Coles, Patrick Ernst†, Hongbo Zhu, Marcus D. Hartmann, Andreas Plückthun† and Andrei N. Lupas*

Dept. of Protein Evolution, Max-Planck-Institute for Developmental Biology, 72076 Tübingen, Germany † Dept. of Biochemistry, University of Zurich, 8057 Zurich, Switzerland Keywords: alternating handedness, novel fold, protein design, protein structure, repeat protein

ABSTRACT: Designing proteins with novel folds remains a major challenge, as the biophysical properties of the target fold are not known a priori and no sequence profile exists to describe its features. Therefore, most computational design efforts so far have been directed towards creating proteins that recapitulate existing folds. Here we present a strategy centered upon the design of novel intramolecular interfaces that enables the construction of a target fold from a set of starting fragments. This strategy effectively reduces the amount of computational sampling necessary to achieve an optimal sequence, without compromising the level of topological control. The solenoid architecture has been target of extensive protein design efforts, as it provides a highly modular platform of low topological complexity. However, none of the previous efforts has attempted to depart from the natural form, which is characterized by a uniformly handed superhelical architecture. Here we aimed to design a more complex platform, abolishing the superhelicity by introducing internally alternating handedness, resulting in a novel, corrugated architecture. We employed our interface-driven strategy, designing three proteins and confirming the design by solving the structure of two examples.

INTRODUCTION Computational design has thus far been very successful in diversifying the geometries and sequences of existing folds. This has been largely assisted by the presence of one or more starting structures for redesigning a particular fold, and the associated data that underpin its sequence determinants. In contrast, designing novel folds with a predetermined backbone blueprint, while offering a vast new range of designable folds, renders the gross sequence and rotamer sampling problem intractable. A fragment-based approach can greatly reduce this search space, as starting building blocks already carry intrinsic folding information. In addition to maintaining control over the level of adherence to a target

fold, this approach also offers the possibility of coarsegraining the assembly problem by choice of the building block sizes. The latter may range from secondary structural elements to large subdomain or domain-sized fragments 1. This effectively decomposes the problem into searching for optimal inter-fragment interfaces and loops. This promises to focus the available computing resources on accurately and exhaustively exploring restricted spaces, instead of sparsely exploring much larger ones. Here we demonstrate the capacity of this interface-driven approach as an efficient means for novel fold design. For many years repeat proteins – in particular solenoids - have been a central topic of protein design. Unlike globular proteins, their low contact order and compositional uniformity have made them excellent platforms for investigating sequence-structure relationships and dissecting the energetics of protein folding 2. They have also served a wide range of applications as antibody-like tailored synthetic binding proteins selected from libraries, and some have even progressed to latestage clinical trials 3. Because of their favorable biophysical properties, they have also been developed into crystallization chaperones 4. Initially, design efforts on solenoids were aimed at generating more robust variants through sequence idealization 5. More recently, the vast potential of solenoid proteins as tunable scaffolds has motivated computational design aimed at expanding the available repertoire of solenoid configurations with atomic accuracy 6. These controlled geometries have included previously unobserved forms. However, despite this considerable success, to date the general solenoid architecture has not been altered. Here we aim to move beyond the solenoid, exploiting an incremental increase in the topological complexity to create a corrugated arrangement so far not observed in nature. Solenoid proteins are characterized by a uniform connectivity between repeat units and thus wind into a continuous superhelix 7. This implies a single inter-unit junction type (defined as an interface and a connecting loop), where the units are bound to posses the same handedness. A waveform description of this periodic

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

fold can be made by plotting the change in dihedral angle around the superhelical axis versus sequence position, which takes the form of a sawtooth wave (Fig. 1A and Fig. S9). The next step in complexity involves the alternating use of two junction types with opposite handedness; under the waveform description this topology adopts the form of a triangle wave (Fig. 1B and Fig. S9). Such a bi-handed topology can be obtained by dou-

Page 2 of 12

bling the size of the building block and introducing a new junction of the opposite handedness to that of the starting block. In contrast to solenoids, the alternating handedness eliminates supercoiling. Here we have taken this approach to construct the corrugated target architecture, using an interface-driven design strategy that builds upon existing, simpler structural blocks and minimizes the amount of sampling required to achieve a target fold.

Figure 1. Solenoid repeat, corrugated repeat and design strategy. (A) Natural solenoids have repeats with a single junction type (blue circle) and a uniform handedness; they can thus be described by a sawtooth wave. The wave represents the torsion angle along the superhelical axis between the first and the nth residue (B) A corrugated fold, represented by a triangle wave, would entail bi-handed repeats, and thus require two junction types; Figure S9 shows actual values for idealized templates. (C) A two-stage strategy of interface design and loop construction; Geo: geometric filtering calculations, PMF: Potential of mean force energy calculations, RFD: Rotational force dissipation simulations. The target fold is built from a four-helixbundle (orange) and its translational symmetry image (purple); top panel. The interface was then spanned by a grafted loop (purple); bottom panel.

RESULTS and DISCUSSION Design strategy

To construct the target topology, two unique helical hairpins, two unique interfaces, and two unique loops are required. The use of an up-down four-helix-bundle as a starting point provides two hairpins, a single interface and a single loop. The design of a second interface with the translated image of the bundle and of a second connecting loop is then sufficient to complete the target fold (Fig 1C). For this purpose, we employ a two-stage strategy: The first aimed at designing an intramolecular interface between two arbitrarily posed building blocks. This arbitrary docking step is only constrained by N- to C-terminal distance between the two blocks, a distance

that can be defined by the allowed loop length. The second stage is aimed at contructing a loop across this interface. We began by compiling a set of four-helix-bundles from the Protein Data Bank (PDB) that satisfied a set of geometric criteria defining regularity, bundle height range, and internal hairpin similarity. Initial poses were built between the bundle backbones and their translation images. The relative orientations were made to minimise the twist and curvature at the connecting interface along the central axis. This step was followed by the main sampling routine, where a combination of sequence sampling, sidechain rotamer sampling, backbone refinement, and rigid-body docking were performed with an initially softened steric repulsion term. For efficient

ACS Paragon Plus Environment

Page 3 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

sampling, different iterations of the main Monte Carlo sampling loops were interlaced with a geometric filter. The latter being aimed at eliminating solutions with poor residue packing quality, before further rounds of design are resumed. The sequence sampling was restricted to the interface positions, while conformational refinement was performed globally (Materials and Methods – Fig. S1). With the goal of further filtering the generated decoys by estimating the interaction free energy of the designed interfaces, more expensive potential-of-meanforce calculations were conducted through variablevelocity, variable-force steered molecular dynamics (SMD) simulations. The interaction free energy between

the building blocks was calculated from the convolution of the velocity and force functions, and was used to rank the candidates accepted for the loop design stage. The described simulation setup applies a more adaptive pulling scheme of a previous constant-velocity setup that we have previously benchmarked against a subset of a protein-protein affinity dataset 8(Materials and Methods). This accelerated form of free energy estimation method has been shown to be particularly suited for protomers that do not undergo major conformational changes upon unbinding 9, which was assumed here given the nature of our building blocks.

Figure 2. All three designs were folded. The first column shows the designed models as cartoon representation. The second column shows the respective CD spectra of the designs. The third column shows the melting curves of the designs, where BRIC1 and BRIC3 exhibit monophasic unfolding, while BRIC2 does not thermally unfold below 100 °C.

The next stage was to construct the loop that connects the newly designed interface. For this we searched the PDB for loop configurations that could serve as initial templates. The search routine scanned structures with a gapped sliding window, based on a generic description of the geometry defined by the ending and starting seg-

ments of adjacent repeat units (Fig. S2). This description was obtained using the dihedrals profile, the axial vectors of the relevant segments and their orientations (Materials and Methods). The grafted loops were then subjected to combined sequence and conformer sampling, and all resulting loop compositions were evaluated using

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

an accelerated molecular dynamics scheme. The routine applies a linear ramp of rotational force across the peptide bond at the center of the loop in a crankshaft fashion. This linear force titration, results in a non-linear rotational response. the resulting non-linear rotational

Page 4 of 12

kinetic energy is evaluated across the simulation time. The loop compositions that required the highest force magnitudes to induce rotational motion were selected for experimental evaluation (Materials and Methods – Fig. S3).

Figure 3. Experimental structures of BRIC1 and BRIC2 confirm the design. (A) The crystal structure of the 3D domainswapped dimer, with individual protomers (coloured by sequence position from cyan to blue) superimposed on the design (yellow). (B) The low-resolution NMR model of the BRIC1 monomer superimposed on the design (colours as in panel A). (C) The crystal structure of BRIC2 (blue) superimposed on the design (orange).

Choice of building blocks

Three starting template bundles were adopted from three different natural proteins, to evaluate the generality of the approach and the choice of purely geometric criteria for template inclusion. The first design, BRIC1 (for Bi-handed Repeat with Internal Corrugation), was constructed from a template bundle from the CheA histidine phosphotransfer domain (PDB: 1I5N) 10; the second, BRIC2, from the DRNN four-helix-bundle, which had previously undergone a total computational redesign of its hydrophobic core (PDB: 2LCH) 11; and the third, BRIC3, from a focal adhesion targeting domain (PDB: 3B71) 12. While one 2LCH is a monomeric solution structure, 1I5N and 3B71 do not posses any crystallographic arrangement similar to that proposed in Figure 1C. In the design process, BRIC1 underwent 12 mutations on the N-terminal face of the designed interface and 13 on the C-terminal face. BRIC2 underwent 12 mutations on the N-terminal face and 12 mutations on the C-terminal face; in addition to 2 mutations in the core of each bundle. BRIC3 underwent 20 mutations on the N-terminal face and 17 mutations on the C-terminal face. Structure-based sequence alignments of the designs to their respective starting templates are shown in Table S1. The interface for BRIC1 was bridged by a 5-residueloop, for BRIC2 by a 7-residue-loop, and for BRIC3 by a 6-residue-loop containing a disulphide bridge. For each of the designs, we explored experimentally the

minimal form consisting of two repeat units. For BRIC1, we retained the native C-terminal helix of the phosphotransfer domain as a C-terminal capping helix. Biophysical characterization

We expressed the proteins in Escherichia coli and purified them using immobilized metal ion affinity chromatography (Materials and Methods). All three were primarily monomeric by analytical size-exclusion chromatography, although they all showed pH-dependent oligomerization. Their well-dispersed 1D NMR spectra were consistent with folded proteins (Figs S6 and S7) and their circular dichroism (CD) spectra with predominantly helical secondary structure (Figure 2). In thermal unfolding experiments, BRIC1 and BRIC3 showed single-phase equilibrium unfolding at 86 °C and 67 °C, respectively, while BRIC2 did not exhibit any melting transition. The monophasic melting transition of BRIC1 corresponds to that of a single, compact domain. This emphasizes the success of our interface design, as it implies that the enthalpy of the designed junction matches that of the native one. The three constructs underwent crystallization screening and only BRIC1 readily yielded diffracting crystals, BRIC2 was fused to a crystallization chaperone, while BRIC3 did not express in sufficient yield in M9 minimal medium or in fusion with the crystallization chaperone Crystal structure of BRIC1

ACS Paragon Plus Environment

Page 5 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

Crystallization screens yielded well-diffracting BRIC1 crystals, for which we obtained data to 2.5 Å resolution in space group C2. The crystals contained one BRIC1 monomer in the asymmetric unit, which was unambiguously located in a molecular replacement trial searching with the full design model, in the first attempt and with high contrast. However, after initial refinement, it became apparent that the connectivity between the two four-helical halves of the protein differed from the design. Clear electron density showed the linker in an extended conformation, resulting from a domain-swapped dimeric assembly. In this assembly, two elongated BRIC1 protomers, related by the crystallographic twofold symmetry, are associated in an antiparallel fashion, such that the N-terminal four-helix bundle of one protomer interfaces with the C-terminal bundle of the other protomer, and vice versa (Fig. 3 and Table S2). Given that BRIC1 also shows a minor dimeric form in solution (Fig. S4), it appears that this form was selectively crystallized. As a result, the inter-repeat interface has entirely retained the designed interface features. This had a swapped backbone RMSD to the design of 1.82 Å (allatom RMSD was 2.1 Å) across the entire structure, excluding the loop. NMR structure of BRIC1

To address the nature of the monomeric form of BRIC1, we prepared isotope labelled samples for solution NMR. Diffusion coefficients measured on freshly prepared samples were consistent with the designed monomer (Fig. S5). However, dimeric and higher oligomeric forms accumulated over time, impacting on the quality of spectra. This feature, combined with the ambiguity intrinsic to repeat sequences, precluded full resonance assignment and thus high-resolution structure determination. We therefore adopted a strategy aimed at creating a low-resolution model, using a sample selectively 13C-labelled on methionine methyl groups to define inter-helical contacts (Materials and Methods). An initial observation was the similarity of chemical shifts between the repeats, indicating that both adopt very similar structures. Inter-helical contacts then defined intra- and inter-repeat junctions very similar to those observed in the crystal, with the C-terminal repeat identified by contacts to the unambiguously assigned Cterminal capping helix. The compiled data were sufficient to define the monomer structure, using the domainswapped crystallographic protomer as a starting point (Figure 3B and Table S3). The calculated monomer ensemble agrees well with the design, with an average backbone RMSD of 1.8 Å (all-atom RMSD ranged from 2.5 to 2.9 Å, excluding the capping helix). Crystal structure of BRIC2

In contrast to BRIC1, BRIC2 did not yield welldiffracting crystals in the first attempt. For this reason, a rigid shared helix fusion to DARPin D12 (designed ankyrin repeat protein D12) was constructed. DARPin

D12 had been previously identified as well-crystallizing under many different conditions and thus serving as a crystallization chaperone when rigidly fused to other repeat proteins 4a. An N-terminal fusion of the DARPin was built in silico and both the shared helix and residues within 5 Å proximity were sequence-optimized using Rosetta fixed backbone design, as previously described 4a . Crystals appeared after 25 days, diffracted to 3.0 Å resolution and the data were integrated in space group P1. For the molecular replacement, a model of the DARPin was used as a search model and the design models of BRIC2 were manually fitted into the density (refinement statistics are provided in Table S4). Out of the four molecules of the asymmetric unit, chain B and D looked as designed and a slight bend of the shared helix was observed for chains A and C, due to crystal forces (Figure S8). Clear electron density was visible for the whole BRIC2 domain and the designed loop, connecting the two repeats, could be built. In comparison to BRIC1, BRIC2 was monomeric and no domain swap was observed in all of the four chains, proving the successful interface design between the two helical bundles (Figure 3C). The overall backbone RMSD ranged from 2.27 to 3.0 Å (all-atom RMSD ranged from 2.8 to 3.4 Å), and confirmed both the design and the potential of DARPin D12 as a crystallization chaperone. Architectural uniqueness and interface design precision

To contrast the BRIC architecture to the nearest existing folds, we conducted structure searches against the entire PDB using PDBeFOLD 13 and DALI 14, and the ECOD database 15 using TM-align 16. No folds were found that structurally align along the full length of our designed structures. Any similarity detected was largely localized to a four-helix-bundle substructure. PDBeFOLD did not recover any significantly related hits, while the best TM-align hits had TM-scores < 0.55 and did not share significant similarity with our BRICs. For the DALI searches we selected three structures based on their alignment lengths and secondary structures arrangements. Figure 4A shows the structures and idealized topologies of these hits contrasted against the BRIC topology. Two of these hits (3D19 and 4AKK) were topologically similar to each other, but with opposite chain paths. These were composed of two uniformly handed four-helix-bundles with N- and C-termini abutting each other at the connecting interface; a close-ended configuration that results from the parallel orientation of the helical hairpins to the main axis. The third hit (3AY5) consisted of two dissociated antiparallel helical domains, with one being a right-handed, side-connected bundle, and the other a left-handed, diagonally connected bundle. To evaluate the interface design precision, we measured the polar and Cartesian error across the interface (Fig. 4B). The polar deviations of the interface between designs and experimental structures were defined in terms of tilt (θ), bend (β) and curvature (κ). Polar devia-

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

tions in BRIC1 were lower in the solution structure than in the crystal structure, with the highest deviation in the κ dimension. BRIC2 however, exhibited a larger range of deviations between the asymmetric unit protomers, particularly, along the β and κ dimensions. As these large inter-protomer deviations would potentially buildup in a multi-repeat scenario, we carried out molecular dynamics simulations for a five-bundle-repeat model of BRIC2. The deviations in the solvated simulations were of smaller magnitude, averaging below 8.0°, 6.6° and 3.0° for |θ|, |β| and |κ|, respectively, at the four designed interfaces. We therefore expect most of these deviations to stem from the crystal packing. For estimating the Cartesian precision at the interface, we calculated the evaluation criteria used in the CAPRI interaction predic-

Page 6 of 12

tion competition: Lrms, Irms and fnat 17. The three structures ranked medium on the Lrms score. The BRIC1 solution structure ranked medium on the Irms score, while the two crystal structures ranked acceptable. All three structures ranked high on the fnat score. In spite of the asymmetric nature of the two-sided interface design, the intramolecular four-helix-bundle backbone RMSD was minor; 0.8 Å within design and 1.1 Å within structure for BRIC1, and 0.9 Å within design and 1.3 Å within structure for BRIC2. The design vs. structure values were 0.6 Å for both respective bundles of BRIC1, and 1.2 Å and 1.3 Å for the first and second bundles of BRIC2, which affirms the rigid incorporation of the building blocks.

Figure 4. The architectural uniqueness and interface accuracy of the BRIC designs. (A) A comparison between the idealized BRIC architecture and the closest architectures through structural similarity searches. The structures are colored by chain path from blue to yellow. (B) Polar and Cartesian disparity between the designs and experimental structures of BRIC1 and BRIC2. The top panel shows the angular deviation from design values for the tilt (θ), bend (β) and curvature (κ) across the designed interface in green, teal and cyan, respectively. Each dot represents either an NMR model or one of the asymmetric unit chains (the single chain of the BRIC1 crystal structure is represented by orange crosses). The bottom panel shows the CAPRI evaluation criteria (Lrms, Irms and fnat) for the designed interfaces (defined in the Materials and Methods). The red, green and blue dashed horizontal lines mark the high, medium and acceptable ranks, respectively. Error bars represent the standard error across asymmetric unit chains of NMR models (some error bars are within the dot diameter).

CONCLUSIONS At the frontier of protein design is the aim to provide new scaffolds for functionalisation, this potential has

made repeat architectures attractive design targets. Internal cross-alignments in our experimental structures show that minimal structural perturbation has been introduced to the starting building blocks, leading to the possibility of constructing longer repeats. With this architecture, the

ACS Paragon Plus Environment

Page 7 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

large-sized building blocks can harbor functional features from selected parent blocks, or afford more extensive engineering owing to their expanded sub-structural diversity, as compared to repeats with smaller building blocks. Protein design efforts have so far been biased towards assembling idealized secondary structure elements. As such they do not reflect natural proteins, where structural deformities are common and often associated with functional motifs. The difficulty of sampling such deformities is an inherent barrier to the designability of these motifs 18. A successful design strategy that sidesteps this problem, and even creates new topologies, has been to combine natural sub-structures by directly using structurally similar fragments as overlapping connectors 19 . This, however, does not allow strict control of target topologies, since it is contingent upon the existence of overlaps that yield viable intramolecular interfaces. In contrast to this connectivity-driven approach, we introduce an interface-driven approach that is capable of delivering novel topologies from an arbitrary arrangement of building blocks. Moreover, this strategy employs sequence and conformational sampling focused only on the junctions between building blocks, and separates interface optimization from loop design, thus adding to the overall efficiency. MATERIALS and METHODS Computational design

The interface sequence design stage was performed in multiple consecutive rounds filtering the top 10-20 candidates from each round and feeding them as input to the next. Each round was performed using a RosettaScripts 20 protocol comprising two generic Monte Carlo loops separated by packstat 21 and total energy (talaris2013 scoring function 22) filters. Each loop executed a protocol comprising soft-repulsion sequence sampling, backbone optimization 23, docking and conformational refinement. Between the consecutive rounds, under- or over-packing was evaluated by calculating the average deviation from high-resolution structures packing density probability. The last round output was filtered through an accelerated SMD routine that aims at approximately assessing the potential of mean force of unbinding across the designed interface. The free energy of unbind ing () was evaluated as → =   

 where and  are the pulling force and velocity vectors at time , respectively. One partner was fixed and aligned against a reference orientation while the other was pulled along a single dimension through a loose spring to achieve a variable-velocity, variableforce SMD setup that yields the free energy profile along the unbinding path. The protein was modelled using the CHARMM36 force field 24, the simulations were performed in explicit solvent (TIP3P water model) and 0.15 M sodium chloride as NPT ensembles at 310 K

and 1 atmosphere using a Langevin thermostat and a Langevin barostat as implemented in the NAMD engine 25 . Particle Mesh Ewald electrostatics grid of 1 Å resolution was used with a long-range cutoff set at 12 Å (switching at 10 Å) and a timestep of 2 fs. The reference pulling velocity ( ) was calibrated to 2.5 Å/ns with a spring constant () of 20 kcal·mol-1·Å-2 where the ap plied force ( was computed as −∇[ [  −   −  ⋅ ] ] ( being the position vector of the steered atom group and  being the pulling direction vector). The systems underwent 2000 steps of conjugate gradient minimization before random atom velocities initialization and force application on the backbone carbonyl carbon atoms. The calculated work was used to rank designs for the next stage. The loop design stage begins with a structural search using a gapped sliding window across the whole PDB, where the landing sites are defined by two N-to-C vectors and a single (φ, ψ) array. Given the latter representation, every subject landing site was compared to the subject geometry by means of dihedral profiles similarity, landing sites lengths similarity and landing sites relative orientation similarity. Loop lengths of 4 and up to 8 were searched for, with landing sites of lengths ranging from 4 to 8 residues. The best matches according to the previous metrics were then grafted onto the top ranking interface designs and subjected to loop mutagenesis using a Rosetta script that performs sequence sampling, backrub refinement, and sidechain refinement in a Monte Carlo looper. The designed loops were evaluated by applying reciprocating crankshaft force across the peptide bond at the centre of the loop with a reciprocation frequency of 20 fs-1. A 60 ps span of equilibration was followed by equal torques applied to the peptide bond hydrogen and oxygen atoms around the peptide bond axis, starting by an angular acceleration of 2 rad·ps-2. The latter rotational acceleration was incrementally ramped up every 40 fs by a value of 2 rad·ps-2 using the updated atomic positions every 20 fs so as not to apply any forces against the peptide axis itself. The simulation was performed in triplicates in durations of 300 ps with similar parameters to the SMD described above. The distributions of the loop atoms root-mean-squaredfluctuation and rotational kinetic energy were assessed to choose the designs of the lowest mean and standard deviation of these variables. The top designs at this point were directly taken to the laboratory. Expression and purification

The genes were acquired from Synbio Technologies, already cloned into pET-28a(+) using NcoI and NdeI cloning sites and in-frame with an N-terminal hexaHistag and a thrombin cleavage site, while harbouring a kanamycin resistance gene as a selection marker. The plasmids were used to transform chemically competent E. coli BL21(DE3) by means of heat-shock. The expression procedure entailed growing of the cells in LB me-

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

dium and inducing with IPTG at OD600 of 0.5~1 with overnight expression at 25 °C. For expression of labelled protein, a preculture in LB medium was grown, cells collected, and resuspended in M9 minimal medium (240 mM Na2HPO4, 110 mM KH2PO4, 43 mM NaCl), supplemented with 10 µM FeSO4, 0.4 µM H3BO3, 10 nM CuSO4, 10 nM ZnSO4, 80 nM MnCl2, 30 nM CoCl2 and 38 µM kanamycin sulfate, to an OD600 of 0.5~1. After 40 minutes of incubation at 25 °C, 2.0 gm 15N-labelled ammonium chloride (Sigma-Aldrich cat.nr. 299251) and 6.25 gm 13C D-glucose (Cambridge Isotope Laboratories, Inc. cat.nr. CLM-1396) were added, or 100 mg methyl-13C L-methionine (Sigma-Aldrich cat.nr. 299146) in case of selective labelling in a 2.5 L culture. After another 40 minutes IPTG was added to 1 mM final concentration for overnight expression. Cells were collected by centrifugation at 5,000 g for 15 minutes, lysed by a Branson Sonifier S-250 (Fisher Scientific) in hypotonic 50 mM Tris-HCl buffer supplemented with one tablet of the cOmplete protease cocktail (Sigma-Aldrich cat.nr. 4693159001) and 3 mg of lyophilized DNase I (5200 U/mg; Applichem cat.nr. A3778). The insoluble fraction was pelleted by 25,000 g centrifugation for 50 minutes, and the soluble fraction was filtered (0.45 µm filter pore size) and directly applied to a Ni-NTA column. A 5 mL HisTrapFF immobilized nickel column (GE Healthcare Life Sciences cat.nr. 17-5255-01) was used for this purpose, washed consecutively by 30 mL 150 mM NaCl, 30 mM Tris buffer (pH 8.5) at 0, 30 and 60 mM imidazole. Fractions were collected by a gradient elution at > 60 mM imidazole. The eluate was concentrated using 10 kDa MWCO centrifugal filters (Merck Millipore cat.nr. UFC901024) and loaded onto an equilibrated Superdex 75 gel filtration column (GE Healthcare Life Sciences cat.nr. 17517401). The gel filtration buffer used was always 100 mM sodium phosphate buffer (for NMR and CD transparency) composed to the target pH, where BRIC1 was eluted in pH 8.5, while BRIC2 and BRIC3 at pH 5.5. An ÄktaFPLC system (GE Healthcare Life Sciences) was used for all chromatography runs. For the D12-BRIC2 chimera, the shared helix between D12 and BRIC2 was introduced by assembly PCR and the resulting fragment was cloned into a pQE30LIC_3C (Qiagen) based plasmid via BamHI and HindIII restriction sites. Chemocompetent BL21DE3 cells were transformed with the plasmid and the protein was expressed in auto-induction medium at 25 °C for 16h 26. Cells were resuspended in 50 mM Tris/HCl pH8, 500 mM NaCl, 20 mM imidazole and lysed via sonication. Insoluble material was spun down by centrifugation for 30 min at 30 000 x g and the supernatant was loaded on 5 mL NiNTA resin, equilibrated with resuspension buffer. The column was washed with 25 mL resuspension buffer and protein was eluted with 15 mL resuspension buffer containing 250 mM imidazole. The elution fraction was dialysed over night against 50 mM Tris/HCl

Page 8 of 12

pH8, 300 mM NaCl and the N-terminal 10xHis-tag was removed by cleavage with 3C-protease (2 % w/w). Following a second NiNTA step to remove the protease and the His-tag, the protein solution was concentrated to 5 mL and further purified by gel filtration on an S200 16/600 column (GE healthcare) equilibrated with 10 mM Tris/HCl pH8, 100 mM NaCl. Biophysical characterization characterization

The analytical gel filtration experiments were all done on a Superdex 200 10/300 GL (GE Healthcare Life Sciences cat.nr 17517501), and the collected fractions from the eluate were used for CD or NMR measurements directly after. 1H NMR spectra were collected on a Bruker AVIII-800. NMR diffusion ordered spectroscopy experiments were performed on a Bruker AVIII-600 using the relevant functionality in the TopSpin software, running the analysis over multiple aliphatic proton peaks. The structure-based prediction of the diffusion coefficient was done using the HYDROpro software 27, setting the corresponding temperature to 310 K and viscosity to 0.007 P. CD spectra were recorded on a Jasco J-810 spectrometer, with a spectral scan window of 200-240 nm, with a sweep delta of 0.1 nm while averaging over 5 scans. Melting curves were measured from 20 to 100 °C, recording the ellipticity at 222 nm every 0.5 °C, while heating at a 1 °C/min rate. X-ray crystallography

For BRIC1 crystallization, the protein was concentrated to 13 mg/mL in 25 mM Tris buffer, pH 8.5, 150 mM NaCl. The D12-BRIC2 fusion was concentrated to 40 mg/mL in 10 mM Tris buffer, pH 8, 100 mM NaCl. Sitting-drop vapour diffusion crystallization trials were performed in 96-well format, equilibrating drops containing 300 nl of protein solution and 300 nl of reservoir solution against 50 µl of reservoir solution. For D12BRIC2, the drop size was 150 nl + 150 nl and the reservoir contained 75 µL of mother liquor. Best diffracting crystals were obtained with a reservoir solution containing 20% v/v PEG 500 MME, 10 % w/v PEG 20,000, 30 mM MgCl2, 30 mM CaCl2 and 100 mM Tris-BICINE pH 8.5, loop-mounted, and flash-cooled in liquid nitrogen. For D12-BRIC2, an initial hit was found in 0.2 M (NH4)2SO4, 25% w/v PEG 3350 and 100 mM Bis-Tris pH 5.5. A fine screen with two perpendicular gradients of the PEG concentration and the pH was set up to yield diffracting crystals, which were flash-frozen in mother liquor containing 20 % v/v ethylene glycol. Data were collected at beamline X10SA at the Swiss Light Source, at 100 K with an X-ray wavelength of 1 Å and a PILATUS 6M-F detector (Dectris) for BRIC1 or an EIGER 16 M X detector (Dectris) for D12-BRIC2. Data for BRIC1 were indexed, integrated and scaled to a resolution of 2.5 Å in space group C2, using XDS 28. For D12-BRIC2, two crystals were indexed and integrated in space group P1. After merging the two datasets, the data were scaled to 3 Å. According to the unit cell dimen-

ACS Paragon Plus Environment

Page 9 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

sions, one BRIC1 monomer was expected in the asymmetric unit with a solvent content of 50%. Molecular replacement was carried out using MOLREP 29, using the designed coordinates as a search model. A unique solution was found in the first attempt with high contrast. After rigid-body refinement with Refmac5 30, a different conformation of the designed connecting loop became apparent and was manually rebuild in Coot 31. The structure was completed and finalized in cycles of manual modelling in Coot and refinement with BUSTER or PHENIX-refine 32. Data processing and refinement statistics are summarized in Tables S2 and S4. NMR structure determination

Spectra were recorded at 310 K on Bruker AVIII-600 and AVIII-800 spectrometers. Backbone sequential assignments were made using standard triple-resonance experiments and by tracing strong NOESY contacts between sequential amide protons in helical segments. Aliphatic sidechain assignments were completed with TOCSY-based experiments, while partial aromatic assignments were made by linking aromatic spin systems to unambiguously assigned aliphatic groups in NOESY spectra. The oligomeric purity of samples was checked with diffusion-ordered (DOSY) spectra. These confirmed that fresh samples used in diffusion experiments were predominantly monomeric. To identify inter-helical contacts, we exploited the uneven distribution of methionine residues observed in the dimeric crystal structure. The 16 methionine residues in this structure fall into three broad clusters, one within each repeat and a third at the inter-repeat interface. To assign these we produced a sample selectively 13Clabelled on methionine methyl groups on a 12C, 15Nlabelled background. Members of each cluster could be identified by contacts between the labelled methyl groups in a 3D CCH-NOESY experiment 33. Contacts to unambiguously assigned protons in a 13C-HSQCNOESY spectrum then allowed the assignment of all members within each cluster. Thus assigned, these methyl groups were effective probes of the inter-helical interfaces providing 34 long-range distance restraints. These were applied, in simulated annealing calculations, together with other unambiguously assigned contacts and TALOS-based dihedral restraints. A summary of the input data and final structure statistics is given in Supplementary Table S3. Structures were calculated with XPLOR-NIH (version 2.9.4) using a monomer extracted from the domainswapped dimer as a starting structure; i.e. an open structure with no inter-unit interface. Simulated annealing runs were first aimed at closing this interface by treating the four-helix bundles as pseudo-rigid bodies. The resulting set of 50 structures defined an interface very similar to that observed in the crystal structure. Refinement was performed using atomistic molecular dynam-

ics computations in isothermal-isobaric ensembles to accommodate large conformational changes, where the overall explicit solvent simulations setup was similar to that described above. A total of 135 ns were collected while deploying the NMR-derived dihedral and distance restraints using the harmonic restraint terms  ! − !  and "# $ % − % , respectively. Here  and "# $ are the dihedral and distance spring constants (set at 1 and 0.1, ly), ! is the ' or ( angle at time , % is the atom pair distance at time , while ! and % are the NMRderived values. Fifty frames from these runs were picked on the basis of agreement with distance restraints and minimized under restraints in XPLOR-NIH to regularize covalent geometry. The final ensemble consisted of 26 structures chosen on the basis of lowest restraint violations. Structural analysis

Searching among existing structures for similar folds was performed using three different methods. The PDBeFOLD 13 and DALI 14 servers were used to search against the entire PDB for similar existing folds to the experimental structures of BRIC1 and BRIC2. The resulting hits were sorted by their alignment lengths, and manually inspected the top 100 hits for similar topologies. Additionally, the ECOD database 15 (ECOD40 subset) was searched using TM-align 16 for the same purpose. Only hits with TM-score equal or above 0.5 were manually inspected for potential similarity. Polar precision at the designed interface was assessed by calculating the deviation between the designs and experimental structures for three quantities; the tilt (θ), bend (β) and curvature (κ) across the designed interface. The three quantities are supposed to represent the planeprojected angular change between the two helical hairpins across the designed interface, along the three mutually orthogonal planes. The assessment of the designed interface accuracy in Cartesian and qualitative terms was done using the CAPRI interface criteria: Lrms, Irms and fnat 17. The Lrms represents the backbone RMSD of the protein units downstream of the designed interface, after structurally aligning the pair by their upstream units. The Irms was calculated as the backbone RMSD between the residues at the designed interface (defined by a distance cutoff of 10 Å). The fnat represents the number of contacts common across the designed interface between the design and experimental structure, divided by the total number of contacts in the experimental structure. A contact is defined by the existence of any interatomic distance within 5 Å between two residues across either side of the interface (The designed loop residues were not considered).

ASSOCIATED CONTENT

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Supporting Information

The Supporting Information is available free of charge on the ACS Publications website at DOI:. Supplementary Figures S1-S9 (PDF) Supplementary Tables S1-S3 (PDF)

AUTHOR INFORMATION Corresponding Author

[email protected] Funding Sources

This work was supported by institutional funds of the Max Planck Society and by grant 310030B_166676 from the Swiss National Science Foundation to AP. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Notes The crystal structure of the dimeric BRIC1 has been deposited in the RCSB Protein Data Bank under the code 6FF6. The crystal structure of the D12-BRIC2 fusion has been deposited in the RCSB Protein Data Bank under the code 6FES

ACKNOWLEDGMENT We thank Reinhard Albrecht and Vincent Truffault from the Max Planck Institute for Developmental Biology for technical assistance in structure determination experiments. We also thank Beat Blattmann from the UZH Protein Crystallization Center for setting up the crystallisation experiments for the D12-BRIC2 chimera.

REFERENCES 1. Woolfson, D. N.; Bartlett, G. J.; Burton, A. J.; Heal, J. W.; Niitsu, A.; Thomson, A. R.; Wood, C. W., De novo protein design: how do we expand into the universe of possible protein structures? Current Opinion in Structural Biology 2015, 33, 16-26. 2. (a) Kloss, E.; Courtemanche, N.; Barrick, D., Repeat-protein folding: new insights into origins of cooperativity, stability, and topology. Archives of biochemistry and biophysics 2008, 469 (1), 83-99; (b) Rowling, Pamela J. E.; Sivertsson, Elin M.; Perez-Riba, A.; Main, Ewan R. G.; Itzhaki, Laura S., Dissecting and reprogramming the folding and assembly of tandem-repeat proteins. Biochemical Society Transactions 2015, 43 (5), 881-888; (c) Kajander, T.; Cortajarena, A. L.; Main, E. R. G.; Mochrie, S. G. J.; Regan, L., A New Folding Paradigm for Repeat Proteins. Journal of the American Chemical Society 2005, 127 (29), 10188-10190; (d) Mello, C. C.; Barrick, D., An experimentally determined protein folding energy landscape. Proceedings of the National Academy of Sciences of the United States of America 2004, 101 (39), 14102-14107; (e) Lowe, A. R.; Itzhaki, L. S., Rational redesign of the folding pathway of a modular protein. Proceedings of the National Academy of Sciences 2007, 104 (8), 2679-2684; (f) Wetzel, S. K.; Settanni, G.; Kenig, M.; Binz, H. K.; Plückthun, A., Folding and Unfolding Mechanism of Highly Stable Full-Consensus Ankyrin Repeat Proteins. Journal of Molecular Biology 2008, 376 (1), 241-257; (g) Wetzel, S. K.; Ewald, C.; Settanni, G.; Jurt, S.; Plückthun, A.; Zerbe, O., Residue-Resolved Stability of Full-Consensus Ankyrin Repeat Proteins Probed by NMR. Journal of Molecular Biology 2010, 402 (1), 241-258.

Page 10 of 12

3. Plückthun, A., Designed Ankyrin Repeat Proteins (DARPins): Binding Proteins for Research, Diagnostics, and Therapy. Annual Review of Pharmacology and Toxicology 2015, 55 (1), 489-511. 4. (a) Wu, Y.; Batyuk, A.; Honegger, A.; Brandl, F.; Mittl, P. R. E.; Plückthun, A., Rigidly connected multispecific artificial binders with adjustable geometries. Scientific Reports 2017, 7 (1), 11217; (b) Batyuk, A.; Wu, Y.; Honegger, A.; Heberling, M. M.; Plückthun, A., DARPinBased Crystallization Chaperones Exploit Molecular Geometry as a Screening Dimension in Protein Crystallography. Journal of Molecular Biology 2016, 428 (8), 1574-1588. 5. (a) Main, E. R. G.; Jackson, S. E.; Regan, L., The folding and design of repeat proteins: reaching a consensus. Current Opinion in Structural Biology 2003, 13 (4), 482-489; (b) Parmeggiani, F.; Huang, P.S.; Vorobiev, S.; Xiao, R.; Park, K.; Caprari, S.; Su, M.; Seetharaman, J.; Mao, L.; Janjua, H.; Montelione, G. T.; Hunt, J.; Baker, D., A General Computational Approach for Repeat Protein Design. Journal of Molecular Biology 2015, 427 (2), 563-575; (c) Binz, H. K.; Stumpp, M. T.; Forrer, P.; Amstutz, P.; Plückthun, A., Designing Repeat Proteins: Wellexpressed, Soluble and Stable Proteins from Combinatorial Libraries of Consensus Ankyrin Repeat Proteins. Journal of Molecular Biology 2003, 332 (2), 489-503; (d) Parmeggiani, F.; Pellarin, R.; Larsen, A. P.; Varadamsetty, G.; Stumpp, M. T.; Zerbe, O.; Caflisch, A.; Plückthun, A., Designed Armadillo Repeat Proteins as General Peptide-Binding Scaffolds: Consensus Design and Computational Optimization of the Hydrophobic Core. Journal of Molecular Biology 2008, 376 (5), 12821304. 6. (a) Brunette, T. J.; Parmeggiani, F.; Huang, P.-S.; Bhabha, G.; Ekiert, D. C.; Tsutakawa, S. E.; Hura, G. L.; Tainer, J. A.; Baker, D., Exploring the repeat protein universe through computational protein design. Nature 2015, 528, 580; (b) Parmeggiani, F.; Huang, P.-S., Designing repeat proteins: a modular approach to protein design. Current Opinion in Structural Biology 2017, 45 (Supplement C), 116-123; (c) Doyle, L.; Hallinan, J.; Bolduc, J.; Parmeggiani, F.; Baker, D.; Stoddard, B. L.; Bradley, P., Rational design of α-helical tandem repeat proteins with closed architectures. Nature 2015, 528, 585; (d) Park, K.; Shen, B. W.; Parmeggiani, F.; Huang, P.-S.; Stoddard, B. L.; Baker, D., Control of repeat protein curvature by computational protein design. Nature structural & molecular biology 2015, 22 (2), 167-174. 7. Kobe, B.; Kajava, A. V., When protein folding is simplified to protein coiling: the continuum of solenoid protein structures. Trends in Biochemical Sciences 2000, 25 (10), 509-515. 8. Kastritis, P. L.; Moal, I. H.; Hwang, H.; Weng, Z.; Bates, P. A.; Bonvin, A. M. J. J.; Janin, J., A structure-based benchmark for protein–protein binding affinity. Protein Science 2011, 20 (3), 482-491. 9. Chen, P.-C.; Kuyucak, S., Accurate Determination of the Binding Free Energy for KcsA-Charybdotoxin Complex from the Potential of Mean Force Calculations with Restraints. Biophysical Journal 2011, 100 (10), 2466-2474. 10. Mourey, L.; Da Re, S.; Pédelacq, J.-D.; Tolstykh, T.; Faurie, C.; Guillet, V.; Stock, J. B.; Samama, J.-P., Crystal Structure of the CheA Histidine Phosphotransfer Domain that Mediates Response Regulator Phosphorylation in Bacterial Chemotaxis. Journal of Biological Chemistry 2001, 276 (33), 31074-31082. 11. Murphy, G. S.; Mills, J. L.; Miley, M. J.; Machius, M.; Szyperski, T.; Kuhlman, B., Increasing Sequence Diversity with Flexible Backbone Protein Design: The Complete Redesign of a Protein Hydrophobic Core. Structure(London, England:1993) 2012, 20 (6), 10861096. 12. Garron, M.-L.; Arthos, J.; Guichou, J.-F.; McNally, J.; Cicala, C.; Arold, S. T., Structural Basis for the Interaction between Focal Adhesion Kinase and CD4. Journal of Molecular Biology 2008, 375 (5), 1320-1328. 13. Krissinel, E.; Henrick, K., Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallographica Section D 2004, 60 (12 Part 1), 2256-2268. 14. Holm, L.; Laakso, L. M., Dali server update. Nucleic Acids Research 2016, 44 (Web Server issue), W351-W355. 15. Cheng, H.; Schaeffer, R. D.; Liao, Y.; Kinch, L. N.; Pei, J.; Shi, S.; Kim, B.-H.; Grishin, N. V., ECOD: An Evolutionary Classification of Protein Domains. PLOS Computational Biology 2014, 10 (12), e1003926. 16. Zhang, Y.; Skolnick, J., TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Research 2005, 33 (7), 2302-2309. 17. Raúl, M.; Raphaël, L.; Leonardo, D. M.; J., W. S., Assessment of blind predictions of protein–protein interactions: Current status of

ACS Paragon Plus Environment

Page 11 of 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Synthetic Biology

docking methods. Proteins: Structure, Function, and Bioinformatics 2003, 52 (1), 51-67. 18. (a) Kim, D. E.; Blum, B.; Bradley, P.; Baker, D., Sampling bottlenecks in de novo protein structure prediction. Journal of molecular biology 2009, 393 (1), 249-260; (b) Huang, P.-S.; Boyken, S. E.; Baker, D., The coming of age of de novo protein design. Nature 2016, 537, 320. 19. Jacobs, T. M.; Williams, B.; Williams, T.; Xu, X.; Eletsky, A.; Federizon, J. F.; Szyperski, T.; Kuhlman, B., Design of structurally distinct proteins using strategies inspired by evolution. Science 2016, 352 (6286), 687. 20. Fleishman, S. J.; Leaver-Fay, A.; Corn, J. E.; Strauch, E.-M.; Khare, S. D.; Koga, N.; Ashworth, J.; Murphy, P.; Richter, F.; Lemmon, G.; Meiler, J.; Baker, D., RosettaScripts: A Scripting Language Interface to the Rosetta Macromolecular Modeling Suite. PLoS ONE 2011, 6 (6), e20161. 21. Sheffler, W.; Baker, D., RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design, and validation. Protein Science : A Publication of the Protein Society 2009, 18 (1), 229-239. 22. Leaver-Fay, A.; O’Meara, M. J.; Tyka, M.; Jacak, R.; Song, Y.; Kellogg, E. H.; Thompson, J.; Davis, I. W.; Pache, R. A.; Lyskov, S.; Gray, J. J.; Kortemme, T.; Richardson, J. S.; Havranek, J. J.; Snoeyink, J.; Baker, D.; Kuhlman, B., Scientific Benchmarks for Guiding Macromolecular Energy Function Improvement. Methods in enzymology 2013, 523, 109-143. 23. Smith, C. A.; Kortemme, T., Backrub-like backbone simulation recapitulates natural protein conformational variability and improves mutant side-chain prediction. Journal of molecular biology 2008, 380 (4), 742-756. 24. Best, R. B.; Zhu, X.; Shim, J.; Lopes, P. E. M.; Mittal, J.; Feig, M.; MacKerell, A. D., Optimization of the Additive CHARMM All-Atom Protein Force Field Targeting Improved Sampling of the Backbone ϕ, ψ and Side-Chain χ1 and χ2 Dihedral Angles. Journal of Chemical Theory and Computation 2012, 8 (9), 3257-3273. 25. Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kalé, L.; Schulten, K., Scalable

Molecular Dynamics with NAMD. Journal of computational chemistry 2005, 26 (16), 1781-1802. 26. Studier, F. W., Protein production by auto-induction in highdensity shaking cultures. Protein Expression and Purification 2005, 41 (1), 207-234. 27. Ortega, A.; Amorós, D.; García de la Torre, J., Prediction of Hydrodynamic and Other Solution Properties of Rigid Proteins from Atomic- and Residue-Level Models. Biophysical Journal 2011, 101 (4), 892-898. 28. Kabsch, W., Automatic processing of rotation diffraction data from crystals of initially unknown symmetry and cell constants. Journal of Applied Crystallography 1993, 26 (6), 795-800. 29. Vagin, A.; Teplyakov, A., An approach to multi-copy search in molecular replacement. Acta Crystallographica Section D 2000, 56 (12), 1622-1624. 30. Murshudov, G. N.; Vagin, A. A.; Lebedev, A.; Wilson, K. S.; Dodson, E. J., Efficient anisotropic refinement of macromolecular structures using FFT. Acta Crystallographica Section D 1999, 55 (1), 247255. 31. Emsley, P.; Cowtan, K., Coot: model-building tools for molecular graphics. Acta Crystallographica Section D 2004, 60 (12 Part 1), 2126-2132. 32. (a) Bricogne G., B. E., Brandl M., Flensburg C., Keller P., Paciorek W.,; Roversi P, S. A., Smart O.S., Vonrhein C., Womack T.O. BUSTER version 2.10.3 Global Phasing Ltd: Cambridge, United Kingdom, 2017; (b) Adams, P. D.; Afonine, P. V.; Bunkoczi, G.; Chen, V. B.; Davis, I. W.; Echols, N.; Headd, J. J.; Hung, L.-W.; Kapral, G. J.; Grosse-Kunstleve, R. W.; McCoy, A. J.; Moriarty, N. W.; Oeffner, R.; Read, R. J.; Richardson, D. C.; Richardson, J. S.; Terwilliger, T. C.; Zwart, P. H., PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallographica Section D 2010, 66 (2), 213-221. 33. Diercks, T.; Coles, M.; Kessler, H., An efficient strategy for assignment of cross-peaks in 3D heteronuclear NOESY experiments. Journal of Biomolecular NMR 1999, 15 (2), 177-180.

For Table of Contents Use Only

An interfaceinterface-driven design strategy yields a novel, novel, corrugated corrugated protein architecture Mohammad ElGamacy, Murray Coles, Patrick Ernst†, Hongbo Zhu, Marcus D. Hartmann, Andreas Plückthun† and Andrei N. Lupas*

ACS Paragon Plus Environment

ACS Synthetic Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 12

Table of Contents graphic

12

ACS Paragon Plus Environment