Unique Conformation in a Natural Interruption Sequence of Type XIX

Jan 27, 2018 - So far, only two crystal structures of collagen peptides with interruptions have been determined: one G1G natural interruption [Protein...
0 downloads 7 Views 1MB Size
Subscriber access provided by MT ROYAL COLLEGE

Article

Unique conformation in a natural interruption sequence of type XIX collagen revealed by high-resolution crystal structure Tingting Xu, Congzhao Zhou, Jianxi Xiao, and Jinsong Liu Biochemistry, Just Accepted Manuscript • DOI: 10.1021/acs.biochem.7b01010 • Publication Date (Web): 27 Jan 2018 Downloaded from http://pubs.acs.org on January 31, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a free service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are accessible to all readers and citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

Biochemistry is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Unique conformation in a natural interruption sequence of type XIX collagen revealed by high-resolution crystal structure Tingting Xua,b, Cong-Zhao Zhoua, Jianxi Xiaoc*, Jinsong Liub,d*

a

School of Life Sciences, University of Science and Technology of China, Hefei 230026, China

b

State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health,

Chinese Academy of Sciences, Guangzhou 510530, China c

State Key Laboratory of Applied Organic Chemistry, Key Laboratory of Nonferrous Metal

Chemistry and Resources Utilization of Gansu Province, College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China d

Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine

and Health, Chinese Academy of Sciences, Guangzhou 510530, China

*Correspondence: [email protected] (J.X.), [email protected] (J.L.)

1

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Abstract Naturally occurring interruptions in non-fibrillar collagen play key roles in molecular flexibility, collagen degradation and ligand binding. The structure feature of the interruption sequences and the molecular basis for their functions have not been well studied. Here, we focused on a G5G type natural interruption sequence G-POALO-G from human type XIX collagen, a homotrimer collagen, as this sequence possesses distinct properties compared with a pathological similar Gly mutation sequence in collagen mimic peptides. We determined the crystal structures of the host-guest peptide (GPO)3-GPOALO-(GPO)4 to 1.03 Å resolution in two crystal forms. In these structures, the interruption zone brings localized disruptions to the triple helix and introduces a light 6-8° bend with the same directional preference to the whole molecule, which may correspond structurally to the first physiological kink site in type XIX collagen. Furthermore, at the G5G interruption site, the presence of Ala and Leu residues, both with free N-H groups, allows the formation of more direct and water-mediated interchain hydrogen bonds than in the related Gly→Ala structure. These could partly explain the difference on thermal stability between the different interruptions. In addition, our structures provide a detailed view of the dynamic property of such interrupted zone on hydrogen bonding topology, torsion angles and helical parameters. Our results, for the first time, also identified zinc binding to the end of triple helix. These findings will shed light on how the interruption sequence influences the conformation of the collagen molecule and provide structural basis for further functional studies.

Keywords: Crystal structure; peptide; triple helix; collagen; collagen interruptions

2

ACS Paragon Plus Environment

Page 2 of 29

Page 3 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Introduction

Collagen can be classified into fibril-forming collagens (I, II, III, V and XI) and non-fibrillar collagens.1,2 They all contain a unique triple helix conformation formed by three polypeptide chains, where glycine is present at every third position and high content of imino acids at the other two positions, resulting in a repetitive (Gly-X-Y)n sequence pattern. In vertebrates, the X and Y positions are frequently occupied by proline and 4-hydroxyproline (4-Hyp), respectively.3,4 The repetitive pattern is especially strict in fibrillar collagen. Breakage of the repeating by single Gly to a bulkier amino acid substitution often causes heritable connective tissue disorders such as osteogenesis imperfecta (OI).5,6 In contrast, the repetitive pattern is not strictly conserved in non-fibrillar collagens, where more than 350 natural occurring interruptions have been observed in human collagens alone.7 Missense mutations of Gly residue, deletion or insertion in between Gly residues could be considered as interruptions, which will break the repetitive sequence. Natural occurring interruptions could play critical functional roles in molecular flexibility, collagen degradation and ligand binding.8-10 The interruptions can be classified by the length of amino acids between the repeating Gly-X-Y.11 For instance, -Gly-X-Gly-, which misses one residue, is termed as G1G interruption. Interestingly, a single Gly→Z missense mutation in the middle of a collagen sequence, like Gly-X-Y-Z-X-Y-Gly, is equivalent to a G5G natural interruption. Therefore, natural interruptions may share very similar sequence pattern with some pathological Gly substitutions. Due to the lack of 3D structures, there is no clear explanation at the molecular level for the structural and functional consequences resulting from the interruptions. So far, only two crystal structures of collagen peptides with interruptions have been determined: one G1G natural interruption (PDB ID: 1EI8, Gly-Pro-Gly interruption, also called Hyp-) and one Gly→Ala 3

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 29

substitution with Gly-POAPO-Gly sequence (PDB ID: 1CAG).12,13 The later one is usually found as disease-related mutations in fibrillar collagens.14 Type XIX collagen, a non-fibrillar collagen, contains four major interruptions separating five distinct collagenous subdomains and several small interruptions within these subdomains. These interruptions have not been well characterized. In previous studies based on sequence analysis, we designed a peptide to model a natural G5G interruption (POALO) at site 386-390 in the α(1) chain of Type XIX collagen.14 This natural interruption containing peptide possesses high degree of stability and folding rate while introducing alterations in local triple helical conformation. Sequence

analysis

showed

that,

in

natural

interruption

G5G[Ala]

sequence

(Gly-AA1-AA2-Ala-AA4-AA5-Gly), there is a very high percentage of hydrophobic residues (63%, in which more than half is Leu) and few Pro at position AA4, while Gly→Ala mutations found in the α(1) chain of Type I collagen have more Pro (39%) and very few hydrophobic amino acids (12%) at the same position. Apparently, the actual sequence preference at or around the interruption site in collagen will play important roles in its conformation, stability and other functional aspects. To understand the structural feature of the residue at the X position, structure of the peptide LOG1 with Gly-Leu-Hyp sequence was determined in a previous study (PDB ID: 2DRT).15 Hydrophobic Leu is one of the most frequently appearing residues in the X position.16 In this structure, additional hydrogen bonds were introduced at the N-H group of each Leu, mediated by water molecules. In addition, Leu residues are involved in the inter-molecule hydrophobic interaction, whose packing pattern may exist in native collagen. However, this structure does not provide information for the effects of Leu at X position next to a Gly→Ala substitution.

4

ACS Paragon Plus Environment

Page 5 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

To unravel the roles of X site residue in the natural interruption, especially the G5G[Ala] type, we determined the structure of a host-guest peptide with this interruption in the middle, (GPO)3-GPOALO-(GPO)4 (POALO for short), to 1.03 Å resolution in two crystal forms. These findings reveal the structure consequences of Leu residue in such an interruption sequence at high resolution and provide the structural details of G5G natural interruption sequence, which will help further functional studies on the natural interruption sequence in non-fibrillar collagens.

Materials and Methods

Crystallization and data collection Host-guest peptide (Gly-Pro-Hyp)3-Gly-Pro-Hyp-Ala-Leu-Hyp-(Gly-Pro-Hyp)4 was synthesized by GenScript company (Nanjing, China). TFA was substituted to acetate acid in the last step of synthesis. Before crystallization screening, the peptide powder was dissolved in water to 20 mg/mL. Crystal screening was performed under 20 ºC using sitting-drop vapor diffusion method. After one week, thin, plate-like and rod–like crystals were grown. The best crystals were originated from two conditions: 0.2 M Zinc acetate dihydrate, 20% w/v Polyethylene glycol 3,350 (condition C2 of PEG/Ion HT kit, Hampton Research) and 0.1 M HEPES pH 7.0, 30% v/v Jeffamine ® M-600 ® pH 7.0 (condition D2 of Index kit, Hampton Research). Prior to data collection, the crystals from Index D2 condition were soaked in a mixture of paratone oil and paraffin oil as cryoprotectant, while the crystals from PEG/Ion C2 condition did not require any cryoprotectant. Complete diffraction data sets for the best crystals were collected at beam line 19U1 of National Center for Protein Science Shanghai and Shanghai Synchrotron

5

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Radiation Facility at 100 K. Both crystals from the two conditions yielded diffraction data at a resolution as high as 1.03 Å. Structure determination and refinement Diffraction images were indexed and integrated using Mosflm, and scaled/merged with Aimless from the CCP4 program suit.17,18 Five percent of the data were randomly selected for R-free calculation. Both crystals belong to the P21 space group, but with different cell parameters. Crystal from PEG/Ion condition (Form I) was estimated to contain one triple helix per asymmetric unit by Matthews coefficient analysis.19 The structure for this crystal form was first determined by ACORN.20 In ACORN-MR procedure, single random-atom searching method was chosen to find an initial set of estimated phases. The initial model is built manually based on the ACORN map, then further refined using REFMAC5 and rebuilt with Coot.21,22 Water molecules were added to the model manually based on the Fo-Fc electron density map, reasonable hydrogen-bonding coordination and distance from other water molecules. Anisotropic B-factor refinement was used at the very end of the refinement. Crystal from the Index-D2 condition contains two triple helices per asymmetric unit. Structure solution of this Form II crystal was obtained by using the “search and phase with starting coordinates” method in ACORN, with the Form I structure as the starting model. In ACORN-MR procedure, standard rotation and translation function were used for MR search. In first stage of the rotation function search, the parameters were set to use 0.5 degree rotation step with 20 % of reflections in 1.5 Å resolution. Structure building and refinement steps are the same as those performed for form I structure. Structure figures were prepared using the program PyMOL.23 Data collection and final refinement statistics are shown in Table 1.

6

ACS Paragon Plus Environment

Page 6 of 29

Page 7 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Results 1.1 Overall structure We obtained two forms of POALO crystals (Form I and Form II), producing high-quality diffractions to 1.03 Å resolution (Table 1). Both structures adopt triple-helical conformation with a rod-like shape consistent with the classical collagen structure (Fig. 1A). They consist of three chains A, B, C/D, E, F, which are designated sequentially as leading, middle, trailing chain (L, M, T for short) staggered by one residue. In Form I crystal structure, the asymmetric unit contains one triple helix molecule, while two antiparallel triple-helical molecules exist in the asymmetric unit of Form II (IIA, IIB). Interestingly, in Form I structure, two zinc ions from crystallization condition were found at the N-terminal end, which has never been reported before. In both structures, the interruption site (POALO, colored yellow in Fig. 1A) shows slightly relaxed conformation in the central zone of the molecule and interrupts the symmetry fold. Top view of the molecules shows that the helical symmetry of the N-terminal first two triplets is close to the 75 ideal helix presented in the (GPO)9 structure (Fig. 1A),24 while the symmetry is broken around the central region. The three molecules in two forms are similar in shape but not identical to each other. Structure alignment based on the whole molecule shows that the root-mean-square deviation (RMSD) value of Cα atoms between Form I and two molecules in Form II are 1.20 Å and 1.43 Å, respectively, and 0.64 Å between the two molecules in Form II. Between the two forms, the chain deviates the most at the two ends (Fig. 1B and Supplementary Fig. S1A). In addition, the high resolution allows us to observe many residues with alternate conformations. Their distributions vary among the three molecules. In Form I, there are 7 residues having second conformation and 5 of them are

7

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

from the M chain. In Form II, alternate conformations are seen in 16 residues for molecule A and 6 residues for molecule B, and most of them appear consecutively, especially residues from Hyp9 to Hyp15 in the L chain of Form IIA molecule (Fig. 1A), resulting in the main chain trace drifting slightly. Interestingly, although there are more alternate conformations distributed around the interruption site than the flanking triplet regions, B-factor values of the interruption residues suggest that the interruption site is still less flexible than the two ends (Supplementary Fig. S1B).

Figure 1. Overall structures of POALO and structure superposition. (A) Side view (N-terminus at the top) and top view of the overall structures of Form I (cyan, left) and Form IIA (green, right) of (GPO)3-GPOALO-(GPO)4. (B) The structure alignment of all Cα atoms between the two forms. Interruption sequence “POALO” is colored in yellow. Bending angles are measured using the centers of main-chain atoms of residues 3-12, 13-15 and 16-24 from three chains. In the top view (lower panel), all the residues are presented as lines except the Hyp in the first two triplets, which

8

ACS Paragon Plus Environment

Page 8 of 29

Page 9 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

are presented as sticks. 2Fo-Fc and Fo-Fc electron density map of the alternate conformation of interruption site in the L chain of Form IIA are shown in rectangle, contoured at 1.0 σ and 3.0 σ, and colored blue and green, respectively. (C) Cartoon representation of the structural superposition results of Form I, Form IIA and other similar host-guest peptides: LOG1 (PDB ID: 2DRT, orange), Gly→Ala peptide (PDB ID: 1CAG, magenta) onto the model structure (GPO)9 (PDB ID: 3B0S, red). To show the deviation, alignments are performed by superposition of the N-terminal flanking region (residues 1-12 in POALO; 2-11 in LOG1; 3-14,33-44, and 63-74 in Gly→Ala; 1-12 in (GPO)9).

Table 1. Data collection and refinement statistic

Data collection Wavelength (Å) Space group Unit cell (Å, °)

Molecules per asymmetric unit Resolution range (Å) a Mosaicity (°) Unique reflections Completeness (%) Rpimb (%) Average redundancy Refinement statistics Resolution range (Å) R-factorc/R-freed (%)

Form I

Form II

0.9778 P21 a = 20.7, b = 25.3, c = 54.0, α = 90, β = 93.1, γ = 90 1 26.95-1.03 (1.05-1.03) 0.12 27461 (1401) 99.5 (99.6) 19.5 (11.4) 3.1 (5.9) 5.9 (5.8)

0.9778 P21 a = 13.8, b = 45.4, c = 77.6, α = 90, β = 90.7, γ = 90 2 45.44-1.03 (1.05-1.03) 0.57 44384(2163) 94.4 (91.1) 6.8 (2.2) 4.6 (21.0) 5.8 (6.0)

26.95-1.03 10.64 / 12.99

45.44-1.03 14.41 / 17.54

9

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 29

RMSDe bond lengths (Å) RMSD bond angles (°) Mean B factors (Å2) Protein Water Other solvent moleculesf Ramachandran plotg Most favored (%)

0.016 2.485

0.016 2.553

4.04 11.00 5.28

8.75 15.76

100

100

PDB entry

5Y46

5Y45

a

The values in parentheses refer to statistics in the highest resolution bin.

b

Rpim is the precision-indication merging R factor.

c

R-factor =∑h|Fo(h) - Fc(h)|/ ∑hFo(h), where Fo and Fc are the observed and calculated

structure-factor amplitudes, respectively. d

R-free was calculated with 5% of the data excluded from the refinement.

e

Root-mean square-deviation from ideal values.

f

Other solvent molecules include the zinc ion and acetic acid.

1.2 Unique conformation at the interruption site The interruption sequence POALO disrupts the twist of the triple helix, leading to an untwisted region. The overall structures of POALO show a slight bend at the interrupted site, which is more obvious in Form II with an 8º angle than 6º angle in Form I (Fig. 1A). Interestingly, even though the two structures were originated from different crystal packing, all M chains point toward the convex side in the middle point of the bend, demonstrating that the bend brought by this natural interruption sequence has a directional preference. Viewing from the top, the Leu residue on the middle chain is much more easily identified than the Leu on the other two chains. To examine the extent of the structural deviation, we performed the structure alignment of POALO (Form I and

10

ACS Paragon Plus Environment

Page 11 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Form IIA), Gly→Ala, LOG1 with Gly-Leu-Hyp sequence, and the reference (GPO)9 structure (PDB ID: 3B0S). We superposed N-terminal flanking region of each molecule and observed the deviation of C-terminal flank region (Fig. 1C). The results demonstrate that all three chains of POALO display deviation starting from the interruption site, while other peptides, especially the same G5G type interruption Gly→Ala, does not have the same deviation.

1.3 Super-helical twist In general, the super-helical twist angle κ for an imino-riched collagen triple helix is about -102.9º. This angle defines a single super-helical step in a 75 superhelix and can provide a direct view of the local changes in triple helical conformation.25 It depicts the angle that three residues from different chains at the same axial position (super helical triplet) rotate around the helical axis. The helical twist value of the POALO peptide varies in a broad range through the terminal and central zones. In form I, the average κ value of the N-terminal flanking region (GPO)3 is -102.7º, close to the ideal 75 helix. Then there is a “W”-shaped large negative κ value in the interruption region (including six super helical triplets), which suggests untwisting occurs (Fig. 2). In the C-terminal end, there is another untwisting peak. Excluding this peak, the C-terminal flanking region (GPO)4 has an average κ value of -102.6º. While in form II, the whole profile of κ values is similar to Form I, but the untwisting region has drifted slightly towards the C-terminus and there is no large untwisting peak at the C-terminal flank region. Taken together, this shows some degree of plasticity of the POALO sequence in order to adapt to the interruption. Furthermore, the average κ values of the C-terminal flanking region (GPO)4 in the two molecules of form II are -100.5º and -101.0º, indicating a small overtwisting. Compared to the Gly→Ala substitution structure, the “untwisting” region of POALO is wider, suggesting a more severe impact on the local triple-helix 11

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

conformation for the ALO sequence.

Figure 2. Variation of the triple helical twist (κ) for the three molecules. Molecule in Form I and two molecules in Form II are colored by cyan, light green and green lines, respectively. Corresponding region in Gly→Ala (PDB: 1CGD) calculated with same method is also shown and colored red. The sequence of each peptide is shown three times to represent three chains staggered by one residue. The triplet pair is defined as reported25 and the κ values are obtained from the spherical polar coordinates after Cα coordinate superposition of those triplet pairs using SUPERPOSE program in CCP4.18,26

1.4 Dihedral angle of the interruption site The average main chain dihedral angles (φ, ψ) of the entire POALO structure are close to that of the idealized 75 helix (Table 2A). However, in the substitution site, torsion angles of Ala show big variation. All Ala residues on the M chain of POALO have a significant deviation on φ angle

12

ACS Paragon Plus Environment

Page 12 of 29

Page 13 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

(-120º) compared to the average value of -66.2º for idealized 75 helix (Table 2B). This large deviation is also found in the Gly→Ala structure, but to a smaller degree, which indicates a common feature brought by Ala substitution on the Gly site. Other than the similarity, significant differences are observed between the dihedral angles of Ala followed with Leu and Pro, especially the ψ angle of Ala on the T chain (Table 2B, POALO and Gly→Ala). On the other hand, we can see that there is no significant difference between the dihedral angles of Gly followed with either Pro or Leu (Table 2B, (GPO)9 and LOG1). Table 2. Main chain conformational angle statisticsa A. Average main chain conformational angles

φ ψc φ ψ φ ψ

Gly(Ala) X position Y position

POALOb

Idealized 75 helix27

-70.6 (11.6) 169.2 (10.5)c -71.7 (8.9) 159.8 (10.1) -57.7 (4.7) 150.0 (6.0)

-66.2 175.9 -79.3 159.4 -59.5 146.3

B. Ala (Gly) position in interrution region, significant deviations are marked as bold red.

Ala/Gly (L) Ala/Gly (M) Ala/Gly (T)

φ ψ φ ψ φ ψ

POALO Form I Form IIA Form IIB -69 -65 -51 166 148 146 -126 -113 -120 168 169 174 -82 -70 -72 163 158 165

Gly→Ala 1CGD28 -74 164 -102 159 -67 135

(GPO)9 Gly13 -73 174 -66 179 -75 173

LOG1 Gly12 -67 173 -71 180 -61 171

a

Data were calculated by Discovery Studio 3.5 (Accelrys, San Diego,CA)

b

Calculated using alternative conformation A.

c

Excluded residues in the terminal ends and B7, B19, C7, C19 in Form I and C25, E19, E25 in

Form II, which have negative value.

13

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

1.5 Interchain hydrogen bonds Interchain hydrogen bond is critical for maintaining the conformation and stability of collagen. Usually, in the regular consecutive G-X-Y sequence, all NH of Gly is involved in forming the direct interchain hydrogen bond. In structures of POALO, some of these hydrogen bonds are lost in the interruption region (Fig. 3A-C) and replaced by water-mediated hydrogen bonds at NH of Ala. In Form I, the direct hydrogen bonds are lost for Ala on the M chain and T chain, while in Form II the direct interaction was lost for Ala on the M chain only. However, two kinds of interchain hydrogen bond are introduced in the disrupted region when Leu appears on the X position. Firstly, NH group of some Leu residues participate in the water-mediated hydrogen bond formation as Ramachandran once proposed.29 Water-mediated hydrogen bond is also observed in Gly→Ala structure. Though significant variability of hydrogen bonds exists between POALO and Gly → Ala, they share common feature of incorporating water molecules to overcome the disruption induced by the sequence irregularity. Secondly, a direct hydrogen bond forms between Leu on the T chain and the hydroxyl group of Hyp15 at the M chain in Form II. Hyp15 at that position has two conformations, and conformation B swings toward the T chain and is involved in the direct hydrogen bond formation (Fig. 3F). The occupancies of the conformation B of Hyp15 in two molecules are 0.58 and 0.46, respectively. In addition, the distances between the methyl group of some Ala and carbonyl group from another chain are short in POALO structures (Table 3). Their distance and angle parameters are consistent with weak hydrogen bond based on ideal geometry of modeled Cβ-H.30-32 The weak hydrogen bond of Cα-H⋯O=C type in collagen was first reported in Gly→Ala structure. 33 The bond length

14

ACS Paragon Plus Environment

Page 14 of 29

Page 15 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

of Cβ-H⋯O=C observed in POALO structure is shorter than those of Cα-H⋯O=C bond.

33

It is

also significantly shorter than the distance between Cβ-H and O=C in Gly→Ala structure (Table 3). Owing to the structural flexibility in the interrupted region, this kind of bond is varied in the three molecules. Form I and Form IIA molecules both contain two such bonds, but with different combination of donor and acceptor residues. Moreover, in Form IIA, one such bond found in the second conformation of Ala on the L chain has the distance as close as 2.89 Å. While in Form IIB, there are two Ala residues involved in the weak hydrogen bond, with Ala on the L chain interacting with Pro (from both the M chain and T chain) in a bifurcated manner.

Figure 3. Interchain hydrogen bonds around the interruption site in Form I (A), Form IIA (B), Form IIB (C), LOG1 (D) and Gly→Ala (E). (F) Details of the interchain hydrogen bonds formed between the hydroxyl of Hyp15 and main chain NH, and weak hydrogen bond between the methyl group of Ala and main chain C=O. Direct hydrogen bond smaller than 3.3 Å are marked as blue dash line; hydrogen bonds mediated by one water and weak hydrogen bonds are shown as green

15

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 29

and orange dash lines, respectively. All chains are designated as in previous figures. In F, the residues are presented as sticks and alternate conformation of Ala13/L, Leu14/M and Hyp15/M are shown. 2Fo-Fc and Fo-Fc electron density map of these residues are shown as in Figure 1A.

Table 3. Ala Cβ-H⋯O=C interchain weak hydrogen bond parameters bond length

Cβ—H⋯O=C

H⋯O

bond angle

Cβ⋯O

Cβ—H⋯O

Ala (M)⋯Ala (L) Ala (T)⋯Leu (L)

2.46 2.37

3.33 3.23

149.8 148.2

Form IIA

Ala (L/A)⋯Pro11 (T) Ala (L/B)⋯Pro11 (T) Ala (T)⋯Ala (M)

1.92 2.11 2.32

2.89 3.08 3.26

179.9 175.3 162.1

Form IIB

Ala (L)⋯Pro11 (M) Ala (L)⋯Pro11 (T) Ala (T)⋯Ala (M)

2.21 2.02 2.31

3.15 2.99 3.22

163.0 175.0 155.6

2.22

3.14

164.0

2.67

3.64

179.0

Form I

Average value Ala (L)⋯Gly12 (T) Gly→Ala (1CGD)

Ala (L)⋯Pro13 (T)

2.71

3.59

151.0

Ala (T)⋯Pro76 (M)

2.23

3.20

177.1

Ala (T)⋯Pro46 (L)

2.73

3.63

154.5

2.59

3.52

165.0

Average value a

a

Values of distance and bond angle are predicted from methyl hydrogen atoms

modeled in optimal orientation for binding.

1.6 Lateral packing via Leu residue In the crystal lattice of Form I, the molecules pack in the antiparallel way along the “b” axis (Fig. 4A, B). In this packing arrangement, Leu on the L chain makes a hydrophobic contact with Leu on the M chain from the neighboring antiparallel molecule, with a distance of 3.7 Å and the buried surface area of 90.7 Å2 and 61.2 Å2 on the interface, respectively (calculated by PISA34). In Form II, there is a similar hydrophobic contact between the Leu residues on the L chain and M chain,

16

ACS Paragon Plus Environment

Page 17 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

but from two parallel molecules A or B (Figure 5A). Another form of lateral packing of Leu residue from the T chain generates a “continuous hydrophobic packing zone” between molecule A and B (Figure 5B, C). Though the distances between the Leu in this packing arrangement are as long as 4.2 Å, there is no water molecule in this zone (Figure 5C). With this hydrophobic interaction, packing in Form II has a much closer inter-axial distance of 12-13 Å than that of Form I (13-14 Å).

Figure 4. Crystal packing of POALO in Form I. (A) Lateral arrangement of Form I structures along the crystallographic “b” axis. Details of the hydrophobic contact between Leu on L and M chain are shown in the right box. Crystal cell is shown as gray lines. (B) Top view of A. Leu residues involved in the hydrophobic contact are shown as red sticks. Zinc ions are presented as

17

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

gray spheres.

Figure 5. Crystal packing of POALO in Form II. (A) Lateral arrangement of Form II structure along the crystallographic “a” axis. (B) View of A along the crystallographic “b” axis. Details of the hydrophobic contact between Leu on the L and M chain are shown in the box. (C) Top view of the arrangement of A along the crystallographic c axis (Form IIA shown in green; Form IIB shown in blue). “Continuous hydrophobic packing zone” is marked with two dash lines. Detail of the contacts between Leu on the T chain from two antiparallel molecules is shown in the right box. The water molecules around the zone are represented as gray spheres. Leu residues involved in the

18

ACS Paragon Plus Environment

Page 18 of 29

Page 19 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

hydrophobic contact are shown as red sticks. Crystal cells are shown as green lines.

1.7 Zn binding site In Form I structure, two Zinc ions are found in the end-to-end interstitial space (Fig. 4A). They connect two molecules through strong coordinate bonding networks. This binding has never been reported in collagen structure. In the Zn1 binding site, zinc atom is coordinated with four water molecules, the NH2 and carbonyl groups of Gly1 residue on the L chain (Fig. 6A). Among the four water molecules, two are involved in the interaction with the C-terminal ends of the neighboring molecule, one forms hydrogen bonds with the Hyp27/L and Pro26/M, another water molecule is involved in the interaction with Gly1/M and the C-ends of the M chain and T chain in another molecule. For the Zn2 binding site, the tetrahedron coordination is completed by one acetate acid, the carbonyl group of Hyp27/T, NH2 and carbonyl groups of Gly1/M, and NH2 group of Gly1/T.

Figure 6. Zinc binding in Form I structure. The hydrogen bonding network of Zn1 (A) and Zn2 (B). Residues are shown as sticks. Water molecules and zinc ions are presented as red and gray sphere, respectively. Coordinating bonds are marked as magenta dash lines.

Discussion 19

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Herein, we determined the structures of a G5G natural interruption sequence peptide POALO in two crystal forms at high resolution. They share some similar structural features with an earlier reported Gly→Ala peptide, which may be the common features of G5G type interruption.13 For example, the central zone shows slight untwisting conformation, which is a localized disruption with minor impacts on the helical twist of the N- and C- terminus. Ala on middle chain has a big deviation of φ value to a more negative value. Moreover, both sequences display relatively lower temperature factor values in the interrupted zone than the two ends, which is significantly different from that of G1G interruption sequence (PDB ID: 1EI8, Gly-Pro-Gly interruption, called Hyp-). This further confirms that preservation of the phase of the repeating tripeptide pattern is important to the stability of triple helix conformation.35 However, introducing a Leu after the substitution site brings some important difference to the overall structure. A bend is observed in the interrupted site of POALO structure. Though the degree of the bending differ slightly in two forms, the bending direction is the same, in which M chains all protrude outwards. The corresponding sequence in type XIX collagen is located at the 6/9 site of COL5 domain, which is around the P1 kink observed by electron microscope in Type XIX collagen.36 Therefore, this bend in POALO structure is unlikely to be a crystallization artifact and probably has physiological significance. These structures also reveal that the bending is variable, which may be consistent with the different conformations observed in type XIX collagen. This bending is not reported in Hyp- and Gly→Ala. Therefore, the bend found in POALO is likely a result of the wider untwisting in POALA than those in Hyp- and Gly→Ala. This untwisting pattern exists in all three structures, even in Form I, whose structure may be affected by zinc binding in the ends. Therefore, this untwisting pattern should be a general structure feature

20

ACS Paragon Plus Environment

Page 20 of 29

Page 21 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

brought by the POALO interruption sequence and may be related to their function,37 but it is not clear how the sequences affect the flexibility and bending of collagen. Our structures illustrate that sequence like POALO with high content of imino acids could introduce bending to collagen and this bending may have a specific orientation. In previous studies, a longer version of POALO has a much higher Tm value than that of same length peptide with sequence LOAPO, indicating a relatively higher stability for POALO.14 In Gly→Ala structure, loss of hydrogen bonds and the presence of interstitial water induce the destabilization when temperature is increased. Comparing hydrogen bonding patterns of interruption sites, two or three more direct interchain hydrogen bonds are regained in the POALO structure. Besides, hydroxyl group of Hyp15 in Form II structure participates in direct hydrogen bonding interaction, which corroborates well with a previous finding that the last position in POALO prefers Hyp.35 This is the first report of interchain direct hydrogen bond formed by the hydroxyl group of Hyp, thus expanding our understanding of the roles Hyp plays in collagen. Furthermore, interchain weak hydrogen bonds between the Cβ group of Ala and carbonyl oxygen may contribute to the structure stability. On the other hand, variability on the hydrogen bonding network and dihedral angles of Ala among the three structures suggests dynamic characteristics of POALO, indicating that a new stable state can be reached through the rearrangement of the hydrogen bonds. Therefore, this type of interruption structure may have good tolerance on the function-related conformational variability, such as the bending we observed in this structure. Sequence analysis demonstrates that more than half of the hydrophobic residue occurrence at the X site is Leu in G5G[Ala] sequences.14 So the side chain of Leu may have significant impact to the structure stability or may be beneficial for the biological function. At the X site, all side chains

21

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

of Leu point out to the solvent. Our structures show that Leu are involved in the lateral arrangement of different crystal forms. Different to the arrangement in LOG1, the buried surface area of Leu residues in POALO structure is much larger than that in LOG1. Though the close contact between molecules is favorable for the structure stability, in non-fibrillar collagen it may not be as important as in fibrillar collagen. In some studies, Leu has been found at the enzyme recognition site or protein binding site.38 To better understand the sequence preference of Leu in this site, more physiological studies on the interruption sequence similar to POALO are needed. Interestingly, two zinc binding sites are found in the Form I structure. The zinc binding here is more likely a crystallization effect, but it may appear in the native collagen containing exposed triple helix structure. Our structures suggest that zinc binding alters the arrangement between the two ends of the triple helix and interfere with the twist of the triple helix to a certain degree. Our finding might be helpful for future crystal engineering for collagen peptides. In summary, our structures of POALO are the highest resolution structures reported up to date for a collagen-like peptide containing an interruption sequence. Our results demonstrate that POALO has a perturbed local conformation, which introduces a bend to the peptide. Even with the bending, this peptide is more stable than Gly→Ala, as Leu at X site could release the tension brought by the Ala substitution. The structures present atom-level details in molecular flexibility, hydrogen bonding pattern at the interruption site and reveal new role that Hyp can play in the stability of collagen. ALO sequence has also been found in two type IV collagen, so our structure will provide general structure features for natural interruption sequence in collagen and facilitate further functional study of collagen in degradation, aggregation and ligand binding.

22

ACS Paragon Plus Environment

Page 22 of 29

Page 23 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

Supporting Information

Variation of the Cα-Cα distances in structural superposition and average temperature factor comparison between two crystal forms of POALO

Accession numbers: All coordinates and structure factors have been deposited in the Protein Data Bank with accession numbers: 5Y45, 5Y46.

Competing interests The authors declare that there are no competing interests associated with the manuscript.

Funding This work was partially supported by grants from the National Natural Science Foundation of China (31570759, 21305056, 31300670).

Acknowledgements We thank the staff members of 19U1 beamline at National Center for Protein Science Shanghai and the Shanghai Synchrotron Radiation Facility, People's Republic of China, for assistance during the diffraction data collection. The authors are also grateful for the support from the Guangzhou Branch of the Supercomputing Center of CAS. The authors would like to thank Drs. Jordi Bella and Barbara Brodsky for helpful discussions and critical reading of the manuscript.

23

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

References: 1.

van der Rest, M., and Garrone, R. (1991) Collagen family of proteins, FASEB J 5, 2814-2823.

2.

Gordon, M. K., and Hahn, R. A. (2010) Collagens, Cell and tissue research 339, 247-257.

3.

Ramachandran, G. N., and Kartha, G. (1955) Structure of collagen, Nature 176, 593-595.

4.

Rich, A., and Crick, F. H. (1955) The structure of collagen, Nature 176, 915-916.

5.

Myllyharju, J., and Kivirikko, K. I. (2001) Collagens and collagen-related diseases, Ann

Med 33, 7-21. 6.

Marini, J. C., Forlino, A., Cabral, W. A., Barnes, A. M., San Antonio, J. D., Milgrom, S., Hyland, J. C., Korkko, J., Prockop, D. J., De Paepe, A., Coucke, P., Symoens, S., Glorieux, F. H., Roughley, P. J., Lund, A. M., Kuurila-Svahn, K., Hartikka, H., Cohn, D. H., Krakow, D., Mottes, M., Schwarze, U., Chen, D., Yang, K., Kuslich, C., Troendle, J., Dalgleish, R., and Byers, P. H. (2007) Consortium for osteogenesis imperfecta mutations in the helical domain of type I collagen: regions rich in lethal mutations align with collagen binding sites for integrins and proteoglycans, Hum Mutat 28, 209-221.

7.

Thiagarajan, G., Li, Y., Mohs, A., Strafaci, C., Popiel, M., Baum, J., and Brodsky, B. (2008) Common interruptions in the repeating tripeptide sequence of non-fibrillar collagens: sequence analysis and structural studies on triple-helix peptide models, J Mol

Biol 376, 736-748. 8.

Mohs, A., Popiel, M., Li, Y., Baum, J., and Brodsky, B. (2006) Conformational features of

24

ACS Paragon Plus Environment

Page 24 of 29

Page 25 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

a natural break in the type IV collagen Gly-X-Y repeat, J Biol Chem 281, 17197-17202. 9.

Miles, A. J., Skubitz, A. P., Furcht, L. T., and Fields, G. B. (1994) Promotion of cell adhesion by single-stranded and triple-helical peptide models of basement membrane collagen

alpha

1(IV)531-543.

Evidence

for

conformationally

dependent

and

conformationally independent type IV collagen cell adhesion sites, J Biol Chem 269, 30939-30945. 10.

Miles, A. J., Knutson, J. R., Skubitz, A. P., Furcht, L. T., McCarthy, J. B., and Fields, G. B. (1995) A peptide model of basement membrane collagen alpha 1 (IV) 531-543 binds the alpha 3 beta 1 integrin, J Biol Chem 270, 29047-29050.

11.

Bella, J. (2014) A first census of collagen interruptions: collagen's own stutters and stammers, J Struct Biol 186, 438-450.

12.

Bella, J., Liu, J., Kramer, R., Brodsky, B., and Berman, H. M. (2006) Conformational effects of Gly-X-Gly interruptions in the collagen triple helix, J Mol Biol 362, 298-311.

13.

Bella, J., Eaton, M., Brodsky, B., and Berman, H. M. (1994) Crystal and molecular structure of a collagen-like peptide at 1.9 A resolution, Science 266, 75-81.

14.

Sun, X., Chai, Y., Wang, Q., Liu, H., Wang, S., and Xiao, J. (2015) A Natural Interruption Displays Higher Global Stability and Local Conformational Flexibility than a Similar Gly Mutation Sequence in Collagen Mimic Peptides, Biochemistry 54, 6106-6113.

15.

Okuyama, K., Narita, H., Kawaguchi, T., Noguchi, K., Tanaka, Y., and Nishino, N. (2007) Unique side chain conformation of a Leu residue in a triple-helical structure, Biopolymers

86, 212-221. 16.

Ramshaw, J.A., Shah, N.K., and Brodsky, B. (1998). Gly-X-Y tripeptide frequencies in

25

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

collagen: a context for host-guest triple-helical peptides. J Struct Biol 122, 86-91. 17.

Leslie, A. G. W., and Powell, H. R. (2007) Processing diffraction data with MOSFLM,

Nato Sci Ser Ii-Math 245, 41-51. 18.

Collaborative Computational Project, N. (1994) The CCP4 suite: programs for protein crystallography, Acta Crystallogr D Biol Crystallogr 50, 760-763.

19.

Kantardjieff, K. A., and Rupp, B. (2003) Matthews coefficient probabilities: Improved estimates for unit cell contents of proteins, DNA, and protein-nucleic acid complex crystals, Protein Sci 12, 1865-1871.

20.

Yao, J. X. (2002) ACORN in CCP4 and its applications, Acta Crystallogr D Biol

Crystallogr 58, 1941-1947. 21.

Vagin, A. A., Steiner, R. A., Lebedev, A. A., Potterton, L., McNicholas, S., Long, F., and Murshudov, G. N. (2004) REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use, Acta Crystallogr D Biol Crystallogr 60, 2184-2195.

22.

Emsley, P., Lohkamp, B., Scott, W. G., and Cowtan, K. (2010) Features and development of Coot, Acta Crystallogr D Biol Crystallogr 66, 486-501.

23.

DeLano, W. (2008) The PyMOL molecular graphics system, Palo Alto, CA: DeLano

Scientific LLC. 24.

Okuyama, K., Miyama, K., Mizuno, K., and Bachinger, H. P. (2012) Crystal structure of (Gly-Pro-Hyp)(9) : implications for the collagen molecular model, Biopolymers 97, 607-616.

25.

Bella, J. (2010) A new method for describing the helical conformation of collagen: dependence of the triple helical twist on amino acid sequence, J Struct Biol 170, 377-391.

26

ACS Paragon Plus Environment

Page 26 of 29

Page 27 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

26.

Krissinel, E., and Henrick, K. (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions, Acta Crystallogr D Biol Crystallogr

60, 2256-2268. 27.

Kramer, R. Z., Bella, J., Mayville, P., Brodsky, B., and Berman, H. M. (1999) Sequence dependent conformational variations of collagen triple-helical structure, Nat Struct Biol 6, 454-457.

28.

Bella, J., Brodsky, B., and Berman, H. M. (1995) Hydration structure of a collagen peptide, Structure 3, 893-906.

29.

Ramachandran, G. N., and Chandrasekharan, R. (1968) Interchain hydrogen bonds via bound water molecules in the collagen triple helix, Biopolymers 6, 1649-1658.

30.

Desiraju, G. R., Steiner, T. (2001) The Weak Hydrogen Bond in Structural Chemistry and Biology, Oxford University Press, Oxford.

31.

Horowitz, S., and Trievel, R. C. (2012) Carbon-oxygen hydrogen bonding in biological structure and function, J Biol Chem 287, 41576-41582.

32.

Yesselman, J. D., Horowitz, S., Brooks, C. L., 3rd, and Trievel, R. C. (2015) Frequent side chain methyl carbon-oxygen hydrogen bonding in proteins revealed by computational and stereochemical analysis of neutron structures, Proteins 83, 403-410.

33.

Bella, J., and Berman, H. M. (1996) Crystallographic evidence for Cα-H⋯O=C hydrogen bonds in a collagen triple helix, J Mol Biol 264, 734-742.

34.

Krissinel, E., and Henrick, K. (2007) Inference of macromolecular assemblies from crystalline state, J Mol Biol 372, 774-797.

35.

Hwang, E. S., Thiagarajan, G., Parmar, A. S., and Brodsky, B. (2010) Interruptions in the

27

ACS Paragon Plus Environment

Biochemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

collagen repeating tripeptide pattern can promote supramolecular association, Protein Sci

19, 1053-1064. 36.

Myers, J. C., Li, D., Amenta, P. S., Clark, C. C., Nagaswami, C., and Weisel, J. W. (2003) Type XIX collagen purified from human umbilical cord is characterized by multiple sharp kinks delineating collagenous subdomains and by intermolecular aggregates via globular, disulfide-linked, and heparin-binding amino termini, J Biol Chem 278, 32047-32057.

37.

Walker, K.T., Nan, R., Wright, D.W., Gor, J., Bishop, A.C., Makhatadze, G.I. Brodsky, B. Perkins, S.J. (2017) Non-linearity of the collagen triple helix in solution and implications for collagen function. Biochem J. 474, 2203-2217.

38.

Manka, S. W., Carafoli, F., Visse, R., Bihan, D., Raynal, N., Farndale, R. W., Murphy, G., Enghild, J. J., Hohenester, E., and Nagase, H. (2012) Structural insights into triple-helical collagen cleavage by matrix metalloproteinase 1, Proceedings of the National Academy of

Sciences 109, 12461-12466.

28

ACS Paragon Plus Environment

Page 28 of 29

Page 29 of 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Biochemistry

For Table of Contents Use Only Unique conformation in a natural interruption sequence of type XIX collagen revealed by high-resolution crystal structure Tingting Xua,b, Cong-Zhao Zhoua, Jianxi Xiaoc*, Jinsong Liub,d*

29

ACS Paragon Plus Environment