Proteolytic Cleavage—Mechanisms, Function, and “Omic

Granelli-Piperno , A.; Reich , E. A Study of Proteases and Protease-Inhibitor Complexes in Biological Fluids J. Exp. Med. 1978, 148, 223– 234 DOI: ...
12 downloads 0 Views 5MB Size
Review Cite This: Chem. Rev. 2018, 118, 1137−1168

pubs.acs.org/CR

Proteolytic CleavageMechanisms, Function, and “Omic” Approaches for a Near-Ubiquitous Posttranslational Modification Theo Klein,†,⊥ Ulrich Eckhard,†,§ Antoine Dufour,†,¶ Nestor Solis,† and Christopher M. Overall*,†,‡ †

Life Sciences Institute, Department of Oral Biological and Medical Sciences, and ‡Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada ABSTRACT: Proteases enzymatically hydrolyze peptide bonds in substrate proteins, resulting in a widespread, irreversible posttranslational modification of the protein’s structure and biological function. Often regarded as a mere degradative mechanism in destruction of proteins or turnover in maintaining physiological homeostasis, recent research in the field of degradomics has led to the recognition of two main yet unexpected concepts. First, that targeted, limited proteolytic cleavage events by a wide repertoire of proteases are pivotal regulators of most, if not all, physiological and pathological processes. Second, an unexpected in vivo abundance of stable cleaved proteins revealed pervasive, functionally relevant protein processing in normal and diseased tissuefrom 40 to 70% of proteins also occur in vivo as distinct stable proteoforms with undocumented N- or Ctermini, meaning these proteoforms are stable functional cleavage products, most with unknown functional implications. In this Review, we discuss the structural biology aspects and mechanisms of catalysis by different protease classes. We also provide an overview of biological pathways that utilize specific proteolytic cleavage as a precision control mechanism in protein quality control, stability, localization, and maturation, as well as proteolytic cleavage as a mediator in signaling pathways. Lastly, we provide a comprehensive overview of analytical methods and approaches to study activity and substrates of proteolytic enzymes in relevant biological models, both historical and focusing on state of the art proteomics techniques in the field of degradomics research.

CONTENTS 1. Introduction 2. Different Classes of Proteases and Proteolytic Mechanisms 2.1. Metallopeptidases 2.2. Aspartic Peptidases 2.3. Glutamate Peptidases 2.4. Asparagine Peptide Lyases 2.5. Serine Peptidases 2.6. Threonine Peptidases 2.7. Cysteine Peptidases 3. Biological Processes Regulated by Proteolytic Cleavage 3.1. Maturation of Translated Proteins 3.2. Signal Peptide Removal 3.3. Proprotein Convertases 3.4. Processing by Carboxypeptidases 3.5. Ectodomain Shedding 3.6. Modulation of Biochemical Pathways and Signaling through Proteolytic Cleavage 3.7. Precision Proteolysis in Immunity 4. Analytical Approaches for Proteases and Proteolytic Processing 4.1. Monitoring Substrate Cleavage 4.2. Substrate- and Activity-Based Probes 4.3. Proteomics-Based Approaches for Substrate Discovery 4.3.1. N-terminal Proteomics © 2017 American Chemical Society

4.3.2. C-terminal Proteomics 5. Future Perspectives Author Information Corresponding Author Present Addresses Notes Biographies Acknowledgments References

1137 1138 1138 1140 1141 1141 1142 1143 1144

1156 1157 1158 1158 1158 1158 1158 1159 1159

1. INTRODUCTION Regulatory pathways interface to create a system capable of selectively amplifying and integrating signals in a coordinated manner to achieve homeostasis. Defects in one pathway can destabilize the entire system, causing neoplastic, fibrotic, inflammatory, autoimmune, and immunodeficiency diseases. The biological function of proteins is regulated by an intricate network of posttranslational chemical and enzymatic modifications such as phosphorylation, glycosylation, ubiquitination, and proteolysis. Proteolytic cleavage or proteolysis is the enzymatic hydrolysis of a peptide bond in a peptide or protein substrate by a family of specialized enzymes termed proteases.1 This irreversible posttranslational modification is often

1145 1145 1145 1145 1146 1147 1148 1150 1151 1151 1152

Special Issue: Posttranslational Protein Modifications

1152 1153

Received: March 1, 2017 Published: December 21, 2017 1137

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Figure 1. (a) Active-site view of the human metzincin ADAM17 (pdb entry: 1bkc). The side chains of Glu406, His405, His409, His415, and Met435 are shown as stick models (sequence numbering following Uniprot entry P78536). Zn, zinc. (b) Active-site view of the Clostridium tetani gluzincin collagenase T (pdb entry: 4ar9). The side chains of Glu466, Glu499, His465, and His469 are shown as stick models (sequence numbering following Uniprot entry Q899Y1). Zn, zinc. (c) Active-site view of E. coli methionine aminopeptidase (pdb entry: 2mat). The side chains of Glu204, Glu235, Asp97, Asp108, and His171 are shown as stick models (sequence numbering following Uniprot entry P0AE18). Co, Cobalt. All figures were created within PyMOL with carbon depicted in cyan, oxygen in red, nitrogen in blue, and sulfur in yellow.

(i) hydrophobic and electrostatic interactions with the substrate amino acid residue side chains in commonly well-defined substrate-binding pockets and (ii) hydrogen bonding with the peptide backbone of a substrate in an extended conformation.15 Whereas endopeptidases (EC 3.4.21−25) cleave substrates within the protein molecule, aminopeptidases (EC 3.4.11−14) and carboxypeptidases (EC 3.4.15−18) remove up to three amino acids from the N- or C-terminus, respectively, and are jointly referred to as exopeptidases.16 A special case represents the so-called omega peptidase family (EC 3.4.19), which releases terminal amino acids that are either chemically modified (e.g., acetylated or formylated), cyclized (e.g., pyroglutamate), or linked by an isopeptide bond. A more systematic classification follows the nature of the active site and the amino acid or metal ion catalyzing the peptide-bond cleavage. Thereby seven distinct classes of proteases can be identified: the aforementioned and well-established five major classes, the serine- (EC 3.4.21), cysteine- (EC 3.4.22), aspartic(EC3.4.23), metallo- (EC 3.4.24), and threonine peptidases (EC 3.4.25), and two classes without known representatives in mammals: the glutamic peptidases (currently included in EC 3.4.23) and the asparagine peptide lyases (EC 4.3.2).17 The latter were established only in 2011 and represent true outliers in the world of proteolytic enzymes as their reaction mechanism does not include the addition of a water molecule. Thus, they do not qualify as hydrolases or peptidases and have to be considered as lyases instead, as they utilize an elimination reaction where an internal asparagine residue first attacks its own carbonyl carbon bond and cyclizes, followed by a nucleophilic attack, leading to a succinimide-ring formation and peptide-bond cleavage.18

regarded solely as a destructive mechanism, for example, in the degradation of extracellular matrix in inflammation and intracellular proteins by the proteasome or in apoptosis, but recent decades of research in degradomics have highlighted specific, limited precision proteolysis, termed processing, to be a crucial regulator of protein function in virtually all biological pathways.2 The complete repertoire of proteases (the degradome) in man comprises 588 peptidases, which can be grouped in 5 catalytic classes: 21 aspartyl-, 164 cysteine-, 192 metallo-, 184 serine-, and 27 threonine peptidases,3 even outnumbering the large kinase family (kinome) with its 518 members.4 This means that nearly 3.1% of the 19 587 human protein encoding genes5,6 encode for proteases, making these hydrolases one of the most abundant classes of enzymes and promoting them to center stage in a wide range of biological processes and diseases. Proteolytic processing is widespread, is pervasive, and generates stable protein fragments as shown by targeted Nterminal proteomics analysis in numerous biological samples. Unexpectedly, the majority of identifiable protein N-termini are within the protein sequence, i.e., unannotated, neo N-termini 77% in platelets,7 68% in erythrocytes,8 50−60% in normal and inflamed murine skin,9 and 78% in human dental pulp.10 The interplay of interconnections between proteases and their endogenous inhibitor proteins, termed the protease web, remains understudied to date, leaving many unanswered questions open for investigation.11 Although proteolytic degradation is vital for maintaining homeostasis, e.g., in protein turnover through the ubiquitinproteasome system,12 this Review will focus on targeted, specific processing as a near-ubiquitous post-translational modification. Processing can profoundly alter the biological properties of a protein, its localization, and stability. We discuss the repertoire of proteases, their different hydrolytic mechanisms, the biological function of proteolytic cleavage, and approaches for the analysis of proteases and their substrates.

2.1. Metallopeptidases

The metalloproteases (EC 3.4.24) represent a highly diverse group of endo- and exopeptidases. They all harness at least one divalent metal ion in their active site, typically Zn2+, which is coordinated by three amino acid side chains and a water molecule, where the latter plays a key role in substrate hydrolysis19,20 (Figure 1a and b). The most common metal ligand is histidine, followed by glutamate, aspartate, and lysine. Additionally, an unpaired cysteine is found in the propeptides of matrix metalloproteinases (MMPs) and the adamalysin families, which replaces the active site water in the zymogen form, rendering the protease inactive.21 Notably, in some families, two

2. DIFFERENT CLASSES OF PROTEASES AND PROTEOLYTIC MECHANISMS In the absence of a protease, peptide bonds are highly stable at neutral pH and 25 °C with half-lives of >100 years as shown in ancient human remains,13 whereas they are hydrolyzed within milliseconds in protease-assisted biochemical reactions.14 Proteases recognize and bind their substrates mainly through 1138

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Figure 2. Active-site view of the zinc metallo-deubiquitinase AMSH-LP (associated molecule with the SH3 domain of STAM-like protein) in complex with Lys63-linked diubiquitin (pdb code: 2znv; PMID: 18758443). (a) AMSH-LP is shown in semitransparent surface representation (middle), while the distal (left) and proximal (right) ubiquitin are depicted in cartoon rendering. The scissile isopeptide bond between the carboxyl group of the distal ubiquitin’s C-terminus (Gly76) and the ε-amino group of Lys63 of the proximal ubiquitin is positioned in the Zn2+ coordinating active site of AMSH-LP and is depicted as two spheres. Following the nomenclature of Schechter and Berger (PMID: 6035483), Arg72 to Gly76 of the distal ubiquitin and Lys63 and Glu64 of the proximal ubiquitin moiety represent the P5 to P1 and P1′ to P2′ positions of the deubiquitinating enzyme (DUB) cleavage site, respectively. Due to the special character of the isopeptide bond, Gln62 and Ile61 form an additional substrate− protease interface that we designate with a double prime symbol (″), and these are consequently designated as P2″ and P3″. (b) Schematic representation of the isopeptide bond of Lys63-linked diubiquitin, highlighting the T-shaped substrate-recognition motif of deubiquitinating enzymes. The tripeptide sequence Gln62-Lys63-Glu64 forms together with the Lys63-linked C-terminal glycine of the distal ubiquitin a unique interaction surface only found in Lys63-linked polyubiquitin. Similar but different structural arrangements are found in other polyubiquitin chains, rendering DUBs highly specific for individual ubiquitin linkages (PMID: 27012468). All structural figures were generated using PyMOL.

tidylinositol) anchors or a C-terminal transmembrane domain.29−31 The well-known and classic MMP substrate is collagen; MMPs 1, 8, 13, and especially MMP14, also known as membrane-type (MT) 1-MMP, cleave type-I collagen across all three α-chains at a single point, namely, Gly775↓Ile776 for the α1(I)-chain and Gly775↓Leu776 for the α2(I)-chain threequarters of the length from the N-terminus of the ∼1000 amino acid residue-long collagen. However, not all MMPs are capable of this cleavage, despite highly similar substrate specificity on the peptide level.32 Binding to the hemopexin domain is key to triple helicase activity and scissile-bond access for cleavage.33 Despite their undisputed role in extracellular matrix remodeling, MMPs are now well-established as signaling enzymes in many biological pathways and diseases such as arthritis, cancer, cardiovascular disease, and liver fibrosis.34,35 Of similar great interest are the adamalysins, consisting of the ADAMs (a disintegrin and metalloproteinase; Figure 1a) and ADAMTS (ADAM with a thrombospondin type 1 motif) families. Like MMPs, the adamalysins are expressed as multidomain extracellular proteins, belong to the metzincin superfamily, and employ a cysteine switch for zymogen inactivation. Whereas the ADAMTS family comprises secreted proteins only, most ADAMs possess a C-terminal transmembrane domain followed by a short cytoplasmic tail that is crucial for intracellular signaling. Next to their emerging role in coagulation, inflammation, and cell migration, ADAMTS proteases are mostly known for their role in connective tissue remodeling by proteolytic processing of, e.g., procollagens, aggrecan, and versican. On the contrary, most ADAM metalloproteinases are involved in ectodomain shedding of various plasma membrane-tethered growth factors, cytokines, receptors, and/or adhesion molecules and thus play fundamental roles in controlling development and homeostasis, thereby linking them to several pathological states such as cancer, cardiovascular disease, and Alzheimer’s disease.36−38 Other prominent members of the metzincin superfamily are, e.g., astacin, meprin A and B, and ulilysin,39 now known as LysargiNase;40 the latter is used because of its trypsin-mirroring specificity in proteomics.40

cocatalytic metal ions are found, typically in the case of cobalt (e.g., E. coli methionine aminopeptidase; Figure 1c) or manganese (e.g., E. coli aminopeptidase A) but also sometimes for zinc (e.g., mammalian leucyl aminopeptidases). Importantly, unlike serine and cysteine proteases, metallopeptidases do not form a covalent acyl−enzyme intermediate during substrate hydrolysis. Although irreversible inhibition of metalloprotease activity has been described using cobalt-containing Schiff-base complexes as small-molecule inhibitors,22 no peptide-based covalent inhibitors exist for this class. Upon substrate binding, the active-site water is displaced toward the general base glutamate whereas its oxygen remains associated with the metal ion. Additionally, the carbonyl of the scissile bond interacts now with the metal ion, leading to a pentacoordinated transition state. The general base glutamate then subtracts a proton from the water molecule, which simultaneously performs a nucleophilic attack on the carbonyl bond of the scissile bond, leading to cleavage and a gem-diol intermediate of the substrate. Subsequent protonation of the amide product by the now general acid glutamate leads to the release of the C-terminal substrate fragment, whereas an attack of a water molecule on the active site zinc is needed to displace the carboxylate-cleavage fragment.19,20,23 Two major families of the metallopeptidases are distinguished by their third zinc-binding ligand.24,25 Whereas metzincins (Figure 1a) employ a total of three histidine residues within a HExxHxxGxxH motif, the gluzincin superfamily (Figure 1b) harnesses a glutamate further downstream of the central HExxH (HExxHxnE), often on an α helix. Additionally, all metzincins possess a methionine-containing 1,4-turn (Met-turn, hence the metzincin family name), which forms a “hydrophobic basement” for the catalytic metal ion (Figure 1a). One subgroup of the metzincin family, the MMPs, comprises 23 members in human. MMPs are encoded as preproproteins with conserved active site features. The presence or absence of accessory substrate-binding domains, known as exosites,26−28 such as the hemopexin domain or the fibronectin type-II modules, distinguish individual family members, some of which also possess GPI (glycosylphospha1139

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

One group of metalloproteases distinguishes itself by their substrates; instead of endo- or exoproteolytical cleavage, the Jab1/Mov34/Mpr1 Pad1 N-terminal+ (MNP+) domain proteases (JAMM) recognize the isopeptide bond formed by ubiquitin conjugated to the free ε-amino group of lysine in the protein sequence (Figure 2a and b). These deubiquitinating enzymes (DUBs) comprise a family of close to 100 metalloand cysteine proteases in Mammalia41 that regulate the removal of ubiquitin from substrates, thereby forming an integral control mechanism in the ubiquitin system.12 Contrary to the mammalian collagenases (MMPs), bacterial collagenases such as collagenase G from Hathewaya histolytica (formerly known as Clostridium histolyticum42) belong to the gluzincin superfamily and digest native, triple-helical collagens into a mixture of small peptides under physiological conditions.43 These collagenases cleave in the repeating ProHyp-↓-Gly sequence of collagen and select for a small hydrophobic residue in P1′, whereas their mammalian counterparts favor a large hydrophobic residue such as Leu or Ile.32,44 In ColT from Clostridium tetani, the fourth zinc ligand, Glu499, lies 34 amino acids downstream of the H465ExxH motif (Figure 1b), in the center of the peptidase domain resembling a thermolysin-like peptidase (TLP) fold.45,46 Like many proteases, thermolysin from Bacillus thermoproteolyticus is expressed as preproenzyme, where the 200 amino acid prodomain acts as a molecular chaperone in the periplasm, leading to autoactivation and subsequent secretion of the mature protease into the extracellular space.47 The TLP fold imparts two roughly spherical subdomains that are separated by a deep substrate-binding cleft where substrates bind from left to right when shown in the standard orientation of metallopeptidases.48 The upper subdomain is dominated by a five-pleated β-sheet with the active-site facing edge strand, providing crucial contacts for substrate binding in antiparallel fashion, whereas the lower C-terminal subdomain is built up predominantly by α-helices. The catalytic zinc ion is sandwiched in between by the HExxH-motif of the central helix and the glutamate from the gluzincin helix.49 Notably, a similar five-pleated β-sheet is also found in MMPs and many other metallopeptidases.50 Another highly prominent gluzincin family member is the dipeptidyl carboxypeptidase angiotensin I-converting enzyme (ACE), an unusual protease having two catalytic domains in the same protein encoded by the same gene. As a central component of the renin-angiotensin system, ACE tightly regulates blood pressure by (i) removing a dipeptide from the C-terminus of the decapeptide angiotensin I, generating the vasopressor angiotensin II, and (ii) inactivating the vasodilator bradykinin by two sequential cleavages at the C-terminus, removing a total of four amino acids.51

Figure 3. (a) Active-site view of the porcine aspartic peptidase pepsin (pdb code: 4pep). The side chains of Asp91 and Asp274 are shown as stick models (sequence numbering following Uniprot entry P00791). (b) Active-site view of the human aspartic peptidase pepsin in complex with an active-site binding phosphonate inhibitor IVA-VAL-VALLEU(P)-(O)PHE-ALA-ALA-OME (pdb code: 1qrp). The side chains of Asp94 and Asp277 are shown as stick models (sequence numbering following Uniprot entry P0DJD7). (c) Active-site view of the porcine pepsin precursor pepsinogen (pdb code: 2psg) with the inhibitory prodomain in place. The side chains of Asp91, Asp274, and Lys51 are shown as stick models (sequence numbering following Uniprot entry P00791). All figures were created within PyMOL with carbon depicted in cyan, oxygen in red, nitrogen in blue, and sulfur in yellow.

The acidic residue in the second domain serves as the general base to abstract a proton from the active-site water, enabling it to perform a nucleophilic attack on the carbonyl carbon of the bound substrate, whereas the first aspartate residue provides via its protonated state electrophilic assistance to the carbonyl oxygen. The tetrahedral oxyanion intermediate subsequently breaks down when the scissile amide bond acquires a proton from the solvent or through structural rearrangement of the intermediate, releasing the cleavage fragments and regenerating the protonated state of the active site aspartate.57−59 Pepsin represents the most-studied aspartic peptidase to date. It shows a bilobed, mostly β-sheet structure, with the active site located at their interface. Every lobe harbors one of the aspartate residues of the catalytic dyad within a conserved Asp-Xaa-Gly motif, in which Xaa is typically a serine or threonine. Despite low sequence similarity, the high overall homology suggests that one lobe most likely evolved from the other by gene duplication.53,57 Approximately 45 residues downstream of the first aspartate is located a highly conserved tyrosine residue, which separates and defines the S1 and S2′ substrate-binding pockets. The tyrosine is also part of an 11residue β-hairpin loop called flap, which caps the active site and encloses substrates and inhibitors in the active groove upon binding. Whereas substrates are readily turned over, active-site inhibitors bind in similar fashion, rendering the enzyme inactive by tight binding and displacing the active site water (Figure 3b). This structural feature is only present in the N-terminal domain and thus constitutes the main asymmetric aspect of the overall structure.60−62 The members of the retroviral retropepsin family constitute a special case, as retropepsins consist only of one lobe and thus need to form a homodimer to build a functional catalytic site in between. Notably, these viral enzymes (e.g., HIV-1 and HIV-2 protease) are not thought to be ancestral and are most likely derived from the N-terminal lobe of a pepsin homologue, as they harbor the aforementioned active-site capping flap

2.2. Aspartic Peptidases

Aspartic peptidases are the main proponents of the carboxyl peptidase family, which is characterized by an acidic active site residue (aspartate or glutamate), a low pH optimum, and an activated water molecule that attacks the scissile peptide bond in a concerted general acid−base mechanism.52,53 In the majority of known aspartic peptidases, a pair of aspartic acid residues act together to bind and activate the catalytic water molecule (Figure 3a), but in some, other amino acids replace the second aspartate, e.g., Glu in the E. coli peptidase HybD54,55 or His in the histo-aspartic protease from Plasmodium falciparum.56 1140

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

structure. Because of the presence of two flaps, the active-site grooves of retropepsins are shortened and accommodate only substrate residues from P3 to P3′ instead of P5 to P3′, with a typical preference for hydrophobic residues around the scissile bond. 63,64 Because of the inherent symmetry of the homodimer, the two aspartates cannot play predefined roles during substrate hydrolysis. Instead, they most likely engage in their individual roles only upon substrate binding, which breaks the initial symmetry.65,66 Importantly, many eukaryotic aspartic proteases are secreted as zymogens.62,67,68 Pepsinogen, for example, is synthesized in the gastric mucosa and secreted into the stomach, where a 44residue N-terminal propeptide (Ile16-Leu62 in human pepsin A) displaces the first ∼10 residues of the mature pepsin sequence (Val63-Ala388) and blocks the active-site cleft with its two α-helices, whereas Lys51 displaces the active-site water (Figure 3c). Additionally, two tyrosine residues, namely, Tyr53 and Tyr137, block the S1 and S1′ binding sites. A special role is incumbent to the highly conserved Arg29 residue, which binds to Asp73, stabilizing the zymogen conformation and, together with other ionic interactions, being responsible for the acidtriggered conversion to pepsin in the stomach.69,70 Other prominent members of the aspartic protease family include cathepsins D and E, the β-secretases BACE1 and -2 (βsite amyloid precursor protein (APP)-cleaving enzyme-1 and -2), and γ-secretase. Cathepsin D is a soluble lysosomal aspartic peptidase with a 44-amino-acid long propeptide. After activation, further proteolytic processing yields the mature and active lysosomal protease in human, composed of disulfidelinked heavy and light chains. Notably, in mice the single-chain form dominates, whereas the bovine form shows equal amounts of both variants.68,71 Additionally, cathepsin D has two asparagine-linked oligosaccharides of high-mannose type, which are not required for protein folding or enzyme activity but play an important role in lysosomal targeting of the protein via mannose-6-phosphate (M6P) receptors.72 Cathepsin E is an intracellular, nonlysosomal glycoprotein closely homologous to cathepsin D but mainly found in the endoplasmic reticulum. It plays a vital role in protein degradation, as well as in the MHC class II antigen processing pathway.73 Cathepsin E is rather unusual among the pepsin-like protein family, as it forms a disulfide-bonded homodimer. Notably, monomeric and dimeric cathepsin E share similar biochemical characteristics, but dimerization evokes higher structural stability to alkaline pH and temperature changes, allowing the enzyme to thrive at the close-to-neutral pH of the endoplasmic reticulum.74,75 BACE-1, also known as Memapsin 2 (membrane aspartic protease of the pepsin family 2), is a membrane-bound aspartic protease, mostly known for its sheddase activity on the APP, releasing the N-terminal ectodomain and initiating the subsequent cleavage of the C-terminal part by the y-secretase complex, ultimately forming a 30−51-amino-acid long isoform of the Aβpeptide involved in the plaque formation in Alzheimer’s disease.68,76 The y-secretase complex itself consists of 5 individual proteins including the multipass membrane protein presenilin-1, an aspartyl protease with an intramembranous active site enabling APP cleavage in its transmembrane region.77

Scytalidium lignicolum was determined, revealing that its catalytic dyad consisted of a glutamic acid and a glutamine78 (Figure 4a). Expressed as preproproteins, they are mostly found

Figure 4. (a) Active-site view of Scytalidium lignicolum glutamate peptidase eqolisin (scytalidopepsin B, pdb code: 1s2b). The side chains of Glu190 and Gln107 are shown as stick models (sequence numbering following Uniprot entry P15369) (b) Active-site view of eqolisin with a bound N-terminal cleavage fragment Ala-Ile-His (pdb code: 1s2k). The side chains of Glu190 and Gln107 are shown as stick models (sequence numbering following Uniprot entry P15369). All figures were created within PyMOL with carbon depicted in cyan, oxygen in red, nitrogen in blue, and sulfur in yellow.

in pathogenic fungi; several glutamic proteases also have been identified in bacteria and archaea but not yet in eukaryotes.79 Because of their distinct fold, which is similar to concanavalin A but novel for a protease, eqolisin is insensitive to pepstatin and several other inhibitors of aspartic proteases.80 A rather peculiar case of glutamate peptidases are the trimer-forming tail spike and fiber proteins found in various bacteriophages. These proteins harbor a proteolytically active C-terminal domain (CTD) as well as a domain structure associated with their primary function as, e.g., endosialidases or K5 lyases. The CTD initially acts as an intramolecular chaperone that is indispensable for correct folding of the full-length protein and the formation of a functional homotrimer. After assembly, the autocatalytic release of the CTDs by the glutamate peptidase stabilizes the remaining trimer.81,82 The suggested hydrolytic mechanism for glutamic peptidases is similar to that for their aspartic counterparts. A general baseactivated water molecule (Figure 4a and b), initially bound by the glutamine and the general base glutamate of the catalytic dyad, acts as the nucleophile, whereas the side-chain amide of the Gln provides electrophilic assistance first by polarizing the carbonyl carbon of the scissile peptide bond and second by stabilizing the tetrahedral intermediate. A proton transfer from the active-site Glu to the leaving-group nitrogen then triggers the breakdown of the transition state and finalizes the proteolytic cleavage, followed by the regeneration of the active-site glutamate.78,83,84 2.4. Asparagine Peptide Lyases

Another family of proteolytic enzymes, which was initially thought to belong to the aspartic peptidases, is now known to be not even hydrolases but peptide lyases, as their cleavage mechanism does not involve a water molecule. Instead, an asparagine residue acts as a nucleophile and attacks its own main-chain carbonyl to create a five-membered succinimide intermediate by intramolecular cyclization. In the right physicochemical environment, this leads to the subsequent cleavage of the peptide bond C-terminally of asparagine in an elimination reaction without the addition of a water molecule.

2.3. Glutamate Peptidases

Historically misclassified as aspartic peptidases, glutamate peptidases were only described in 2004, when the crystal structure of scytalidoglutamic peptidase (eqolisin) from 1141

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

side chain of the first residue of the C-terminal extein (transesterification). The intein’s C-terminal asparagine then attacks its own backbone carbonyl and releases the intein from the polypeptide, similar to that described above for peptide lyases in general. Finally, an acyl rearrangement of the thioester bond linking the extein domains occurs to form a normal peptide bond between them. Importantly and name-giving, none of the peptide bonds in this process is broken by hydrolysis, and thus these enzymes have to be referred to as peptide lyases and not proteases.18,85,86

Furthermore, as every active site functions only once without any subsequent substrate turnover, the self-cleaving activity of peptide lyases does not represent a typical enzymatic action after all.18,85 Probably the most prominent members of the asparagine peptide lyase family are the self-splicing inteins (Figure 5). An

2.5. Serine Peptidases

The serine peptidases hydrolyze peptide bonds very differently to the mechanisms described earlier for aspartate, glutamate, and metallopeptidases, as the initial nucleophilic attack is carried out by the side chain of a serine residue and not an activated water molecule, which leads to a covalent acyl− enzyme intermediate during peptide-bond cleavage. Thus, the catalytic mechanism is more similar to the asparagine peptide lyases, but with the important difference that the second tetrahedral intermediate is actually attacked by a water molecule, thus fully qualifying them as true hydrolases and peptidases.87,88 The most common (“classical”) catalytic triad of serine peptidases consists of histidine, aspartate, and serine (Figure 6a), not necessarily in that sequence order, but with highly similar three-dimensional arrangements. Whereas the serine acts as a nucleophile, the aspartate stabilizes the histidine side chain, which in turn acts as a proton donor and general base. It is noteworthy that glutamate and histidine replace the aspartate in bacterial LD-carboxypeptidases and herpes simplex virus endopeptidase, respectively, and lysine takes the role as general base in the Ser/Lys-dyad of Lon-A peptidase, which lacks the acidic residue. However, the probably most-studied serine peptidases, such as trypsin, chymotrypsin, or subtilisin, all have the classical catalytic triad.89 In chymotrypsin, the catalytic triad is built up by Ser214, His71, and Asp119 (Figure 6a and b). The latter orients His57 and reduces its pKa to enable the histidine to function as a general base in the reaction. The histidine then abstracts a proton from Ser195, which simultaneously performs a nucleophilic attack on the carbonyl of the substrate scissile bond. A covalent link is formed between the serine side-chain oxygen and the substrate, resulting in a tetrahedral intermediate (enzyme−acyl complex) and a negative charge on the peptide carbonyl oxygen, which is stabilized by the protease backbone amides of Gly193 and Ser195, which form a positively charged pocket commonly referred to as an oxyanion hole. Subsequent protonation of the substrate amide nitrogen by His57, which now acts as a general acid, collapses the complex and results in the release of the C-terminal part of the substrate (amidase reaction), while the N-terminal part remains bound to the protease. Next, an activated water molecule, which displaced the C-terminal fragment in the active site, attacks the ester bond of the acyl−enzyme intermediate (Figure 6c), resulting in a second oxyanion hole-stabilized tetrahedral intermediate. Collapse of the latter releases the N-terminal substrate fragment from the active site (esterase reaction) and regenerates the serine side chain. Similar catalytic mechanisms are suggested for the other nonclassical active sites of serine peptidases.87−89 Another prototype of serine peptidase is the endopeptidase trypsin. It cleaves preferentially after arginine or lysine, whereas the structurally nearly identical chymotrypsin selects for large hydrophobic residues. Trypsin is synthesized as preproprotein

Figure 5. Schematic representation of the mechanism of inteinmediated protein splicing. Upon dimerization of N-intein and C-intein (1) the first residue of the newly formed catalytic core (a serine or cysteine) facilitates a nucleophilic N−O/S acyl shift (2) followed by transthioesterification mediated by the last residue in the C-extein (3). The C-terminal asparagine residue in the intein subsequently selfcyclizes into a succinimidyl group, cleaving the peptide bond between intein and extein and allowing an S/O−N shift to link both exteins via an amide bond (4).

intein is a protein domain of ∼135 residues inserted into a polypeptide that is responsible for catalyzing its own excision while simultaneously fusing the two flanking extein domains, similar to the splicing out of intron RNA from mRNA. To do so, the last residue of the intein domain must be asparagine, whereas the first residues of both the intein and C-terminal extein domain have to be either cysteine, serine, or threonine. First, the side chain of the N-terminal amino acid of the intein attacks the preceding peptide bond, leading to an amide to ester (Ser, Thr) or thioester (Cys) rearrangement, i.e., the Nterminal extein domain is loaded onto the inteins N-terminal side chain. Enabled by the close proximity of the inteins N- and C-terminus, the N-extein is subsequently transferred onto the 1142

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

side chains. As before, additional enzyme engineering is needed to fully convert, e.g., trypsin to elastase.93 So, despite nearly identical three-dimensional structures and substrate binding in highly similar antiparallel β-sheet conformation, these three enzymes show very different substrate preferences, highlighting the structural fine-tuning during protein evolution and the complexity of any structure−function analysis. 2.6. Threonine Peptidases

Threonine peptidases exploit an N-terminal threonine as central catalytic residue and nucleophile for peptide-bond hydrolysis, whereas no other amino acids are strictly conserved in the active site.94 Instead, the N-terminal amino group, or a water molecule bound between the α-amine and the threonine side chain, acts as a general base in the reaction by providing the proton necessary for hydrolysis. Together with peptidases that harness Ser or Cys in the very same N-terminal position, threonine peptidases are colloquially grouped as N-terminal nucleophile (Ntn) hydrolases. Importantly, all Ntn hydrolases have to be activated first to liberate the N-terminal nucleophile, either by simple N-terminal methionine excision as seen, e.g., in bacterial proteasomes (reviewed in ref 95) or by an intramolecular autoactivation mechanism observed in the eukaryotic counterpart, probably involving a catalytic Thr/ Asp/Lys triad.94,96−98 Undeniably, the proteasome (Figure 7a), which is a multisubunit complex comprising four stacked rings each

Figure 6. (a) Catalytic triad of human chymotrypsin (pdb code: 4h4f). The side chains of Ser216, His74, and Asp121 are shown as stick models (sequence numbering following Uniprot entry Q99895). (b) Catalytic triad of porcine chymotrypsin (pdb code: 1esa). The side chains of Ser214, His71, and Asp119 are shown as stick models (sequence numbering following Uniprot entry P00772). (c) Acylenzyme complex of porcine chymotrypsin (pdb code: 1esa). The side chains of Ser214, His71, and Asp119 are shown as stick models (sequence numbering following Uniprot entry P00772). All figures were created within PyMOL with carbon depicted in cyan, oxygen in red, nitrogen in blue, and sulfur in yellow.

including a 15- and 8-amino-acid long signal peptide and propeptide, respectively. Notably, whereas trypsin is rendered inactive yet highly stable at low pH, at its pH optimum of ∼8, calcium binding ensures maximum activity and decreased trypsin autodegradation. The propeptide harbors a conserved recognition motif for the serine peptidase enteropeptidase (also called enterokinase), which cleaves after (Asp)4-Lys and thus unveils the N-terminus of mature trypsin, in which the catalytic triad bridges two β-barrel domains. Whereas the guanidinium group of arginine-containing substrates is directly recognized by the carboxyl group of Asp189 at the bottom of the S1 binding pocket, the shorter lysine side chains are encompassed via a bridging water molecule, partly explaining the ∼6-fold preference for arginine over lysine.90 In chymotrypsin-like peptidases, the equivalent position is taken by serine, but neither the mutation of Asp to Ser in trypsin nor that from Ser to Asp in chymotrypsin fully converts the substrate specificity despite the fact that nearly all other residues of the S1 subsite are identical. Additional surface loops that are not even part of the extended substrate binding site have to be reciprocally engineered to render the respective enzyme specificity.91,92 In elastase and elastase-like serine peptidases, the S1 substrate pocket represents only a shallow depression, as glycine residues that build the pocket in chymotrypsin and trypsin are replaced by larger, hydrophobic residues that fill the pocket and instead provide interaction with small hydrophobic

Figure 7. (a) Active-site view of the threonine peptidase β7 subunit of the human proteasome (pdb code: 5le5). The side chains of Thr44 and Lys76 are shown as stick models (sequence numbering following Uniprot entry Q99436). (b) Active-site view of the β7-subunit of the human proteasome complexed with inhibitor bortezomib (pdb code: 5lf3). The side chains of Thr44, Gly90, and Lys76 are shown as stick models (sequence numbering following Uniprot entry Q99436). All figures were created within PyMOL with carbon depicted in cyan, oxygen in red, nitrogen in blue, and sulfur in yellow.

containing six (in bacteria) or seven (in archaea and eukaryotes) subunits as core particle, represents the most prominent threonine peptidase. Although the subunits of the outer rings (α-subunits) share the protein fold with the two inner rings (β-subunits), only the latter can act as proteases within the proteasome’s αββα-architecture. Additionally, in eukaryotic proteasomes only 6 of the 14 homologues are proteolytically active, with always two sites showing chymotryptic, tryptic, and glutamyl endopeptidase activity, thus cleaving preferentially after large hydrophobic, basic, and acidic residues, respectively. On the contrary, archaeal proteasomes are composed of 14 identical active β-subunits. In each active subunit, a conserved lysine residue ∼30 amino acids downstream of the N-terminal threonine is suggested to assist 1143

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

substrate hydrolysis by decreasing the pKa of the amino terminus through electrostatic interactions (Figure 7a and b). Additionally, a conserved glutamate roughly halfway between the N-terminal threonine and the aforementioned lysine seems to be involved in the catalytic mechanism. After the nucleophilic attack, the main-chain amide of a conserved glycine residue stabilizes the tetrahedral transition state. After breakdown of the intermediate, the C-terminal part of the substrate is released, whereas the N-terminal part remains covalently bound to the threonine side chain as an acyl− enzyme complex. A subsequent nucleophilic attack by a water molecule closes the reaction circle by releasing the N-terminal cleavage product and regenerating the free protease.99−101 Notably, whereas the proteasome is highly conserved among eukaryotes, it seems sufficiently diverse to allow specific targeting of, e.g., disease-causing protozoan parasites, such as Plasmodium, Trypanosoma, or Leishmania, possibly opening a new road for, e.g., antimalarial agents.102,103

Figure 8. (a) Catalytic triad of the Carica papaya cysteine peptidase papain (pdb code: 1ppn). The side chains of Cys158, His292, and Asn308 are shown as stick models (sequence numbering following Uniprot entry P00784). (b) Catalytic triad of the Carica papaya cysteine peptidase papain complexed with inhibitor E64 (pdb code: 1pe6). The side chains of Cys158, His292, and Asn308 are shown as stick models (sequence numbering following Uniprot entry P00784). All figures were created within PyMOL with carbon depicted in cyan, oxygen in red, nitrogen in blue, and sulfur in yellow.

2.7. Cysteine Peptidases

Cysteine peptidases are found through all kingdoms of life. Although evolutionary highly diverse, cysteine peptidases all share a common catalytic mechanism that involves a nucleophilic attack by a cysteine side chain, typically from a catalytic triad or dyad, for substrate hydrolysis. As the active site thiol group is highly prone to oxidation, cysteine peptidases are most active in a slightly reducing environment and/or at slightly acidic pH (4.0−6.5). In mammals, cytosolic calpains and lysosomal cathepsins are probably the most well-known family members. Nevertheless, the most-studied cysteine peptidase is papain from the tropical melon tree (Carica papaya), which is also considered as the archetype of cysteine proteinases. Papain is synthesized as a nonglycosylated preproprotein of 345 amino acids with an 18- and 115amino-acid long signal peptide and propeptide, respectively. The mature protein folds into a globular domain stabilized by three disulfide bridges, with two subdomains delimiting a surface-exposed active-site groove on the top, when shown in standard orientation. Although the left subdomain is comprised nearly entirely by the N-terminal part of the enzyme, the Nterminus is located on the distal right side, whereas the right subdomain ends in a strand that extends proximal into the left domain. The catalytic residues Cys158 (left domain; corresponds to Cys25 in pepsin numbering starting after prepropeptide removal) and His292 (right domain; His159) of the catalytic dyad are located on opposite sides of the interface (Figure 8a and b). Several additional residues in papain are known to be important for substrate turnover but are considered as noncatalytic, as they are not essential for hydrolysis; e.g., Asn308 (Asn175) resembles the structural equivalent of aspartate in the catalytic triad of serine peptidases (Ser/His/ Asp) but with much lower contribution to the catalytic efficiency. Mutation to glutamine only leads to a 3.4-fold drop in kcat/KM,104 whereas mutating the aspartate in the catalytic triad of trypsin results in a 104-fold decrease of enzymatic activity.105 In the case of papain-like proteinases, the catalytic center is complemented by the aforementioned Asn308, but other residues such as aspartate or glutamate can be found in the equivalent position, especially in bacteria and viruses. The main role of the third residue during substrate hydrolysis is to orient the imidazole ring of the catalytic dyad, ultimately allowing the histidine to assist in cysteine polar-

ization and deprotonation. It is still under debate whether the histidine also acts as proton acceptor or if the imidazole is per se protonated, as well as if the cysteine thiol becomes ionized only upon substrate binding like in serine peptidases or if cysteine proteases are a priori active. Hydrolysis is initiated by the nucleophilic attack of the activated thiolate anion on the carbonyl carbon of the substrate scissile bond, resulting in the first tetrahedral transition state where the substrate is covalently attached to the cysteine, and the reaction intermediate is stabilized by the oxyanion hole of the enzyme. Rotation of the imidazole ring enables a subsequent proton transfer to the amide nitrogen of the attacked peptide bond, releasing the C-terminal fragment of the cleaved polypeptide, whereas the N-terminal part remains covalently linked via its C-terminus. Next, a water molecule, polarized by the active-site histidine, attacks the carbonyl carbon of the thioester-bonded acyl−enzyme complex, resulting in the second tetrahedral intermediate, which ultimately decomposes and releases the N-terminal portion of the substrate, thereby regenerating the peptidase.106−109 Lysosomal cysteine cathepsins represent a highly studied group of peptidases within the papain superfamily of cysteine proteases. In humans, there are 11 representatives known, namely, cathepsin B, C, F, H, K, L, O, S, V, X, and W. Lysosomes contain a number of lipases, amylases, nucleases, and more than 50 different proteases, mainly aspartic, serine, and cysteine peptidases. In addition to, e.g., the aspartic protease cathepsin D, the serine protease cathepsin A, and the cysteine protease legumain, all kinds of cysteine cathepsins play a key role in lysosomes.110−113 One example is cathepsin B, which displays both endopeptidase and carboxydipeptidase activities; after initial cleavage within the substrate, the cleaved polypeptide chain is degraded in a processive manner by removing dipeptides from the substrate C-terminus. Notably, compared to other papain-like enzymes, cathepsin B shows a rather broad specificity in the P2 position and accepts, e.g., even arginine next to the more commonly found large hydrophobic residues, alleviating its degradative power.111,114 In contrast, cathepsin H functions as either an endopeptidase or aminopeptidase, depending on the presence of an 8-aminoacid long minichain originating from the propeptide. Cathepsin 1144

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

a process named N-terminal methionine excision (NME, reviewed in refs 125 and 126). Translation and subsequent proper folding of proteins is an imperfect process. Although the role of the N-terminal methionine and the functional relevance of its proteolytic excision is not fully understood, current dogma dictates that these processes control protein stability and half-life in the context of the N-end rule that determines protein fate.8,127,128 For cytosolic proteins, stability and quality control is determined by the N-terminal residue after excision of the initiator methionine. Most residues (primary, secondary, and tertiary destabilizing residues) trigger ubiquitination and proteasomal degradation if the protein is incorrectly folded with the N-terminal residue exposed. Acetylated N-terminal residues can also destabilize proteins where the N-terminal residue is either M, V, A, S, T, C, P, or G, termed secondary destabilizing residues. Lange et al.8 investigated protein stability to determine if the neo-N-terminal residue exposed after protease cleavage followed the N-end rule. While following the N-end rule, post translational neo-N-terminal acetylation unexpectedly stabilized cleaved cytosolic proteins with primary destabilizing residues (R, K, L, H, F, Y, W, and I). Thus, this differed from the destabilizing role for acetylated N-termini having secondary destabilizing residues.8 For proteins moving through the ER, a quality-control system termed ER (endoplasmic reticulum)-associated protein degradation (ERAD) is in check. Misfolded proteins are recognized by the ERAD machinery, likely due to interactions of specific chaperones with exposed hydrophobic patches, incomplete disulfide bridges, and improper glycosylation, leading to lysine-48 polyubiquitination of the misfolded protein by ubiquitin E3 ligases such as carboxy terminus of Hsp70interacting protein (CHIP), expulsion of the protein from the ER lumen, and subsequent cytosolic proteolytic degradation of the misfolded protein by the proteasome.129−131

H lacking the minichain shows distinct endopeptidase activity. However, when the minichain is bound, it occludes the nonprimed subsides from S2 backward while providing a negative charge through its C-terminus to bind the substrate Nterminus, abolishing endopeptidase activity and emanating in an elevated aminopeptidase activity.115,116 Interestingly, in cathepsin X, a unique 3-amino-acid long miniloop-insertion (Ile-Pro-Gln) between the oxyanion hole glutamine and the active-site cysteine confers carboxypeptidase activity in a similar manner. The insertion is preceded by a histidine residue (glycine in papain), which occupies the S2′ subsite and can anchor the substrate C-terminus in two different side-chain conformations, depending on carboxymonopeptidase or carboxydipeptidase activity.117−119 Overall, lysosomal cysteine cathepsins developed several modus operandi that reduce the number of substrate-binding sites from either side and provide an electrostatic anchor for the respective substrate terminus, facilitating permanent or temporary exopeptidase activity. Adjuvant to their specialization as exo- or endopeptidases, localization and pH dependency are other unique features of individual cysteine cathepsins; e.g., cathepsins B, C, H, L, and O are ubiquitously expressed in nearly all tissues and cells, whereas cathepsins S, K, V, F, X, and W show restricted organ distribution, suggesting housekeeping or specific functions, respectively. Additionally, cathepsin H is irreversibly inactivated above pH 7.0; its inactive precursor remains stable, and cathepsin S even shows activity and actually has its pH optimum somewhere between pH 6.0 and 7.5. Importantly, like many other proteases, cysteine cathepsins are expressed with N-terminal signal-peptide and propeptide sequences responsible for protein sorting into specific cellular compartments or extracellular space and enzyme inhibition, respectively. In the inactive zymogen form, the prodomain forms a compact minidomain, where a stretched and flexible segment covers and tightly interacts with the active site cleft in backward orientation, occupying several substrate-binding pockets while also preventing hydrolysis. Activation occurs in acidic milieu, at pH 5.0 or below, where the prodomain switches from the closed conformation favored at neutral or slightly acidic pH to an open conformation due to ionic repulsion. The physical displacement yields a low-active cathepsin, which is sufficient to activate another zymogen in trans, thus triggering an autocatalytic activation cascade.109,120 Other prominent cysteine peptidases that are beyond the scope of this Review are calpain-1 and -2,121 the various caspases,122 the lysosomal protease legumain,123 and the paracaspase mucosa-associated lymphoid tissue lymphoma translocation gene 1 (MALT1),124 just to mention a few.

3.2. Signal Peptide Removal

Localization and transport of proteins in the secretory pathway is regulated by intrinsic sequences within the N-terminal region of these proteins termed signal peptides; short sequences are recognized during translation of the nascent protein by the signal-recognition particle in the ER that mediates translocation into the ER lumen,132 allowing subsequent secretion of the proteins via vesicular transport mechanisms. Alternatively, this process can occur by posttranslational processes.133 During translocation, signal peptide sequences are cleaved from the preprotein by a family of serine proteases named signal peptidases (SPase), e.g., type-1 SPase in bacteria and the ER signal peptidase complex (SPC) in eukaryotes for proper localization (reviewed in ref 134). This process is further controlled by proteolytic cleavage of the signal peptide remnant by a family of intramembrane cleaving aspartyl proteases called signal peptide peptidases (SPPs135) by regulated intramembrane proteolysis (RIP, reviewed in ref 136). Proteins targeted for the mitochondrion contain a variant signal peptide named mitochondrial matrix peptide (unfortunately also designated as MMP), an 8−58-amino-acid long sequence rich in positively charged amino acids that targets the preprotein to the mitochondrion and is subsequently cleaved by the zinc metalloprotease mitochondrial processing peptidase (MPP).137

3. BIOLOGICAL PROCESSES REGULATED BY PROTEOLYTIC CLEAVAGE 3.1. Maturation of Translated Proteins

Various proteolytic cleavage events function as a control mechanism during translation and localization of proteins in the cell. In eukaryotic protein synthesis, mRNA generally starts with an AUG codon, and therefore the translated protein sequence will almost exclusively start with a methionine residue. Depending on the protein and the organism, this initiator methionine either can remain on the new protein or, in a large fraction of newly translated proteins, be rapidly removed though proteolytic cleavage by methionine aminopeptidases in

3.3. Proprotein Convertases

Many proteins such as proteolytic enzymes, growth factors, and peptide hormones are expressed and translated as inactive 1145

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

processing of pro-opiomelanocortin (POMC) into adrenocorticotropic hormone (ACTH) or alpha-melanocyte-stimulating hormone (αMSH).162 Furthermore, PC2 processes a wide variety of neuroendocrine precursor proteins such as proenkephalin, prosomatostatin, neurotensin, neuromedin N, prodynorphin, and nociceptin.163 PC1 knockout results in severe growth defects that are likely caused by disrupted processing of insulin growth factor-1 and growth hormonereleasing hormone (GHRH).138 PC4 is selectively expressed in germ cells in testes, ovaries, and placenta and plays an important role in maintaining fertility and in reproduction, likely through cleavage of pituitary adenylate cyclase-activating polypeptide (PACAP) in the testes and the activation of various ADAM proteases.138,164,165 SKI-1 or S1P is a widely expressed protease with notably different cleavage-motif preference than the other members of the PC family because it hydrolyzes substrates after any amino acid other than cysteine and proline.166,167 SKI-1 is an activator of membrane-bound transcription factors in cholesterol metabolism and homeostasis (sterol regulatory element-binding protein (SREBP)-1 and -2)168 and activating transcription factor 6 (ATF6) in the endoplasmic reticulum (ER) stress response.169 Besides this role in transcription-factor processing, SKI-1 cleaves and activates viral proteins such as Lassa virus glycoprotein, making the enzyme a potential target for antiviral therapeutics.138

precursor forms that require targeted, specific proteolytic cleavage events during their transition through the secretory pathway to become biologically active. In humans, this process is regulated by a family of seven proprotein convertase (PC) enzymes (furin, PC1, -2, -4, -5, and -7, and paired basic amino acid cleaving enzyme 4 (PACE4)) that hydrolyze peptide bonds at paired basic amino acid residues in substrate precursor proteins, the subtilisin kexin isozyme 1 (SKI-1) that cleaves after nonbasic residues, and proprotein convertase subtilisin/ kexin 9 (PCSK9) that is only known to cleave itself in an autocatalytic activation step.138,139 Depending on the substrate and its activating PC, cleavage and activation can occur at any location in the secretory pathway from the trans-Golgi network, endosomes, and cell surface and, in the case of activation of polypeptide hormones, in secretory granules. PCs cleaves precursor proteins by limited proteolysis within a sequence motif [R/K]-(X)0,2,4,6-[R/K]↓P′1, with a strong preference for arginine in the P1 position.140 Because the cleavage-site specificity of most PCs is so similar, and there is considerable redundancy in substrate cleavage in vitro, localization is likely the main determinant of substrate specificity.141,142 In 1990, furin or PCSK3 was the first identified mammalian PC as a homologue of the yeast convertase kexin.143 Furin is a membrane-bound protease that is subject to endosomal trafficking between the cell surface and the trans-Golgi network, where it processes numerous protein precursors by cleavage at dibasic motifs such as BXBB or BBXB (where B is a basic amino acid residue).141 Furin has been reported to be secreted into the extracellular space.144 Interestingly, some potent human pathogens have evolved to contain furin-cleavage sites in their virulence factors; for example, Anthrax lethal factor and avian influenza virus hemeagglutinin rely on host furin for activation.141,145,146 Proteolytic cleavage by furin is essential to lifefurin knockout mice are not viable due to severe cardiovascular defects during embryogenesis that are likely a result of insufficient activation of transforming growth factor beta (TGFβ).147,148 Processing of TGFβ1 by furin is also required in adaptive immunity; T cells lacking furin produce less mature TGFβ1 and are dysfunctional.149 Furin and PACE4 are capable of activation of multiple membrane-associated metalloproteases in the metzincin superfamily such as MMPs, MT-MMPs and ADAMs, by selective proteolysis of motifs within the prodomain, leading to exposure of the catalytic cleft and accessibility for substrates.150−152 Cellular adhesion is regulated by furin and furin-like (PC5) PCs through the activation of pro-integrins that are required for interaction of the cell with the extracellular matrix, and furin activity is high during situations of tissue remodeling, such as in atherosclerotic lesions.153 There is considerable apparent redundancy between furin and related PCs such as PC5, PC7, and PACE4, although some substrates appear to be preferentially cleaved by one over the other enzymes.138 PC1 and -2 are, as indicated by their original names pituitary or prohormone convertase 1 and 2, respectively, important activators of endocrine and neural polypeptide hormones,154−157 which are localized in granules in hormoneproducing cells and activate their substrates in the acidic regulated secretory pathway.158,159 There is considerable overlap in substrate specificity considering knockout mice show defects but are viable, and in unison these two proteases regulate vital processes, such as blood sugar homeostasis through cleavage of pro-insulin and glucagon,160,161 and modulate many hormonal signaling events through site-specific

3.4. Processing by Carboxypeptidases

Proteases are divided into their ability to cleave proteins from within a sequence (endoproteases) or the ends of a polypeptide sequence (exopeptidases). Exopeptidases can be further classified if they cleave protein chains at their N terminus (aminopeptidases) or if they cleave from the C terminus (carboxypeptidases).1 In addition, peptidases and proteases can cleave single or two (or more) amino acids from the end of a protein or peptide such as in the case of peptidyl dipeptidases and oligoendopeptidases, but this section will focus on single amino acid hydrolysis in the case strictly for carboxypeptidases. Peptidases and proteases can be classified by their mechanisms of peptide hydrolysis, namely, if the active-site residues essential for hydrolysis are serine, threonine, cysteine, aspartate, and glutamate or if there is the necessity of a metal ion bound to the enzyme in the case of metalloproteases.170 The two largest groups for carboxypeptidases are serine carboxypeptidases and metallocarboxypeptidases.171,172 Of these two groups, the metallocarboxypeptidase family has more members and is better characterized biochemically and structurally than serine carboxypeptidases. Carboxypeptidases can recognize specific C-terminal residues that can be recognized by substrate-binding pockets and subsequently cleaved. Serine carboxypeptidases often have pH optima in the acidic range as opposed to most serine proteases and metallocarboxypeptidases, which are active in neutral to mildly basic ranges.172 While carboxypeptidases were originally discovered in the context of protein digestion as in the case of bovine carboxypeptidase A, the role of carboxypeptidases has now been shown to be more exquisite. There are 34 carboxypeptidases in humans, which highlights the importance of their biological role.173 Metallocarboxypeptidases perform peptide hydrolysis through acid−base mechanisms mediated by a metal ion with a solvent molecule to attack the target peptide bond.49 In contrast, serine carboxypeptidases were originally characterized 1146

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

range of 5.0−5.5, which renders it inactive in the Golgi body where it is synthesized, preventing it from acting on other cellular proteins.1,177 It is proposed that this pH range avails carboxypeptidase E activity in secretory vesicles, which are packed with peptide hormones that require processing.178 Many peptide hormones require C-terminal trimming after endoprotease processing as well as in many cases peptide amidation to render the hormone biologically active.179 A widely known substrate of carboxypeptidase E is insulin,180 a key regulator of glucose intake in humans, which when dysregulated causes diabetes. Insulin is synthesized as a prepro form and upon endoprotease processing (prohormone convertases 1 and 2) generates two pairs of C-terminal basic residues: Lys-64/Arg-65 and Arg-31/Arg-32.181 These basic residues are substrates of carboxypeptidase E and when removed result in a mature molecule of insulin. A mutation on Ser-202 to proline on carboxypeptidase E abrogates the catalytic activity of the enzyme (also resulting in carboxypeptidase E being degraded)182 and results in severe impairment of maturation of insulin (10−50% of correct processing compared to wild-type), which in turn manifests as obesity and diabetic pathologies in mice models.1,183 It is clear that carboxypeptidases have profound biological roles across a variety of different organisms, with functions that appear concerned with digestion of food and precise maturation of peptide hormones.

to be serine proteases due to being inhibited by diisopropylfluorophosphate (DPF)172 and later confirmed by structural studies to attack the scissile bond with a catalytic triad (Ser-HisAsp).1 Metallocarboxypeptidases are structurally split into two families that have been extensively reviewed by Gomis-Rüth:171 cowrins and funnelins. The cowrin subset of metallocarboxypeptidases is characterized by having a long and narrow groove across the length of the protease with the active site halfway through the groove. Such a groove allows for the insertion of unfolded oligopeptide sequences that can be cleaved at the last residue catalyzed by the metal ion (typically zinc). Cowrins have a consensus structural motif involved in catalysis, HExxH+ExxS/G+H+Y/R+Y, and their catalytic domains are 500−700 residues in size. Funnelins differ from cowrins structurally in that funnelins have a shallow active cleft at the bottom of a funnel-like cavity and have the potential to hydrolyze folded proteins. Funnelins also possess a consensus motif important for their catalysis: HxxE+R+NR+H+Y+E with a catalytic domain ∼300 residues large. Examples of cowrins and funnelins are neurolysin and carboxypeptidase A, respectively. As mentioned above, the role of carboxypeptidases was initially described in the context of bovine pancreatic carboxypeptidase A, one of the most studied funnelin-type carboxypeptidases and the first metalloprotease described (utilizing a zinc ion for catalysis).174 Carboxypeptidase A has selectivity toward C-terminal residues that are aliphatic or aromatic (hence why it was termed carboxypeptidase Afor aromatic and aliphatic). Carboxypeptidase B was also described in its role for digestive biology for its specificity toward basic residues at the C-termini of polypeptide chains. It is important to note that carboxypeptidases A and B preferably act upon peptides as opposed to large protein structures. Thus, they necessitate the action of other endoproteases that can cleave proteins to generate smaller fragments with C-terminal residues that are recognizable by the carboxypeptidase and release the C-terminal amino acid, which then can be absorbed by the organism in the case of digestive enzymes.1 Trypsin is the most notorious and well-studied endoprotease that acts concertedly with carboxypeptidase B.1 Trypsin can activate procarboxypeptidase B into its active form by releasing its propeptide,175 allowing secreted carboxypeptidase B to act upon extracellular proteins (ingested in the digestive system) as opposed to intracellular proteins, where they can cause cellular damage if not controlled. Carboxypeptidase B can later cleave C-terminal lysines on tryptic peptides. Lysine is an essential amino acid in humans (namely, it cannot be synthesized by metabolic processes), and so the activity of carboxypeptidase B is essential for life. This role is mirrored by the activity of carboxypeptidase A on C-terminal aliphatic amino acids that are important for nutrition in concerted action with pepsin and chymotrypsin endoprotease activity.1 In addition to carboxypeptidases A and B, which have been described in great detail in the literature, carboxypeptidases have also evolved to have hormone-activating functions which consists of trimming small peptide sequences into biologically active forms. Of distinction is carboxypeptidase E, a metallocarboxypeptidase that has specificity toward C-terminal basic residues, much like carboxypeptidase B.176 Unlike carboxypeptidase B, however, the distribution of carboxypeptidase E is in tissues where many neurotransmitters and peptide hormones are produced such as the brain and pancreas.176 Furthermore, the activity of carboxypeptidase E is in the pH

3.5. Ectodomain Shedding

Proteins involved in intercellular signaling and interaction between cells are expressed and translated as membraneanchored proteoforms that are sensitive to proteolytic cleavage by an array of mainly membrane-bound proteolytic enzymes. Such ectodomain shedding modulates the function of the substrate protein, by activating precursor forms of, for example, cytokines and growth factors, directly and indirectly inhibiting signaling by release of membrane-bound receptors into the extracellular space (negating signaling directly and also functioning as soluble dominant negative proteins indirectly), or by affecting cell motility and binding by cleavage of proteins involved in cell−cell and cell−matrix interactions. Following cleavage, the cell-associated usually C-terminal protein remnant can be internalized by endocytosis or further processing by intramembrane proteases such as presenilin-1 in the γ-secretase complex,184 leading to either degradation or intracellular signaling events185,186 (Figure 9). Although, technically, most extracellular and membraneassociated proteolytic enzymes could mediate ectodomain shedding and especially soluble187 and membrane-anchored MMPs188,189 have been described to do so, the specialized proteases that this function is most attributed to are ADAMs. As discussed above, the ADAMs190,191 are multidomain, transmembrane glycoproteins (Figure 9) that are closely related to the reprolysins. These snake venom integrin binding proteins were discovered as fertilization factors when it was reported that binding and fusion of guinea pig ovum and sperm were partially mediated by PH-20 or -30 (fertilin α and β),192−195 proteins that were renamed ADAM1 and -2 after the discovery and cloning of more related proteins. Besides mediating cell−cell and cell−matrix interaction through selective integrin binding via their disintegrin domains (reviewed in ref 196), about half of mammalian ADAMs (12/22 in human) contain a consensus HExxHxxGxxH zincbinding motif in their metalloprotease domain, implicating them as potential active endopeptidases (reviewed in ref 197). 1147

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Ectodomain shedding of growth factors involved in epidermal growth factor receptor (EGFR) signaling is largely attributed to ADAMs. ADAM12, -10, and -17 process TGFα and heparin-binding EGF (HB-EGF) factor to their soluble forms, leading to activation of EGF receptor signaling.185,213 Whereas physiological shedding of growth factors is required in processes such as wound healing, tissue repair, and angiogenesis, dysregulation of the shedding and subsequent aberrant EGFR activation can lead to unwanted tissue remodeling, such as in fibrosis,214 and uncontrolled cell proliferation and has therefore been extensively implicated in the pathogenesis of cancer.215,216 Several ADAMs can mediate cleavage and shedding of amyloid precursor protein (APP) and may play a role in the pathogenesis of Alzheimer’s disease. APP can be processed by three families of proteases with different biological outcomes: α-secretases such as ADAM9, -10, and -17 and meprin β;217 βsecretases BACE1 and -2; and presenilins in the γ-secretase complex.218,219 β-Secretase-mediated release of soluble APP and subsequent regulated intramembrane processing of the remnant by the γ-secretase complex yields an amyloidogenic Aβ fragment that is implicated in the formation of amyloid plaques causing Alzheimer’s disease, whereas shedding of APP by ADAMs is mediated by a proteolytic cleavage within the Aβ region itself and yields less toxic fragments.219 This protective role of a clan of metalloproteases illustrates an very real dilemma in pharmaceutical intervention targeting proteases; many proteases have either protective or detrimental roles in different diseases, making rational use of inhibitors challenging and open to side effects.220,221 Besides cleaving APP, ADAM10 is known to be the responsible sheddase for multiple other substrates in the central nervous system, implicating this protease as an important regulator of neural development. ADAM10 was found to be the mammalian homologue to the Drosophila protein kuzbanian (KUZ), which cleaves and activates the Notch receptor, mediates release of the Notch receptor ligand delta, and has a crucial function in neurological development in the fruit fly.222 In mice, knockout of Adam10 leads to lethality in an early embryonic stage due to major defects in neurological development that bear a striking resemblance to Notch knockout models.223,224 Interestingly, Notch signaling itself may act as a positive feedback loop via transcriptional activation of furin, which in turn leads to enhanced activation of ADAM10 (and others) by PC cleavage of the inhibitory prodomain.225 Because academic efforts in this area have frequently focused on elucidating the role and substrates of ADAM10 and -17, some other catalytically active ADAMs remain understudied to date, and further research is required to understand the intricate role of ADAM-mediated shedding in health and disease.

Figure 9. Schematic representation of ectodomain shedding mediated by ADAM proteases. Proteolytic cleavage mediated though the active ADAM metalloprotease domain in the membrane-bound substrate proximal to the membrane releases the extracellular fragment, leading to activation of membrane-anchored proproteins, abrogated interaction of the ADAM-expressing cell with neighboring cells or the extracellular matrix (ECM), or regulated intramembrane proteolysis of the C-terminal remnant. C, C-terminal domain; TM, transmembrane domain; EGF, epidermal growth factor-like domain; CYS, cysteine-rich domain; DIS, disintegrin domain; MP, metalloprotease domain.

As with other metzincins such as MMPs and ADAMTS’s, ADAMs can target and degrade extracellular matrix proteins such as collagens, laminin, and fibronectin,198,199 but this function is secondary to their ability to cleave membrane-bound proteins. The “sheddase” activity of ADAM proteases was first demonstrated when a disintegrin metalloprotease was described as the main responsible protease for the release of active mature tumor necrosis factor alpha (TNFα) from its membraneanchored precursor form.200,201 This process had already been attributed to metalloprotease activity earlier, although thought to be due to MMPs.202,203 The discovery of ADAM17 or TNFα converting enzyme (TACE) was the gateway to a multitude of studies investigating ADAM proteases and their substrates in health and disease, with ADAM17 receiving the most academic interest, perhaps due to its potential therapeutic potential in inflammatory diseases. The last two decades of research have yielded dozens of putative substrates such as cytokines and growth factors, their receptors, and adhesion molecules,204,205 although for many their physiological relevance remains unclear as many have been discovered via in vitro experiments and significant redundancy appears to exist for substrate shedding between individual ADAMs. ADAM proteases play a pivotal role in regulation of select cytokine signaling, as exemplified by the well-documented release of TNFα by ADAM17. Although multiple ADAMs (e.g., -9, -10, and -19) are capable of processing proTNFα in vitro,206 ADAM17 is the dominant factor in determining TNF response upon stimulation, as shown by a lack of soluble TNFα production by stimulation of ADAM17-deficient monocytes,200 whereas other ADAMs, like the phylogenetically most closely related ADAM10, may regulate constitutive production of soluble TNFα.207 ADAM sheddase activity is involved in interleukin-6 (IL6) signaling through release of membranebound IL6 receptor (IL6R) that is mediated by ADAM10 and -17.208−210 Proteolytic release of the soluble IL6R, as opposed to shedding of TNF superfamily receptors, is not inhibitory but rather an important agonistic mechanism in pro-inflammatory IL6 trans-signaling by binding to the membrane protein gp130 on neighboring cells.211,212

3.6. Modulation of Biochemical Pathways and Signaling through Proteolytic Cleavage

Targeted precision proteolysis is key in the initiation and regulation of many biochemical processes and pathways, e.g., the well-documented limited-cleavage events in the coagulation cascade226 (Figure 10) and complement activation cascade227 (Figure 11), but also has important roles in many other biological processes. Proteolysis by the caspase family of cysteine endopeptidases is a major regulator of cell death, through either activation of the immunologically silent apoptosis or necroptosis (reviewed in refs 228 and 229). Apoptosis can be initiated via two distinct 1148

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Figure 10. Schematic representation of the coagulation cascade where a series of sequential proteolytic cleavages of serine protease zymogens activates them, allowing the cascade to progress to the next tier. The intrinsic pathway is activated upon exposure of blood to negatively charged exposed surfaces and depends on proteolytic activation of factors XII, XI, and IX. The extrinsic pathway is activated when vascular tissue damage leads to activation of factor VII. The pathways converge at proteolytic activation of the endopeptidase factor X (also known as Stuart-Prower factor), which is capable of converting prothrombin into its active enzyme form that subsequently activates factor XIII and forms a positive feedback loop into the intrinsic pathway. Thrombin also cleaves and activates fibrinogen, allowing activated factor XIII to form a cross-linked polyfibrin clot.

Figure 11. Schematic overview of the complement activation cascade, which is dependent on a series of sequential activations of, and by, serine endopeptidases. Activation can occur via three distinct pathways: in the classical pathway via formation of a C1 complex upon antibody−antigen ligation, in the mannose-binding lectin (MBL) pathway upon formation of an MBL-mannose-binding lectin associated serine protease (MASP) complex, or in the alternative pathway by direct activation of factor C3 via an activating surface. All pathways ultimately lead to the formation of a membrane attach complex (MAC) that leads to lysis of the compromised cell and removal of the pathogen.

pathways: the intrinsic pathway is activated upon cellular stress such as excessive reactive oxygen species formation or DNA damage, and the extrinsic pathway relies on activation of death receptor complex formation upon ligation of the appropriate receptor with, e.g., Fas or TNFα. In the intrinsic pathway, proapoptotic effectors in the Bcl2 family permeabilize the mitochondrial outer membrane, leading to release of proapoptotic factors such as cytochrome c and second mitochondrial activator of caspases (SMAC, or direct IAPbinding protein with low pI (DIABLO)), which subsequently enable formation of a large apoptosome complex that leads to dimerization and proteolytic activation of caspase-9. Caspase-9 activates the effector proteases caspase-3, -6, and -7 that cleave a large repertoire of substrates, leading to cell death. Under normal conditions, death receptors such as TNF-R are present in large signalosomes with antiapoptotic factors such as the inhibitors of apoptosis (IAP1 and -2) and linear ubiquitin chains polymerized by linear ubiquitin assembly complex (LUBAC) on RIP1 that keep apoptosis in check.230 In the absence of antiapoptotic mediators in the signalosome, ligation with a death-inducing ligand leads to homodimerization and proteolytic activation of caspase-8 that in turn activates effector caspases. Caspase-8 also cleaves Bcl-2 interacting domain (BID), leading to a cross-talk into the intrinsic pathway by causing release of cytochrome c from mitochondria. Signaling through receptor pathways that can cause apoptosis is a complex biological system, since depending on cofactors and stimuli many different outcomes are possible. Ironically, partial activation of caspase-8 by dimerization with the caspase-like

domain of a truncated form of c-FLIP (cellular FLICE-like inhibitory protein) upon ligation of the T cell receptor with an antigen does not cause apoptosis yet is required for T cell survival and proliferation.231 Caspases have a strong to even exclusive preference for an aspartate residue in the P1 position yet seem to rely on other additional features for substrate specificity as shown by limited proteolysis of substrates in vitro.232 Several large-scale profiling studies of substrates of caspase-mediated proteolysis during induced apoptosis have revealed a plethora of substrates,233−235 but the question remains which of these cleavage events are critical in initiation and sustenance of the pathway and which are mere victims of bystander proteolysis. Interestingly, a very in-depth analysis of the apoptosis interactome and its relationship with proteolytic disassembly revealed that caspase cleavage of substrates in early apoptosis occurs after interactome disassembly and is not the cause of breakup of essential protein/protein interactions.233 This was unexpected as the dogma prevailing in the field has been that caspase activity disassembles protein complexes, leading to termination of cellular functions. Thus, it appears that protein/protein interactions may mask caspase-cleavage sites or sterically preclude caspase access and that caspase cleavage occurs upon isolated proteins following dynamic protein exchange or complex remodeling. 1149

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

3.7. Precision Proteolysis in Immunity

deficient B cells from a human patient, we identified a specific cleavage by MALT1 in ranBP-type and C3HC4-type zinc finger-containing protein 1 (RBCK1), better known as hemeoxidized IRP2 ubiquitin ligase 1 (HOIL1).257 This cleavage was later confirmed by others.258,259 HOIL1 associates with HOIL1-interacting protein (HOIP) and Shank-associated RH domain-interacting protein (SHARPIN) to form the linear ubiquitin chain assembly complex that mediates formation of a special head-to-tail, linear polyubiquitin linkage that has important roles in the immune system.260−264 Upon cleavage by MALT1, the C-terminal domain of HOIL-1 dissociates from LUBAC, which abrogates the formation of linear polyubiquitin, thus terminating NF-κB activation, and desensitizes cells from reactivation for at least 2 h.257 This showed for the first time that MALT1 proteolytic activity acts as a late negative regulator of NF-κB at later time points, illustrating the complex and pleiotropic nature of this unusual protease in immune cell activation and again highlighting the difficulties in drug development from incompletely validated drug targets. MMPs are a class of protease that has revealed unpredicted and unexpected roles that span beyond their well-recognized matrix remodeling functions. As their names suggest, the first efforts to identify substrates of MMPs were strictly focused on extracellular matrix proteins such as the collagens.265 Recently, MMPs have been demonstrated to be immune regulators by modifying, through the activation or inactivation of chemokines, cytokines, cytokine-binding proteins, signaling pathways, protease inhibitors, and cell-surface receptors.266−268 MMP2 was one of the first examples of an MMP modulating the function of a chemokine through the proteolytic removal of the first four N-terminal amino acids (1QPVG4) of monocyte chemoattractant protein (MCP) 3 or CCL7 (chemokine (C−C motif) ligand 7); this inactivation of MCP3 resulted in an antagonistic effect for CC-receptors, leading to a loss of calcium signaling and chemotactic activity in vivo.269 Similarly, the first four amino acids (1KPVS4) of another chemokine, stromal cellderived factor 1 (SDF-1α and β/CXCL12), were cleaved by several MMPs (MMP1, -2, -3, -9, -13, and -14), resulting in a loss of signaling by its cognate receptor CXCR (C-X-C chemokine receptor type)-4.270 In fact, this cleavage switched receptor binding to the related CXCR3, again highlighting the importance of precision proteolysis as a potent posttranslational modification that profoundly regulates protein function.271 It is now known that MMPs cleave most if not all human and murine CXC and CC chemokines, thereby affecting virtually all aspects of leukocyte migration, chemotaxis, activation or inactivation, and signaling.272−274 Cytokines such as the interferons are critical during antiviral immunity.275 During coxsackievirus type B3 or respiratory syncytial virus infections, macrophage metalloelastase (MMP12) processes interferon-alpha (IFN-α) at the Cterminus at two distinct sites (157leu↓Gln158 and 161Leu↓ Arg162) that are critical for binding IFN-α receptor-2 and mediating downstream signaling.276 MMP12 was also demonstrated to have moonlighting roles by binding to the NFKBIA (NF-kappa-B inhibitor alpha) promoter (encoding for inhibitor of NF-kappa-B alpha or IκBα) to activate IκBα in order to mediate the secretion of IFN-α from viral-infected cells.276 Importantly, the intracellular and extracellular regulation of IFN-α is specific as MMP12 is unable to cleave other type I interferon cytokines such as IFN-β or bind its promoter.276 In addition to being implicated in antiviral immunity, MMP12 was demonstrated to play a protective role in peritonitis and

Proteolytic activity is key in the regulation and execution of several pathways in innate and adaptive immunity. Serine proteases contained in secretory vesicles in immune cells, such as granzymes in cytotoxic lymphocytes or neutrophil elastase in neutrophils, are involved in the immune response to viral (reviewed in ref 236) and bacterial pathogens (reviewed in ref 237). Although the exact details remain under debate,238 the consensus emerging is that, upon specific recognition of a viral antigen presented through the major histocompatibility complex on the surface of an infected cell by T cell receptors of cytotoxic CD8+ T cells or activation of NK cells through signals from infected cells, secretory vesicles containing granzyme proteases and the protein perforin are mobilized and secreted into the extracellular space. This leads to generation of perforin pores in the membrane of the infected cell large enough to enable translocation of secreted granzyme into the cytosol of the target cell.239,240 Internalized granzyme B (GrzB) subsequently activates classical apoptosis by a specific activating proteolytic cleavage of procaspase 8, which initiates the apoptotic cascade.241 GrzB further performs a specific cleavage in the protein BID, thereby directly activating mitochondrial apoptosis.242,243 This seemingly elegant system does not tell the full story though, and knockout experiments have demonstrated that other members of the granzyme family such as GrzA and GrzM are involved in the process.244,245 The exact importance of each protease and their substrates may be dependent on thus far unknown factors, is likely variable between species, and will require more study to elucidate the full mechanism. Activation of B and T lymphocytes after ligation of their surface-expressed antigen receptor with the corresponding antigen leads to a rapid mounting of a potent adaptive immune response. After activation, a series of specific phosphorylation and ubiquitination events leads to activation of the transcription factor nuclear factor kappa-B (NF-κB) and subsequent transcription and translation of a multitude of proteins involved in inflammation, lymphocyte differentiation, and proliferation.246−248 One of the signaling nodes transducing the antigen signal down to NF-κB activation consists of a trimeric complex of caspase recruitment domain family-11 (CARD11), B-cell lymphoma/leukemia-10 (BCL10), and MALT1, termed the CBM.249 MALT1, the only paracaspase in humans, is a cysteine protease homologue of the metacaspases found in yeast and plants124 that, in addition to the scaffolding function within the CBM complex, cleaves a very specific repertoire of substrates in lymphocytes that regulates the NF-κB response after antigen receptor ligation (reviewed in ref 250). Although MALT1 contains a consensus active site, it took until 2008 for the first substrates to be discovered, when MALT1 was shown to cleave BCL10 after Arg228 in the C-terminal region, a cleavage that is important during T cell activation although not required for NF-κB activation upon T cell receptor ligation.251 At the same time, a publication showed the negative regulator of NF-κB called A20 to be cleaved by MALT1 upon T cell activation, indicating MALT1 proteolytic activity enhances or fine-tunes NF-κB activation in lymphocytes by removing negative checkpoints.252 Since these discoveries, MALT1 has been shown to cleave a handful of other negative regulators of canonical NF-κB activation in activated lymphocytes such as the deubiquitinases A20252 and CYLD,253 as well as RelB,254 Regnase-1,255 and Roquin-1 and -2.256 Recently, using TAILS, an unbiased N-terminomics proteomics analysis of MALT11150

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

4.1. Monitoring Substrate Cleavage

rheumatoid arthritis by the inactivation of several complement components: C3, C3b, iC3b, C3a, C5a, C4, and C4b.277 The activation of C3 to C3b leads to the covalent binding and opsonization of target cell surfaces, resulting in cell lysis.278 The cleavage by MMP12 reduces the proinflammatory activation effect of complement, and C3b-tagged cells no longer induce cell lysis.277 Also, MMP12 generates more potent phagocytosis enhancers through the processing of C3b and iC3b. Other C3 or C5 cleavage products include C3a or C5a anaphylatoxins serving to reduce their chemoattractant activities277,279 and are implicated in the augmentation of vascular permeability. Notably, MMP12 cleavage of C3a (740Arg↓Ala741) and C5a (746Asp↓Met747) results in the reduction of calcium flux, leading to a reduction of neutrophil recruitment and macrophage infiltration.277 The protease-activated receptors (PARs) are an important class of G protein-coupled receptors and have been implicated in cancer and disorders of the cardiovascular, musculoskeletal, gastrointestinal, respiratory, and central nervous system.280 PAR1 is required to enhance growth and invasion of breast carcinoma cells in a murine xenograft model, and MMP1, secreted from proximal fibroblasts or platelets, is an agonist of PAR1 by inducing Ca2+ signaling, Rho-GTP/MAPK (rhoguanosine triphosphate/mitogen-activated protein kinase 1) pathways, and cell migration through the cleavage of a unique tethered ligand sequence (41PRSFLLRN47).281,282 Therefore, MMPs not only regulate the remodeling of the matrix by cleaving extracellular matrix (ECM) proteins but also influence (activation or inactivation) diverse immune responses through the targeted processing of multiple chemokines, cytokines, and cell-surface receptors. These specific examples highlight clearly the importance of precision proteolysis in regulating signaling and other bioactive proteins as a critical posttranslational modification. We point out again that proteolysis is one of the few posttranslational modifications regulated by cells after proteins are secreted from cells, and thus it is critical to have knowledge of the N- and Ctermini of proteins in order to correctly formulate hypotheses regarding the function of a protein and the system.

Perhaps the most straightforward approach to analysis of proteolytic activity is in direct in vitro monitoring of hydrolysis of (putative) substrates by the protease of interest. Cleavage can be visualized by gel electrophoresis or mass spectrometry, and synthetic quenched fluorogenic peptide probes based on fluorescence resonance energy transfer (FRET) or peptides labeled with gold nanoparticles that emit a specific signal only after hydrolysis of the peptide are now used for basic characterization of in vitro cleavage site specificity of the protease283−286 and are reviewed in ref 287. Several approaches using extended synthetic peptide libraries combined with sequencing or mass spectrometric analysis and bioinformatics analysis of the cleavage sites have been described.288−291 Proteomic identification of protease cleavage sites (PICS)292,293 utilizes peptide libraries generated though controlled proteolysis of endogenous bacterial or human proteomes with enzymes like trypsin, GluC, chymotrypsin, or LysargiNase. After chemical blocking of reactive thiol and amine groups, the peptide pool is incubated with the protease of interest, where hydrolysis of the peptides reveals a free α-amine group on the C-terminal peptide remnant that is amendable to labeling and enrichment. After LC-MS/MS (liquid chromatography-tandem mass spectrometry) identification of the enriched peptide remnants, the nonprime peptide sequence can be identified bioinformatically, leading to in-depth profiling of the cleavage site specificity of the studied proteases.32,292,294 Zymography is the umbrella term for methods visualizing protein substrate conversion by proteases (reviewed in ref 295). Zymography yields qualitative and, if performed within the limited linear range of the assay, also quantitative information about protease and proteolytic cleavage, and visualization can be performed in SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) gels on tissue sections and in intact organisms. Gel zymography was originally performed by submitting samples containing the putative protease of interest to electrophoretic separation based on molecular weight using SDS-PAGE, followed by (partial) reactivating of the protease by removal of SDS by nonionic detergents and overlay of the gel with an indicator gel containing a cleavable substrate (e.g., fibrin-agar for visualization of active plasmin).296 Improved methods directly incorporated a protein substrate such as gelatin in the SDSPAGE gel matrix, where gelatinase (MMP2 and -9) activity leads to clear bands on a bright gel background after Coomassie staining, yielding exquisite sensitivity and information about different catalytically active proteoforms in the sample, provided proteolysis leads to sufficient cleavage of the substrate to small enough peptides to allow diffusion from the gel.297 Thus, one or two cleavages only will not be detected. Although several improvements and variations on the basic premise have been published, such as using in-gel zymography for testing inhibitory antibodies incorporated in the gel, 2-dimensional electrophoresis for enhanced resolution, and transfer zymography for monitoring multiple substrates from the same gel (reviewed extensively in ref 295), the main limitation of this technique lies in the fact that little quantitative and only limited qualitative information about the proteolytic potential in vivo is provided; noncovalent complexes of proteases and their endogenous inhibitors (e.g., gelatinases and tissue inhibitors of metalloproteases (TIMPs)) are dissociated during electrophoresis so the zymogram will inevitably overestimate the proteolytic activity present. Finally, novices are often confused

4. ANALYTICAL APPROACHES FOR PROTEASES AND PROTEOLYTIC PROCESSING Uncontrolled proteolytic cleavage has strong detrimental physiological effects, and therefore protease activity is subject to multilayered posttranslational regulation. Proteases are often expressed and translated as inactive zymogen proforms that require an activation step before being able to process substrates, either by allosteric or proteolytic removal of an inhibitory prodomain or by other posttranslational modifications such as phosphorylation or ubiquitination. Furthermore, colocalization of a protease and its substrate can be dependent on specific stimuli, and an arsenal of endogenous protease inhibitors keeps proteolysis in check. This strong regulation of proteolytic cleavage poses a challenge for analysis of proteases and their targets because mere quantification of expression levels or protein abundance does not necessarily reflect the biological impact of the protease. To overcome this shortcoming of traditional transcriptomic and proteomic techniques in studying proteolysis, the development of targeted and more functional analysis methods termed degradomics has been instrumental in enhancing our understanding of proteolytic cleavages in health and disease.2 1151

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

by the molecular weights of the protease forms in zymograms. As the analyses are performed and analyzed under nonreductive conditions, the presence of any disulfide bonds in the protease will cause the protease to electrophorese with a lower apparent molecular weight than that if reduced. Finally, protease prodomain electrophoretic bands will show activity on gels if SDS activates the enzyme or if the allosteric pressure exerted by the refolding procedure leads to protease autoactivation. However, this occurs in situ in the gel and so, despite a loss of the propeptide in gel, the proform is also revealed as an active protease, although it must be considered as an inactive zymogen form.

Figure 12. Schematic overview of the activity-based probe (ABP) approach for selective analysis of protease activity and active proteases. A proteinaceous sample of interest (lysate, living cells, and tissue) is incubated with an ABP that consists of a peptide-like linker region specific for the protease (class) of interest, an electrophilic reactive group (warhead) for covalent linkage of the probe to the active protease, and a reporter moiety for visualization (e.g., a fluorophore) or affinity enrichment (e.g., biotin). In the case of serine and cysteine peptidases, cleavage of the probe by the active protease leads to covalent derivatization of the catalytic residue with the probe, allowing activity-specific analysis after gel electrophoresis, or pull-down of the labeled protein and analysis by mass spectrometry.

4.2. Substrate- and Activity-Based Probes

Although in vitro cleavage monitoring of natural and synthetic substrates provides valuable kinetic information, visualization of protease activity under physiological conditions is required to infer and validate biology. To selectively analyze active proteases, several research groups have dedicated their efforts to synthesize and utilize selective, substrate-based probes (SBPs) and activity-based probes (ABPs) for different protease classes (reviewed in refs 15, 298, and 299). SBPs are used in in situ and in vivo zymography to selectively visualize proteolytic cleavage in tissue sections or living intact small animals using either directly fluorescent or the more popular and sensitive internally quenched fluorescent probes that give a positive or negative fluorescent signal upon cleavage of the probe by the protease of interest.300,301 Although SBPs are powerful tools to study proteolytic cleavage in tissue and even in vivo, redundancy between enzymes, which is observed in many protease classes, makes unambiguous interpretation of the acquired data difficult. To overcome this issue, ABPs have been developed that irreversibly modify the active protease and allow postlabeling identification of the actual enzyme involved, e.g., by mass spectrometry. Commonly, ABPs contain a substrate-like peptidomimetic backbone sequence optimized for recognition by the protease (class) of interest, an electrophile “warhead” that mediates a nucleophilic attack upon cleavage of the peptide backbone, which leads to covalent labeling of the active site, and a reporter group that can be a fluorophore for visualization of a group for affinity enrichment of the labeled protease such as biotin or via click chemistry302 (Figure 12). Visualization of labeled protease using SDS-PAGE, fluorescence microscopy, mass spectrometry, and in vivo imaging provides qualitative and quantitative information about the activity status that was present in vivo under the experimental conditions, making these probes extremely powerful tools to study proteolysis. The ABP approach has been used extensively and successfully to analyze activity of cysteine-, serine-, and threonine proteases. Because catalysis is mediated by an amino acid residue in the catalytic site that functions as a nucleophile, probes containing a directly reactive electrophile label the catalytic residue in a true activity-dependent manner. Many chemical structures have been used as electrophile in both irreversible inhibitors of these protease classes and ABPs, e.g., activated ketones, phosphonylating and sulfonylating groups, epoxides, and Michael acceptors (reviewed in ref 298). Besides obviously valuable application in fundamental characterization of activity of a specific protease, a promising application of ABPs lies in in vivo labeling strategies for in situ visualization of proteolytic activity in, e.g., inflammatory

diseases and cancer, where aberrant proteolytic activity contributes to the pathology.303−305 Designing ABPs for protease classes that are not dependent on active-site nucleophile amino acids for substrate hydrolysis (metalloproteases and aspartyl proteases) has proven more challenging. Covalent labeling is achieved by incorporating a photoactivatable cross-linker group into the probe that forms a reactive radical species and covalently attaches to catalytic-site residues upon irradiation of the sample with UV light. Several groups have employed this approach to profiling of metzincin proteases such as MMPs and ADAMs, but low labeling yields and nonspecific labeling seem to have made this approach largely redundant.135,306−308 The use of nanomolar noncovalent protease inhibitors affords one venue to visualize such proteases that lack a covalent acyl intermediate in catalysis, but these can only be used under native conditions, e.g., in vivo imaging, and not denaturing SDS-PAGE gels. The lower amounts of protein that can therefore be labeled necessitates highly sensitive detection systems, and this has driven the development of noncovalent probes for positron emission tomography of metalloproteases using the potent, low-nanomolar inhibitor Marimastat as the label-accepting probe.309,310 A variant method for enrichment of active proteases from complex samples also utilizes reversible inhibitors containing peptidomimetic recognition sequences for selectivity that are immobilized on a solid support or resin and can be applied as an affinity matrix to selectively enrich proteases in their active, noninhibited form. This method has been used to purify active forms of collagenases using a weak metalloprotease inhibitor with micromolar potency (Pro-Leu-Gly-hydroxamate),311 and optimization of the inhibitor backbone structure has yielded affinity materials that have been used to enrich and analyze MMPs and ADAMs with limited success in biological samples.312−317 4.3. Proteomics-Based Approaches for Substrate Discovery

As the function of a protease is determined by the substrates it cleaves, obtaining a comprehensive picture of these substrates under specific conditions is required to interpret the biological significance of proteolysis. Proteomics provides a toolkit for unbiased qualitative and quantitative profiling of proteins and 1152

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

these are minor and specific chemical labeling of the N-terminal α-amine without side reaction on ε-amino groups on lysine residues in the protein is far from trivial, although several groups have attempted this but with low fidelity (Figure 13). One approach for positive selection after labeling of Nterminal amines utilizes a subtiligase enzyme to selectively label

proteoforms present in a biological sample, but typically proteolytic cleavages are elusive to regular “shotgun” proteomics experiments. As with proteases themselves, mere quantification of the level of a substrate in a sample provides limited information; stable cleavage fragments will still be present, and the probability of identifying cleavage sites is statistically unlikely. To overcome this problem, methods selectively analyzing cleaved substrates have emerged and broadened our knowledge on substrates of proteolysis (reviewed in refs 318−320). Initial efforts used the additional information on molecular weight of proteoforms in the samples from the once-popular method of 2-dimensional gel electrophoresis, which separates proteins based on isoelectric point and size,321 followed by excision of spots of interest, in-gel digestion with trypsin, and mass spectrometric identification using either MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) or LC-electrospray ionization (ESI)-MS/MS.322 In a modified protocol termed 2-dimensional difference gel electrophoresis (2D-DIGE), protein samples corresponding to different experimental conditions (e.g., protease-treated and control) are differentially labeled with fluorescent dyes, combined, and resolved on a 2D polyacrylamide gel.323 Cleaved substrates are visualized as decreasing intensity of colored spots corresponding to the full-length proteins in the protease-treated sample and concurrent appearance of lower-molecular-weight fragments. Both spots can be excised and identified by mass spectrometry. The emergence of more powerful and faster mass spectrometers enabled in-depth coverage of proteins in excised slices of 1D gel electrophoresis of unfractionated samples, which made it possible to avoid the finicky 2D electrophoresis and laborious picking of (many) individual spots. Originally pioneered in 2002 by Guo et al.,324 this technique was greatly enhanced by developing easily accessible software and other technical advances leading to the development of PROTOMAP (protein topography and migration analysis platform). As before, this is a method that is based on standard SDS-PAGE resolution of protein samples harvested from cells where proteolytic activity occurred or was induced (e.g., apoptotic Jurkat cells) followed by LC-MS/MS identification and quantification by spectral counting of proteins extracted from multiple gel slices.325,326 By bioinformatically combining mass spectrometric identity information with molecular-size information obtained from the electrophoresis data, putative substrates can be identified. Although this method gives an indication of the approximate position in the protein sequence where the cleavage occurred, the limited resolution of mass in gel electrophoresis combined with the low probability of identifying the tryptic peptide of the cleavage site makes unambiguous identification of the exact point of cleavage unlikely. 4.3.1. N-terminal Proteomics. Selective analysis of cleaved protein substrates and cleavage sites is the most optimal scenario to elucidate the comprehensive degradome of a certain protease, but this approach faces similar challenges as methods for analysis of other posttranslational modifications (PTMs) such as phosphorylation and ubiquitination, which require specific enrichment of modified peptides or proteins for successful profiling. At first glance, labeling of cleaved proteins at their N-terminal α-amino group followed by positive selection of the labeled fraction seems like an attractive option, but, although differences in acid-dissociation constant exist,

Figure 13. Schematic overview of degradomics approaches using positive selection of N-terminal peptides. A proteome sample is acquired after activation of the protease of interest (e.g., upon induction of apoptosis). The sample consists of full-length, uncleaved proteins and fragments resulting from proteolysis containing neo-Nterminal protein ends. Positive selection of N-terminal peptides (neoand natural) can be achieved by enzymatic selective labeling of the Nterminal α-amine, attaching a tobacco etch virus (TEV)-linked biotin group (method I). After digestion of the proteins with trypsin and biotin-(strept-)avidin affinity purification, the N-terminal peptides are collected by TEV protease cleavage and are amendable to mass spectrometric identification, allowing exact identification of the proteolytic cleavage sites on a large scale. A chemical approach for positive selection utilizes the difference in pKa between N-terminal αamines and lysine side-chain ε-amines. By performing a guanidination at high pH, selective labeling of ε-amines is achieved (method II). This allows for chemical derivatization with biotin of the N-terminal αamines with standard N-hydroxysuccinimide chemistry, followed by enrichment of the N-terminal peptides using (strept-)avidin and mass spectrometric identification of proteolytic cleavage sites. Method III uses N-tris(2,4,6-trimethoxyphenyl)phosphonium acetyl (TMPP) as a reagent for selective labeling of the N-terminal α-amines, followed by tryptic digestion and sample fractionation by reversed-phase chromatography to enrich for TMPP-labeled peptides. A further enrichment of N-terminal peptides is achieved by affinity enrichment using an α-TMPP antibody, allowing subsequent mass spectrometric identification of cleavage sites. 1153

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

α-amino groups in a proteome sample with a biotinylated short peptide sequence containing a tobacco etch virus (TEV) protease cleavage site. After tryptic digestion of the sample, labeled peptides are enriched using avidin affinity extraction, and peptides are eluted from the avidin matrix by cleavage with TEV protease, followed by LC-MS/MS identification of the labeled peptides from which putative substrates can be inferred.234,327 Another approach uses the difference in pKa between α- and ε-amino groups to selectively block lysine sidechain amines by guanidination in the presence of omethylthiourea at high pH, followed by labeling of free Nterminal amines with a disulfide-containing biotinylation reagent, extraction of the biotinylated peptides with immobilized avidin, and elution of the peptides under reductive conditions.328 In a comparable approach using selective labeling of N-terminal α-amines at high pH with N-tris(2,4,6trimethoxyphenyl)phosphonium acetyl (TMPP), N-terminal peptides can be enriched using an antibody that recognizes TMPP-labeled peptides that can be extracted with magnetic beads prior to mass spectrometric analysis.329,330 Because of the technical difficulties of specifically labeling αamines and the inherent lower high-accuracy coverage obtained, the majority of methods for targeted analysis of protein N-termini now rely on the opposite principal: depletion of tryptic, internal peptides to effectively lead to negative enrichment of N-terminal peptides. Although several methods have been published in the past decade, the basic premise is always to chemically block all primary amines in the sample of interest at the protein level, either by N-hydroxysuccinimide (NHS)-activated reagents or reductive amination (also known as reductive alkylation) with an aldehyde using cyanoborohydride as a catalyst.331 By labeling at the whole-protein level by definition then, any N-terminal labeled peptide had to have been present and exposed in the sample prior to labeling. If the sample was handled with appropriate care using protease inhibitors and taking precautions to avoid proteolysis on sample preparation, then such labeled N-termini on peptides represent either mature protein N-termini or proteasegenerated neo-N-termini. After incubation with trypsin to digest the proteins into LC-MS amendable peptides, the newly generated internal (i.e., nonterminal), tryptic peptides that are not the N-terminome are removed from the sample, leading to negative enrichment of the blocked N-termini. One major advantage of this approach over the positiveselection methods lies not so much in the discovery of cleaved substrates but in that it allows for comprehensive analysis of the N-terminome in the sample, including naturally occurring Nterminal modifications such as methylation, acetylation, and fatty acid derivatization (Figure 14) that also have important biological implications on protein function.179 The chemical blocking step provides an ideal opportunity for isotopic differential labeling and multiplexing of samples (e.g., protease-treated versus control, or different patients, or different stimulation conditions) by using stable isotope-labeled reagents such as 13C- or deuterium-labeled acetate or formaldehyde, or isobaric proteomics tags such as iTRAQ (isobaric tags for relative and absolute quantitation) or TMTs (tandem mass tags). Another advantage is that the natural Nterminal peptides can be used to statistically determine the variation within and between experiments and hence used to statistically discriminate the neo-N-termini derived from the protease of interest from background proteolysis present in

Figure 14. Schematic overview of the most common biologically occurring chemical modification of the N-terminus (a) and C-terminus (b) of proteoforms.

both the protease-containing and control samples as a highratio peptide. Approaches that have been used to deplete the sample of internal peptides include immobilization of free amine containing peptides with carboxylic acid functionalized polystyrene beads332 or aldehyde-functionalized POROS beads using reductive amination,333 biotinylation of free αamines with NHS biotin and depletion with avidin beads,334,335 and phospho-tagging (PTAG) internal peptides via reductive amination with glyceraldehyde-3-phosphate, followed by chromatographic separation of phosphorylated internal peptides from the N-termini of interest using a titanium dioxide affinity column.336 However, an inherent problem with beadbased approaches is nonspecific binding of peptides to these media, leading to false positives or saturation of MS analyses. Although many methods have been published and have been shown to function with varying degrees of reliability and validation, two distinct workflows have become more widely used than othersterminal amine isotopic labeling of substrates (TAILS337,338), combined fractional diagonal chromatography (COFRADIC339,340), and the variant chargebased fractional diagonal chromatography (ChaFRADIC341−343) (Figure 15). COFRADIC relies on chemical modification of the hydrophobicity of internal peptides and removal by reversed-phase chromatography, leading to negative enrichment of N-terminal peptides. All amines are initially blocked by acetylation, followed by digestion of the sample with trypsin. The sample is subsequently prefractionated by strong cation-exchange (SCX) chromatography, which negatively enriches for the acetylated N-termini (both natural as chemically modified) due to their low affinity for the SCX material caused by the difference in dissociation constant compared to unblocked, internal peptides. The enriched peptides are fractionated further by C18 chromatography, and 1154

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Figure 15. Schematic overview of the most commonly used degradomics approaches using negative selection of N-terminal peptides. Proteome samples under different conditions (e.g., control cells versus protease-deficient cells, or protease-overexpressing cells) are acquired that contain uncleaved, intact proteins and cleavage fragments resulting from proteolytic activity. All primary amines in the proteins are blocked by dimethylation or acetylation, using isotopically labeled reagents to differentially label proteins from different conditions, or in TAILS by isobaric mass spectrometry reagents such as iTRAQ or TMT. After combining the samples and digestion with trypsin, TAILS (left column) uses a soluble hyperbranched 1155

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Figure 15. continued polyglycerol polymer conjugated with aldehyde groups (HPG-ALD) for depletion of internal tryptic peptides that bind via their unblocked α-amino groups by reductive amination. The polymer and bound internal peptides are subsequently removed via molecular weight cutoff filter centrifugation, allowing for negative enrichment of blocked N-terminal peptides and mass spectrometric analysis. The isotopic or isobaric tagging of the individual samples allows for relative quantification of abundance of the N-terminal peptide, enabling the analysis of more subtle differences in proteolytic cleavage compared to only “on/off” approaches. COFRADIC (center column) relies on a series of chromatographic fractionations and chromatographic depletion of internal (i.e., unlabeled) tryptic peptides after derivatization with a hydrophobic reagent (TNBS). The extensive fractionation yields samples with a high enrichment for N-terminal peptides but with very large fraction numbers to be analyzed by mass spectrometry. ChaFRADIC (right column) is a COFRADIC variant that relies on the charge shift of nonterminal, tryptic peptides upon chemical acetylation that allows removal by SCX chromatography and negative enrichment of N-terminal peptides.

relies on the digestion of entire lysates with a site-specific enzyme, commonly trypsin, and then fractionating the resulting peptides prior to injection into the mass spectrometer and subsequent identification. While there are significant advantages to fractionation at the peptide level and the identification of those peptides, there is loss of protein context.350 Isoforms and proteolytically generated protein fragments are usually lost in the vast numbers of internal tryptic peptides due to the fact that the peptides that can differentiate isoforms or protein fragments are fewer in abundance and often have less amenable ionization/fragmentation properties. Trypsin is commonly used due to its strict substrate specificity, which generates Cterminal arginines and lysines, both basic residues. The presence of these amino acids confers tryptic peptides excellent fragmentation patterns, as it will preferentially generate stable yions as opposed to lesser stable b-ions when using collisioninduced dissociation (CID).351 Peptides derived from Cterminal regions of proteins are not always fully tryptic. Over 80% of proteins annotated in the human proteome (SwissProt) do not have a C-terminal basic residue, making them semitryptic if digested with trypsin. Consequently, identification of C-terminal peptides is a particular challenge as Cterminal peptides, while perhaps being the most specific sequences to a protein,352 (a) are in significantly less abundance than internally generated fully tryptic peptides and (b) have impaired fragmentation properties relative to fully tryptic peptides. To overcome the second issue listed here, new enzymes have been introduced to proteomic workflows in order to identify C-terminal peptides. Namely, the proteases Lys-N353 and LysargiNase40 are now available to digest proteomes and enhance C-terminal peptide identification. LysargiNase was demonstrated to improve the identification of C-terminal peptides in a human breast cancer model. As opposed to trypsin, Lys-N and LysargiNase both generate a positively charged N-terminus, which generates b- and a-ion ladders, which greatly assist in peptide identification by database search algorithms.351 Despite the advances provided by Lys-N and lysargiNase as proteome-digestion alternatives, the caveat of abundant internal peptides still remains, and similarly to other post-translationally modified peptides, Cterminal peptides require enrichment for in-depth profiling. As opposed to other post-translational modifications where a specific chemical group is added at functional sites on peptides and proteins, proteolytic processing does not have a single chemical group that allows for affinity purification. While there are methods to enrich N-terminal peptides, the chemistry for isolating C-terminal peptides is not as straightforward. COFRADIC, discussed earlier, can be adapted for the isolation of C-terminal peptides using a combination of metabolic labeling, acetylation, oxidation, and strong cation exchange.340 For cell cultures and biological systems that are not amenable

each fraction is labeled with 2,4,6-trinitrobenzenesulfonic acid (TNBS), which increases the hydrophobicity of the remaining internal peptides, allowing their efficient removal through reversed-phase fractionation. In general, COFRADIC leads to high numbers of samples due to the extensive sequential fractionation steps. A recent variant of COFRADIC termed ChaFRADIC simplifies the approach and massively reduces the amount of sample required for analysis to the low-microgram range. ChaFRADIC utilizes an optimized SCX fractionation to separate peptide fractions based on their charge state, followed by acetylation of internal, tryptic peptides, which reduces the charge state. During a second SCX fractionation on each of the original samples, nonterminal peptides will display a shifted retention profile, and by only selecting the fraction that did not exhibit charge shift, enrichment of blocked N-terminal peptides is achieved.342 On the other hand, TAILS is much simpler and faster and circumvents the high peptide losses inherent to chromatography-based approaches by the development of a unique ∼100 kDa aldehyde-functionalized soluble hyperbranched polyglycerol polymer (HPG-ALD) that is commercially available (http:// flintbox.com/public/project/1948). TAILS relies on blocking all primary amines at the protein level using either reductive amination with isotopically labeled formaldehyde or labeling with isobaric NHS tags such as iTRAQ or TMT. After mixing of differentially labeled samples and digestion of the proteins with trypsin, GluC, or LysargiNase, the internal tryptic peptides are depleted by binding to HPG-ALD via reductive amination. The polymer with bound internal peptides can be conveniently removed by filtration through a molecular-weight cutoff filter, leading to enriched N-termini (generally >95%) in an unfractionated sample that can be analyzed by LC-MS/MS. A bioinformatics suite containing CLIPPER344 and database tools TopFIND345 and TopFINDer346 is available for further analysis of the obtained data. TAILS has been successfully employed to study the proteolytic landscape in several cell-based models and to date remains the only N-terminomics approach reported to be capable of in vivo tissue N-terminomics, including inflamed skin9 and arthritic ankle joints,277 antiviral responses,276 a pancreatic cancer model,347 and lymphocytes from an immunedeficient patient.257 One of the major mysteries in regards to proteolytic cleavage is the extent and genesis of neo-N-termini in any given tissue.348 N-terminal proteomics studies regularly find that the majority of protein termini are the result of nonannotated proteolytic cleavage, and so questions arise including which proteases are responsible for this and what are the biological implications of these cleavages.179,349 4.3.2. C-terminal Proteomics. Detection of the natural C terminus of a protein or the proteolytically processed end of a protein is not a trivial matter when employing massspectrometric approaches. Conventional shotgun proteomics 1156

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

changes resulted in a total 481 C-terminal peptides from 369 proteins in E. coli. Contrasting to the inclusion of singly charged species in the mass spectrometer, Somasundaram and colleagues performed “charge-reversal derivatization” to convert C-terminal peptides into species that behave more like fully tryptic peptides.357 Utilizing similar principles to reduce water availability during EDC coupling, Somasundaram and colleagues used the basic amines N,N-dimethylethylenediamine and (4-aminobutyl)guanidine instead of ethanolamine. In doing so, it increased the charge of peptides and provided a basic group at the Cterminus, which increased the yield of y-ions, allowing for more confident identification using database search engines. However, the increase in charge of trypsin-generated peptides requires the utilization of alternate fragmentation methods, such as electron-transfer dissociation (ETD), which perform best at higher charge states.358 Thus, they utilized a combination of chymotrypsin, trypsin, and ETD on a LTQ Orbitrap Velos. This report found 424 C-termini from using two digestion enzymes and two fragmentation methods. The number of identified peptides across all these studies shows that the number of identified C-terminal peptides is significantly lower compared not only to N-terminomic methodologies available but to the predicted number of proteins in E. coli. Thus, C-TAILS and other carboxy-oriented methods are still in their infancy and hold much promise for the future of sequencing the ends of proteins.

to metabolic labeling, an alternate method can be utilizedCTAILS. Alternative C-terminomic strategies can also be found described in the review by Tanco and colleagues.173 Schilling and colleagues presented a negative selection strategy to study C-terminal peptides as a modified method of N-terminal amine isotopic labeling of substrates (NTAILS).354,355 The original TAILS method reported utilizes a water-soluble polymer to deplete internal peptides using reductive alkylation chemistry following amine blocking at the protein level. C-TAILS follows a similar principle. A polymer is utilized to deplete internal tryptic peptides after the protein is blocked at carboxylic acids. Contrary to the original TAILS method where primary amines react only with aldehydes to form covalent bonds, C-TAILS chemistry reacts carboxylic acids with primary amines. As such, C-TAILS requires blocking at multiple levels. First, thiol groups are alkylated at the protein level and then primary amines (N-termini and lysines) are reductively dimethylated, rendering those groups unavailable for later reaction. This prevents cyclization and aggregation of proximal primary amines to carboxylic acids in the amidation reaction. Following buffer exchange to 4-morpholinoethanesulfonic acid (MES) buffer pH 5, proteins are then reacted with a primary amine, originally published with ethanolamine, and a combination of 1-ethyl-3-(3-(dimethylamino)propyl)carbodiimide (EDC) and sulfo-N-hydroxysuccinimide (sNHS), both in excess quantities. Carboxylic acids and primary amines do not react in isolation; it requires the activation of the carboxylic acid. EDC activates carboxylic acids into an unstable O-acylisourea intermediate. This intermediate can (1) be stabilized by s-NHS to form the amine-reactive compound, which then forms an amide bond with ethanolamine, or (2) hydrolyze back to the carboxylic acid. Because of the backconversion back to carboxylic acid, addition of EDC and s-NHS is repeated three times in C-TAILS over 18 h to maximize carboxylic acid blocking, which is critical for the success of the method. Proteins are then precipitated to eliminate excess reagent and digested with trypsin to generate peptides that contain newly formed primary amines and carboxylic acids. For quantification purposes and to prevent aggregation and cyclization, peptides are labeled at their newly formed N-termini with isotopic or isobaric labels. Following buffer exchange, the freely available C-termini from internal tryptic peptides are reacted in a similar EDC/s-NHS scheme with a water-soluble linear polymer polyallylamine (PAA). This polymer captures internal tryptic peptides and can be separated from the unbound C-terminally blocked peptides utilizing a spin-filter membrane with a highmolecular-weight cutoff. C-TAILS was originally utilized to study the E. coli C-terminome, and it achieved >70% of peptides being C-terminally blocked, identifying 460 peptides from 196 proteins.354 Since the originally published protocol in 2010, there have been novel developments to address some caveats in C-TAILS. While the original method utilized dimethylation of primary amines for blocking, a report in 2015 by Zhang and colleagues showed there are potential benefits for utilization of acetylation to improve Mascot scores.356 In addition, they prepurified PAA polymer to remove monomers and allowed for the inclusion of singly charged species to be fragmented by the mass spectrometer. Most notable was the inclusion of acetonitrile in the EDC reactions in order to minimize water concentrations while allowing proteins to remain soluble. Overall these

5. FUTURE PERSPECTIVES Regulated proteolytic processing is dysfunctional in diseases that afflict us all yet is still ill-definedonly 340 of 565 human proteases have known substrates. Deciphering cleavage events at decision points of cell and tissue function will reveal an unexplored layer of complexity in the hierarchy of cell regulation. The past decade of increasing research effort and attention in the field of degradomics has transformed our view of proteolysis from a mere destructive mechanism, such as in MMP-mediated degradation of extracellular matrix, to an intricate and selective posttranslational modification with important regulatory functions in many biological pathways. With increasing technological capabilities (e.g., more powerful mass spectrometers and improved bioinformatics), our ability to probe substrate proteolysis deeper and more specifically in various processes in health and disease has provided more insight into how proteolysis changes protein function, yet many questions remain to be answered. An unexpected general finding of such unbiased terminomics approaches is the very high percentage of protein populations that are present with undocumented N- or C-termini present concurrent with their mature parent protein as translated and matured in the classical biochemical pathways.348 Thus, in normal skin ∼50% of proteins have a different N-terminus to that annotated in Uniprot;9 in platelets and erythrocytes this percentage rises to 77%7 and 68%,8 respectively, ∼4% of which are internal alternate start sites.8 Although these two cell types are unusual, these numbers seem representative, and indeed we found in human dental pulp that 78% of proteins occurred as different cleaved proteoforms.10 As has been discussed throughout this Review, the position and chemistry of the Nterminal amino acid residues and their specific modifications can profoundly alter the function of the protein. Thus, these proteoforms not only increase protein diversity in vivo but they alter our conceptual models of protein, organelle, cell, and 1157

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

phosphorylation, ubiquitination, and proteolytic cleavage will integrate into a true systems analysis platform to study the role of specific proteolytic cleavages in biology and pathology. This nuanced new level of biological control heralds a bright new era and realm of biology and protein and cell regulation.

tissue functionin dynamic ongoing physiological processes that maintain or restore homeostasis and go astray, leading to pathology or occurring to switch function as a protective host response to pathology. The genesis of these neo-termini needs to be deciphered, and for that vast majority generated other than by alternate start sites and splicing, the cognate proteases leading to these proteoforms need to be identified. In pathology, these proteases or proteoforms may form valuable new drug targets or biomarkers of disease. In any case, both need to be recognized and the biological implications need to be understood for more accurate understanding of the function and response networks of a proteome, at rest and when stressed. Current proteomics techniques aimed at unbiased study of proteolysis of substrates provide temporal snapshots of the state of cleavage in a tissue and give invaluable insights into what substrates are cleaved, e.g., in response to a specific stimulus or disease state, but interpreting these pictures can prove challenging. Most useful proteomics techniques are based on relative quantification between samples, so the stoichiometry between cleaved and intact substrate is often overlooked, and it is unknown how this affects biology. Many of the background cleavages may occur at low stoichiometry, and for these, are they relevant? For each protein, what is the tipping point for physiological relevance in altering protein function, particularly when the cleavage products are antagonists or dominant negatives of the parent protein. How stable are the parent and neoproteins? A related question is how many of these cleavages are also silent? Thus, perhaps the generation of neotermini in homeostatic conditions is just the price the system pays for maintaining a responsive pathway and network that can rapidly change the levels of a protein or its activity on demand. Similarly, deciphering protease redundancy for these cleavages is important to understand. Redundancy is often too easy an excuse for lack of deeper understanding. That is, a system may not have redundant capacity but rather the organism does, and in different cells, tissue, or organs, under different stresses or nutritional and hormonal statuses, different proteases cut in and out. This is important to really understand in order to formulate good hypotheses and insightful experiments. Localization of a given protease and its substrate greatly affects the outcome and effect of cleavage, and more sensitive analytical techniques will enable more detailed analysis of specific tissues and subcellular fractions. Another challenge in analyzing proteolytic cleavage in complex, biologically relevant samples is the interplay that exists between different proteases and their inhibitors. The direct source of an observed proteolytic cleavage is often intractable and can be due to indirect effects, e.g., deactivation of a protease inhibitor by another protease, often of a different class,11 or activation of another protease. Ongoing advances in bioinformatics are the key to give insights into the intricacies of the protease web and interpreting the genesis of proteolytic proteoforms in cells and tissues. Finally, proteolytic cleavage can be regulated or affected by other posttranslational protein modifications. Activity of proteases may rely on specific phosphorylation or ubiquitination events, and glycosylation or phosphorylation or other posttranslational modifications of substrates may enable, or prevent, proteolysis. So, do cleavage sites function combinatorially or together with other posttranslational modifications? Integrating systems biology approaches that profile posttranslational modifications will further our insight into this, and current targeted analysis platforms for

AUTHOR INFORMATION Corresponding Author

*E-mail: [email protected]. Present Addresses ⊥

(T.K.) Department of Clinical Chemistry; Erasmus Medical Center, Rotterdam, The Netherlands. § (U.E.) Division of Structural Biology, Department of Molecular Biology, University of Salzburg, Salzburg, Austria. ¶ (A.D.) Department of Physiology and Pharmacology; University of Calgary, Calgary, Canada. Notes

The authors declare no competing financial interest. Biographies Theo Klein obtained his Ph.D. in Analytic Biochemistry in 2008 from the University of Groningen, The Netherlands, where he worked on development of novel chemical biology and proteomics methods for qualitative and quantitative analysis of protease activity. After a brief segue into biotech at Brains Online in Groningen, he joined Chris Overall’s lab in Vancouver in 2011 to study the role of proteolytic cleavage in adaptive immunity, focusing on the role of the paracaspase MALT1 in antigen-receptor signaling. Ulrich Eckhard obtained his Ph.D. degree in 2011 from the University of Salzburg, Austria, where he focused on the biochemical and structural elucidation of bacterial collagenases. He then joined Prof. Chris Overall at the University of British Columbia, Vancouver, Canada, where he secured a postdoctoral fellowship from the Michael Smith Foundation for Health Research and expanded his structural biochemical methodological repertoire to state-of-the-art proteomics. Since September 2016, he has worked as a researcher at Uppsala within a multidisciplinary project trying to unravel how protein structure and function are intertwined in the evolution of new protein variants. Antoine Dufour obtained his Ph.D. degree in 2010 from Stony Brook University, NY, U.S.A. He developed a patent on the composition and methods for the inhibition of MMP9-mediated cell migration. He then began a postdoctoral fellowship with Prof. Chris Overall at the University of British Columbia, focused on the investigation of proteases in inflammatory diseases such as cancer, arthritis, and systemic lupus erythematosus using proteomics. In 2012 he was awarded a fellowship from the Canadian Institutes of Health Research to continue his work on the roles of matrix metalloproteinases in breast cancer. Nestor Solis obtained his Ph.D. in microbial proteomics in 2014 from The University of Sydney, Australia, where he explored the identification of cell-surface proteins from Staphylococcus species using cell-shaving proteomics. He joined the laboratory of Professor Christopher Overall in Vancouver, Canada, in 2014 to further expand on proteomic methods to study N- and C-terminomes in macrophages. He was awarded two fellowships in 2016 to study macrophage terminomes: a Michael Smith Foundation for Health Research postdoctoral fellowship from British Columbia, Canada, and a CJ Martin Early Career Fellowship from the National Health and Medical Research Council from Australia. 1158

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(10) Eckhard, U.; Marino, G.; Abbey, S. R.; Tharmarajah, G.; Matthew, I.; Overall, C. M. The Human Dental Pulp Proteome and NTerminome: Levering the Unexplored Potential of Semitryptic Peptides Enriched by TAILS to Identify Missing Proteins in the Human Proteome Project in Underexplored Tissues. J. Proteome Res. 2015, 14, 3568−3582. (11) Fortelny, N.; Cox, J. H.; Kappelhoff, R.; Starr, A. E.; Lange, P. F.; Pavlidis, P.; Overall, C. M. Network Analyses Reveal Pervasive Functional Regulation between Proteases in the Human Protease Web. PLoS Biol. 2014, 12, e1001869. (12) Hershko, A.; Ciechanover, A. The Ubiquitin System. Annu. Rev. Biochem. 1998, 67, 425−479. (13) Janko, M.; Zink, A.; Gigler, A. M.; Heckl, W. M.; Stark, R. W. Nanostructure and Mechanics of Mummified Type I Collagen from the 5300-Year-Old Tyrolean Iceman. Proc. R. Soc. London, Ser. B 2010, 277, 2301−2309. (14) Radzicka, A.; Wolfenden, R. Rates of Uncatalyzed Peptide Bond Hydrolysis in Neutral Solution and the Transition State Affinities of Proteases. J. Am. Chem. Soc. 1996, 118, 6105−6109. (15) Deu, E.; Verdoes, M.; Bogyo, M. New Tools for Dissecting Protease Function: Implications for Inhibitor Design, Drug Discovery and Probe Development. Nat. Struct. Mol. Biol. 2012, 19, 9−16. (16) Hooper, N. M. Proteases: A Primer. Essays Biochem. 2002, 38, 1−8. (17) Rawlings, N. D.; Barrett, A. J.; Finn, R. Twenty Years of the MEROPS Database of Proteolytic Enzymes, Their Substrates and Inhibitors. Nucleic Acids Res. 2016, 44 (D1), D343−D350. (18) Rawlings, N. D.; Barrett, A. J.; Bateman, A. Asparagine Peptide Lyases: A Seventh Catalytic Type of Proteolytic Enzymes. J. Biol. Chem. 2011, 286, 38321−38328. (19) Vallee, B. L.; Auld, D. S. Active-Site Zinc Ligands and Activated H2O of Zinc Enzymes. Proc. Natl. Acad. Sci. U. S. A. 1990, 87, 220− 224. (20) Auld, D. S. Zinc Coordination Sphere in Biochemical Zinc Sites. BioMetals 2001, 14, 271−313. (21) Springman, E. B.; Angleton, E. L.; Birkedal-Hansen, H.; Van Wart, H. E. Multiple Modes of Activation of Latent Human Fibroblast Collagenase: Evidence for the Role of a Cys73 Active-Site Zinc Complex in Latency and A “cysteine Switch” mechanism for Activation. Proc. Natl. Acad. Sci. U. S. A. 1990, 87, 364−368. (22) Harney, A. S.; Sole, L. B.; Meade, T. J. Kinetics and Thermodynamics of Irreversible Inhibition of Matrix Metalloproteinase 2 by a Co(III) Schiff Base Complex. JBIC, J. Biol. Inorg. Chem. 2012, 17, 853−860. (23) Pelmenschikov, V.; Siegbahn, P. E. M. Catalytic Mechanism of Matrix Metalloproteinases: Two-Layered ONIOM Study. Inorg. Chem. 2002, 41, 5659−5666. (24) Hooper, N. M. Families of Zinc Metalloproteases. FEBS Lett. 1994, 354, 1−6. (25) Rawlings, N. D.; Barrett, A. J. Evolutionary Families of Metallopeptidases. Methods Enzymol. 1995, 248, 183−228. (26) Overall, C. M. Matrix Metalloproteinase Substrate Binding Domains, Modules and Exosites. Overview and Experimental Strategies. Methods Mol. Biol. 2001, 151, 79−120. (27) Overall, C. M. Molecular Determinants of Metalloproteinase Substrate Specificity: Matrix Metalloproteinase Substrate Binding Domains, Modules, and Exosites. Mol. Biotechnol. 2002, 22, 51−86. (28) Overall, C. M.; McQuibban, G. A.; Clark-Lewis, I. Discovery of Chemokine Substrates for Matrix Metalloproteinases by Exosite Scanning: A New Tool for Degradomics. Biol. Chem. 2002, 383, 1059−1066. (29) Bode, W.; Maskos, K. Structural Studies on MMPs and TIMPs. Methods Mol. Biol. 2001, 151, 45−77. (30) Nagase, H.; Visse, R.; Murphy, G. Structure and Function of Matrix Metalloproteinases and TIMPs. Cardiovasc. Res. 2006, 69, 562− 573. (31) Cerdà-Costa, N.; Xavier Gomis-Rüth, F. Architecture and Function of Metallopeptidase Catalytic Domains. Protein Sci. 2014, 23, 123−144.

Chris Overall is a Professor and Tier 1 Canada Research Chair in Protease Proteomics and Systems Biology at the University of British Columbia (UBC), Life Sciences Institute. He is a pioneer of degradomics, a term he coined. He has had several sabbaticals in the Biotechnology and Pharmaceutical industries and is an Honorary Professor at the Freiburg Institute for Advanced Studies, AlbertLudwigs Universität Freiburg, Germany. Dr. Overall was awarded the 2002 CIHR Scientist of the Year, the UBC Killam Senior Researcher Award 2005, the Tony Pawson Canadian National Proteomics Network Award for Outstanding Contribution and Leadership to the Canadian Proteomics Community 2014, and the HUPO Proteomics Discovery Award in 2018. He is an Associate Editor of the Journal of Proteome Research, Editor of mSystems, and on the Editorial Boards of the Journal of Molecular Cellular Proteomics, Matrix Biology, and Biological Chemistry and the Advisory Committee of the International Union of Basic and Clinical Pharmacology.

ACKNOWLEDGMENTS C.M.O. is funded by a Foundation Grant from the Canadian Institute for Health Research and a Canada Research Chair in Protease Proteomics and Systems Biology. A.D. was supported by a Michael Smith Foundation for Health Research postdoctoral fellowship. U.E. acknowledges a postdoctoral fellowship from the Michael Smith Foundation for Health Research. N.S. is funded through a Canadian MSFHR Trainee Fellowship (Grant ID 16642) and an Australian NHMRC Early Career Fellowship (GNT1126842). Additional funding inclues C.I.H.R. Foundation Grant (FDN: 148408) (C.M.O.); Canada Research Chair (Tier I) in Metalloproteinase Proteomics and Systems Biology (950-01-126; 950-20-3877) (C.M.O.); C.I.H.R. Operating Grant (MOP-133632) (C.M.O.). REFERENCES (1) Barrett, A. J.; Woessner, J. F.; Rawlings, N. D. Handbook of Proteolytic Enzymes; Elsevier: 2012. (2) López-Otín, C.; Overall, C. M. Protease Degradomics: A New Challenge for Proteomics. Nat. Rev. Mol. Cell Biol. 2002, 3, 509−519. (3) Pérez-Silva, J. G.; Español, Y.; Velasco, G.; Quesada, V. The Degradome Database: Expanding Roles of Mammalian Proteases in Life and Disease. Nucleic Acids Res. 2016, 44 (D1), D351−D355. (4) Roskoski, R. A Historical Overview of Protein Kinases and Their Targeted Small Molecule Inhibitors. Pharmacol. Res. 2015, 100, 1−23. (5) Ezkurdia, I.; Juan, D.; Rodriguez, J. M.; Frankish, A.; Diekhans, M.; Harrow, J.; Vazquez, J.; Valencia, A.; Tress, M. L. Multiple Evidence Strands Suggest That There May Be as Few as 19 000 Human Protein-Coding Genes. Hum. Mol. Genet. 2014, 23, 5866− 5878. (6) Omenn, G. S.; Lane, L.; Lundberg, E. K.; Beavis, R. C.; Overall, C. M.; Deutsch, E. W. Metrics for the Human Proteome Project 2016: Progress on Identifying and Characterizing the Human Proteome, Including Post-Translational Modifications. J. Proteome Res. 2016, 15, 3951−3960. (7) Prudova, A.; Serrano, K.; Eckhard, U.; Fortelny, N.; Devine, D. V.; Overall, C. M. TAILS N-Terminomics of Human Platelets Reveals Pervasive Metalloproteinase-Dependent Proteolytic Processing in Storage. Blood 2014, 124, e49−60. (8) Lange, P. F.; Huesgen, P. F.; Nguyen, K.; Overall, C. M. Annotating N Termini for the Human Proteome Project: N Termini and Nα-Acetylation Status Differentiate Stable Cleaved Protein Species from Degradation Remnants in the Human Erythrocyte Proteome. J. Proteome Res. 2014, 13, 2028−2044. (9) auf dem Keller, U.; Prudova, A.; Eckhard, U.; Fingleton, B.; Overall, C. M. Systems-Level Analysis of Proteolytic Events in Increased Vascular Permeability and Complement Activation in Skin Inflammation. Sci. Signaling 2013, 6, rs2. 1159

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(32) Eckhard, U.; Huesgen, P. F.; Schilling, O.; Bellac, C. L.; Butler, G. S.; Cox, J. H.; Dufour, A.; Goebeler, V.; Kappelhoff, R.; Keller, U. a. d.; et al. Active Site Specificity Profiling of the Matrix Metalloproteinase Family: Proteomic Identification of 4300 Cleavage Sites by Nine MMPs Explored with Structural and Synthetic Peptide Cleavage Analyses. Matrix Biol. 2016, 49, 37−60. (33) Tam, E. M.; Moore, T. R.; Butler, G. S.; Overall, C. M. Characterization of the Distinct Collagen Binding, Helicase and Cleavage Mechanisms of Matrix Metalloproteinase 2 and 14 (Gelatinase A and MT1-MMP): The Differential Roles of the MMP Hemopexin c Domains and the MMP-2 Fibronectin Type II Modules in Collagen Triple Helicase Activities. J. Biol. Chem. 2004, 279, 43336−43344. (34) Klein, T.; Bischoff, R. Physiology and Pathophysiology of Matrix Metalloproteases. Amino Acids 2011, 41, 271−290. (35) Nagase, H.; Woessner, J. F. Matrix Metalloproteinases. J. Biol. Chem. 1999, 274, 21491−21494. (36) Klein, T.; Bischoff, R. Active Metalloproteases of the A Disintegrin and Metalloprotease (ADAM) Family: Biological Function and Structure. J. Proteome Res. 2011, 10, 17−33. (37) Edwards, D. R.; Handsley, M. M.; Pennington, C. J. The ADAM Metalloproteinases. Mol. Aspects Med. 2008, 29, 258−289. (38) Kelwick, R.; Desanlis, I.; Wheeler, G. N.; Edwards, D. R. The ADAMTS (A Disintegrin and Metalloproteinase with Thrombospondin Motifs) Family. Genome Biol. 2015, 16, 113. (39) Gomis-Rüth, F. X.; Trillo-Muyo, S.; Stöcker, W. Functional and Structural Insights into Astacin Metallopeptidases. Biol. Chem. 2012, 393, 1027−1041. (40) Huesgen, P. F.; Lange, P. F.; Rogers, L. D.; Solis, N.; Eckhard, U.; Kleifeld, O.; Goulas, T.; Gomis-Rüth, F. X.; Overall, C. M. LysargiNase Mirrors Trypsin for Protein C-Terminal and MethylationSite Identification. Nat. Methods 2015, 12, 55−58. (41) Sowa, M. E.; Bennett, E. J.; Gygi, S. P.; Harper, J. W. Defining the Human Deubiquitinating Enzyme Interaction Landscape. Cell 2009, 138, 389−403. (42) Lawson, P. A.; Rainey, F. A. Proposal to Restrict the Genus Clostridium (Prazmowski) to Clostridium Butyricum and Related Species. Int. J. Syst. Evol. Microbiol. 2016, 66, 1009−1016. (43) Bond, M. D.; Van Wart, H. E. Characterization of the Individual Collagenases from Clostridium Histolyticum. Biochemistry 1984, 23, 3085−3091. (44) Eckhard, U.; Huesgen, P. F.; Brandstetter, H.; Overall, C. M. Proteomic Protease Specificity Profiling of Clostridial Collagenases Reveals Their Intrinsic Nature as Dedicated Degraders of Collagen. J. Proteomics 2014, 100, 102−114. (45) Eckhard, U.; Schönauer, E.; Nüss, D.; Brandstetter, H. Structure of Collagenase G Reveals a Chew-and-Digest Mechanism of Bacterial Collagenolysis. Nat. Struct. Mol. Biol. 2011, 18, 1109−1114. (46) Eckhard, U.; Schönauer, E.; Brandstetter, H. Structural Basis for Activity Regulation and Substrate Preference of Clostridial Collagenases G, H, and T. J. Biol. Chem. 2013, 288, 20184−20194. (47) Inouye, K.; Kusano, M.; Hashida, Y.; Minoda, M.; Yasukawa, K. Engineering, Expression, Purification, and Production of Recombinant Thermolysin. Biotechnol. Annu. Rev. 2007, 13, 43−64. (48) Gomis-Rüth, F. X.; Botelho, T. O.; Bode, W. A Standard Orientation for Metallopeptidases. Biochim. Biophys. Acta, Proteins Proteomics 2012, 1824, 157−163. (49) Holmes, M. A.; Matthews, B. W. Structure of Thermolysin Refined at 1.6 A Resolution. J. Mol. Biol. 1982, 160, 623−639. (50) Tallant, C.; Marrero, A.; Gomis-Rüth, F. X. Matrix Metalloproteinases: Fold and Function of Their Catalytic Domains. Biochim. Biophys. Acta, Mol. Cell Res. 2010, 1803, 20−28. (51) Acharya, K. R.; Sturrock, E. D.; Riordan, J. F.; Ehlers, M. R. W. Ace Revisited: A New Target for Structure-Based Drug Design. Nat. Rev. Drug Discovery 2003, 2, 891−902. (52) Dunn, B. M. Overview of Pepsin-like Aspartic Peptidases. Curr. Protoc. Protein Sci. 2001, 25, 21.3.1−21.3.6. (53) Dunn, B. M. Structure and Mechanism of the Pepsin-like Family of Aspartic Peptidases. Chem. Rev. 2002, 102, 4431−4458.

(54) Fritsche, E.; Paschos, A.; Beisel, H. G.; Böck, A.; Huber, R. Crystal Structure of the Hydrogenase Maturating Endopeptidase HYBD from Escherichia Coli. J. Mol. Biol. 1999, 288, 989−998. (55) Carroll, T. M.; Setlow, P. Site-Directed Mutagenesis and Structural Studies Suggest That the Germination Protease, GPR, in Spores of Bacillus Species Is an Atypical Aspartic Acid Protease. J. Bacteriol. 2005, 187, 7119−7125. (56) Bhaumik, P.; Xiao, H.; Hidaka, K.; Gustchina, A.; Kiso, Y.; Yada, R. Y.; Wlodawer, A. Structural Insights into the Activation and Inhibition of Histo-Aspartic Protease from Plasmodium Falciparum. Biochemistry 2011, 50, 8862−8879. (57) Szecsi, P. B. The Aspartic Proteases. Scand. J. Clin. Lab. Invest. 1992, 52, 5−22. (58) Blundell, T. L.; Guruprasad, K.; Albert, A.; Williams, M.; Sibanda, B. L.; Dhanaraj, V. The Aspartic Proteinases. An Historical Overview. Adv. Exp. Med. Biol. 1998, 436, 1−13. (59) Suguna, K.; Padlan, E. A.; Smith, C. W.; Carlson, W. D.; Davies, D. R. Binding of a Reduced Peptide Inhibitor to the Aspartic Proteinase from Rhizopus Chinensis: Implications for a Mechanism of Action. Proc. Natl. Acad. Sci. U. S. A. 1987, 84, 7009−7013. (60) Andreeva, N. S.; Rumsh, L. D. Analysis of Crystal Structures of Aspartic Proteinases: On the Role of Amino Acid Residues Adjacent to the Catalytic Site of Pepsin-like Enzymes. Protein Sci. Publ. Protein Soc. 2001, 10, 2439−2450. (61) Davies, D. R. The Structure and Function of the Aspartic Proteinases. Annu. Rev. Biophys. Biophys. Chem. 1990, 19, 189−215. (62) Sielecki, A. R.; Fujinaga, M.; Read, R. J.; James, M. N. Refined Structure of Porcine Pepsinogen at 1.8 A Resolution. J. Mol. Biol. 1991, 219, 671−692. (63) Wlodawer, A.; Erickson, J. W. Structure-Based Inhibitors of HIV-1 Protease. Annu. Rev. Biochem. 1993, 62, 543−585. (64) Dunn, B. M.; Scarborough, P. E.; Lowther, W. T.; Rao-Naik, C. Comparison of the Active Site Specificity of the Aspartic Proteinases Based on a Systematic Series of Peptide Substrates. Adv. Exp. Med. Biol. 1995, 362, 1−9. (65) Brik, A.; Wong, C.-H. HIV-1 Protease: Mechanism and Drug Discovery. Org. Biomol. Chem. 2003, 1, 5−14. (66) Wlodawer, A.; Miller, M.; Jaskólski, M.; Sathyanarayana, B. K.; Baldwin, E.; Weber, I. T.; Selk, L. M.; Clawson, L.; Schneider, J.; Kent, S. B. Conserved Folding in Retroviral Proteases: Crystal Structure of a Synthetic HIV-1 Protease. Science 1989, 245, 616−621. (67) Hartsuck, J. A.; Koelsch, G.; Remington, S. J. The HighResolution Crystal Structure of Porcine Pepsinogen. Proteins: Struct., Funct., Genet. 1992, 13, 1−25. (68) Eder, J.; Hommel, U.; Cumin, F.; Martoglio, B.; Gerhartz, B. Aspartic Proteases in Drug Discovery. Curr. Pharm. Des. 2007, 13, 271−285. (69) McPhie, P. Pepsinogen: Activation by a Unimolecular Mechanism. Biochem. Biophys. Res. Commun. 1974, 56, 789−792. (70) Dykes, C. W.; Kay, J. Conversion of Pepsinogen into Pepsin Is Not a One-Step Process. Biochem. J. 1976, 153, 141−144. (71) Benes, P.; Vetvicka, V.; Fusek, M. Cathepsin D - Many Functions of One Aspartic Protease. Crit. Rev. Oncol. Hematol. 2008, 68, 12−28. (72) Cuozzo, J. W.; Tao, K.; Cygler, M.; Mort, J. S.; Sahagian, G. G. Lysine-Based Structure Responsible for Selective Mannose Phosphorylation of Cathepsin D and Cathepsin L Defines a Common Structural Motif for Lysosomal Enzyme Targeting. J. Biol. Chem. 1998, 273, 21067−21076. (73) Zaidi, N.; Kalbacher, H. Cathepsin E: A Mini Review. Biochem. Biophys. Res. Commun. 2008, 367, 517−522. (74) Fowler, S. D.; Kay, J.; Dunn, B. M.; Tatnell, P. J. Monomeric Human Cathepsin E. FEBS Lett. 1995, 366, 72−74. (75) Tsukuba, T.; Sakai, H.; Yamada, M.; Maeda, H.; Hori, H.; Azuma, T.; Akamine, A.; Yamamoto, K. Biochemical Properties of the Monomeric Mutant of Human Cathepsin E Expressed in Chinese Hamster Ovary Cells: Comparison with Dimeric Forms of the Natural and Recombinant Cathepsin E. J. Biochem. 1996, 119, 126−134. 1160

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(76) Olsson, F.; Schmidt, S.; Althoff, V.; Munter, L. M.; Jin, S.; Rosqvist, S.; Lendahl, U.; Multhaup, G.; Lundkvist, J. Characterization of Intermediate Steps in Amyloid Beta (Aβ) Production under nearNative Conditions. J. Biol. Chem. 2014, 289, 1540−1550. (77) Zhou, S.; Zhou, H.; Walian, P. J.; Jap, B. K. CD147 Is a Regulatory Subunit of the Gamma-Secretase Complex in Alzheimer’s Disease Amyloid Beta-Peptide Production. Proc. Natl. Acad. Sci. U. S. A. 2005, 102, 7499−7504. (78) Fujinaga, M.; Cherney, M. M.; Oyama, H.; Oda, K.; James, M. N. G. The Molecular Structure and Catalytic Mechanism of a Novel Carboxyl Peptidase from Scytalidium Lignicolum. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 3364−3369. (79) Jensen, K.; Østergaard, P. R.; Wilting, R.; Lassen, S. F. Identification and Characterization of a Bacterial Glutamic Peptidase. BMC Biochem. 2010, 11, 47. (80) Oda, K. New Families of Carboxyl Peptidases: Serine-Carboxyl Peptidases and Glutamic Peptidases. J. Biochem. 2012, 151, 13−25. (81) Schwarzer, D.; Stummeyer, K.; Gerardy-Schahn, R.; Mühlenhoff, M. Characterization of a Novel Intramolecular Chaperone Domain Conserved in Endosialidases and Other Bacteriophage Tail Spike and Fiber Proteins. J. Biol. Chem. 2007, 282, 2821−2831. (82) Schwarzer, D.; Stummeyer, K.; Haselhorst, T.; Freiberger, F.; Rode, B.; Grove, M.; Scheper, T.; von Itzstein, M.; Mühlenhoff, M.; Gerardy-Schahn, R. Proteolytic Release of the Intramolecular Chaperone Domain Confers Processivity to Endosialidase F. J. Biol. Chem. 2009, 284, 9465−9474. (83) Kataoka, Y.; Takada, K.; Oyama, H.; Tsunemi, M.; James, M. N. G.; Oda, K. Catalytic Residues and Substrate Specificity of Scytalidoglutamic Peptidase, the First Member of the Eqolisin in Family (G1) of Peptidases. FEBS Lett. 2005, 579, 2991−2994. (84) Sasaki, H.; Nakagawa, A.; Muramatsu, T.; Suganuma, M.; Sawano, Y.; Kojima, M.; Kubota, K.; Takahashi, K.; Tanokura, M. The Three-Dimensional Structure of Aspergilloglutamic Peptidase from Aspergillus Niger. Proc. Jpn. Acad., Ser. B 2004, 80, 435−438. (85) Tajima, N.; Kawai, F.; Park, S.-Y.; Tame, J. R. H. A Novel InteinLike Autoproteolytic Mechanism in Autotransporter Proteins. J. Mol. Biol. 2010, 402, 645−656. (86) Paulus, H. Protein Splicing and Related Forms of Protein Autoprocessing. Annu. Rev. Biochem. 2000, 69, 447−496. (87) Polgár, L. The Catalytic Triad of Serine Peptidases. Cell. Mol. Life Sci. 2005, 62, 2161−2172. (88) Hedstrom, L. Serine Protease Mechanism and Specificity. Chem. Rev. 2002, 102, 4501−4524. (89) Page, M. J.; Di Cera, E. Serine Peptidases: Classification, Structure and Function. Cell. Mol. Life Sci. 2008, 65, 1220−1236. (90) Briand, L.; Chobert, J.-M.; Gantier, R.; Declerck, N.; Tran, V.; Léonil, J.; Mollé, D.; Haertlé, T. Impact of the Lysine-188 and Aspartic Acid-189 Inversion on Activity of Trypsin. FEBS Lett. 1999, 442, 43− 47. (91) Venekei, I.; Szilágyi, L.; Gráf, L.; Rutter, W. J. Attempts to Convert Chymotrypsin to Trypsin. FEBS Lett. 1996, 379, 143−147. (92) Hedstrom, L.; Szilagyi, L.; Rutter, W. J. Converting Trypsin to Chymotrypsin: The Role of Surface Loops. Science 1992, 255, 1249− 1253. (93) Hung, S. H.; Hedstrom, L. Converting Trypsin to Elastase: Substitution of the S1 Site and Adjacent Loops Reconstitutes Esterase Specificity but Not Amidase Activity. Protein Eng., Des. Sel. 1998, 11, 669−673. (94) Brannigan, J. A.; Dodson, G.; Duggleby, H. J.; Moody, P. C. E.; Smith, J. L.; Tomchick, D. R.; Murzin, A. G. A Protein Catalytic Framework with an N-Terminal Nucleophile Is Capable of SelfActivation. Nature 1995, 378, 416−419. (95) Becker, S. H.; Darwin, K. H. Bacterial Proteasomes: Mechanistic and Functional Insights. Microbiol. Mol. Biol. Rev. 2017, 81.e00036-16 (96) Ekici, Ö . D.; Paetzel, M.; Dalbey, R. E. Unconventional Serine Proteases: Variations on the Catalytic Ser/His/Asp Triad Configuration. Protein Sci. 2008, 17, 2023−2037. (97) Dodson, G.; Wlodawer, A. Catalytic Triads and Their Relatives. Trends Biochem. Sci. 1998, 23, 347−352.

(98) Huber, E. M.; Heinemeyer, W.; Li, X.; Arendt, C. S.; Hochstrasser, M.; Groll, M. A Unified Mechanism for Proteolysis and Autocatalytic Activation in the 20S Proteasome. Nat. Commun. 2016, 7, 10900. (99) Löwe, J.; Stock, D.; Jap, B.; Zwickl, P.; Baumeister, W.; Huber, R. Crystal Structure of the 20S Proteasome from the Archaeon T. Acidophilum at 3.4 A Resolution. Science 1995, 268, 533−539. (100) Baumeister, W.; Walz, J.; Zühl, F.; Seemüller, E. The Proteasome: Paradigm of a Self-Compartmentalizing Protease. Cell 1998, 92, 367−380. (101) Seemuller, E.; Lupas, A.; Stock, D.; Lowe, J.; Huber, R.; Baumeister, W. Proteasome from Thermoplasma Acidophilum: A Threonine Protease. Science 1995, 268, 579−582. (102) Li, H.; O’Donoghue, A. J.; van der Linden, W. A.; Xie, S. C.; Yoo, E.; Foe, I. T.; Tilley, L.; Craik, C. S.; da Fonseca, P. C. A.; Bogyo, M. Structure- and Function-Based Design of Plasmodium-Selective Proteasome Inhibitors. Nature 2016, 530, 233−236. (103) Li, H.; Bogyo, M.; da Fonseca, P. C. A. The Cryo-EM Structure of the Plasmodium Falciparum 20S Proteasome and Its Use in the Fight against Malaria. FEBS J. 2016, 283, 4238−4243. (104) Vernet, T.; Tessier, D. C.; Chatellier, J.; Plouffe, C.; Lee, T. S.; Thomas, D. Y.; Storer, A. C.; Ménard, R. Structural and Functional Roles of Asparagine 175 in the Cysteine Protease Papain. J. Biol. Chem. 1995, 270, 16645−16652. (105) Craik, C. S.; Roczniak, S.; Largman, C.; Rutter, W. J. The Catalytic Role of the Active Site Aspartic Acid in Serine Proteases. Science 1987, 237, 909−913. (106) Ménard, R.; Carrière, J.; Laflamme, P.; Plouffe, C.; Khouri, H. E.; Vernet, T.; Tessier, D. C.; Thomas, D. Y.; Storer, A. C. Contribution of the Glutamine 19 Side Chain to Transition-State Stabilization in the Oxyanion Hole of Papain. Biochemistry 1991, 30, 8924−8928. (107) Otto, H.-H.; Schirmeister, T. Cysteine Proteases and Their Inhibitors. Chem. Rev. 1997, 97, 133−172. (108) Rzychon, M.; Chmiel, D.; Stec-Niemczyk, J. Modes of Inhibition of Cysteine Proteases. Acta Biochim. Pol. 2004, 51, 861− 873. (109) Wiederanders, B. Structure-Function Relationships in Class CA1 Cysteine Peptidase Propeptides. Acta Biochim. Pol. 2003, 50, 691−713. (110) Stoka, V.; Turk, B.; Turk, V. Lysosomal Cysteine Proteases: Structural Features and Their Role in Apoptosis. IUBMB Life 2005, 57, 347−353. (111) McGrath, M. E. The Lysosomal Cysteine Proteases. Annu. Rev. Biophys. Biomol. Struct. 1999, 28, 181−204. (112) Turk, V.; Stoka, V.; Vasiljeva, O.; Renko, M.; Sun, T.; Turk, B.; Turk, D. Cysteine Cathepsins: From Structure, Function and Regulation to New Frontiers. Biochim. Biophys. Acta, Proteins Proteomics 2012, 1824, 68−88. (113) Rossi, A.; Deveraux, Q.; Turk, B.; Sali, A. Comprehensive Search for Cysteine Cathepsins in the Human Genome. Biol. Chem. 2004, 385, 363−372. (114) Hasnain, S.; Hirama, T.; Huber, C. P.; Mason, P.; Mort, J. S. Characterization of Cathepsin B Specificity by Site-Directed Mutagenesis. Importance of Glu245 in the S2-P2 Specificity for Arginine and Its Role in Transition State Stabilization. J. Biol. Chem. 1993, 268, 235−240. (115) Guncar, G.; Podobnik, M.; Pungercar, J.; Strukelj, B.; Turk, V.; Turk, D. Crystal Structure of Porcine Cathepsin H Determined at 2.1 A Resolution: Location of the Mini-Chain C-Terminal Carboxyl Group Defines Cathepsin H Aminopeptidase Function. Structure 1998, 6, 51−61. (116) Vasiljeva, O.; Dolinar, M.; Turk, V.; Turk, B. Recombinant Human Cathepsin H Lacking the Mini Chain Is an Endopeptidase. Biochemistry 2003, 42, 13522−13528. (117) Nägler, D. K.; Zhang, R.; Tam, W.; Sulea, T.; Purisima, E. O.; Ménard, R. Human Cathepsin X: A Cysteine Protease with Unique Carboxypeptidase Activity. Biochemistry 1999, 38, 12648−12654. 1161

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(118) Gunčar, G.; Klemenčič, I.; Turk, B.; Turk, V.; KaraoglanovicCarmona, A.; Juliano, L.; Turk, D. Crystal Structure of Cathepsin X: A Flip−flop of the Ring of His23 Allows Carboxy-Monopeptidase and Carboxy-Dipeptidase Activity of the Protease. Structure 2000, 8, 305− 313. (119) Sivaraman, J.; Nägler, D. K.; Zhang, R.; Ménard, R.; Cygler, M. Crystal Structure of Human Procathepsin X: A Cysteine Protease with the Proregion Covalently Linked to the Active Site Cysteine. J. Mol. Biol. 2000, 295, 939−951. (120) Pungercar, J. R.; Caglic, D.; Sajid, M.; Dolinar, M.; Vasiljeva, O.; Pozgan, U.; Turk, D.; Bogyo, M.; Turk, V.; Turk, B. Autocatalytic Processing of Procathepsin B Is Triggered by Proenzyme Activity. FEBS J. 2009, 276, 660−668. (121) Ono, Y.; Sorimachi, H. Calpains  An Elaborate Proteolytic System. Biochim. Biophys. Acta, Proteins Proteomics 2012, 1824, 224− 236. (122) Riedl, S. J.; Shi, Y. Molecular Mechanisms of Caspase Regulation during Apoptosis. Nat. Rev. Mol. Cell Biol. 2004, 5, 897− 907. (123) Dall, E.; Brandstetter, H. Structure and Function of Legumain in Health and Disease. Biochimie 2016, 122, 126−150. (124) Uren, A. G.; O’Rourke, K.; Aravind, L. A.; Pisabarro, M. T.; Seshagiri, S.; Koonin, E. V.; Dixit, V. M. Identification of Paracaspases and Metacaspases: Two Ancient Families of Caspase-like Proteins, One of Which Plays a Key Role in MALT Lymphoma. Mol. Cell 2000, 6, 961−967. (125) Giglione, C.; Boularot, A.; Meinnel, T. Protein N-Terminal Methionine Excision. Cell. Mol. Life Sci. 2004, 61, 1455−1474. (126) Giglione, C.; Fieulaine, S.; Meinnel, T. N-Terminal Protein Modifications: Bringing Back into Play the Ribosome. Biochimie 2015, 114, 134−146. (127) Giglione, C.; Vallon, O.; Meinnel, T. Control of Protein LifeSpan by N-Terminal Methionine Excision. EMBO J. 2003, 22, 13−23. (128) Varshavsky, A. The N-End Rule Pathway and Regulation by Proteolysis. Protein Sci. Publ. Protein Soc. 2011, 20, 1298−1345. (129) Meacham, G. C.; Patterson, C.; Zhang, W.; Younger, J. M.; Cyr, D. M. The Hsc70 Co-Chaperone CHIP Targets Immature CFTR for Proteasomal Degradation. Nat. Cell Biol. 2001, 3, 100−105. (130) Connell, P.; Ballinger, C. A.; Jiang, J.; Wu, Y.; Thompson, L. J.; Höhfeld, J.; Patterson, C. The Co-Chaperone CHIP Regulates Protein Triage Decisions Mediated by Heat-Shock Proteins. Nat. Cell Biol. 2001, 3, 93−96. (131) Hiller, M. M.; Finger, A.; Schweiger, M.; Wolf, D. H. ER Degradation of a Misfolded Luminal Protein by the Cytosolic Ubiquitin-Proteasome Pathway. Science 1996, 273, 1725−1728. (132) Nyathi, Y.; Wilkinson, B. M.; Pool, M. R. Co-Translational Targeting and Translocation of Proteins to the Endoplasmic Reticulum. Biochim. Biophys. Acta, Mol. Cell Res. 2013, 1833, 2392− 2402. (133) Johnson, N.; Powis, K.; High, S. Post-Translational Translocation into the Endoplasmic Reticulum. Biochim. Biophys. Acta, Mol. Cell Res. 2013, 1833, 2403−2409. (134) Paetzel, M.; Karla, A.; Strynadka, N. C. J.; Dalbey, R. E. Signal Peptidases. Chem. Rev. 2002, 102, 4549−4580. (135) Weihofen, A.; Binns, K.; Lemberg, M. K.; Ashman, K.; Martoglio, B. Identification of Signal Peptide Peptidase, a PresenilinType Aspartic Protease. Science 2002, 296, 2215−2218. (136) Voss, M.; Schröder, B.; Fluhrer, R. Mechanism, Specificity, and Physiology of Signal Peptide Peptidase (SPP) and SPP-like Proteases. Biochim. Biophys. Acta, Biomembr. 2013, 1828, 2828−2839. (137) Maccecchini, M. L.; Rudin, Y.; Blobel, G.; Schatz, G. Import of Proteins into Mitochondria: Precursor Forms of the Extramitochondrially Made F1-ATPase Subunits in Yeast. Proc. Natl. Acad. Sci. U. S. A. 1979, 76, 343−347. (138) Seidah, N. G.; Prat, A. The Biology and Therapeutic Targeting of the Proprotein Convertases. Nat. Rev. Drug Discovery 2012, 11, 367−383. (139) Steiner, D. F. The Proprotein Convertases. Curr. Opin. Chem. Biol. 1998, 2, 31−39.

(140) Rockwell, N. C.; Krysan, D. J.; Komiyama, T.; Fuller, R. S. Precursor Processing by kex2/furin Proteases. Chem. Rev. 2002, 102, 4525−4548. (141) Thomas, G. Furin at the Cutting Edge: From Protein Traffic to Embryogenesis and Disease. Nat. Rev. Mol. Cell Biol. 2002, 3, 753− 766. (142) Seidah, N. G.; Mayer, G.; Zaid, A.; Rousselet, E.; Nassoury, N.; Poirier, S.; Essalmani, R.; Prat, A. The Activation and Physiological Functions of the Proprotein Convertases. Int. J. Biochem. Cell Biol. 2008, 40, 1111−1125. (143) van de Ven, W. J.; Voorberg, J.; Fontijn, R.; Pannekoek, H.; van den Ouweland, A. M.; van Duijnhoven, H. L.; Roebroek, A. J.; Siezen, R. J. Furin Is a Subtilisin-like Proprotein Processing Enzyme in Higher Eukaryotes. Mol. Biol. Rep. 1990, 14, 265−275. (144) Plaimauer, B.; Mohr, G.; Wernhart, W.; Himmelspach, M.; Dorner, F.; Schlokat, U. Shed” Furin: Mapping of the Cleavage Determinants and Identification of Its C-Terminus. Biochem. J. 2001, 354, 689−695. (145) Molloy, S. S.; Bresnahan, P. A.; Leppla, S. H.; Klimpel, K. R.; Thomas, G. Human Furin Is a Calcium-Dependent Serine Endoprotease That Recognizes the Sequence Arg-X-X-Arg and Efficiently Cleaves Anthrax Toxin Protective Antigen. J. Biol. Chem. 1992, 267, 16396−16402. (146) Walker, J. A.; Molloy, S. S.; Thomas, G.; Sakaguchi, T.; Yoshida, T.; Chambers, T. M.; Kawaoka, Y. Sequence Specificity of Furin, a Proprotein-Processing Endoprotease, for the Hemagglutinin of a Virulent Avian Influenza Virus. J. Virol. 1994, 68, 1213−1218. (147) Roebroek, A. J.; Umans, L.; Pauli, I. G.; Robertson, E. J.; van Leuven, F.; Van de Ven, W. J.; Constam, D. B. Failure of Ventral Closure and Axial Rotation in Embryos Lacking the Proprotein Convertase Furin. Development 1998, 125, 4863−4876. (148) Couture, F.; Kwiatkowska, A.; Dory, Y. L.; Day, R. Therapeutic Uses of Furin and Its Inhibitors: A Patent Review. Expert Opin. Ther. Pat. 2015, 25, 379−396. (149) Pesu, M.; Watford, W. T.; Wei, L.; Xu, L.; Fuss, I.; Strober, W.; Andersson, J.; Shevach, E. M.; Quezado, M.; Bouladoux, N.; et al. TCell-Expressed Proprotein Convertase Furin Is Essential for Maintenance of Peripheral Immune Tolerance. Nature 2008, 455, 246−250. (150) Remacle, A. G.; Shiryaev, S. A.; Oh, E.-S.; Cieplak, P.; Srinivasan, A.; Wei, G.; Liddington, R. C.; Ratnikov, B. I.; Parent, A.; Desjardins, R.; et al. Substrate Cleavage Analysis of Furin and Related Proprotein Convertases. A Comparative Study. J. Biol. Chem. 2008, 283, 20897−20906. (151) Wong, E.; Maretzky, T.; Peleg, Y.; Blobel, C. P.; Sagi, I. The Functional Maturation of A Disintegrin and Metalloproteinase (ADAM) 9, 10, and 17 Requires Processing at a Newly Identified Proprotein Convertase (PC) Cleavage Site. J. Biol. Chem. 2015, 290, 12135−12146. (152) Stawowy, P.; Meyborg, H.; Stibenz, D.; Borges Pereira Stawowy, N.; Roser, M.; Thanabalasingam, U.; Veinot, J. P.; Chrétien, M.; Seidah, N. G.; Fleck, E.; et al. Furin-like Proprotein Convertases Are Central Regulators of the Membrane Type Matrix Metalloproteinase-pro-Matrix Metalloproteinase-2 Proteolytic Cascade in Atherosclerosis. Circulation 2005, 111, 2820−2827. (153) Kappert, K.; Furundzija, V.; Fritzsche, J.; Margeta, C.; Krüger, J.; Meyborg, H.; Fleck, E.; Stawowy, P. Integrin Cleavage Regulates Bidirectional Signalling in Vascular Smooth Muscle Cells. Thromb. Haemostasis 2010, 103, 556−563. (154) Seidah, N. G.; Gaspar, L.; Mion, P.; Marcinkiewicz, M.; Mbikay, M.; Chrétien, M. cDNA Sequence of Two Distinct Pituitary Proteins Homologous to Kex2 and Furin Gene Products: TissueSpecific mRNAs Encoding Candidates for pro-Hormone Processing Proteinases. DNA Cell Biol. 1990, 9, 415−424. (155) Smeekens, S. P.; Avruch, A. S.; LaMendola, J.; Chan, S. J.; Steiner, D. F. Identification of a cDNA Encoding a Second Putative Prohormone Convertase Related to PC2 in AtT20 Cells and Islets of Langerhans. Proc. Natl. Acad. Sci. U. S. A. 1991, 88, 340−344. 1162

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

Procarboxypeptidase B and Determination of the Sequence of Its Activation Segment. Biochemistry 1991, 30, 4082−4089. (176) Fricker, L. D.; Snyder, S. H. Purification and Characterization of Enkephalin Convertase, an Enkephalin-Synthesizing Carboxypeptidase. J. Biol. Chem. 1983, 258, 10950−10955. (177) Greene, D.; Das, B.; Fricker, L. D. Regulation of Carboxypeptidase E. Effect of pH, Temperature and Co2+ on Kinetic Parameters of Substrate Hydrolysis. Biochem. J. 1992, 285, 613−618. (178) Cool, D. R.; Normant, E.; Shen, F.; Chen, H. C.; Pannell, L.; Zhang, Y.; Loh, Y. P. Carboxypeptidase E Is a Regulated Secretory Pathway Sorting Receptor: Genetic Obliteration Leads to Endocrine Disorders in Cpe(fat) Mice. Cell 1997, 88, 73−83. (179) Marino, G.; Eckhard, U.; Overall, C. M. Protein Termini and Their Modifications Revealed by Positional Proteomics. ACS Chem. Biol. 2015, 10, 1754−1764. (180) Steiner, D. F.; Oyer, P. E. The Biosynthesis of Insulin and a Probable Precursor of Insulin by a Human Islet Cell Adenoma. Proc. Natl. Acad. Sci. U. S. A. 1967, 57, 473−480. (181) International Textbook of Diabetes Mellitus, 4th ed.; DeFronzo, R. A., Ferrannini, E., Zimmet, P., Alberti, G.; Wiley-Blackwell: 2015; http://www.wiley.com/WileyCDA/WileyTitle/productCd0470658614.html (accessed Sep 28, 2017). (182) Varlamov, O.; Leiter, E. H.; Fricker, L. Induced and Spontaneous Mutations at Ser202 of Carboxypeptidase E. Effect on Enzyme Expression, Activity, and Intracellular Routing. J. Biol. Chem. 1996, 271, 13981−13986. (183) Naggert, J. K.; Fricker, L. D.; Varlamov, O.; Nishina, P. M.; Rouille, Y.; Steiner, D. F.; Carroll, R. J.; Paigen, B. J.; Leiter, E. H. Hyperproinsulinaemia in Obese Fat/fat Mice Associated with a Carboxypeptidase E Mutation Which Reduces Enzyme Activity. Nat. Genet. 1995, 10, 135−142. (184) Wunderlich, P.; Glebov, K.; Kemmerling, N.; Tien, N. T.; Neumann, H.; Walter, J. Sequential Proteolytic Processing of the Triggering Receptor Expressed on Myeloid Cells-2 (TREM2) Protein by Ectodomain Shedding and γ-Secretase-Dependent Intramembranous Cleavage. J. Biol. Chem. 2013, 288, 33027−33036. (185) Higashiyama, S.; Nanba, D.; Nakayama, H.; Inoue, H.; Fukuda, S. Ectodomain Shedding and Remnant Peptide Signalling of EGFRs and Their Ligands. J. Biochem. 2011, 150, 15−22. (186) Clark, P. Protease-Mediated Ectodomain Shedding. Thorax 2014, 69, 682−684. (187) Dean, R. A.; Overall, C. M. Proteomics Discovery of Metalloproteinase Substrates in the Cellular Context by iTRAQ Labeling Reveals a Diverse MMP-2 Substrate Degradome. Mol. Cell. Proteomics 2007, 6, 611−623. (188) Tam, E. M.; Morrison, C. J.; Wu, Y. I.; Stack, M. S.; Overall, C. M. Membrane Protease Proteomics: Isotope-Coded Affinity Tag MS Identification of Undescribed MT1-Matrix Metalloproteinase Substrates. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 6917−6922. (189) Butler, G. S.; Dean, R. A.; Tam, E. M.; Overall, C. M. Pharmacoproteomics of a Metalloproteinase Hydroxamate Inhibitor in Breast Cancer Cells: Dynamics of Membrane Type 1 Matrix Metalloproteinase-Mediated Membrane Protein Shedding. Mol. Cell. Biol. 2008, 28, 4896−4914. (190) Seals, D. F.; Courtneidge, S. A. The ADAMs Family of Metalloproteases: Multidomain Proteins with Multiple Functions. Genes Dev. 2003, 17, 7−30. (191) Wolfsberg, T. G.; Straight, P. D.; Gerena, R. L.; Huovila, A. P.; Primakoff, P.; Myles, D. G.; White, J. M. ADAM, a Widely Distributed and Developmentally Regulated Gene Family Encoding Membrane Proteins with a Disintegrin and Metalloprotease Domain. Dev. Biol. 1995, 169, 378−383. (192) Primakoff, P.; Hyatt, H.; Myles, D. G. A Role for the Migrating Sperm Surface Antigen PH-20 in Guinea Pig Sperm Binding to the Egg Zona Pellucida. J. Cell Biol. 1985, 101, 2239−2244. (193) Primakoff, P.; Hyatt, H.; Tredick-Kline, J. Identification and Purification of a Sperm Surface Protein with a Potential Role in Sperm-Egg Membrane Fusion. J. Cell Biol. 1987, 104, 141−149.

(156) Smeekens, S. P.; Steiner, D. F. Identification of a Human Insulinoma cDNA Encoding a Novel Mammalian Protein Structurally Related to the Yeast Dibasic Processing Protease Kex2. J. Biol. Chem. 1990, 265, 2997−3000. (157) Docherty, K.; Steiner, D. F. Post-Translational Proteolysis in Polypeptide Hormone Biosynthesis. Annu. Rev. Physiol. 1982, 44, 625−638. (158) Day, R.; Schafer, M. K.; Watson, S. J.; Chrétien, M.; Seidah, N. G. Distribution and Regulation of the Prohormone Convertases PC1 and PC2 in the Rat Pituitary. Mol. Endocrinol. 1992, 6, 485−497. (159) Malide, D.; Seidah, N. G.; Chrétien, M.; Bendayan, M. Electron Microscopic Immunocytochemical Evidence for the Involvement of the Convertases PC1 and PC2 in the Processing of Proinsulin in Pancreatic Beta-Cells. J. Histochem. Cytochem. 1995, 43, 11−19. (160) Davidson, H. W.; Rhodes, C. J.; Hutton, J. C. Intraorganellar Calcium and pH Control Proinsulin Cleavage in the Pancreatic Beta Cell via Two Distinct Site-Specific Endopeptidases. Nature 1988, 333, 93−96. (161) Smeekens, S. P.; Montag, A. G.; Thomas, G.; Albiges-Rizo, C.; Carroll, R.; Benig, M.; Phillips, L. A.; Martin, S.; Ohagi, S.; Gardner, P. Proinsulin Processing by the Subtilisin-Related Proprotein Convertases Furin, PC2, and PC3. Proc. Natl. Acad. Sci. U. S. A. 1992, 89, 8822−8826. (162) Benjannet, S.; Rondeau, N.; Day, R.; Chrétien, M.; Seidah, N. G. PC1 and PC2 Are Proprotein Convertases Capable of Cleaving Proopiomelanocortin at Distinct Pairs of Basic Residues. Proc. Natl. Acad. Sci. U. S. A. 1991, 88, 3564−3568. (163) Seidah, N. G. The Proprotein Convertases, 20 Years Later. Methods Mol. Biol. 2011, 768, 23−57. (164) Seidah, N. G.; Day, R.; Hamelin, J.; Gaspar, A.; Collard, M. W.; Chrétien, M. Testicular Expression of PC4 in the Rat: Molecular Diversity of a Novel Germ Cell-Specific Kex2/subtilisin-like Proprotein Convertase. Mol. Endocrinol. 1992, 6, 1559−1570. (165) Li, M.; Nakayama, K.; Shuto, Y.; Somogyvari-Vigh, A.; Arimura, A. Testis-Specific Prohormone Convertase PC4 Processes the Precursor of Pituitary Adenylate Cyclase-Activating Polypeptide (PACAP). Peptides 1998, 19, 259−268. (166) Seidah, N. G.; Mowla, S. J.; Hamelin, J.; Mamarbachi, A. M.; Benjannet, S.; Touré, B. B.; Basak, A.; Munzer, J. S.; Marcinkiewicz, J.; Zhong, M.; et al. Mammalian Subtilisin/kexin Isozyme SKI-1: A Widely Expressed Proprotein Convertase with a Unique Cleavage Specificity and Cellular Localization. Proc. Natl. Acad. Sci. U. S. A. 1999, 96, 1321−1326. (167) Marschner, K.; Kollmann, K.; Schweizer, M.; Braulke, T.; Pohl, S. A Key Enzyme in the Biogenesis of Lysosomes Is a Protease That Regulates Cholesterol Metabolism. Science 2011, 333, 87−90. (168) Cheng, D.; Espenshade, P. J.; Slaughter, C. A.; Jaen, J. C.; Brown, M. S.; Goldstein, J. L. Secreted Site-1 Protease Cleaves Peptides Corresponding to Luminal Loop of Sterol Regulatory Element-Binding Proteins. J. Biol. Chem. 1999, 274, 22805−22812. (169) Haze, K.; Yoshida, H.; Yanagi, H.; Yura, T.; Mori, K. Mammalian Transcription Factor ATF6 Is Synthesized as a Transmembrane Protein and Activated by Proteolysis in Response to Endoplasmic Reticulum Stress. Mol. Biol. Cell 1999, 10, 3787−3799. (170) López-Otín, C.; Bond, J. S. Proteases: Multifunctional Enzymes in Life and Disease. J. Biol. Chem. 2008, 283, 30433−30437. (171) Gomis-Rüth, F. X. Structure and Mechanism of Metallocarboxypeptidases. Crit. Rev. Biochem. Mol. Biol. 2008, 43, 319−345. (172) Breddam, K. Carboxypeptidase S-1 from Penicillium Janthinellum: Enzymatic Properties in Hydrolysis and Aminolysis Reactions. Carlsberg Res. Commun. 1988, 53, 309−320. (173) Tanco, S.; Gevaert, K.; Van Damme, P. C-Terminomics: Targeted Analysis of Natural and Posttranslationally Modified Protein and Peptide C-Termini. Proteomics 2015, 15, 903−914. (174) Vallee, B. L.; Neurath, H. Carboxypeptidase, a Zinc Metalloenzyme. J. Biol. Chem. 1955, 217, 253−261. (175) Burgos, F. J.; Salvà, M.; Villegas, V.; Soriano, F.; Mendez, E.; Avilés, F. X. Analysis of the Activation Process of Porcine 1163

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(194) Blobel, C. P.; Wolfsberg, T. G.; Turck, C. W.; Myles, D. G.; Primakoff, P.; White, J. M. A Potential Fusion Peptide and an Integrin Ligand Domain in a Protein Active in Sperm-Egg Fusion. Nature 1992, 356, 248−252. (195) Wolfsberg, T. G.; Bazan, J. F.; Blobel, C. P.; Myles, D. G.; Primakoff, P.; White, J. M. The Precursor Region of a Protein Active in Sperm-Egg Fusion Contains a Metalloprotease and a Disintegrin Domain: Structural, Functional, and Evolutionary Implications. Proc. Natl. Acad. Sci. U. S. A. 1993, 90, 10783−10787. (196) Lu, X.; Lu, D.; Scully, M. F.; Kakkar, V. V. Structure-Activity Relationship Studies on ADAM Protein-Integrin Interactions. Cardiovasc. Hematol. Agents Med. Chem. 2007, 5, 29−42. (197) Klein, T.; Bischoff, R. Active Metalloproteases of the A Disintegrin and Metalloprotease (ADAM) Family: Biological Function and Structure. J. Proteome Res. 2011, 10, 17−33. (198) Roy, R.; Wewer, U. M.; Zurakowski, D.; Pories, S. E.; Moses, M. A. ADAM 12 Cleaves Extracellular Matrix Proteins and Correlates with Cancer Status and Stage. J. Biol. Chem. 2004, 279, 51323−51330. (199) Roychaudhuri, R.; Hergrueter, A. H.; Polverino, F.; LauchoContreras, M. E.; Gupta, K.; Borregaard, N.; Owen, C. A. ADAM9 Is a Novel Product of Polymorphonuclear Neutrophils: Regulation of Expression and Contributions to Extracellular Matrix Protein Degradation during Acute Lung Injury. J. Immunol. 2014, 193, 2469−2482. (200) Black, R. A.; Rauch, C. T.; Kozlosky, C. J.; Peschon, J. J.; Slack, J. L.; Wolfson, M. F.; Castner, B. J.; Stocking, K. L.; Reddy, P.; Srinivasan, S.; et al. A Metalloproteinase Disintegrin That Releases Tumour-Necrosis Factor-Alpha from Cells. Nature 1997, 385, 729− 733. (201) Moss, M. L.; Jin, S. L.; Milla, M. E.; Bickett, D. M.; Burkhart, W.; Carter, H. L.; Chen, W. J.; Clay, W. C.; Didsbury, J. R.; Hassler, D.; et al. Cloning of a Disintegrin Metalloproteinase That Processes Precursor Tumour-Necrosis Factor-Alpha. Nature 1997, 385, 733− 736. (202) McGeehan, G. M.; Becherer, J. D.; Bast, R. C.; Boyer, C. M.; Champion, B.; Connolly, K. M.; Conway, J. G.; Furdon, P.; Karp, S.; Kidao, S.; et al. Regulation of Tumour Necrosis Factor-Alpha Processing by a Metalloproteinase Inhibitor. Nature 1994, 370, 558−561. (203) Gearing, A. J.; Beckett, P.; Christodoulou, M.; Churchill, M.; Clements, J.; Davidson, A. H.; Drummond, A. H.; Galloway, W. A.; Gilbert, R.; Gordon, J. L.; et al. Processing of Tumour Necrosis FactorAlpha Precursor by Metalloproteinases. Nature 1994, 370, 555−557. (204) Scheller, J.; Chalaris, A.; Garbers, C.; Rose-John, S. ADAM17: A Molecular Switch to Control Inflammation and Tissue Regeneration. Trends Immunol. 2011, 32, 380−387. (205) Lisi, S.; D’Amore, M.; Sisto, M. ADAM17 at the Interface between Inflammation and Autoimmunity. Immunol. Lett. 2014, 162, 159−169. (206) Zheng, Y.; Saftig, P.; Hartmann, D.; Blobel, C. Evaluation of the Contribution of Different ADAMs to Tumor Necrosis Factor Alpha (TNFalpha) Shedding and of the Function of the TNFalpha Ectodomain in Ensuring Selective Stimulated Shedding by the TNFalpha Convertase (TACE/ADAM17). J. Biol. Chem. 2004, 279, 42898−42906. (207) Ludwig, A.; Hundhausen, C.; Lambert, M. H.; Broadway, N.; Andrews, R. C.; Bickett, D. M.; Leesnitzer, M. A.; Becherer, J. D. Metalloproteinase Inhibitors for the Disintegrin-like Metalloproteinases ADAM10 and ADAM17 That Differentially Block Constitutive and Phorbol Ester-Inducible Shedding of Cell Surface Molecules. Comb. Chem. High Throughput Screening 2005, 8, 161−171. (208) Mülberg, J.; Schooltink, H.; Stoyan, T.; Günther, M.; Graeve, L.; Buse, G.; Mackiewicz, A.; Heinrich, P. C.; Rose-John, S. The Soluble Interleukin-6 Receptor Is Generated by Shedding. Eur. J. Immunol. 1993, 23, 473−480. (209) Matthews, V.; Schuster, B.; Schütze, S.; Bussmeyer, I.; Ludwig, A.; Hundhausen, C.; Sadowski, T.; Saftig, P.; Hartmann, D.; Kallen, K.J.; et al. Cellular Cholesterol Depletion Triggers Shedding of the

Human Interleukin-6 Receptor by ADAM10 and ADAM17 (TACE). J. Biol. Chem. 2003, 278, 38829−38839. (210) Baran, P.; Nitz, R.; Grötzinger, J.; Scheller, J.; Garbers, C. Minimal Interleukin 6 (IL-6) Receptor Stalk Composition for IL-6 Receptor Shedding and IL-6 Classic Signaling. J. Biol. Chem. 2013, 288, 14756−14768. (211) Chalaris, A.; Gewiese, J.; Paliga, K.; Fleig, L.; Schneede, A.; Krieger, K.; Rose-John, S.; Scheller, J. ADAM17-Mediated Shedding of the IL6R Induces Cleavage of the Membrane Stub by γ-Secretase. Biochim. Biophys. Acta, Mol. Cell Res. 2010, 1803, 234−245. (212) Atreya, R.; Mudter, J.; Finotto, S.; Müllberg, J.; Jostock, T.; Wirtz, S.; Schütz, M.; Bartsch, B.; Holtmann, M.; Becker, C.; et al. Blockade of Interleukin 6 Trans Signaling Suppresses T-Cell Resistance against Apoptosis in Chronic Intestinal Inflammation: Evidence in Crohn Disease and Experimental Colitis in Vivo. Nat. Med. 2000, 6, 583−588. (213) Dang, M.; Armbruster, N.; Miller, M. A.; Cermeno, E.; Hartmann, M.; Bell, G. W.; Root, D. E.; Lauffenburger, D. A.; Lodish, H. F.; Herrlich, A. Regulated ADAM17-Dependent EGF Family Ligand Release by Substrate-Selecting Signaling Pathways. Proc. Natl. Acad. Sci. U. S. A. 2013, 110, 9776−9781. (214) Melenhorst, W. B.; Visser, L.; Timmer, A.; van den Heuvel, M. C.; Stegeman, C. A.; van Goor, H. ADAM17 Upregulation in Human Renal Disease: A Role in Modulating TGF-Alpha Availability? Am. J. Physiol. Renal Physiol. 2009, 297, F781−790. (215) Kenny, P. A.; Bissell, M. J. Targeting TACE-Dependent EGFR Ligand Shedding in Breast Cancer. J. Clin. Invest. 2007, 117, 337−345. (216) Taylor, S. R.; Markesbery, M. G.; Harding, P. A. HeparinBinding Epidermal Growth Factor-like Growth Factor (HB-EGF) and Proteolytic Processing by a Disintegrin and Metalloproteinases (ADAM): A Regulator of Several Pathways. Semin. Cell Dev. Biol. 2014, 28, 22−30. (217) Jefferson, T.; Č aušević, M.; auf dem Keller, U.; Schilling, O.; Isbert, S.; Geyer, R.; Maier, W.; Tschickardt, S.; Jumpertz, T.; Weggen, S.; et al. Metalloprotease Meprin Beta Generates Nontoxic NTerminal Amyloid Precursor Protein Fragments in Vivo. J. Biol. Chem. 2011, 286, 27741−27750. (218) Prox, J.; Rittger, A.; Saftig, P. Physiological Functions of the Amyloid Precursor Protein Secretases ADAM10, BACE1, and Presenilin. Exp. Brain Res. 2012, 217, 331−341. (219) De Strooper, B.; Vassar, R.; Golde, T. The Secretases: Enzymes with Therapeutic Potential in Alzheimer Disease. Nat. Rev. Neurol. 2010, 6, 99−107. (220) Dufour, A.; Overall, C. M. Missing the Target: Matrix Metalloproteinase Antitargets in Inflammation and Cancer. Trends Pharmacol. Sci. 2013, 34, 233−242. (221) Overall, C. M.; Kleifeld, O. Tumour Microenvironment Opinion: Validating Matrix Metalloproteinases as Drug Targets and Anti-Targets for Cancer Therapy. Nat. Rev. Cancer 2006, 6, 227−239. (222) Rooke, J.; Pan, D.; Xu, T.; Rubin, G. M. KUZ, a Conserved Metalloprotease-Disintegrin Protein with Two Roles in Drosophila Neurogenesis. Science 1996, 273, 1227−1231. (223) Hartmann, D.; de Strooper, B.; Serneels, L.; Craessaerts, K.; Herreman, A.; Annaert, W.; Umans, L.; Lübke, T.; Lena Illert, A.; von Figura, K.; et al. The Disintegrin/metalloprotease ADAM 10 Is Essential for Notch Signalling but Not for Alpha-Secretase Activity in Fibroblasts. Hum. Mol. Genet. 2002, 11, 2615−2624. (224) Weber, S.; Niessen, M. T.; Prox, J.; Lüllmann-Rauch, R.; Schmitz, A.; Schwanbeck, R.; Blobel, C. P.; Jorissen, E.; de Strooper, B.; Niessen, C. M.; et al. The Disintegrin/metalloproteinase Adam10 Is Essential for Epidermal Integrity and Notch-Mediated Signaling. Development 2011, 138, 495−505. (225) Qiu, H.; Tang, X.; Ma, J.; Shaverdashvili, K.; Zhang, K.; Bedogni, B. Notch1 Autoactivation via Transcriptional Regulation of Furin, Which Sustains Notch1 Signaling by Processing Notch1Activating Proteases ADAM10 and Membrane Type 1 Matrix Metalloproteinase. Mol. Cell. Biol. 2015, 35, 3622−3632. (226) Chapin, J. C.; Hajjar, K. A. Fibrinolysis and the Control of Blood Coagulation. Blood Rev. 2015, 29, 17−24. 1164

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(227) Merle, N. S.; Church, S. E.; Fremeaux-Bacchi, V.; Roumenina, L. T. Complement System Part I - Molecular Mechanisms of Activation and Regulation. Front. Immunol. 2015, 6, 262. (228) Shalini, S.; Dorstyn, L.; Dawar, S.; Kumar, S. Old, New and Emerging Functions of Caspases. Cell Death Differ. 2015, 22, 526− 539. (229) Pop, C.; Salvesen, G. S. Human Caspases: Activation, Specificity, and Regulation. J. Biol. Chem. 2009, 284, 21777−21781. (230) Shimizu, Y.; Taraborrelli, L.; Walczak, H. Linear Ubiquitination in Immunity. Immunol. Rev. 2015, 266, 190−207. (231) Mocarski, E. S.; Upton, J. W.; Kaiser, W. J. Viral Infection and the Evolution of Caspase 8-Regulated Apoptotic and Necrotic Death Pathways. Nat. Rev. Immunol. 2011, 12, 79−88. (232) Timmer, J. C.; Salvesen, G. S. Caspase Substrates. Cell Death Differ. 2007, 14, 66−72. (233) Scott, N. E.; Rogers, L. D.; Prudova, A.; Brown, N. F.; Fortelny, N.; Overall, C. M.; Foster, L. J. Interactome Disassembly during Apoptosis Occurs Independent of Caspase Cleavage. Mol. Syst. Biol. 2017, 13, 906. (234) Mahrus, S.; Trinidad, J. C.; Barkan, D. T.; Sali, A.; Burlingame, A. L.; Wells, J. A. Global Sequencing of Proteolytic Cleavage Sites in Apoptosis by Specific Labeling of Protein N Termini. Cell 2008, 134, 866−876. (235) Van Damme, P.; Martens, L.; Van Damme, J.; Hugelier, K.; Staes, A.; Vandekerckhove, J.; Gevaert, K. Caspase-Specific and Nonspecific in Vivo Protein Processing during Fas-Induced Apoptosis. Nat. Methods 2005, 2, 771−777. (236) Heutinck, K. M.; ten Berge, I. J. M.; Hack, C. E.; Hamann, J.; Rowshani, A. T. Serine Proteases of the Human Immune System in Health and Disease. Mol. Immunol. 2010, 47, 1943−1955. (237) Marshall, N. C.; Finlay, B. B.; Overall, C. M. Sharpening Host Defenses during Infection: Proteases Cut to the Chase. Mol. Cell. Proteomics 2017, 16, S161−S171. (238) Voskoboinik, I.; Whisstock, J. C.; Trapani, J. A. Perforin and Granzymes: Function, Dysfunction and Human Pathology. Nat. Rev. Immunol. 2015, 15, 388−400. (239) Thiery, J.; Keefe, D.; Boulant, S.; Boucrot, E.; Walch, M.; Martinvalet, D.; Goping, I. S.; Bleackley, R. C.; Kirchhausen, T.; Lieberman, J. Perforin Pores in the Endosomal Membrane Trigger the Release of Endocytosed Granzyme B into the Cytosol of Target Cells. Nat. Immunol. 2011, 12, 770−777. (240) Lopez, J. A.; Susanto, O.; Jenkins, M. R.; Lukoyanova, N.; Sutton, V. R.; Law, R. H. P.; Johnston, A.; Bird, C. H.; Bird, P. I.; Whisstock, J. C.; et al. Perforin Forms Transient Pores on the Target Cell Plasma Membrane to Facilitate Rapid Access of Granzymes during Killer Cell Attack. Blood 2013, 121, 2659−2668. (241) Laforge, M.; Bidère, N.; Carmona, S.; Devocelle, A.; Charpentier, B.; Senik, A. Apoptotic Death Concurrent with CD3 Stimulation in Primary Human CD8+ T Lymphocytes: A Role for Endogenous Granzyme B. J. Immunol. 2006, 176, 3966−3977. (242) Sutton, V. R.; Davis, J. E.; Cancilla, M.; Johnstone, R. W.; Ruefli, A. A.; Sedelies, K.; Browne, K. A.; Trapani, J. A. Initiation of Apoptosis by Granzyme B Requires Direct Cleavage of Bid, but Not Direct Granzyme B-Mediated Caspase Activation. J. Exp. Med. 2000, 192, 1403−1414. (243) Heibein, J. A.; Goping, I. S.; Barry, M.; Pinkoski, M. J.; Shore, G. C.; Green, D. R.; Bleackley, R. C. Granzyme B-Mediated Cytochrome c Release Is Regulated by the Bcl-2 Family Members Bid and Bax. J. Exp. Med. 2000, 192, 1391−1402. (244) de Poot, S. a. H.; Bovenschen, N. Granzyme M: Behind Enemy Lines. Cell Death Differ. 2014, 21, 359−368. (245) Susanto, O.; Stewart, S. E.; Voskoboinik, I.; Brasacchio, D.; Hagn, M.; Ellis, S.; Asquith, S.; Sedelies, K. A.; Bird, P. I.; Waterhouse, N. J.; et al. Mouse Granzyme A Induces a Novel Death with Writhing Morphology That Is Mechanistically Distinct from Granzyme BInduced Apoptosis. Cell Death Differ. 2013, 20, 1183−1193. (246) Dal Porto, J. M.; Gauld, S. B.; Merrell, K. T.; Mills, D.; PughBernard, A. E.; Cambier, J. B Cell Antigen Receptor Signaling 101. Mol. Immunol. 2004, 41, 599−613.

(247) Li, M. O.; Rudensky, A. Y. T Cell Receptor Signalling in the Control of Regulatory T Cell Differentiation and Function. Nat. Rev. Immunol. 2016, 16, 220−233. (248) Cantrell, D. Signaling in Lymphocyte Activation. Cold Spring Harbor Perspect. Biol. 2015, 7, a018788. (249) Turvey, S. E.; Durandy, A.; Fischer, A.; Fung, S. Y.; Geha, R. S.; Gewies, A.; Giese, T.; Greil, J.; Keller, B.; McKinnon, M. L.; et al. The CARD11-BCL10-MALT1 (CBM) Signalosome Complex: Stepping into the Limelight of Human Primary Immunodeficiency. J. Allergy Clin. Immunol. 2014, 134, 276−284. (250) Hachmann, J.; Salvesen, G. S. The Paracaspase MALT1. Biochimie 2016, 122, 324−338. (251) Rebeaud, F.; Hailfinger, S.; Posevitz-Fejfar, A.; Tapernoux, M.; Moser, R.; Rueda, D.; Gaide, O.; Guzzardi, M.; Iancu, E. M.; Rufer, N.; et al. The Proteolytic Activity of the Paracaspase MALT1 Is Key in T Cell Activation. Nat. Immunol. 2008, 9, 272−281. (252) Coornaert, B.; Baens, M.; Heyninck, K.; Bekaert, T.; Haegman, M.; Staal, J.; Sun, L.; Chen, Z. J.; Marynen, P.; Beyaert, R. T Cell Antigen Receptor Stimulation Induces MALT1 Paracaspase−mediated Cleavage of the NF-κB Inhibitor A20. Nat. Immunol. 2008, 9, 263− 271. (253) Staal, J.; Driege, Y.; Bekaert, T.; Demeyer, A.; Muyllaert, D.; Van Damme, P.; Gevaert, K.; Beyaert, R. T-Cell Receptor-Induced JNK Activation Requires Proteolytic Inactivation of CYLD by MALT1. EMBO J. 2011, 30, 1742−1752. (254) Hailfinger, S.; Nogai, H.; Pelzer, C.; Jaworski, M.; Cabalzar, K.; Charton, J. E.; Guzzardi, M.; Decaillet, C.; Grau, M.; Dorken, B.; et al. Malt1-Dependent RelB Cleavage Promotes Canonical NF-kappaB Activation in Lymphocytes and Lymphoma Cell Lines. Proc. Natl. Acad. Sci. U. S. A. 2011, 108, 14596−14601. (255) Uehata, T.; Iwasaki, H.; Vandenbon, A.; Matsushita, K.; Hernandez-Cuellar, E.; Kuniyoshi, K.; Satoh, T.; Mino, T.; Suzuki, Y.; Standley, D. M.; et al. Malt1-Induced Cleavage of Regnase-1 in CD4(+) Helper T Cells Regulates Immune Activation. Cell 2013, 153, 1036−1049. (256) Jeltsch, K. M.; Hu, D.; Brenner, S.; Zöller, J.; Heinz, G. A.; Nagel, D.; Vogel, K. U.; Rehage, N.; Warth, S. C.; Edelmann, S. L.; et al. Cleavage of Roquin and Regnase-1 by the Paracaspase MALT1 Releases Their Cooperatively Repressed Targets to Promote TH17 Differentiation. Nat. Immunol. 2014, 15, 1079−1089. (257) Klein, T.; Fung, S.-Y.; Renner, F.; Blank, M. A.; Dufour, A.; Kang, S.; Bolger-Munro, M.; Scurll, J. M.; Priatel, J. J.; Schweigler, P.; et al. The Paracaspase MALT1 Cleaves HOIL1 Reducing Linear Ubiquitination by LUBAC to Dampen Lymphocyte NF-κB Signalling. Nat. Commun. 2015, 6, 8777. (258) Elton, L.; Carpentier, I.; Staal, J.; Driege, Y.; Haegman, M.; Beyaert, R. MALT1 Cleaves the E3 Ubiquitin Ligase HOIL-1 in Activated T Cells, Generating a Dominant Negative Inhibitor of LUBAC-Induced NF-κB Signaling. FEBS J. 2016, 283, 403−412. (259) Douanne, T.; Gavard, J.; Bidère, N. The Paracaspase MALT1 Cleaves the LUBAC Subunit HOIL1 during Antigen Receptor Signaling. J. Cell Sci. 2016, 129, 1775−1780. (260) Kirisako, T.; Kamei, K.; Murata, S.; Kato, M.; Fukumoto, H.; Kanie, M.; Sano, S.; Tokunaga, F.; Tanaka, K.; Iwai, K. A Ubiquitin Ligase Complex Assembles Linear Polyubiquitin Chains. EMBO J. 2006, 25, 4877−4887. (261) Ikeda, F.; Deribe, Y. L.; Skanland, S. S.; Stieglitz, B.; Grabbe, C.; Franz-Wachtel, M.; van Wijk, S. J.; Goswami, P.; Nagy, V.; Terzic, J.; et al. SHARPIN Forms a Linear Ubiquitin Ligase Complex Regulating NF-kappaB Activity and Apoptosis. Nature 2011, 471, 637−641. (262) Tokunaga, F.; Nakagawa, T.; Nakahara, M.; Saeki, Y.; Taniguchi, M.; Sakata, S.; Tanaka, K.; Nakano, H.; Iwai, K. SHARPIN Is a Component of the NF-kappaB-Activating Linear Ubiquitin Chain Assembly Complex. Nature 2011, 471, 633−636. (263) Emmerich, C. H.; Schmukle, A. C.; Walczak, H. The Emerging Role of Linear Ubiquitination in Cell Signaling. Sci. Signaling 2011, 4, re5. 1165

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(264) Sasaki, K.; Iwai, K. Roles of Linear Ubiquitinylation, a Crucial Regulator of NF-κB and Cell Death, in the Immune System. Immunol. Rev. 2015, 266, 175−189. (265) Bonnans, C.; Chou, J.; Werb, Z. Remodelling the Extracellular Matrix in Development and Disease. Nat. Rev. Mol. Cell Biol. 2014, 15, 786−801. (266) Vandenbroucke, R. E.; Libert, C. Is There New Hope for Therapeutic Matrix Metalloproteinase Inhibition? Nat. Rev. Drug Discovery 2014, 13, 904−927. (267) Khokha, R.; Murthy, A.; Weiss, A. Metalloproteinases and Their Natural Inhibitors in Inflammation and Immunity. Nat. Rev. Immunol. 2013, 13, 649−665. (268) Dufour, A.; Overall, C. M. Missing the Target: Matrix Metalloproteinase Antitargets in Inflammation and Cancer. Trends Pharmacol. Sci. 2013, 34, 233−242. (269) McQuibban, G. A.; Gong, J.-H.; Tam, E. M.; McCulloch, C. A. G.; Clark-Lewis, I.; Overall, C. M. Inflammation Dampened by Gelatinase A Cleavage of Monocyte Chemoattractant Protein-3. Science 2000, 289, 1202−1206. (270) McQuibban, G. A.; Butler, G. S.; Gong, J. H.; Bendall, L.; Power, C.; Clark-Lewis, I.; Overall, C. M. Matrix Metalloproteinase Activity Inactivates the CXC Chemokine Stromal Cell-Derived Factor1. J. Biol. Chem. 2001, 276, 43503−43508. (271) Zhang, K.; McQuibban, G. A.; Silva, C.; Butler, G. S.; Johnston, J. B.; Holden, J.; Clark-Lewis, I.; Overall, C. M.; Power, C. HIVInduced Metalloproteinase Processing of the Chemokine Stromal Cell Derived Factor-1 Causes Neurodegeneration. Nat. Neurosci. 2003, 6, 1064−1071. (272) Van Den Steen, P. E.; Wuyts, A.; Husson, S. J.; Proost, P.; Van Damme, J.; Opdenakker, G. Gelatinase B/MMP-9 and Neutrophil collagenase/MMP-8 Process the Chemokines Human GCP-2/ CXCL6, ENA-78/CXCL5 and Mouse GCP-2/LIX and Modulate Their Physiological Activities. Eur. J. Biochem. 2003, 270, 3739−3749. (273) Deleon-Pennell, K. Y.; Altara, R.; Yabluchanskiy, A.; Modesti, A.; Lindsey, M. L. The Circular Relationship between Matrix Metalloproteinase-9 and Inflammation Following Myocardial Infarction. IUBMB Life 2015, 67, 611−618. (274) Starr, A. E.; Dufour, A.; Maier, J.; Overall, C. M. Biochemical Analysis of Matrix Metalloproteinase Activation of Chemokines CCL15 and CCL23 and Increased Glycosaminoglycan Binding of CCL16. J. Biol. Chem. 2012, 287, 5848−5860. (275) McNab, F.; Mayer-Barber, K.; Sher, A.; Wack, A.; O’Garra, A. Type I Interferons in Infectious Disease. Nat. Rev. Immunol. 2015, 15, 87−103. (276) Marchant, D. J.; Bellac, C. L.; Moraes, T. J.; Wadsworth, S. J.; Dufour, A.; Butler, G. S.; Bilawchuk, L. M.; Hendry, R. G.; Robertson, A. G.; Cheung, C. T.; et al. A New Transcriptional Role for Matrix Metalloproteinase-12 in Antiviral Immunity. Nat. Med. 2014, 20, 493− 502. (277) Bellac, C. L.; Dufour, A.; Krisinger, M. J.; Loonchanta, A.; Starr, A. E.; auf dem Keller, U.; Lange, P. F.; Goebeler, V.; Kappelhoff, R.; Butler, G. S.; et al. Macrophage Matrix Metalloproteinase-12 Dampens Inflammation and Neutrophil Influx in Arthritis. Cell Rep. 2014, 9, 618−632. (278) Gros, P.; Milder, F. J.; Janssen, B. J. C. Complement Driven by Conformational Changes. Nat. Rev. Immunol. 2008, 8, 48−58. (279) Ricklin, D.; Reis, E. S.; Lambris, J. D. Complement in Disease: A Defence System Turning Offensive. Nat. Rev. Nephrol. 2016, 12, 383−401. (280) Ramachandran, R.; Noorbakhsh, F.; Defea, K.; Hollenberg, M. D. Targeting Proteinase-Activated Receptors: Therapeutic Potential and Challenges. Nat. Rev. Drug Discovery 2012, 11, 69−86. (281) Boire, A.; Covic, L.; Agarwal, A.; Jacques, S.; Sherifi, S.; Kuliopulos, A. PAR1 Is a Matrix Metalloprotease-1 Receptor That Promotes Invasion and Tumorigenesis of Breast Cancer Cells. Cell 2005, 120, 303−313. (282) Trivedi, V.; Boire, A.; Tchernychev, B.; Kaneider, N. C.; Leger, A. J.; O’Callaghan, K.; Covic, L.; Kuliopulos, A. Platelet Matrix

Metalloprotease-1 Mediates Thrombogenesis by Activating PAR1 at a Cryptic Ligand Site. Cell 2009, 137, 332−343. (283) Harris, J. L.; Backes, B. J.; Leonetti, F.; Mahrus, S.; Ellman, J. A.; Craik, C. S. Rapid and General Profiling of Protease Specificity by Using Combinatorial Fluorogenic Substrate Libraries. Proc. Natl. Acad. Sci. U. S. A. 2000, 97, 7754−7759. (284) Backes, B. J.; Harris, J. L.; Leonetti, F.; Craik, C. S.; Ellman, J. A. Synthesis of Positional-Scanning Libraries of Fluorogenic Peptide Substrates to Define the Extended Substrate Specificity of Plasmin and Thrombin. Nat. Biotechnol. 2000, 18, 187−193. (285) Boulware, K. T.; Daugherty, P. S. Protease Specificity Determination by Using Cellular Libraries of Peptide Substrates (CLiPS). Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 7583−7588. (286) Rano, T. A.; Timkey, T.; Peterson, E. P.; Rotonda, J.; Nicholson, D. W.; Becker, J. W.; Chapman, K. T.; Thornberry, N. A. A Combinatorial Approach for Determining Protease Specificities: Application to Interleukin-1beta Converting Enzyme (ICE). Chem. Biol. 1997, 4, 149−155. (287) Poreba, M.; Drag, M. Current Strategies for Probing Substrate Specificity of Proteases. Curr. Med. Chem. 2010, 17 (33), 3968−3995. (288) Turk, B. E.; Huang, L. L.; Piro, E. T.; Cantley, L. C. Determination of Protease Cleavage Site Motifs Using Mixture-Based Oriented Peptide Libraries. Nat. Biotechnol. 2001, 19, 661−667. (289) Tanco, S.; Lorenzo, J.; Garcia-Pardo, J.; Degroeve, S.; Martens, L.; Aviles, F. X.; Gevaert, K.; Van Damme, P. Proteome-Derived Peptide Libraries to Study the Substrate Specificity Profiles of Carboxypeptidases. Mol. Cell. Proteomics 2013, 12, 2096−2110. (290) Biniossek, M. L.; Niemer, M.; Maksimchuk, K.; Mayer, B.; Fuchs, J.; Huesgen, P. F.; McCafferty, D. G.; Turk, B.; Fritz, G.; Mayer, J.; et al. Identification of Protease Specificity by Combining ProteomeDerived Peptide Libraries and Quantitative Proteomics. Mol. Cell. Proteomics 2016, 15, 2515−2524. (291) O’Donoghue, A. J.; Eroy-Reveles, A. A.; Knudsen, G. M.; Ingram, J.; Zhou, M.; Statnekov, J. B.; Greninger, A. L.; Hostetter, D. R.; Qu, G.; Maltby, D. A.; et al. Global Identification of Peptidase Specificity by Multiplex Substrate Profiling. Nat. Methods 2012, 9, 1095−1100. (292) Schilling, O.; Overall, C. M. Proteome-Derived, DatabaseSearchable Peptide Libraries for Identifying Protease Cleavage Sites. Nat. Biotechnol. 2008, 26, 685−694. (293) Schilling, O.; Huesgen, P. F.; Barré, O.; auf dem Keller, U.; Overall, C. M. Characterization of the Prime and Non-Prime Active Site Specificities of Proteases by Proteome-Derived Peptide Libraries and Tandem Mass Spectrometry. Nat. Protoc. 2011, 6, 111−120. (294) Schilling, O.; auf dem Keller, U.; Overall, C. M. Factor Xa Subsite Mapping by Proteome-Derived Peptide Libraries Improved Using WebPICS, a Resource for Proteomic Identification of Cleavage Sites. Biol. Chem. 2011, 392, 1031−1037. (295) Vandooren, J.; Geurts, N.; Martens, E.; Van den Steen, P. E.; Opdenakker, G. Zymography Methods for Visualizing Hydrolytic Enzymes. Nat. Methods 2013, 10, 211−220. (296) Granelli-Piperno, A.; Reich, E. A Study of Proteases and Protease-Inhibitor Complexes in Biological Fluids. J. Exp. Med. 1978, 148, 223−234. (297) Hibbs, M. S.; Hasty, K. A.; Seyer, J. M.; Kang, A. H.; Mainardi, C. L. Biochemical and Immunological Characterization of the Secreted Forms of Human Neutrophil Gelatinase. J. Biol. Chem. 1985, 260, 2493−2500. (298) Sanman, L. E.; Bogyo, M. Activity-Based Profiling of Proteases. Annu. Rev. Biochem. 2014, 83, 249−273. (299) Nomura, D. K.; Dix, M. M.; Cravatt, B. F. Activity-Based Protein Profiling for Biochemical Pathway Discovery in Cancer. Nat. Rev. Cancer 2010, 10, 630−638. (300) Yan, S. J.; Blomme, E. a. G. In Situ Zymography: A Molecular Pathology Technique to Localize Endogenous Protease Activity in Tissue Sections. Vet. Pathol. 2003, 40, 227−236. (301) Keow, J. Y.; Herrmann, K. M.; Crawford, B. D. Differential in Vivo Zymography: A Method for Observing Matrix Metalloproteinase Activity in the Zebrafish Embryo. Matrix Biol. 2011, 30, 169−177. 1166

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(302) Kolb, H. C.; Finn, M. G.; Sharpless, K. B. Click Chemistry: Diverse Chemical Function from a Few Good Reactions. Angew. Chem., Int. Ed. 2001, 40, 2004−2021. (303) Abd-Elrahman, I.; Kosuge, H.; Wises Sadan, T.; Ben-Nun, Y.; Meir, K.; Rubinstein, C.; Bogyo, M.; McConnell, M. V.; Blum, G. Cathepsin Activity-Based Probes and Inhibitor for Preclinical Atherosclerosis Imaging and Macrophage Depletion. PLoS One 2016, 11, e0160522. (304) Blum, G.; von Degenfeld, G.; Merchant, M. J.; Blau, H. M.; Bogyo, M. Noninvasive Optical Imaging of Cysteine Protease Activity Using Fluorescently Quenched Activity-Based Probes. Nat. Chem. Biol. 2007, 3, 668−677. (305) Edgington, L. E.; Verdoes, M.; Ortega, A.; Withana, N. P.; Lee, J.; Syed, S.; Bachmann, M. H.; Blum, G.; Bogyo, M. Functional Imaging of Legumain in Cancer Using a New Quenched ActivityBased Probe. J. Am. Chem. Soc. 2013, 135, 174−182. (306) Saghatelian, A.; Jessani, N.; Joseph, A.; Humphrey, M.; Cravatt, B. F. Activity-Based Probes for the Proteomic Profiling of Metalloproteases. Proc. Natl. Acad. Sci. U. S. A. 2004, 101, 10000−10005. (307) Klein, T.; Geurink, P. P.; Overkleeft, H. S.; Kauffman, H. K.; Bischoff, R. Functional Proteomics on Zinc-Dependent Metalloproteinases Using Inhibitor Probes. ChemMedChem 2009, 4, 164− 170. (308) Chan, E. W. S.; Chattopadhaya, S.; Panicker, R. C.; Huang, X.; Yao, S. Q. Developing Photoactive Affinity Probes for Proteomic Profiling: Hydroxamate-Based Probes for Metalloproteases. J. Am. Chem. Soc. 2004, 126, 14435−14446. (309) Ting, R.; Harwig, C.; auf dem Keller, U.; McCormick, S.; Austin, P.; Overall, C. M.; Adam, M. J.; Ruth, T. J.; Perrin, D. M. Toward [18F]-Labeled Aryltrifluoroborate Radiotracers: In Vivo Positron Emission Tomography Imaging of Stable Aryltrifluoroborate Clearance in Mice. J. Am. Chem. Soc. 2008, 130, 12045−12055. (310) auf dem Keller, U.; Bellac, C. L.; Li, Y.; Lou, Y.; Lange, P. F.; Ting, R.; Harwig, C.; Kappelhoff, R.; Dedhar, S.; Adam, M. J.; et al. Novel Matrix Metalloproteinase Inhibitor [18F]marimastat-Aryltrifluoroborate as a Probe for in Vivo Positron Emission Tomography Imaging in Cancer. Cancer Res. 2010, 70, 7562−7569. (311) Moore, W. M.; Spilburg, C. A. Purification of Human Collagenases with a Hydroxamic Acid Affinity Column. Biochemistry 1986, 25, 5189−5195. (312) Leeuwenburgh, M. A.; Geurink, P. P.; Klein, T.; Kauffman, H. F.; van der Marel, G. A.; Bischoff, R.; Overkleeft, H. S. Solid-Phase Synthesis of Succinylhydroxamate Peptides: Functionalized Matrix Metalloproteinase Inhibitors. Org. Lett. 2006, 8, 1705−1708. (313) Freije, J. R.; Klein, T.; Ooms, J. A.; Franke, J. P.; Bischoff, R. Activity-Based Matrix Metallo-Protease Enrichment Using Automated, Inhibitor Affinity Extractions. J. Proteome Res. 2006, 5, 1186−1194. (314) Geurink, P.; Klein, T.; Leeuwenburgh, M.; van der Marel, G.; Kauffman, H.; Bischoff, R.; Overkleeft, H. A Peptide Hydroxamate Library for Enrichment of Metalloproteinases: Towards an AffinityBased Metalloproteinase Profiling Protocol. Org. Biomol. Chem. 2008, 6, 1244−1250. (315) Freije, R.; Klein, T.; Ooms, B.; Kauffman, H. F.; Bischoff, R. An Integrated High-Performance Liquid Chromatography-Mass Spectrometry System for the Activity-Dependent Analysis of Matrix Metalloproteases. J. Chromatogr. A 2008, 1189, 417−425. (316) Hesek, D.; Toth, M.; Meroueh, S. O.; Brown, S.; Zhao, H.; Sakr, W.; Fridman, R.; Mobashery, S. Design and Characterization of a Metalloproteinase Inhibitor-Tethered Resin for the Detection of Active MMPs in Biological Samples. Chem. Biol. 2006, 13, 379−386. (317) Hesek, D.; Toth, M.; Krchnak, V.; Fridman, R.; Mobashery, S. Synthesis of an Inhibitor-Tethered Resin for Detection of Active Matrix Metalloproteinases Involved in Disease. J. Org. Chem. 2006, 71, 5848−5854. (318) Rogers, L. D.; Overall, C. M. Proteolytic Post-Translational Modification of Proteins: Proteomic Tools and Methodology. Mol. Cell. Proteomics 2013, 12, 3532−3542.

(319) Eckhard, U.; Marino, G.; Butler, G. S.; Overall, C. M. Positional Proteomics in the Era of the Human Proteome Project on the Doorstep of Precision Medicine. Biochimie 2016, 122, 110−118. (320) Klein, T.; Viner, R. I.; Overall, C. M. Quantitative Proteomics and Terminomics to Elucidate the Role of Ubiquitination and Proteolysis in Adaptive Immunity. Philos. Trans. R. Soc., A 2016, 374, 20150372. (321) O’Farrell, P. H. High Resolution Two-Dimensional Electrophoresis of Proteins. J. Biol. Chem. 1975, 250, 4007−4021. (322) Hamdan, M.; Righetti, P. G. Assessment of Protein Expression by Means of 2-D Gel Electrophoresis with and without Mass Spectrometry. Mass Spectrom. Rev. 2003, 22 (4), 272−284. (323) Viswanathan, S.; Ü nlü, M.; Minden, J. S. Two-Dimensional Difference Gel Electrophoresis. Nat. Protoc. 2006, 1, 1351−1358. (324) Guo, L.; Eisenman, J. R.; Mahimkar, R. M.; Peschon, J. J.; Paxton, R. J.; Black, R. A.; Johnson, R. S. A Proteomic Approach for the Identification of Cell-Surface Proteins Shed by Metalloproteases. Mol. Cell. Proteomics 2002, 1, 30−36. (325) Dix, M. M.; Simon, G. M.; Cravatt, B. F. Global Mapping of the Topography and Magnitude of Proteolytic Events in Apoptosis. Cell 2008, 134, 679−691. (326) Dix, M. M.; Simon, G. M.; Cravatt, B. F. Global Identification of Caspase Substrates Using PROTOMAP (Protein Topography and Migration Analysis Platform). Methods Mol. Biol. 2014, 1133, 61−70. (327) Crawford, E. D.; Seaman, J. E.; Agard, N.; Hsu, G. W.; Julien, O.; Mahrus, S.; Nguyen, H.; Shimbo, K.; Yoshihara, H. A. I.; Zhuang, M.; et al. The DegraBase: A Database of Proteolysis in Healthy and Apoptotic Human Cells. Mol. Cell. Proteomics 2013, 12, 813−824. (328) Timmer, J. C.; Enoksson, M.; Wildfang, E.; Zhu, W.; Igarashi, Y.; Denault, J.-B.; Ma, Y.; Dummitt, B.; Chang, Y.-H.; Mast, A. E.; et al. Profiling Constitutive Proteolytic Events in Vivo. Biochem. J. 2007, 407, 41−48. (329) Bland, C.; Bellanger, L.; Armengaud, J. Magnetic Immunoaffinity Enrichment for Selective Capture and MS/MS Analysis of NTerminal-TMPP-Labeled Peptides. J. Proteome Res. 2014, 13, 668− 680. (330) Bland, C.; Hartmann, E. M.; Christie-Oleza, J. A.; Fernandez, B.; Armengaud, J. N-Terminal-Oriented Proteogenomics of the Marine Bacterium Roseobacter Denitrificans Och114 Using NSuccinimidyloxycarbonylmethyl)tris(2,4,6-Trimethoxyphenyl)phosphonium Bromide (TMPP) Labeling and Diagonal Chromatography. Mol. Cell. Proteomics 2014, 13, 1369−1381. (331) Boersema, P. J.; Raijmakers, R.; Lemeer, S.; Mohammed, S.; Heck, A. J. R. Multiplex Peptide Stable Isotope Dimethyl Labeling for Quantitative Proteomics. Nat. Protoc. 2009, 4, 484−494. (332) Kuhn, K.; Thompson, A.; Prinz, T.; Müller, J.; Baumann, C.; Schmidt, G.; Neumann, T.; Hamon, C. Isolation of N-Terminal Protein Sequence Tags from Cyanogen Bromide Cleaved Proteins as a Novel Approach to Investigate Hydrophobic Proteins. J. Proteome Res. 2003, 2, 598−609. (333) Shen, P.-T.; Hsu, J.-L.; Chen, S.-H. Dimethyl Isotope-Coded Affinity Selection for the Analysis of Free and Blocked N-Termini of Proteins Using LC-MS/MS. Anal. Chem. 2007, 79, 9520−9530. (334) McDonald, L.; Robertson, D. H. L.; Hurst, J. L.; Beynon, R. J. Positional Proteomics: Selective Recovery and Analysis of N-Terminal Proteolytic Peptides. Nat. Methods 2005, 2, 955−957. (335) McDonald, L.; Beynon, R. J. Positional Proteomics: Preparation of Amino-Terminal Peptides as a Strategy for Proteome Simplification and Characterization. Nat. Protoc. 2006, 1, 1790−1798. (336) Mommen, G. P. M.; van de Waterbeemd, B.; Meiring, H. D.; Kersten, G.; Heck, A. J. R.; de Jong, A. P. J. M. Unbiased Selective Isolation of Protein N-Terminal Peptides from Complex Proteome Samples Using Phospho Tagging (PTAG) and TiO(2)-Based Depletion. Mol. Cell. Proteomics 2012, 11, 832−842. (337) Kleifeld, O.; Doucet, A.; auf dem Keller, U.; Prudova, A.; Schilling, O.; Kainthan, R. K.; Starr, A. E.; Foster, L. J.; Kizhakkedathu, J. N.; Overall, C. M. Isotopic Labeling of Terminal Amines in Complex Samples Identifies Protein N-Termini and Protease Cleavage Products. Nat. Biotechnol. 2010, 28, 281−288. 1167

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168

Chemical Reviews

Review

(338) Kleifeld, O.; Doucet, A.; Prudova, A.; auf dem Keller, U.; Gioia, M.; Kizhakkedathu, J. N.; Overall, C. M. Identifying and Quantifying Proteolytic Events and the Natural N Terminome by Terminal Amine Isotopic Labeling of Substrates. Nat. Protoc. 2011, 6, 1578−1611. (339) Gevaert, K.; Goethals, M.; Martens, L.; Van Damme, J.; Staes, A.; Thomas, G. R.; Vandekerckhove, J. Exploring Proteomes and Analyzing Protein Processing by Mass Spectrometric Identification of Sorted N-Terminal Peptides. Nat. Biotechnol. 2003, 21, 566−569. (340) Van Damme, P.; Staes, A.; Bronsoms, S.; Helsens, K.; Colaert, N.; Timmerman, E.; Aviles, F. X.; Vandekerckhove, J.; Gevaert, K. Complementary Positional Proteomics for Screening Substrates of Endo- and Exoproteases. Nat. Methods 2010, 7, 512−515. (341) Venne, A. S.; Solari, F. A.; Faden, F.; Paretti, T.; Dissmeyer, N.; Zahedi, R. P. An Improved Workflow for Quantitative N-Terminal Charge-Based Fractional Diagonal Chromatography (ChaFRADIC) to Study Proteolytic Events in Arabidopsis Thaliana. Proteomics 2015, 15, 2458−2469. (342) Venne, A. S.; Vögtle, F.-N.; Meisinger, C.; Sickmann, A.; Zahedi, R. P. Novel Highly Sensitive, Specific, and Straightforward Strategy for Comprehensive N-Terminal Proteomics Reveals Unknown Substrates of the Mitochondrial Peptidase Icp55. J. Proteome Res. 2013, 12, 3823−3830. (343) Tinnefeld, V.; Venne, A. S.; Sickmann, A.; Zahedi, R. P. Enrichment of Cross-Linked Peptides Using Charge-Based Fractional Diagonal Chromatography (ChaFRADIC). J. Proteome Res. 2017, 16, 459−469. (344) auf dem Keller, U.; Overall, C. M. CLIPPER: An Add-on to the Trans-Proteomic Pipeline for the Automated Analysis of TAILS NTerminomics Data. Biol. Chem. 2012, 393, 1477−1483. (345) Lange, P. F.; Overall, C. M. TopFIND, a Knowledgebase Linking Protein Termini with Function. Nat. Methods 2011, 8, 703− 704. (346) Fortelny, N.; Yang, S.; Pavlidis, P.; Lange, P. F.; Overall, C. M. Proteome TopFIND 3.0 with TopFINDer and PathFINDer: Database and Analysis Tools for the Association of Protein Termini to Pre- and Post-Translational Events. Nucleic Acids Res. 2015, 43, D290−297. (347) Prudova, A.; Gocheva, V.; Auf dem Keller, U.; Eckhard, U.; Olson, O. C.; Akkari, L.; Butler, G. S.; Fortelny, N.; Lange, P. F.; Mark, J. C.; et al. TAILS N-Terminomics and Proteomics Show Protein Degradation Dominates over Proteolytic Processing by Cathepsins in Pancreatic Tumors. Cell Rep. 2016, 16, 1762−1773. (348) Fortelny, N.; Pavlidis, P.; Overall, C. M. The Path of No Return–Truncated Protein N-Termini and Current Ignorance of Their Genesis. Proteomics 2015, 15, 2547−2552. (349) Lange, P. F.; Overall, C. M. Protein TAILS: When Termini Tell Tales of Proteolysis and Function. Curr. Opin. Chem. Biol. 2013, 17, 73−82. (350) Washburn, M. P.; Wolters, D.; Yates, J. R. Large-Scale Analysis of the Yeast Proteome by Multidimensional Protein Identification Technology. Nat. Biotechnol. 2001, 19, 242−247. (351) Steen, H.; Mann, M. The ABC’s (and XYZ’s) of Peptide Sequencing. Nat. Rev. Mol. Cell Biol. 2004, 5, 699−711. (352) Wilkins, M. R.; Gasteiger, E.; Tonella, L.; Ou, K.; Tyler, M.; Sanchez, J. C.; Gooley, A. A.; Walsh, B. J.; Bairoch, A.; Appel, R. D.; et al. Protein Identification with N and C-Terminal Sequence Tags in Proteome Projects. J. Mol. Biol. 1998, 278, 599−608. (353) Taouatas, N.; Heck, A. J. R.; Mohammed, S. Evaluation of Metalloendopeptidase Lys-N Protease Performance under Different Sample Handling Conditions. J. Proteome Res. 2010, 9, 4282−4288. (354) Schilling, O.; Barré, O.; Huesgen, P. F.; Overall, C. M. Proteome-Wide Analysis of Protein Carboxy Termini: C Terminomics. Nat. Methods 2010, 7, 508−511. (355) Schilling, O.; Huesgen, P. F.; Barré, O.; Overall, C. M. Identification and Relative Quantification of Native and Proteolytically Generated Protein C-Termini from Complex Proteomes: CTerminome Analysis. Methods Mol. Biol. 2011, 781, 59−69. (356) Zhang, Y.; He, Q.; Ye, J.; Li, Y.; Huang, L.; Li, Q.; Huang, J.; Lu, J.; Zhang, X. Systematic Optimization of C-Terminal Amine-Based

Isotope Labeling of Substrates Approach for Deep Screening of CTerminome. Anal. Chem. 2015, 87, 10354−10361. (357) Somasundaram, P.; Koudelka, T.; Linke, D.; Tholey, A. CTerminal Charge-Reversal Derivatization and Parallel Use of Multiple Proteases Facilitates Identification of Protein C-Termini by CTerminomics. J. Proteome Res. 2016, 15, 1369−1378. (358) Good, D. M.; Wirtala, M.; McAlister, G. C.; Coon, J. J. Performance Characteristics of Electron Transfer Dissociation Mass Spectrometry. Mol. Cell. Proteomics 2007, 6, 1942−1951.

1168

DOI: 10.1021/acs.chemrev.7b00120 Chem. Rev. 2018, 118, 1137−1168