TAILS Profiling of Proteases and Their Substrates in

Aug 8, 2019 - Cite This:ACS Chem. Biol.2019XXXXXXXXXX-XXX ...... Heatmaps and pie carts were generated to visualize this data using Microsoft Excel...
0 downloads 0 Views 28MB Size
Subscriber access provided by Nottingham Trent University

Article

N-terminomics/TAILS profiling of proteases and their substrates in ulcerative colitis Marilyn H. Gordon, Anthonia Anowai, Daniel Young, Nabangshu Das, Rhiannon I. Campden, Henna Sekhon, Zoe Myers, Barbara Mainoli, Sameeksha Chopra, Peter S. Thuy-Boun, Jayachandran Kizhakkedathu, Gurmeet Bindra, Humberto B. Jijon, Steven Heitman, Robin Yates, Dennis W. Wolan, Laura E. Edgington-Mitchell, Wallace K. MacNaughton, and Antoine Dufour ACS Chem. Biol., Just Accepted Manuscript • DOI: 10.1021/acschembio.9b00608 • Publication Date (Web): 08 Aug 2019 Downloaded from pubs.acs.org on August 12, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Figure 1 A

Page 1 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

B

D

Gordon et al.

ACS Chemical Biology

N-terminomics/TAILS workflow

Healthy (N = 3)

Light Dimethylation (0 Da)

N-termini enrichment TAILS

Trypsin Ulcerative Colitis (N = 3)

Heavy Dimethylation ( +6 Da)

C

Healthy vs Ulcerative Colitis patients All unique peptides (n = 6,803) TAILS

Shotgun/ preTAILS LC/MS/MS

TAILS LC/MS/MS

Shotgun/preTAILS proteomics data Healthy (n = 427)

Ulcerative Colitis (n = 218)

3.0%

4.1% acetylated

1,642 4,753

MaxQuant

dimethylated

Dimethylated

dimethylated

acetylated

Acetylated

408 97.0%

95.9%

Shotgun/preTAILS Healthy (n = 299)

10.7%

Intact Methionine 6.6%

Dimethylated

Met Removal 1.4%

Acetylated ace

72.7% Processing

tyl

dim

ate d

Signal Peptide19.2%

eth yla ted

89.3%

Ulcerative Colitis (n = 262)

6.1%

Intact Methionine 4.4%

dimethylated

Met Removal 2.1%

acetylated Dimethylated

Acetylated

78.1% Processing

93.9%

ACS Paragon Plus Environment

Signal Peptide 15.3%

Page 2 of 30

Gordon et al.

TAILS data: Healthy IceLogo n = 265

60

P6 P5 P4 P3 P2 P1 P1’ P2’ P3’ P4’ P5’ P6’ G A S

P value upregulated

P V T

30 P Value = 0.05 % Difference

C L I N

P6

P5

P4

P3

P2

P1

P1’

P2’

P3’

P4’

P5’

P6’

D Q K E M H

-30

downregulated

F R Y W

-60

B

TAILS data: Ulcerative Colitis IceLogo 60

n = 224

P6 P5 P4 P3 P2 P1 P1’ P2’ P3’ P4’ P5’ P6’ G A S P V T

30

P value upregulated

C

P Value = 0.05 % Difference

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Figure 2 A

L I N

P6

P5

P4

P3

P2

P1

P1’

P2’

P3’

P4’

P5’

P6’

D Q K E M H

-30

F R Y W

-60

ACS Paragon Plus Environment

downregulated

Figure 3 A

Page 3 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

B

ACS Chemical Biology

Reactome Pathway: Healthy

Gordon et al. Metabolism Adherens Junctions TCA cycle

Reactome Pathway: Ulcerative Colitis Immune System Neutrophil degranulation Toll-like Receptor Cascades

ACS Paragon Plus Environment

Figure 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Gordon etPage al.4 of 30

ACS Chemical Biology

E-Cadherin

Processing

L-I Cadherin E-Cadherin

L-I Cadherin

Proteases in healthy biopsies 1

ND

N CT

E-Cadherin

CTNND1 CTNNA

1

CTNN

A1 L-I Cadherin

Degradation

Proteases in ulcerative colitis biopsies

Elevated peptides in healthy biopsies identified by N-terminomics/TAILS and Shotgun/preTAILS E-Cadherin (CDH1)

Liver-Intestinal Cadherin (CDH17)

MGPWSRSLSALLLLLQVSSWLCQEPEPCHPGFDAESYTFTVPRRHLERGRVLGRVNFEDC TGRQRTAYFSLDTRFKVGTDGVITVKRPLRFHNPQIHFLVYAWDSTYRKFSTKVTLNTVG HHHRPPPHQASVSGIQAELLTFPNSSPGLRRQKRDWVIPPISCPENEKGPFPKNLVQIKS NKDKEGKVFYSITGQGADTPPVGVFIIERETGWLKVTEPLDRERIATYTLFSHAVSSNGN AVEDPMEILITVTDQNDNKPEFTQEVFKGSVMEGALPGTSVMEVTATDADDDVNTYNAAI AYTILSQDPELPDKNMFTINRNTGVISVVTTGLDRESFPTYTLVVQAADLQGEGLSTTAT AVITVTDTNDNPPIFNPTTYKGQVPENEANVVITTLKVTDADAPNTPAWEAVYTILNDDG GQFVVTTNPVNNDGILKTAKGLDFEAKQQYILHVAVTNVVPFEVSLTTSTATVTVDVLDV NEAPIFVPPEKRVEVSEDFGVGQEITSYTAQEPDTFMEQKITYRIWRDTANWLEINPDTG AISTRAELDREDFEHVKNSTYTALIIATDNGSPVATGTGTLLLILSDVNDNAPIPEPRTI FFCERNPKPQVINIIDADLPPNTSPFTAELTHGASANWTIQYNDPTQESIILKPKMALEV GDYKINLKLMDNQNKDQVTTLEVSVCDCEGAAGVCRKAQPVEAGLQIPAILGILGGILAL LILILLLLLFLRRRAVVKEPLLPPEDDTRDNVYYYDEEGGGEEDQDFDLSQLHRGLDARP EVTRNDVAPTLMSVPRYLPRPANPDEIGNFIDENLKAADTDPTAPPYDSLLVFDYEGSGS EAASLSSLNSSESDKDQDYDYLNEWGNRFKKLADMYGGGEDD

Catenin Alpha-1 (CTNNA1)

MILQAHLHSLCLLMLYLATGYGQEGKFSGPLKPMTFSIYEGQEPSQIIFQFKANPPAVTF ELTGETDNIFVIEREGLLYYNRALDRETRSTHNLQVAALDANGIIVEGPVPITIKVKDIN DNRPTFLQSKYEGSVRQNSRPGKPFLYVNATDLDDPATPNGQLYYQIVIQLPMINNVMYF QINNKTGAISLTREGSQELNPAKNPSYNLVISVKDMGGQSENSFSDTTSVDIIVTENIWK APKPVEMVENSTDPHPIKITQVRWNDPGAQYSLVDKEKLPRFPFSIDQEGDIYVTQPLDR EEKDAYVFYAVAKDEYGKPLSYPLEIHVKVKDINDNPPTCPSPVTVFEVQENERLGNSIG TLTAHDRDEENTANSFLNYRIVEQTPKLPMDGLFLIQTYAGMLQLAKQSLKKQDTPQYNL TIEVSDKDFKTLCFVQINVIDINDQIPIFEKSDYGNLTLAEDTNIGSTILTIQATDADEP FTGSSKILYHIIKGDSEGRLGVDTDPHTNTGYVIIKKPLDFETAAVSNIVFKAENPEPLV FGVKYNASSFAKFTLIVTDVNEAPQFSQHVFQAKVSEDVAIGTKVGNVTAKDPEGLDISY SLRGDTRGWLKIDHVTGEIFSVAPLDREAGSPYRVQVVATEVGGSSLSSVSEFHLILMDV NDNPPRLAKDYTGLFFCHPLSAPGSLIFEATDDDQHLFRGPHFTFSLGSGSLQNDWEVSK INGTHARLSTRHTEFEEREYVVLIRINDGGRPPLEGIVSLPVTFCSCVEGSCFRPAGHQT GIPTVGMAVGILLTTLLVIGIILAVVFIRIKKDKGKDNVESAQASEVKPLRS

Catenin Delta-1 (CTNND1)

MTAVHAGNINFKWDPKSLEIRTLAVERLLEPLVTQVTTLVNTNSKGPSNKKRGRSKKAHV LAASVEQATENFLEKGDKIAKESQFLKEELVAAVEDVRKQGDLMKAAAGEFADDPCSSVK RGNMVRAARALLSAVTRLLILADMADVYKLLVQLKVVEDGILKLRNAGNEQDLGIQYKAL KPEVDKLNIMAAKRQQELKDVGHRDQMAAARGILQKNVPILYTASQACLQHPDVAAYKAN RDLIYKQLQQAVTGISNAAQATASDDASQHQGGGGGELAYALNNFDKQIIVDPLSFSEER FRPSLEERLESIISGAALMADSSCTRDDRRERIVAECNAVRQALQDLLSEYMGNAGRKER SDALNSAIDKMTKKTRDLRRQLRKAVMDHVSDSFLETNVPLLVLIEAAKNGNEKEVKEYA QVFREHANKLIEVANLACSISNNEEGVKLVRMSASQLEALCPQVINAALALAAKPQSKLA QENMDLFKEQWEKQVRVLTDAVDDITSIDDFLAVSENHILEDVNKCVIALQEKDVDGLDR TAGAIRGRAARVIHVVTSEMDNYEPGVYTEKVLEATKLLSNTVMPRFTEQVEAAVEALSS DPAQPMDENEFIDASRLVYDGIRDIRKAVLMIRTPEELDDSDFETEDFDVRSRTSVQTED DQLIAGQSARAIMAQLPQEQKAKIAEQVASFQEEKSKLDAEVSKWDDSGNDIIVLAKQMC MIMMEMTDFTRGKGPLKNTSDVISAAKKIAEAGSRMDKLGRTIADHCPDSACKQDLLAYL QRIALYCHQLNICSKVKAEVQNLGGELVVSGVDSAMSLIQAAKNLMNAVVQTVKASYVAS TKYQKSQGMASLNLPAVSWKMKAPEKKPLVKREKQDETQTKIKRASQKKHVNPVQALSEF KAMDSI

MDDSEVESTASILASVKEQEAQFEKLTRALEEERRHVSAQLERVRVSPQDANPLMANGTL TRRHQNGRFVGDADLERQKFSDLKLNGPQDHSHLLYSTIPRMQEPGQIVETYTEEDPEGA MSVVSVETSDDGTTRRTETTVKKVVKTVTTRTVQPVAMGPDGLPVDASSVSNNYIQTLGR DFRKNGNGGPGPYVGQAGTATLPRNFHYPPDGYSRHYEDGYPGGSDNYGSLSRVTRIEER YRPSMEGYRAPSRQDVYGPQPQVRVGGSSVDLHRFHPEPYGLEDDQRSMGYDDLDYGMMS DYGTARRTGTPSDPRRRLRSYEDMIGEEVPSDQYYWAPLAQHERGSLASLDSLRKGGPPP PNWRQPELPEVIAMLGFRLDAVKSNAAAYLQHLCYRNDKVKTDVRKLKGIPVLVGLLDHP KKEVHLGACGALKNISFGRDQDNKIAIKNCDGVPALVRLLRKARDMDLTEVITGTLWNLS SHDSIKMEIVDHALHALTDEVIIPHSGWEREPNEDCKPRHIEWESVLTNTAGCLRNVSSE RSEARRKLRECDGLVDALIFIVQAEIGQKDSDSKLVENCVCLLRNLSYQVHREIPQAERY QEAAPNVANNTGPHAASCFGAKKGKDEWFSRGKKPIEDPANDTVDFPKRTSPARGYELLF QPEVVRIYISLLKESKTPAILEASAGAIQNLCAGRWTYGRYIRSALRQEKALSAIADLLT NEHERVVKAASGALRNLAVDARNKELIGKHAIPNLVKNLPGGQQNSSWNFSEDTVISILN TINEVIAENLEAAKKLRETQGIEKLVLINKSGNRSEKEVRAAALVLQTIWGYKELRKPLE KEGWKKSDFQVNLNNASRSQSSHSYDDSTLPLIDRNQKSDKKPDREEIQMSNMGSNTKSL DNNYSTPNERGDHNRTLDRSGDLGDMEPLKGTTPLMQDEGQESLEEELDVLVLDDEGGQV SYPSMQKI

ACS Paragon Plus Environment

Figure 5 A

Page 5 of 30

Protease web of neutrophil elastase in Healthy patients (PathFINDer) 545

CAST

36

CST3 Inh.

Inh.

MEP1A

CAPN2

103

1761

Inh.

EEF1A1

Inh.

Inh.

Inh.

257 161

VIM

CTNND1

KRT18

SERPINF2 Inh.

CMA1

76

394

KLK4

108 69, 71

41 KRT8

UBA1

Inh.

HIST1H1B

154

ALB

CDH1

Legend

381

PRDX6

SERPINA1 Inh.

GZMM

176

403

Inh.

CTSL

8

CKMT1B

SERPINC1

SERPINB1

Inh.

GZMB

252

CSTB

344

400,

382

422

SERPINA3

SERPINB9

Inh.

CTSS 7

381

342

Inh.

4

IMUP

FLNA

363

ELANE

Inh.

3

SFN

CTSB

376

KNG1

Inh.

Inh.

3 , 39

Query protease

Inhibitor

UGDH

Inhibition (Inh.) List member

Protease

Cleavage (position indicated) Substrate

B

Protease web of neutrophil elastase in Ulcerative Colitis patients (PathFINDer) ELANE 3 , 39 376 19, 141

KNG1

Inh.

3 21

400,

6

64

403

213, 239

GAPDH

Inh.

Inh.

CTSB

149

MMP11

GZMB

GZMM

69

96

296

363

49, 54 95

RPL27A

ACTB RPL30

413

RPL13 47

CTSG

74

8,

259

EEF1A1 98

TXLNA

73

RPS3A

VIM

69

28

HSPB90B1

ACS Paragon Plus Environment

,7

0,

25

1

72

580

410

363

CALU

,3

Inh.

Inh.

Inh.

92

Inh.

,5

TUBA1A

1

361

56 410

PLG

8

748

75

C3 PKM

RPL35

MMP9

Inh.

63

2,

75, 8

9, 4

382

,5

21

179

66

Inh.

CTSS

2 MEP1A ,4

CALR

344

Inh.

Inh.

CTSL

PFN1

1 381

,4

40

Inh.

Inh.

Inh.

40

342

92

SERPINH1 CST3 TIMP1 SERPINB9 SLPI SERPINA3 SERPINB1 PLAU SERPINA1 MMP2 MMP3 SERPINF2

HRG

54

36

8

, 37

,4

Inh.

363

57, 5

336

78

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Gordon et al.

ACS Chemical Biology

465

Figure 6 1 A

Ulcerative Healthy Colitis (kDa) 1 2 1 2 100 75 100

Calpain-1 Calpain-2

75 35

Tryptase

25 100

STAT1

75

GAPDH

35

B

Fold difference (Healthy over UC)

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

6 of 30 Gordon Page et al.

ACS Chemical Biology

4

Healthy Ulcerative Colitis

3 2

n.s.

150 102

1

0

Calpain-1

Ulcerative Colitis 2 1 2 PARP1 Cleaved PARP1

76 35

Caspase-3

20 17

35

n.s.

*

1

Healthy (kDa)

*

Cleaved Caspase-3 GAPDH

ACS Paragon Plus Environment

Calpain-2

Tryptase

STAT1

B

Protease Source

3% 7%

Bacteria Fungal

63%

viral

Serine Metallo

17% 27 %

Cysteine

51%

Aspartic Threonine

Bacterial Proteases

lth

y

y

Human Proteases

H

C

ea

lth U

ea H omptin

Escherichia coli

arginine aminopeptidase (Streptococcus-type)

Streptococcus gordonii

gingipain K

Porphyromonas gingivalis

gingipain RgpA

Porphyromonas gingivalis

gingipain RgpB

Porphyromonas gingivalis

sortase A (Staphylococcus-type) AaaA aminopeptidase bacillolysin

Aspartic Cysteine

Escherichia coli

peptidase T

Bacillis subtilis

peptidyl-Asp metallopeptidase

caspase-8 caspase-9 legumain paracaspase

Metallo

Pseudomonas aeruginosa

pseudolysin

Zmp1 peptidase (Clostridium difficile-type)

aminopeptidase B aminopeptidase P1 aminopeptidase P2 aminopeptidase P3 angiotensin-converting enzyme-2

Pseudomonas aeruginosa

dipeptidyl-peptidase 11 (Porphyromonas gingivalis-type)

carboxypeptidase A1

Porphyromonas gingivalis

dipeptidyl-peptidase 4 (bacteria-type 2)

Capnocytophaga gingivalis

EprS g.p. (Pseudomonas aeruginosa)

Pseudomonas aeruginosa

glutamyl peptidase BI

carboxypeptidase B2

Achromobacter lyticus

oligopeptidase B

Escherichia coli

prolyl aminopeptidase 2

Aeromonas sobria

prolyl dipeptidase (Lactobacillus-type)

Lactobacillus sp

prolyl tripeptidyl peptidase

Bacillus subtilis

carboxypeptidase M

Serine

carboxypeptidase N endoplasmic reticulum aminopeptidase 1 matrix metallopeptidase-7 Met-Xaa dipeptidase neurolysin

SplA peptidase (Staphylococcus aureus)

Staphylococcus aureus

SplB peptidase (Staphylococcus aureus)

procollagen C-peptidase

Staphylococcus aureus

vertebrate tolloid-like 1 protein

SplD peptidase (Staphylococcus aureus)

Staphylococcus aureus

SpsB signal peptidase

Staphylococcus aureus

coagulation factor Xa

Lactococcus lactis

coagulation factor Xia

Xaa-Pro dipeptidyl-peptidase signal peptidase I

Escherichia coli

isoaspartyl dipeptidase (threonine type)

Escherichia coli

Low

Metallo

carboxypeptidase E carboxypeptidase G3

Bacillus intermedius

lysyl endopeptidase (bacteria)

aminopeptidase A

Vibrio splendidus

Streptomyces griseus

arginyl peptidase

ADAMTS1 peptidase

Escherichia coli Clostridium difficile

aminopeptidase S (Streptomyces-type)

ADAM19 peptidase

aminopeptidase AC

Pseudomonas aeruginosa

vimelysin Xaa-Pro dipeptidase (bacteria-type)

Cysteine

caspase-4

cathepsin P

Escherichia coli Tannerella forsythia

methionyl aminopeptidase 1 (Escherichia-type)

caspase-3

caspase-7

Pseudomonas sp.

karilysin

caspase-2

caspase-6

Porphyromonas gingivalis

epralysin

caspase-14

Pseudomonas aeruginosa Achromobacter lyticus

DapE peptidase

caspase-1 caspase-10

Staphylococcus aureus Bacillus subtilis

beta-lytic metallopeptidase CPG70 carboxypeptidase (Porphyromonas gingivalis)

coagulation factor Ixa

Threonine

High

complement component activated C1s DESC1 peptidase dipeptidyl-peptidase 8 dipeptidyl-peptidase 9 dipeptidyl-peptidase II dipeptidyl-peptidase IV (eukaryote) elastase-2 enteropeptidase factor VII-activating peptidase furin granzyme A

Fungal Proteases

granzyme B (Homo sapiens-type) granzyme K granzyme M HAT-like 3 peptidase

H ea U lthy C candidapepsin SAP10

hepsin human airway trypsin-like peptidase kallikrein-related peptidase 11 kallikrein-related peptidase 12 Candida albicans

penicillopepsin

Penicillium janthinellum

metacaspase Yca1 (Saccharomyces cerevisiae-type)

Saccharomyces cerevisiae

aspartyl aminopeptidase

Saccharomyces cerevisiae

methionyl aminopeptidase 2

Saccharomyces cerevisiae

peptidyl-Lys metallopeptidase

Grifola

carboxypeptidase OcpA

Aspartic Cysteine Metallo

Aspergillus oryzae

carboxypeptidase Y

Aspergillus nidulans

dipeptidyl aminopeptidase A

Saccharomyces cerevisiae

kexin

Saccharomyces cerevisiae

Low

Serine

kallikrein-related peptidase 13 kallikrein-related peptidase 14 kallikrein-related peptidase 2 kallikrein-related peptidase 4 kallikrein-related peptidase 5

Serine

kallikrein-related peptidase 6 kallikrein-related peptidase 8 lysosomal Pro-Xaa carboxypeptidase matriptase matriptase-2 matriptase-3 mesotrypsin

High

PCSK1 peptidase PCSK2 peptidase PCSK4 peptidase PCSK5 peptidase PCSK6 peptidase PCSK7 peptidase

Viral Proteases

plasma kallikrein plasmin prolyl oligopeptidase protein C (activated)

ea U lthy C

C

4% 1 %

C

27%

Protease Class

Human

serine carboxypeptidase A serine carboxypeptidase C serine carboxypeptidase D signalase (animal) 21 kDa component thrombin

H

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Gordon et al.

ACS Chemical Biology

U

Figure 7 A

Page 7 of 30

spumapepsin enterovirus picornain 2A rhinovirus picornain 2A flavivirin

Low

transmembrane peptidase, serine 4

trypsin 1 Aspartic tryptase alpha Cysteine tryptase beta human rhinovirus ACS Paragon Plus Environmenturokinase-type plasminogen yellow fever virus activator Serine

human spumaretrovirus human poliovirus 1

High

Low

High

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 8 of 30

N-terminomics/TAILS profiling of proteases and their substrates in ulcerative colitis Marilyn H. Gordon1, Anthonia Anowai2,3, Daniel Young1,3, Nabangshu Das1,3, Rhiannon I. Campden2,4, Henna Sekhon1,3, Zoe Myers1,3, Barbara Mainoli1,3, Sameeksha Chopra1,3, Peter S. Thuy-Boun5, Jayachandran Kizhakkedathu6, Gurmeet Bindra7, Humberto B. Jijon7, Steven Heitman7, Robin Yates4, Dennis W. Wolan5, Laura E. Edgington-Mitchell8,9,10, Wallace K. MacNaughton1,11* and Antoine Dufour1,2,3,11*

1Department

of Physiology and Pharmacology, University of Calgary, Calgary, Alberta, Canada T2N 4N1.

2Department

of Biochemistry and Molecular Biology, University of Calgary, Calgary, Alberta, Canada T2N

4N1. 3McCaig

Institute for Bone and Joint Health, University of Calgary, Calgary, Alberta, Canada T2N 4N1.

4Department

of Comparative Biology and Experimental Medicine, University of Calgary, Calgary, Alberta,

Canada T2N 4N1. 5Departments

of Molecular Medicine and Integrative Structural and Computational Biology, The Scripps

Research Institute, 10550 North Torrey Pines Road, La Jolla, California 92037, United States 6Department

of Pathology and Laboratory Medicine and Department of Chemistry, University of British

Columbia, Vancouver, British Columbia V6T 1Z2, Canada 7Department

of Medicine, Division of Gastroenterology, University of Calgary, Calgary. Alberta, Canada

T2N4N1. 8Department

of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology

Institute, The University of Melbourne, Parkville, Victoria, Australia. 9Drug

Discovery Biology, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville,

Victoria, Australia. 10Department

of Oral and Maxillofacial Surgery, New York University College of Dentistry, Bluestone

Center for Clinical Research, New York, New York, USA. 11Co-senior

authors

*Correspondence

and material requests should be addressed to Wallace K. MacNaughton

([email protected]) or to Antoine Dufour ([email protected]).

ACS Paragon Plus Environment

1

Page 9 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

ABSTRACT Dysregulated protease activity is often implicated in the initiation of inflammation and immune

cell

recruitment

in

gastrointestinal

inflammatory

diseases.

Using

N-

terminomics/TAILS (terminal amine isotopic labeling of substrates), we compared proteases, along with their substrates and inhibitors, between colonic mucosal biopsies of healthy patients and those with ulcerative colitis (UC). Among the 1,642 N-termini enriched using TAILS, increased endogenous processing of proteins was identified in UC compared to healthy patients. Changes in the reactome pathways for proteins associated with metabolism, adherens junction proteins (E-cadherin, liver-intestinal cadherin, catenin alpha-1 and catenin delta-1) and neutrophil degranulation were identified between the two groups. Increased neutrophil infiltration and distinct proteases observed in ulcerative colitis may result in extensive break down, altered processing or increased remodeling of adherens junctions and other cellular functions. Analysis of the preferred proteolytic cleavage sites indicated that the majority of proteolytic activity and processing comes from host proteases, but that key microbial proteases may also play a role in maintaining homeostasis. Thus, the identification of distinct proteases and processing of their substrates improves the understanding of dysregulated proteolysis in normal intestinal physiology and ulcerative colitis.

 INTRODUCTION Proteases are implicated in key functions during gastrointestinal (GI) homeostasis, and dysregulated proteolysis is one of the contributing factors of gastrointestinal inflammatory diseases1-3. Upon recruitment to the site of inflammation, activated immune cells produce various intra- and extra-cellular proteases including neutrophil elastase (NE), matriptase, cathepsins, caspases and matrix metalloproteinases, which all play roles in modulating the host response4-7. The shift in the balance between regulated and dysregulated proteolysis in GI inflammation is not fully understood, and the full repertoire of proteases, their substrates and their inhibitors has not been examined in detail. Transcript analyses have been performed on samples from patients with inflammatory bowel diseases8 but mRNA levels often do not correlate with protein levels9. Furthermore, transcriptomics does not assess whether a substrate is cleaved or not by a protease. Thus, examination of post-translational modifications is required to complement previous

ACS Paragon Plus Environment

2

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 10 of 30

transcript and proteome analyses. The gut microbiome, consisting of a complex ecosystem of bacteria, fungi, and viruses, harbors a greater diversity of microorganisms than all other organs10. The microbiome proteases play significant roles in immunity and host homeostasis and are starting to be characterized11,12, yet the extent of their substrates and downstream effects in the intestinal environment remains elusive. A current challenge is to be able to distinguish proteolytic processing driven by host and microbiome proteases. In ulcerative colitis (UC), the intestinal epithelial monolayer is compromised, with increased permeability and loss of cell-cell junctions, yet it remains difficult to determine if it is the host and/or bacterial proteases causing disruption of the adherens junctions13. However, antibacterial antibodies have been detected in patient serum14. Neutrophils are some of the first immune cell responders and an important hallmark of ulcerative colitis but the precise roles of its proteases and substrates remains poorly characterized3,15. Interestingly, a predominance of serine proteases activity (e.g. NE, cathepsin G, thrombin, tryptase) has been detected in inflammatory bowel diseases3. Genome-wide association studies of ulcerative colitis patients identified 19 specific loci (including ECM1, HNF4, CDH1 and LAMB1) implicated in dysfunction of the epithelial barrier16. We previously published6 that the disruption of adherens junctions can occur through proteolytic processing of CDH1 (E-Cadherin) by NE. Yet the specific roles and weighted contributions of proteases and their substrates have not been fully characterized in human ulcerative colitis patients. Additionally, the host-microbe interactions in healthy individuals and ulcerative colitis patients are not well characterized. Here we report a N-terminomic and proteomic analysis of the host and the microbiome of ulcerative colitis patient biopsies.  RESULTS N-terminomics/TAILS and shotgun/preTAILS analyses of healthy versus ulcerative colitis colon biopsies. Proteases play key biological functions in intestinal tissues and enhanced proteolytic activity is typically detected in inflammatory bowel disease (IBD)1,3. To profile the proteases and their substrates that are increased in IBD, we collected colon biopsies from three healthy individuals undergoing routine colonoscopy screening and three patients with ulcerative colitis (Table 1). Human colon biopsies from healthy (n = 3) or ulcerative colitis patients (n = 3)

ACS Paragon Plus Environment

3

Page 11 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

(Table 1) were subjected to an N-terminomics/TAILS and pre-enrichment TAILS proteomics analysis (Supplementary Tables 1-4). Protein lysates were denatured, alkylated and the peptide αand ε-amines were isotopically labeled with heavy or light formaldehyde. Following trypsin digestion, negative selection against unlabelled α-amines of trypsinized peptides was attained by incubating the samples with a 3-fold excess of the dendritic polyglycerol aldehyde TAILS polymer. Unbound peptides were separated from polymer-bound peptides by filtration. The global proteomes were compared by a shotgun (pre-enrichment TAILS) proteomics analysis and the Ntermini were enriched using an N-terminomics/TAILS protocol (see Methods and Figure 1A)17,18. Proteins from healthy tissues were labeled with light formaldehyde (+28 Da dimethylation), while ulcerative colitis samples were labeled with heavy formaldehyde (+34 Da dimethylation). After liquid chromatography and tandem mass spectrometry (LC-MS/MS) analysis, data were analyzed using MaxQuant19 at a 1% false discovery rate (FDR), TopFIND20,21 and STRING22 (Figure 1A). The data were not combined, but analyzed as separate individual datasets. Shotgun/preTAILS analysis of three healthy versus three ulcerative colitis biopsies yielded 4,753 unique peptides (Andromeda false discovery rate ≤ 1%) and the TAILS analysis yielded 1,642 unique peptides, where 408 were identical to the preTAILS analysis (Figure 1B, Supplementary Tables 1-4 and Supplementary Figure 1). Using heat maps of differentially expressed peptides, both the Nterminome and proteome of the healthy controls closely clustered together as compared to those of ulcerative colitis patients (Supplementary Figure 2A-B). The ratios of natural N-termini (acetylated) and neo-N-termini (dimethylated) were comparable between healthy (3.0% acetylated, 97.0% dimethylated) and ulcerative colitis (4.1% acetylated, 95.9% dimethylated) in the shotgun/preTAILS data (Figure 1C). In the TAILS-enriched samples, dimethylated peptides were slightly increased in ulcerative colitis (93.9%) as compared to healthy samples (89.3%). In vivo Nterminal processing was slightly increased in ulcerative colitis (78.1%) over healthy (72.7%) samples (Figure 1D). Other N-terminal processing events were characterized, including intact methionine, N-terminal methionine excision and signal peptide removal (Figure 1D). Overall, using unbiased N-terminomics and proteomics analyses, distinct proteolytic signatures were identified in ulcerative colitis as compared to healthy colon biopsies. Distinct proteases and proteolytic signatures between healthy and ulcerative colitis. Intestinal inflammation results in the infiltration of immune cells and the production of distinct proteases in

ACS Paragon Plus Environment

4

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 12 of 30

the GI tract as compared to non-inflamed tissues1,2,6. Our N-terminomics/TAILS survey of healthy and ulcerative colitis colon biopsies (Figure 1A-B) revealed 11 proteases and 5 protease inhibitors that were differentially expressed (Table 2). We identified 265 unique N-termini that were elevated in the healthy biopsies (Figure 2A, summarized as heat maps). The cleavage sites displayed an enrichment in alanine or valine residues at P6, glycine at P3, valine at P2 and aspartic acid or arginine at P1. The 224 N-termini found to be elevated in ulcerative colitis displayed a strong preference for aspartic acid at P5, P3 and P1 and proline at P2 (Figure 2B). At the P1 position, both healthy and inflamed tissues exhibited an enrichment in aspartic acid residues; however, there was also an increase in basic arginine residues at P1 in healthy tissue while ulcerative colitis samples exhibited an enrichment in hydrophobic valine and isoleucine residues (Figure 2A-B). Overall, we detected distinct proteases with distinct proteolytic preferences and cleavage signatures in healthy versus ulcerative colitis. Determination of reactome changes. To better characterize the changes in the global protein network during ulcerative colitis, we investigated the functional interactions of elevated proteins for each condition using STRING v11 (https://string-db.org). In the healthy colon biopsies, we detected 3 dominant clusters: 40 enriched proteins were involved in metabolism, 9 were involved in the citric acid (TCA) cycle and respiratory electron transport and 4 were involved in adherens junctions (Figure 3A, Supplementary Figure 3A and Supplementary Table 5). In ulcerative colitis, we found an enrichment for 22 proteins involved in immune function, 12 involved in neutrophil degranulation and 5 in Toll-like receptor cascades (Figure 3B, Supplementary Figure 3B and Supplementary Table 5). In agreement with our previous work6, we found elevated degradation of E-cadherin (CDH1) that could be degraded by neutrophil elastase (NE) in ulcerative colitis and an increase in adherens junction proteins in healthy samples. As we previously demonstrated6, NE is able to cleave E-cadherin in at least 48 positions resulting in peptides effectively able to increase wound closure. Here, we identified additional proteins implicated in adherens junctions that are known to interact with E-cadherin25 and are either processed or degraded: liver-intestine cadherin (CDH17), catenin alpha-1 (CTNNA1) and catenin delta-1 (CTNND1) (Figure 4). Therefore, increased neutrophil infiltration and unique protease regulation during ulcerative colitis may result in extensive breakdown or dysregulated remodeling of adherens junction proteins.

ACS Paragon Plus Environment

5

Page 13 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Characterization of the ulcerative colitis protease web. Post-translational proteolysis can drastically modify protein function, change its cellular location or result in a total destruction of a protein2,5,26,27. Knowledge of the site where a protease cleaves its substrates is valuable information but the extent of the mechanistic consequence of processed substrates remain sparse in the literature.

Using

topFIND20,

(http://clipserve.clip.ubc.ca/topfind),

a

protease

and

we

characterized

substrate all

identified

analysis

platform

protein

N-termini

(Supplementary Table 1). Using PathFINDer21, we integrated the N-termini of all proteins in a protease web where NE (ELANE) was the query protease compared in healthy and inflamed UC tissue (Figure 5A-B). All known cleavage sites (number beside each arrow) were integrated between NE (ELANE) (blue), the protease inhibitors (red), identified hits from our Nterminomics/TAILS data (grey) and known proteolytic processing from literature (white) (Figure 5A-B). Using TopFINDer and PathFINDer, we identified granzyme B, although not detected in our proteomic analyses, as the protease cleaving catenin delta-1 (Figure 4) between 161Asp↓Gly162. In contrast, kallikrein-4 (KLK4) is known to cleave E-cadherin (CDH1) between

154Arg↓Asp155

but this N-termini was not identified in our data suggesting that another protease, for example NE, could be directly cleaving this substrate. Other connections in the ulcerative colitis data identified NE upstream of four proteases (MMP2, MMP3, MMP9 and PLAU), suggesting additional proteases contributing to the increased processing observed in ulcerative colitis (Figure 1D). Validation of inflammatory proteins and proteases. Although only predictive associations can be made using PathFINDer, biological networks from published data can generate testable experimental hypotheses that can be validated using Western blotting (Figure 6A-B) or mass spectrometry experiments (Figure 1A-D and Supplementary Tables 1-4). To support and validate part of our proteomics data, we verified the protein expression of a few selected proteins: calpain1, calpain-2, tryptase and STAT1 using western blotting (Figure 6A). The upregulation of STAT1 is indicative of ulcerative colitis inflammatory signaling as demonstrated previously23. Next, we measured the apoptosis levels by monitoring Caspase-3 activation and Poly [ADP-ribose] polymerase 1(PARP1) cleavage and identified increased apoptosis in UC patients (Figure 6B). Therefore, PathFINDer allows for a broader unbiased interpretation and help in the validation of the in vivo human protease web interactions.

ACS Paragon Plus Environment

6

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 14 of 30

Characterization and identification of potential proteolytic contributions. Using our topFIND analysis, we compared positions P4 to P4’ of our identified N-termini peptides to known enzymatic preferences for human, bacterial, fungal, and viral proteases using the MEROPS database (https://www.ebi.ac.uk/merops/index.shtml). Detailed comparative analysis allowed us to identify which proteases could potentially be contributing to the generation of our identified peptide sequences (Supplementary Tables 1) and determine a relative predictive frequency. Solely based on the site of cleavages, the predicted proteolytic activity was identified to be potentially from host proteases (63%), followed by bacterial, fungal, and viral sources (27%, 7%, and 3%, respectively) (Figure 7A). Of the identified potential contributing proteases, serine proteases were the most abundant (51%), followed by metalloproteases (27%) and cysteine proteases (17%), with aspartic (4%) and threonine (1%) proteases (Figure 7B). As expected, an increase in probable contributions by host cysteine immune proteases (such as caspases) was observed in both healthy and UC patients, as well as an increase in several viral, fungal, and bacterial proteases, in particular metalloproteases from these groups (Figure 7C). The majority of proteins which were differentially processed were cell structural proteins such as actins linked to the cytoskeleton remodelling (Supplementary Table 6), suggesting that the peptides generated by this altered proteolytic program likely play a role in maintaining cell-cell adhesions and barrier integrity.

 DISCUSSION Proteases play key functions in gastrointestinal physiology and host metabolism, yet when dysregulated can contribute to increased inflammation resulting in pathologies such as ulcerative colitis. Previous transcript and protein screens cannot inform about the activity status of proteases, their proteolytic substrates, or the potential actions of these neo-peptides in inflamed human tissues. Our systems-wide and N-termini profiling of proteases, their substrates and inhibitors, and identification of probable contributions resulted in the identification of distinct protease signatures in healthy versus ulcerative colitis colon biopsies. Interestingly, we identified an enrichment for D in the P1 position in both healthy and UC biopsies. In addition to D in P1, we identified an enrichment for R, N, Y and F in healthy biopsies. In UC, we found an enrichment in P1 for V, I, A and C. The amino acids preferences correspond to 243 cleavage sites representing 136 unique proteins being processed in ulcerative colitis including proteins implicated in the recruitment/chemotaxis and degranulation of immune cells: eosinophils (eosinophil peroxidase28),

ACS Paragon Plus Environment

7

Page 15 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

neutrophils (Ras-related C3 botulinum toxin substrate 1, RAC129-31; myeloperoxidase, MPO32; Lymphocyte-specific protein 1, LSP116,33,34; Grancalcin, GCA35) and macrophages (S100A936). In healthy patients, we identified 288 N-termini representing 162 proteins implicated in intestinal mucus homeostasis (Calcium-activated chloride channel regulator 1, CLCA137; Mucin-2, MUC238), adherens junction homeostasis (E-cadherin, liver-intestine Cadherin (CDH17), catenin alpha-1 (CTNNA1) and catenin delta-1 (CTNND1) (Figure 4) and intestinal barrier function (Cell surface A33 antigen, GPA3339,40). Using N-terminomics, we identified differences in processing versus degradation of E-cadherin resulting in altered levels of bioactive peptides during inflammation, as we have shown previously6, and identified 3 additional substrates differentially processed implicated in barrier functions: liver-intestine cadherin, catenin alpha-1 and catenin delta-1. The shift in reactome priorities from metabolic and tissue barrier function in healthy individuals to inflammatory/immune responses and bacterial sensing in ulcerative colitis patients is unsurprising, however, the lack of overlap in reactomes gives potential insight into the disease pathogenesis. Of note, the enriched reactomes identified in the healthy samples were related to cellular and tissue metabolism. In recent years, increasing evidence has emerged for UC either being a ‘metabolic disease’ or having significant metabolic extraintestinal effects41-43, however, this has been primarily attributed to the altered nutritional state and/or altered microbial composition of UC patients. Our data suggest that this metabolic dysregulation may be in part initiated by a shift away from metabolic homeostasis regulated by host proteases to a reactome of proteases spending more cellular energy on initiating and maintaining an inflammatory immune response. Further investigation into the unexplored area of protease influences on metabolic homeostasis and metabolic links to chronic sustained inflammation would likely give novel insight into the pathogenesis of ulcerative colitis. We identified several potential host and microbial proteases that may be contributing to the healthy or diseased states through their relative proteolytic activity and likely play a role in maintaining host homeostasis. While many studies have looked at host proteolytic activity in UC patients3,44-48, these have been targeted studies, looking at specific subsets of proteases, and as such have been unable to examine the roles and enzymatic activities of various host and microbial proteases concurrently. We have shown that not only is there a significant change in the overall proteolytic ACS Paragon Plus Environment

8

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 30

signatures of host and microbial proteases in ulcerative colitis, but that there appears to be a targeted (although incompletely characterized) change in the differential post-translational proteolytic processing of key homeostatic cellular barrier proteins (E-cadherin, liver-intestine cadherin, catenin alpha-1, catenin delta-1, actins, tubulins, vimentin) and transcriptional machinery (histones, elongation factors, transcription factors) (Figure 5 and Supplementary Table 6). This suggests a potential previously unknown mechanism for cellular regulation, with proteases maintaining the gut homeostatic state through enzymatic processing of structural and translational proteins directly, with a shift to neo-N-termini peptides with potential loss or gain of function bioactivities in UC. Previous studies have focused on the degradative properties of inflammatory proteases due to their significantly elevated levels in ulcerative colitis patients, however, our data suggest inflammatory proteases may be directly regulating cellular processes in a non-destructive, targeted manner via the creation of unique peptide signatures, creating new functional reactomes and proteases web interactions. Indeed, while the role of bioactive peptides from extracellular matrix proteins and their contributions to tissue repair and homeostasis has been well established in skin and cardiac studies49-52, few studies have examined the larger contributions of small peptide molecules and proteolytic regulation of tissue repair and homeostasis in other organ system. At least one study44 has identified significant basement matrix remodelling in ulcerative colitis as a result of altered proteolytic activity, which is consistent with our results, and degradation of therapeutic biologic antibodies by host proteases has been shown to be a potential cause for non-response in UC patients45, although the implications of epithelial remodelling by proteases remains to be better characterized. Our data suggest that a better understanding of the proteolytic post-translational regulatory systems in intestinal epithelial biology is necessary to identify probable peptide biomarker signatures of UC and new therapeutic targets, as well as identifying probable nonresponders to certain biologic therapeutics.  FIGURE LEGENDS Figure 1 Proteome and N-terminome/TAILS Analyses of Healthy and Ulcerative Colitis Colon Biopsies. (A) Experimental design. Colon biopsies from healthy patients and those with ulcerative colitis (n=3) were analyzed by TAILS and shotgun (preTAILS) proteomics analyses. Colon biopsies were labeled with light or heavy formaldehyde and analyzed using MaxQuant 1.6.0.119,53. (B) The numbers of unique and shared peptides identified with high confidence (FDR ≤ 1%) are shown. For peptides identified in each sample, see Supplementary Tables 1-3 for TAILS and

ACS Paragon Plus Environment

9

Page 17 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Supplementary Table 4 for preTAILS. (C) Distribution of dimethylated and -amine acetylated Ntermini peptides before TAILS enrichment. (D) Left, Distribution of dimethylated and -amine acetylated N-termini peptides after TAILS enrichment. Right, Distribution of post-translational modifications of peptides including intact methionine, methionine removed (no methionine), signal peptide removal and protease generated neo-N-terminal processing. Figure 2 Differentiated Proteases and Proteolytic Signatures of Healthy and Ulcerative Colitis Colon Biopsies. Left, Peptides sequence profiles of significantly changed neo-N-termini peptides in (A) healthy (n = 265) and (B) ulcerative colitis (n = 224) TAILS data using IceLogo54. Significantly (P value < 0.05) over-represented amino acids are shown above the x axis and underrepresented residues are shown below the x axis. Right, Cleavage sites identified in (A) healthy and (B) ulcerative colitis biopsies are depicted as heat maps from P6 to P6’ residues. Green, upregulated. Red, Downregulated. Figure 3 Analysis of Protein-Protein Interaction Network by STRING v11. Elevated proteins from (A) healthy or (B) ulcerative colitis colon biopsies identified from the shotgun/preTAILS analysis were mapped by searching the STRING v11 software with confidence level of 1% false discovery rate. Colored lines between the proteins indicate different types of interaction evidence: known interactions (teal), experimentally determined (pink), predicted interactions gene neighborhood (green), gene fusions (red), gene co-occurrence (blue), text-mining (yellow), co-expression (black), protein homology (purple). Figure 4 Analysis of Protease Substrates Involved in Adherens Junctions. Upper, diagram demonstrating the proteolytic processing of adherens junction proteins in healthy tissues versus the proteolytic degradation in ulcerative colitis tissues. Lower, peptides identified in the Nterminomics/TAILS analysis (Supplementary Table 1-3) are shown in red and peptides identified in the shotgun/preTAILS analysis (Supplementary Table 4) are shown in green for four proteins implicated in adherens junctions: E-cadherin (blue), liver-intestine cadherin (red), catenin alpha-1 (green) and catenin delta-1 (purple). Figure 5 Protease Web Analysis Using PathFINDer. All N-termini identified in healthy (A) or

ACS Paragon Plus Environment

10

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 30

ulcerative colitis (B) colon biopsies in Figure 1A and Supplementary Table 2-3 were analyzed using PathFINDer analysis tool. Neutrophil elastase, ELANE (blue), was the query protease. The numbers are the known cleavage site from the topFIND database, inhibitors are shown in red, substrates in black, and known cleavage sites are shown in grey. Figure 6 Western Blot Validation of Proteomics Data. (A) Left, Western blot analysis for Calpain1, Calpain-2, tryptase, STAT1 in four additional colon biopsies (healthy n = 2 and ulcerative colitis n = 2). Patients information is shown in Table 1. GAPDH loading controls and molecular weight marker positions in all blots are as shown. Right, quantification of the western blots from the left panel in comparison to GAPDH. Error bars represent SEM. Student’s t-test: *p < 0.05. n.s. = not significant. (B) Western blot analysis for PARP1 and Caspase-3 in four colon biopsies (healthy n = 2 and ulcerative colitis n = 2). Patients information is shown in Table 1. GAPDH loading controls and molecular weight marker positions in all blots are as shown. Figure 7 Characterization and Identification of Potential Proteolytic Contributions. Analysis of the source (A) and class (B) of potential contributing proteases in human colonic biopsies using cleavage site comparison of identified N-termini to known proteolytic consensus sequences from the MEROPS database. (C) Heatmaps representing identified potential proteases and analysis of their relative probable contributions between ulcerative colitis and healthy patients, broken down by source and protease class.

 METHODS N-terminal TAILS and shotgun proteomics whole protein dimethylation. Human colon biopsies from healthy (n = 3) or ulcerative colitis patients (n = 3) (Table 1) were subjected to an N-terminomics/TAILS and shotgun proteomics analysis (Supplementary Tables 1-4). All the sample biomass from the biopsies was required for the data collection. Protein lysates were diluted in a final concentration of 4 M guanidine HCL (pH 8.0), 100 mM HEPES (pH 8.0) and 10 mM DTT. Once cooled to room temperature, alkylation of the sample was achieved by incubation with a final concentration of 15 mM iodoacetamide for 20 min in the dark at room temperature. The pH was next adjusted to 6.0 with HCl. Next, to label peptide α- and ε-amines, samples were incubated for 18 h at 37 °C with isotopically heavy [40 mM

13CD

2O

+20 mM NaBH3CN (sodium

cyanoborohydride)] or light labels [40 mM light formaldehyde (CH2O) + 20 mM NaBH3CN], all ACS Paragon Plus Environment

11

Page 19 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

final concentrations. Next, proteins (combined sample mixtures: control and ulcerative colitis) were precipitated in acetone/methanol and washed 4 times in methanol. Next, proteins were trypsinized. N-terminal Enrichment Negative selection against unlabelled α-amines of trypsinized peptides was attained by incubating the samples with a 3 fold excess (w/w) of dendritic polyglycerol aldehyde polymer (http://flintbox.com/public/project/1948/) and 1 M NaBH3CN for 18 h at 37°C, following pH adjustment to 6.0 with 1 M HCL. Ten µL 1 M NaBH3CN were added for each 1 mg protein to ensure catalysis of the binding reaction of the polymer to free α-amines of peptides. Unbound peptides were separated from polymer-bound peptides by filtration using centrifugal filter units with 10-kDa cut-off membranes (Amicon, Millipore,). The TAILS polymer was washed with 100 µL of 100 mM ammonium bicarbonate. The polymer-bound peptides on the filter were then discarded. The filtrate was collected, acidified with 100% acetic acid to pH < 3.0 and stored on C18 Stage tips until analysis. High Performance Liquid Chromatography (HPLC) and Mass Spectrometry (MS) All liquid chromatography and mass spectrometry experiments were carried out by the Southern Alberta Mass Spectrometry (SAMS) core facility at the University of Calgary, Canada. Analysis was performed on an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Scientific) operated with Xcalibur (version 4.0.21.10) and coupled to a Thermo Scientific Easy-nLC (nanoflow Liquid Chromatography) 1200 system. Tryptic and TAILS peptides (2 μg) were loaded onto a C18 trap (75 um x 2 cm; Acclaim PepMap 100, P/N 164946; ThermoScientific) at a flow rate of 2 μl/min of solvent A (0.1% formic acid and 3% acetonitrile in LC-MS grade water). Peptides were eluted using a 120 min gradient from 5 to 40% (5% to 28% in 105 min followed by an increase to 40% B in 15 min) of solvent B (0.1% formic acid in 80% LC-MS grade acetonitrile) at a flow rate of 0.3 μL/min and separated on a C18 analytical column (75 um x 50 cm; PepMap RSLC C18; P/N ES803; Thermo Scientific). Peptides were then electrosprayed using 2.3 kV voltage into the ion transfer tube (300°C) of the Orbitrap Lumos operating in positive mode. The Orbitrap first performed a full MS scan at a resolution of 120,000 FWHM to detect the precursor ion having a m/z between 375 and 1575 and a +2 to +7 charge. The Orbitrap AGC (Auto Gain

ACS Paragon Plus Environment

12

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 30

Control) and the maximum injection time were set at 4e5 and 50 ms, respectively. The Orbitrap was operated using the top speed mode with a 3 sec cycle time for precursor selection. The most intense precursor ions presenting a peptidic isotopic profile and having an intensity threshold of at least 5000 were isolated using the quadrupole and fragmented with HCD (30% collision energy) in the ion routing multipole. The fragment ions (MS2) were analyzed in the ion trap at a rapid scan rate. The AGC and the maximum injection time were set at 1 x 104 and 35 ms, respectively, for the ion trap. Dynamic exclusion was enabled for 45 sec to avoid of the acquisition of same precursor ion having a similar m/z (plus or minus 10 ppm). Proteomic data and bioinformatics analysis. Spectral data were matched to peptide sequences in the human UniProt protein database using the Andromeda algorithm53 as implemented in the MaxQuant19 software package v.1.6.0.1, at a peptide-spectrum match FDR of < 0.01. Search parameters included a mass tolerance of 20 p.p.m. for the parent ion, 0.5 Da for the fragment ion, carbamidomethylation of cysteine residues (+57.021464 Da), variable N-terminal modification by acetylation (+42.010565 Da), and variable methionine oxidation (+15.994915 Da). N-terminal and lysine heavy (+34.063116 Da) and light (+28.031300 Da) dimethylation were defined as labels for relative quantification. The cleavage site specificity was set to semi-ArgC (search for free N-terminus) for the TAILS data and was set to semi-ArgC (search for only lysines) for the preTAILS data, with up to two missed cleavages allowed. Significant outlier cutoff values were determined after log(2) transformation by boxplot-and-whiskers analysis using the BoxPlotR tool55 (Supplementary Figure 1). The data was submitted to PRIDE and is fully accessible: Project Name: N-terminomics/TAILS profiling of proteases and their substrates in ulcerative colitis; Project accession: PXD014479.

Antibodies. Polyclonal rabbit anti-Calpain-1 domain-I large subunit and Polyclonal rabbit antiCalpain-2 domain-IV large subunit antibodies (Triple Point Biologics) were used at a dilution of 1/1,000. Polyclonal rabbit anti-tryptase FL 275 lot #C2212 (Santa Cruz Biotechnology) was used at a dilution of 1/1,000. Polyclonal rabbit anti-STAT1 9172S lot 25 (Cell Signaling) was used at a dilution of 1/1,000. Monoclonal rabbit anti-GAPDH 2118S lot 14C10 (Cell Signaling) was used at a dilution of 1/1,000. Polyclonal rabbit anti-cleaved PARP1 9544s lot D214 (Cell Signaling) was ACS Paragon Plus Environment

13

Page 21 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

used at a dilution of 1/1,000. Polyclonal rabbit anti-caspase-3 9622s (Cell Signaling) was used at a dilution of 1/1,000. Gelatin Zymography. Gelatin zymography was carried out using 10% SDS-polyacrylamide gels containing 0.1% gelatin. After electrophoresis, SDS was removed by incubation of the gel with 2.5% Triton X-100, and gelatinase activity was recovered by incubation in a Tris-based buffer containing 10 mM Ca2+ for 24 h. Gels were stained with Coomassie Brilliant Blue, and enzymatic activity was detected by observing a lack of gelatin protein in sample lanes represented by a clear bright band. IceLogo Generation. All significant processed peptides elevated in healthy or ulcerative colitis biopsies were visualized using IceLogo website https://iomics.ugent.be/icelogoserver/create as previously published54. All peptides were aligned against the human proteome used as a reference set. Reactome Pathway Analysis. To identify protein-protein interaction, the STRING (Search Tool for the Retrieval of Interacting Genes) database was used to identify interconnectivity among proteins. Protein interaction relationship is encoded into networks in the STRING v11 database (https://string-db.org). Our data was analyze using the Homo sapiens as our model organism at a false discovery rate of 5%. TopFIND and PathFINDer analysis. All TopFIND, TopFINDer and PathFINDer analyses were performed using the website http://clipserve.clip.ubc.ca/topfind/. Bioinformatics searches were performed as described previously21. Characterization and Identification of Potential Proteolytic Contributions. In order to identify which human, bacterial, fungal, or viral proteases could be contributing to the generation of the Ntermini identified using TAILS and MaxQuant analysis, we performed TopFIND analyses on all identified peptides to determine the P4-P4’ sequences surrounding cleavage sites. Using the MEROPS database (https://www.ebi.ac.uk/merops/cgi-bin/peptidase_specificity) for substrate specificity, the analysis was performed by matching all detected peptides identified in our TAILS experiment and matching the P4 to P4’ to all proteases capable of contributing to our generated Ntermini library for each resulting peptide. Relative frequencies were calculated for each protease

ACS Paragon Plus Environment

14

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 22 of 30

and aligned with the frequency of peptide occurrence in healthy and ulcerative colitis patient groups. Heatmaps and pie carts were generated to visualize this data using Microsoft Excel. Human ethics approvals. The University of Calgary approved the research protocols for studies on human samples (approval #REB15-0957) and the research was conducted in accordance with the Declaration of Helsinki. Data availability statement. All data are available from the authors upon reasonable request. Please contact corresponding author Antoine Dufour, email [email protected].  SUPPORTING INFORMATION The supporting information is available free of charge via the internet at http://pubs.acs.org. Details of the proteomics data: all peptides and proteins identified in the LC-MS/MS proteomics and N-terminomics study (Supplementary Tables 1-6). Supplementary Table 1 | N-terminomics TAILS peptides from colon biopsies identified without constraint by enzyme specificity rules during spectrum-to-sequence matching. Supplementary Table 2 | N-terminomics TAILS peptides significantly elevated from the healthy colon biopsies. Supplementary Table 3 | N-terminomics TAILS peptides significantly elevated from the ulcerative colitis colon biopsies. Supplementary Table 4 | Shotgun proteomics peptides from colon biopsies identified without constraint by enzyme specificity rules during spectrum-to-sequence matching. Supplementary Table 5 | Reactome Pathway of Shotgun proteomics data. Supplementary Table 6 | Identification of differential proteolytic processing of proteins from Nterminomics. Details on experimental and supplementary Figures 1-8 are shown (PDF).  COMPETING INTERESTS The authors declare no competing financial interests  AUTHOR INFORMATION

ACS Paragon Plus Environment

15

Page 23 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

*Correspondence

and material requests should be addressed to Wallace K. MacNaughton

([email protected]) or to Antoine Dufour ([email protected]). ORCID Antoine Dufour: 0000-0002-3429-4188 Author Contributions M.G., L.E.M., W.K.M. and A.D. conceived the project and wrote the manuscript. M.G. and A.D. performed the proteomics, N-terminomics and bioinformatics analysis. R.C., N.D., A.A. and D.Y. performed the western blot analysis. M.G., P. T. B. and D. W. helped in the bacterial, viral and fungal data generation and ran the microbiome analysis. J. K. synthesized and provided the TAILS polymer. G. B., H. B. J. and S. H. provided the patient biopsies. R. Y. helped in the data interpretation.  ACKNOWLEDGMENTS The authors thank L. Brechenmacher and the Southern Alberta Mass Spectrometry (SAMS) core facility for assistance with the LC-MS/MS analysis, and the Intestinal Inflammation Tissue Bank (IITB) at the University of Calgary for collection of human tissue and patient characteristics. We thank the Natural Sciences and Engineering Research Council (NSERC) and Crohn's and Colitis Canada for funding. LEM was supported by an Early Career Fellowship from the National Health and Medical Research Council of Australia (NHMRC, GNT1091636), a Grimwade Fellowship from the Russell and Mab Grimwade Miegunyah Fund at The University of Melbourne, a DECRA Fellowship from the Australian Research Council (ARC, DE180100418).  ABBREVIATIONS FDR, false discovery rate; GI, gastrointestinal; Inflammatory Bowel Diseases, IBD; LC-MS/MS, liquid chromatography-tandem mass spectrometry; TAILS, terminal amine isotopic labeling of substrates; TCA, the citric acid; UC, ulcerative colitis.  REFERENCES (1) Vergnolle, N. (2016) Protease inhibition as new therapeutic strategy for GI diseases. Gut 65, 1215–1224.

ACS Paragon Plus Environment

16

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 24 of 30

(2) Edgington-Mitchell, L. E. (2016) Pathophysiological roles of proteases in gastrointestinal disease. Am. J. Physiol. Gastrointest. Liver Physiol. 310, G234–9. (3) Denadai-Souza, A., Bonnart, C., Tapias, N. S., Marcellin, M., Gilmore, B., Alric, L., Bonnet, D., Burlet-Schiltz, O., Hollenberg, M. D., Vergnolle, N., and Deraison, C. (2018) Functional Proteomic Profiling of Secreted Serine Proteases in Health and Inflammatory Bowel Disease. Sci Rep 8, 7834. (4) Dufour, A., and Overall, C. M. (2013) Missing the target: matrix metalloproteinase antitargets in inflammation and cancer. Trends Pharmacol. Sci. 34, 233–242. (5) Klein, T., Eckhard, U., Dufour, A., Solis, N., and Overall, C. M. (2018) Proteolytic CleavageMechanisms, Function, and “Omic” Approaches for a Near-Ubiquitous Posttranslational Modification. Chem. Rev. 118, 1137–1168. (6) Gordon, M. H., Chauvin, A., Boisvert, F.-M., and MacNaughton, W. K. (2019) Proteolytic Processing of the Epithelial Adherens Junction Molecule E-Cadherin by Neutrophil Elastase Generates Short Peptides With Novel Wound-Healing Bioactivity. Cell Mol Gastroenterol Hepatol 7, 483–486.e8. (7) Salvesen, G. S., and Dixit, V. M. (1997) Caspases: intracellular signaling by proteolysis. Cell 91, 443–446. (8) Cleynen, I., Jüni, P., Bekkering, G. E., Nüesch, E., Mendes, C. T., Schmied, S., Wyder, S., Kellen, E., Villiger, P. M., Rutgeerts, P., Vermeire, S., and Lottaz, D. (2011) Genetic evidence supporting the association of protease and protease inhibitor genes with inflammatory bowel disease: a systematic review. PLoS ONE (Dubé, M.-P., Ed.) 6, e24106. (9) Fortelny, N., Overall, C. M., Pavlidis, P., and Freue, G. V. C. (2017) Can we predict protein from mRNA levels? Nature 547, E19–E20. (10) Gill, S. R., Pop, M., Deboy, R. T., Eckburg, P. B., Turnbaugh, P. J., Samuel, B. S., Gordon, J. I., Relman, D. A., Fraser-Liggett, C. M., and Nelson, K. E. (2006) Metagenomic analysis of the human distal gut microbiome. Science 312, 1355–1359. (11) Xu, J. H., Jiang, Z., Solania, A., Chatterjee, S., Suzuki, B., Lietz, C. B., Hook, V. Y. H., O'Donoghue, A. J., and Wolan, D. W. (2018) A Commensal Dipeptidyl Aminopeptidase with Specificity for N-Terminal Glycine Degrades Human-Produced Antimicrobial Peptides in Vitro. ACS Chem. Biol. 13, 2513–2521. (12) Mayers, M. D., Moon, C., Stupp, G. S., Su, A. I., and Wolan, D. W. (2017) Quantitative Metaproteomics and Activity-Based Probe Enrichment Reveals Significant Alterations in Protein Expression from a Mouse Model of Inflammatory Bowel Disease. J. Proteome Res. 16, 1014– 1026. (13) Prantera, C. (2008) What role do antibiotics have in the treatment of IBD? Nat Clin Pract Gastroenterol Hepatol 5, 670–671. (14) Danese, S., and Fiocchi, C. (2011) Ulcerative colitis. N. Engl. J. Med. 365, 1713–1725. (15) Brazil, J. C., Louis, N. A., and Parkos, C. A. (2013) The role of polymorphonuclear leukocyte trafficking in the perpetuation of inflammation during inflammatory bowel disease. Inflamm. Bowel Dis. 19, 1556–1565. (16) Anderson, C. A., Boucher, G., Lees, C. W., Franke, A., D'Amato, M., Taylor, K. D., Lee, J. C., Goyette, P., Imielinski, M., Latiano, A., Lagacé, C., Scott, R., Amininejad, L., Bumpstead, S., Baidoo, L., Baldassano, R. N., Barclay, M., Bayless, T. M., Brand, S., Büning, C., Colombel, J.F., Denson, L. A., De Vos, M., Dubinsky, M., Edwards, C., Ellinghaus, D., Fehrmann, R. S. N., Floyd, J. A. B., Florin, T., Franchimont, D., Franke, L., Georges, M., Glas, J., Glazer, N. L., Guthery, S. L., Haritunians, T., Hayward, N. K., Hugot, J.-P., Jobin, G., Laukens, D., Lawrance, I., Lémann, M., Levine, A., Libioulle, C., Louis, E., McGovern, D. P., Milla, M., Montgomery,

ACS Paragon Plus Environment

17

Page 25 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

G. W., Morley, K. I., Mowat, C., Ng, A., Newman, W., Ophoff, R. A., Papi, L., Palmieri, O., Peyrin-Biroulet, L., Panés, J., Phillips, A., Prescott, N. J., Proctor, D. D., Roberts, R., Russell, R., Rutgeerts, P., Sanderson, J., Sans, M., Schumm, P., Seibold, F., Sharma, Y., Simms, L. A., Seielstad, M., Steinhart, A. H., Targan, S. R., van den Berg, L. H., Vatn, M., Verspaget, H., Walters, T., Wijmenga, C., Wilson, D. C., Westra, H.-J., Xavier, R. J., Zhao, Z. Z., Ponsioen, C. Y., Andersen, V., Torkvist, L., Gazouli, M., Anagnou, N. P., Karlsen, T. H., Kupcinskas, L., Sventoraityte, J., Mansfield, J. C., Kugathasan, S., Silverberg, M. S., Halfvarson, J., Rotter, J. I., Mathew, C. G., Griffiths, A. M., Gearry, R., Ahmad, T., Brant, S. R., Chamaillard, M., Satsangi, J., Cho, J. H., Schreiber, S., Daly, M. J., Barrett, J. C., Parkes, M., Annese, V., Hakonarson, H., Radford-Smith, G., Duerr, R. H., Vermeire, S., Weersma, R. K., and Rioux, J. D. (2011) Metaanalysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47. Nat. Genet. 43, 246–252. (17) Kleifeld, O., Doucet, A., Auf dem Keller, U., Prudova, A., Schilling, O., Kainthan, R. K., Starr, A. E., Foster, L. J., Kizhakkedathu, J. N., and Overall, C. M. (2010) Isotopic labeling of terminal amines in complex samples identifies protein N-termini and protease cleavage products. Nat. Biotechnol. 28, 281–288. (18) Mallia-Milanes, B., Dufour, A., Philp, C., Solis, N., Klein, T., Fischer, M., Bolton, C. E., Shapiro, S. D., Overall, C. M., and Johnson, S. R. (2018) TAILS proteomics reveals dynamic changes in airway proteolysis controlling protease activity and innate immunity during COPD exacerbations. Am. J. Physiol. Lung Cell Mol. Physiol. 571, 1089. (19) Cox, J., and Mann, M. (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372. (20) Lange, P. F., and Overall, C. M. (2011) TopFIND, a knowledgebase linking protein termini with function. Nat. Methods 8, 703–704. (21) Fortelny, N., Yang, S., Pavlidis, P., Lange, P. F., and Overall, C. M. (2015) Proteome TopFIND 3.0 with TopFINDer and PathFINDer: database and analysis tools for the association of protein termini to pre- and post-translational events. Nucleic Acids Res. 43, D290–7. (22) Szklarczyk, D., Gable, A. L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., Simonovic, M., Doncheva, N. T., Morris, J. H., Bork, P., Jensen, L. J., and Mering, C. V. (2019) STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613. (23) Schreiber, S., Rosenstiel, P., Hampe, J., Nikolaus, S., Groessner, B., Schottelius, A., Kühbacher, T., Hämling, J., Fölsch, U. R., and Seegert, D. (2002) Activation of signal transducer and activator of transcription (STAT) 1 in human chronic inflammatory bowel disease. Gut 51, 379–385. (24) O'Sullivan, S., Gilmer, J. F., and Medina, C. (2015) Matrix metalloproteinases in inflammatory bowel disease: an update. Mediators Inflamm. 2015, 964131–19. (25) Krakstad, B. F., Ardawatia, V. V., and Aragay, A. M. (2004) A role for Galpha12/Galpha13 in p120ctn regulation. Proc. Natl. Acad. Sci. U.S.A. 101, 10314–10319. (26) Dufour, A., and Overall, C. M. 4.7 Rock, paper, and molecular scissors: regulating the game of extracellular matrix homeostasis, remodeling, and inflammation, in Extracellular Matrix: Pathobiology and Signaling. DE GRUYTER, Berlin, Boston. (27) Dufour, A. (2015) Degradomics of matrix metalloproteinases in inflammatory diseases. Frontiers in Bioscience 7, 150–167.

ACS Paragon Plus Environment

18

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 26 of 30

(28) Forbes, E., Murase, T., Yang, M., Matthaei, K. I., Lee, J. J., Lee, N. A., Foster, P. S., and Hogan, S. P. (2004) Immunopathogenesis of experimental ulcerative colitis is mediated by eosinophil peroxidase. J. Immunol. 172, 5664–5675. (29) Kato, T., Suzuki, K., Okada, S., Kamiyama, H., Maeda, T., Saito, M., Koizumi, K., Miyaki, Y., and Konishi, F. (2012) Aberrant methylation of PSD disturbs Rac1-mediated immune responses governing neutrophil chemotaxis and apoptosis in ulcerative colitis-associated carcinogenesis. Int. J. Oncol. 40, 942–950. (30) Muise, A. M., Walters, T., Xu, W., Shen-Tu, G., Guo, C.-H., Fattouh, R., Lam, G. Y., Wolters, V. M., Bennitz, J., van Limbergen, J., Renbaum, P., Kasirer, Y., Ngan, B.-Y., Turner, D., Denson, L. A., Sherman, P. M., Duerr, R. H., Cho, J., Lees, C. W., Satsangi, J., Wilson, D. C., Paterson, A. D., Griffiths, A. M., Glogauer, M., Silverberg, M. S., and Brumell, J. H. (2011) Single nucleotide polymorphisms that increase expression of the guanosine triphosphatase RAC1 are associated with ulcerative colitis. Gastroenterology 141, 633–641. (31) Wertheimer, E., and Kazanietz, M. G. (2011) Rac1 takes center stage in pancreatic cancer and ulcerative colitis: quantity matters. Gastroenterology 141, 427–430. (32) de Souza, H. S. P., and Fiocchi, C. (2016) Immunopathogenesis of IBD: current state of the art. Nat Rev Gastroenterol Hepatol 13, 13–27. (33) Liu, L., Cara, D. C., Kaur, J., Raharjo, E., Mullaly, S. C., Jongstra-Bilen, J., Jongstra, J., and Kubes, P. (2005) LSP1 is an endothelial gatekeeper of leukocyte transendothelial migration. J. Exp. Med. 201, 409–418. (34) Petri, B., Kaur, J., Long, E. M., Li, H., Parsons, S. A., Butz, S., Phillipson, M., Vestweber, D., Patel, K. D., Robbins, S. M., and Kubes, P. (2011) Endothelial LSP1 is involved in endothelial dome formation, minimizing vascular permeability changes during neutrophil transmigration in vivo. Blood 117, 942–952. (35) Xu, P., Roes, J., Segal, A. W., and Radulovic, M. (2006) The role of grancalcin in adhesion of neutrophils. Cell. Immunol. 240, 116–121. (36) Zhang, X., Wei, L., Wang, J., Qin, Z., Wang, J., Lu, Y., Zheng, X., Peng, Q., Ye, Q., Ai, F., Liu, P., Wang, S., Li, G., Shen, S., and Ma, J. (2017) Suppression Colitis and Colitis-Associated Colon Cancer by Anti-S100a9 Antibody in Mice. Front Immunol 8, 1774. (37) Nyström, E. E. L., Birchenough, G. M. H., van der Post, S., Arike, L., Gruber, A. D., Hansson, G. C., and Johansson, M. E. V. (2018) Calcium-activated Chloride Channel Regulator 1 (CLCA1) Controls Mucus Expansion in Colon by Proteolytic Activity. EBioMedicine 33, 134– 143. (38) Wenzel, U. A., Magnusson, M. K., Rydström, A., Jonstrand, C., Hengst, J., Johansson, M. E. V., Velcich, A., Öhman, L., Strid, H., Sjövall, H., Hansson, G. C., and Wick, M. J. (2014) Spontaneous colitis in Muc2-deficient mice reflects clinical and cellular features of active ulcerative colitis. PLoS ONE (Heimesaat, M. M., Ed.) 9, e100217. (39) Pereira-Fantini, P. M., Judd, L. M., Kalantzis, A., Peterson, A., Ernst, M., Heath, J. K., and Giraud, A. S. (2010) A33 antigen-deficient mice have defective colonic mucosal repair. Inflamm. Bowel Dis. 16, 604–612. (40) Williams, B. B., Tebbutt, N. C., Buchert, M., Putoczki, T. L., Doggett, K., Bao, S., Johnstone, C. N., Masson, F., Hollande, F., Burgess, A. W., Scott, A. M., Ernst, M., and Heath, J. K. (2015) Glycoprotein A33 deficiency: a new mouse model of impaired intestinal epithelial barrier function and inflammatory disease. Dis Model Mech 8, 805–815. (41) Tigas, S., and Tsatsoulis, A. (2012) Endocrine and metabolic manifestations in inflammatory bowel disease. Ann Gastroenterol 25, 37–44.

ACS Paragon Plus Environment

19

Page 27 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

(42) Yorulmaz, E., Adali, G., Yorulmaz, H., Ulasoglu, C., Tasan, G., and Tuncer, I. (2011) Metabolic syndrome frequency in inflammatory bowel diseases. Saudi J Gastroenterol 17, 376– 382. (43) Yan, Z.-X., Gao, X.-J., Li, T., Wei, B., Wang, P.-P., Yang, Y., and Yan, R. (2018) Fecal Microbiota Transplantation in Experimental Ulcerative Colitis Reveals Associated Gut Microbial and Host Metabolic Reprogramming. Appl. Environ. Microbiol. (McBain, A. J., Ed.) 84, G310. (44) Mortensen, J. H., Manon-Jensen, T., Jensen, M. D., Hägglund, P., Klinge, L. G., Kjeldsen, J., Krag, A., Karsdal, M. A., and Bay-Jensen, A.-C. (2017) Ulcerative colitis, Crohn“s disease, and irritable bowel syndrome have different profiles of extracellular matrix turnover, which also reflects disease activity in Crohn”s disease. PLoS ONE (Kufer, T. A., Ed.) 12, e0185855. (45) Biancheri, P., Brezski, R. J., Di Sabatino, A., Greenplate, A. R., Soring, K. L., Corazza, G. R., Kok, K. B., Rovedatti, L., Vossenkämper, A., Ahmad, N., Snoek, S. A., Vermeire, S., Rutgeerts, P., Jordan, R. E., and MacDonald, T. T. (2015) Proteolytic cleavage and loss of function of biologic agents that neutralize tumor necrosis factor in the mucosa of patients with inflammatory bowel disease. Gastroenterology 149, 1564–1574.e3. (46) Trusevych, E. H., and MacNaughton, W. K. (2015) Proteases and their receptors as mediators of inflammation-associated colon cancer. Curr. Pharm. Des. 21, 2983–2992. (47) Ranson, N., Kunde, D., and Eri, R. (2017) Regulation and Sensing of Inflammasomes and Their Impact on Intestinal Health. Int J Mol Sci 18, 2379. (48) Muthas, D., Reznichenko, A., Balendran, C. A., Böttcher, G., Clausen, I. G., Kärrman Mårdh, C., Ottosson, T., Uddin, M., MacDonald, T. T., Danese, S., and Berner Hansen, M. (2017) Neutrophils in ulcerative colitis: a review of selected biomarkers and their potential therapeutic implications. Scand. J. Gastroenterol. 52, 125–135. (49) Kisling, A., Lust, R. M., and Katwa, L. C. (2019) What is the role of peptide fragments of collagen I and IV in health and disease? Life Sci. 228, 30–34. (50) Okada, M., Imoto, K., Sugiyama, A., Yasuda, J., and Yamawaki, H. (2017) New Insights into the Role of Basement Membrane-Derived Matricryptins in the Heart. Biol. Pharm. Bull. 40, 2050–2060. (51) Ricard-Blum, S., and Vallet, S. D. (2016) Proteases decode the extracellular matrix cryptome. Biochimie 122, 300–313. (52) Wells, A., Nuschke, A., and Yates, C. C. (2016) Skin tissue repair: Matrix microenvironmental influences. Matrix Biol. 49, 25–36. (53) Cox, J., Neuhauser, N., Michalski, A., Scheltema, R. A., Olsen, J. V., and Mann, M. (2011) Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805. (54) Colaert, N., Helsens, K., Martens, L., Vandekerckhove, J., and Gevaert, K. (2009) Improved visualization of protein consensus sequences by iceLogo. Nat. Methods 6, 786–787. (55) Spitzer, M., Wildenhain, J., Rappsilber, J., and Tyers, M. (2014) BoxPlotR: a web tool for generation of box plots. Nat. Methods 11, 121–122.

ACS Paragon Plus Environment

20

ACS Chemical Biology 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 28 of 30

Table 1. Patient information.

ACS Paragon Plus Environment

21

Page 29 of 30 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ACS Chemical Biology

Uniprot ID ID

O75078 P04080 P15088 HC#1 O14773 P09936 HC#2 P00995 Q9UGM5 P10619 HC#3 Q9UBR2 Q15661 UC#1 P07339 P20810 UC#2 Q9ULA0 P07384 UC#3 P25774 Q96KP4 P07858 HC#4 Q99497 P25786 HC#5 P28066 Q8TCE1 P04632 UC#2 P28070 P30740 UC#4 Q9H4A4 P00734 UC#5 P01034 P04275 P07711 P08246 P08311 P29466 Q9NY33 Q5SVL2 P30740 P01023 P28838 P01024

Gene name Diagnosis

Protein name Sample

Disease

Age

Sex

Medications

Fold difference: UC vs healthy Smoker?

ADAM11 Disintegrin and metalloproteinase domain-containing protein 11 0.1 location Score CSTB Cystatin-B 0.1 Patient Characteristics: N-terminomics/TAILS CPA3 Mast cell carboxypeptidaseand A Proteomics 0.2 Healthy Descending N/A 61 M 1 Aspirin Never TPP1 Tripeptidyl-peptidase 0.2 control colon UCHL1 Ubiquitin carboxyl-terminal hydrolase 0.2 Healthy Descending Pancreatic N/A Secretory 56 trypsin M inhibitor None Former SPINK1 0.2 control colon FETUB Fetuin-B 0.3 CTSA Lysosomal protective 0.3 Healthy Descending N/A 53 Mprotein None Never CTSZ Cathepsin Z 0.4 control colon TPSAB1 Tryptase alpha/beta-1 0.4 Ulcerative Sigmoid Mayo 2 20 M 5-ASA, Adalimumab Never CTSD Cathepsin D 0.5 colitis colon CAST 0.5 Ulcerative Right colon Active Calpastatin 37 M Unknown Current DNPEP Aspartyl aminopeptidase 0.5 colitis CAPN1 Calpain-1 0.8 Ulcerative Sigmoid Mayo 2 68 M 5-ASA, Infliximab, Unknown CTSS Cathepsin S 0.8 colitis colon MTX, Corticosteroids CNDP2 Cytosolic non-specific dipeptidase 0.8 Patient Characteristics: Western Blots CTSB Cathepsin B 0.8 Healthy Sigmoid N/A 52 FDJ-1 None Never PARK7 Protein deglycase 1.0 control colon PSMA1 Proteasome subunit alpha type-1 1.0 Healthy Sigmoid N/A 42 alpha M type-5 None Unknown PSMA5 Proteasome subunit 1.0 control colon SERPINC1 Antithrombin-III 1.1 CAPNS1 Calpain small 1.1 Ulcerative Right colon Active 37 subunit M 1 Unknown Current PSMB4 Proteasome subunit beta type-4 1.2 colitis SERPINB1 Right colon Leukocyte elastase inhibitor 1.4 Ulcerative Mayo 1 57 F 5-Asa, Infliximab Never RNPEP Aminopeptidase B Not quantified colitis F2 Not quantified Ulcerative Transverse Mayo 3 Thrombin 53 M None Never CST3 Cystatin-C Not quantified colitis colon VWF Von Willebrand factor Not quantified CTSL Cathepsin L Not quantified ELANE Neutrophil elastase Not quantified CTSG Cathepsin G Not quantified CASP1 Caspase-1 Not quantified DPP3 Dipeptidyl peptidase 3 Not quantified CASP7 Caspase-7 Not quantified SERPINB1 Leukocyte elastase inhibitor Not quantified A2M Alpha-2-macroglobulin 2.2 LAP3 Cytosol aminopeptidase 2.3 C3 Complement C3 2.8

Table 2. Quantitative ratio and identified proteases and proteases inhibitors in colon biopsies.

ACS Paragon Plus Environment

22

Proteolytic Regulation Ulcerative Colitis ACS ChemicalinBiology Page 30 of 30 1 2 3 4 ACS Paragon Plus Environment Protease 5Proteases 6 Inhibitors