Proteomics Analysis Reveals That Structural Proteins of the Virion

Jan 10, 2019 - Prediction of HLA peptide binding by bioinformatics software is routinely utilized to select potential candidates for viral ligands. Th...
8 downloads 0 Views 2MB Size
Subscriber access provided by Access provided by University of Liverpool Library

Article

Proteomics analysis reveals that structural proteins of the virion core and involved in gene expression are the main source for HLA class II ligands in vaccinia virus-infected cells Elena Lorente, Antonio J. Martín-Galiano, Eilon Barnea, Alejandro Barriga, Concepción Palomo, Juan García-Arriaza, Carmen Mir, Pilar Lauzurica, Mariano Esteban, Arie Admon, and Daniel Lopez J. Proteome Res., Just Accepted Manuscript • DOI: 10.1021/acs.jproteome.8b00595 • Publication Date (Web): 10 Jan 2019 Downloaded from http://pubs.acs.org on January 10, 2019

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Proteomics analysis reveals that structural proteins of the virion core and involved in gene expression are the main source for HLA class II ligands in vaccinia virus-infected cells

Elena Lorente 1, Antonio J. Martín-Galiano 2, Eilon Barnea 3, Alejandro Barriga 1, Concepción Palomo 1, Juan García-Arriaza 4, Carmen Mir 1, Pilar Lauzurica 1, Mariano Esteban 4, Arie Admon 3, and Daniel López 1, *

From 1 Unidad de Presentación y Regulación Inmunes, and 2 Unidad de Bioinformática, Instituto de Salud Carlos III, 28220 Majadahonda (Madrid), Spain,

3

Department of

Biology, Technion-Israel Institute of Technology, 32000 Haifa, Israel,

4

Department of

Molecular and Cellular Biology, Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas (CSIC), 28049 Madrid, Spain.

*

Correspondence to: Dr. Daniel López. Unidad de Presentación y Regulación

Inmunes. Centro Nacional de Microbiología. Instituto de Salud Carlos III. 28220 Majadahonda (Madrid), Spain. Tel: +34 91 822 37 08, FAX: +34 91 509 79 19, E-mail: [email protected]

1 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

ABSTRACT

Protective cellular and humoral immune responses require previous recognition of viral antigenic peptides complexed with HLA class II molecules on the surface of the antigen presenting cells. The HLA class II-restricted immune response is important for the control and the clearance of poxvirus infection including vaccinia virus (VACV), the vaccine used in the worldwide eradication of smallpox. In this study, a mass spectrometry analysis was used to identify VACV ligands bound to HLA-DR and -DP class II molecules present on the surface of VACV-infected cells. Twenty-six naturally processed viral ligands among the tens of thousands of cell peptides bound to HLA class II proteins were identified. These viral ligands arose from nineteen parental VACV proteins; A4, A5, A18, A35, A38, B5, B13, D1, D5, D7, D12, D13, E3, E8, H5, I2, I3, J2, and K2. The majority of these VACV proteins yielded one HLA ligand, and were generated mainly, but not exclusively, by the classical HLA class II antigen processing pathway. Medium-sized and abundant proteins from the virion core and/or involved in the viral gene expression were the major source of VACV ligands bound to HLA-DR and -DP class II molecules. These findings will help to understand the effectiveness of current poxvirus-based vaccines and will be important in the design of new ones.

KEYWORDS Antigen processing, bioinformatics tools, cellular immune response, HLA, immunoproteomics, mass spectrometry, peptides, T cells, vaccine, virus.

2 ACS Paragon Plus Environment

Page 2 of 38

Page 3 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

INTRODUCTION Vaccination is the most effective tool to prevent infectious diseases. Currently, most of licensed vaccines derive from live, attenuated, or inactivated pathogens, and were developed by empirical approaches 1. The molecular knowledge of how current vaccines generate protective immune responses is critical to improve the safety, immunogenicity and effectiveness of existing vaccines as well as for a rational design of new ones against emergent and old pandemic infectious diseases (e.g., AIDS, malaria or tuberculosis). The immune system utilizes the cellular protein degradation machinery to kill pathogens. For this, the human leukocyte antigen (HLA) class I and II antigen processing pathways are a key element in the identification of non self determinants by CD8+ and CD4+ T lymphocytes, respectively. Proteolytic degradation by proteasomes and other cytosol proteases of the newly synthesized cellular and viral proteins whose sequence or folding are defective generated short peptides (8-10 residues long) that, after translocation to the endoplasmic reticulum lumen by transporters associated with antigen processing, bind and stabilize the newly synthesized HLA class I molecules. These HLA class I/peptide complexes are then exported to the cell surface for cytotoxic CD8+ T cell recognition 2. In contrast, newly synthesized HLA class II molecules from antigen presenting cells are inserted in the endoplasmic reticulum and transported to endosomal compartments. Next, these compartments fuse with late endosomes, which contain viral particles and/or exogenous host proteins that were previously endocytosed by phagocytosis, pinocytosis, or endocytosis, and processed by several resident cathepsins from lysosomes. The binding of cellular and viral peptides of different lengths, but much longer (up to 30 residues) than those assembled with HLA class I molecules, stabilizes these nascent HLA class II/peptide complexes and allows for their subsequent transport to the cell membrane where they are exposed for T helper cell recognition 3. The interaction of the receptor of the CD4+ T helper cells with 3 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

these HLA class II/peptide complexes present in the surface of the antigen presenting cells is the key event that triggers the activity of T helper cells, initiating, suppressing or regulating the different components of the adaptive immune response. In absence of a proper HLA class II-restricted helper response both cytotoxic and humoral immune components cannot be efficiently activated, no CD8+ T lymphocytes can eliminate infected cells nor antibodies are generated and thus, the pathogen evades the immune response with fatal results for the host. The identification of viral HLA ligands by mass spectrometry analysis contributes to a better understanding of the cellular antiviral immune response. This experimental approach has two main advantages. First, this direct characterization of peptides bound to HLA molecules in infected cells provides unbiased information about natural ligands presented to T cells. Second, for large-genome viruses, such as vaccinia virus (VACV), which encodes complex proteomes of more than 200 proteins, classical overlapping synthetic peptide analyses are both very expensive and technically unfeasible in practice. Moreover, using mass spectrometry analyses, more than one hundred VACV peptides were previously identified bound to several HLA class I molecules in different studies

4-8

. Nevertheless, only three natural ligands bound

to HLA-DR class II molecules were previously isolated from VACV-infected cells 9. Therefore, the identification of natural viral ligands that are presented by several frequent HLA-DR and -DP class II molecules in VACV-infected cells would be of great interest to analyze how the immune system selects HLA class II ligands, a knowledge applicable to the vaccine design. In this work, we used the immunopeptidomics analysis of HLA class II ligands that were isolated from large amounts of VACVinfected cells, without any methodological bias (e.g., selection of an individual protein, or usage of HLA consensus scoring algorithms). We found the existence of twenty-two HLA-DR and four HLA-DP-bound natural peptides, derived from nineteen VACV proteins, which were processed and presented in VACV-infected cells.

4 ACS Paragon Plus Environment

Page 4 of 38

Page 5 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

EXPERIMENTAL PROCEDURES

Cell lines

Two different Epstein–Barr virus (EBV)-immortalized B cell lymphoblastoid lines were used. The homozygous HOM-2 cell line expresses HLA-DR B1*0101, and HLADP B1*0401 chains. The heterozygous JY cell line expresses four HLA-DR chains: B1*0404, B4*0101, B1*1301, and B3*0101 and two HLA-DP chains: B1*0201, and B1*0401. Both human cell lines were cultured in RPMI 1640 supplemented with 7% fetal bovine serum.

Infection of the cell lines by VACV The VACV Western Reserve strain expressing the chikungunya virus structural genes, previously described

10

, was utilized to infect 1x109 HOM-2 or JY cells at a

multiplicity of infection of 3 plaque-forming units/cell in 100 ml for 2 h at 37ºC and then washed with PBS, as previously described

7

. These conditions were previously

determined as the optimal to obtain infection of all cells without affecting the cell viability. Next, the cells were then cultured for 4 h at 37ºC and stained with the Omnitope antiserum-FITC that recognizes VACV purified virions. Samples were analyzed by fluorescence-activated cell sorting (FACS) to confirm VACV infection (1020 ± 87 mean fluorescence intensity [MFI] in VACV-infected cells versus 422 ± 48 in non-infected cells stained with the anti-VACV antiserum). The cells were then frozen until used in the presence of phenylmethanesulfonylfluoride (PMSF). HLA-bound peptide isolation HLA-bound peptides were isolated from a total of 4 or 2x1010 uninfected or VACV-infected HOM-2 or JY cell lines for 6 hours, respectively. The cells were lysed in 1% CHAPS (Sigma), 20 mM Tris/HCl buffer, and 150 mM NaCl, pH 7.5 in the presence of the cOmplete™, Mini Protease Inhibitor Cocktail (Merck KGaA, Darmstadt, 5 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 6 of 38

Germany). After centrifugation, the supernatant was passed first through a control precolumn

containing

CNBr-activated

Sepharose

4B

(GE

Healthcare,

Buckinghamshire, UK) to remove non-specific peptides and proteins. Next, the HLADR/peptide or -DP/peptide complexes were isolated sequentially via affinity chromatography from the soluble cell extract fraction with L243 (HB55)

11

, or B7.21

12

monoclonal antibodies (mAbs), which are specific for monomorphic pan-HLA-DR or DP class II determinants, respectively. The HLA-bound peptides were eluted at 4ºC with 0.1% aqueous trifluoroacetic acid (TFA), separated from the HLA molecules, and concentrated by ultra-filtration with a Centricon 3 filter (Amicon, Beverly, MA), as previously described 13. Electrospray-ion trap mass spectrometry analysis Peptide mixtures recovered after the ultra-filtration step were concentrated using Micro-Tip reversed-phase columns (C18, 200 µl, Harvard Apparatus, Holliston, MA)

13

. Each C18 tip was equilibrated with 80% acetonitrile in 0.1% TFA, washed with

0.1% TFA, and then loaded with the peptide mixture. The tip was then washed with an additional volume of 0.1% TFA, and the peptides were subsequently eluted with 80% acetonitrile in 0.1% TFA. Lastly, the peptide samples were then concentrated to approximately 20 µl using vacuum centrifugation 7, 13. The HLA class II peptides recovered from immunoprecipitated HLA-DR or DPspecific mAbs, were analyzed by nanoLC-MS/MS using a Q-Exactive-Plus mass spectrometer that was fitted with an Ultimate 3000 RSLC nanocapillary UHPLC (Thermo Fisher Scientific, Waltham, MA), using the same parameters previously described

14

. The peptides were resolved on homemade Reprosil C18-Aqua capillary

columns (75 micron ID, Dr. Maisch GmbH, Ammerbuch-Entringen, Germany)

15

with a

5-28% acetonitrile linear gradient for 2 h in the presence of 0.1% formic acid at a flow rate of 0.15 μL/min. The dynamic exclusion was set to 20 sec and the automatic gain

6 ACS Paragon Plus Environment

Page 7 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

control value for the full MS was set to 3x106. The selected masses were fragmented from the survey scan of mass-to-charge ratio (m/z) 300-1,800 AMU at resolution 70,000. The 10 most intense masses were selected for fragmentation by higherenergy collisional dissociation (HCD) from each full mass spectrum. No fragmentation was performed for peptides with unassigned precursor ion charge states or charge states of five and above. MS/MS spectra were acquired with a resolution of 17,500 at m/z 200. The target value of the MS/MS was set to 1x105 and the isolation window to 1.8 m/z. The maximum injection time was set to 100 ms and normalized collision energy to 25 eV. The peptide match option was set to Preferred. Fragmented masses were dynamically excluded from further selection for fragmentation for 20 sec. Database searches Raw mass spectrometry data were processed using Peaks 8.5 (Bioinformatics Solutions Inc., Waterloo, Canada) for peak-list generation from the nanoLC-MS/MS data. The peaks were identified using the Peaks 8.5 software programs using the human and VACV parts of the UniProt/Swiss-Prot database (November 2017), which included 20267 and 217 proteins, respectively. The search was not limited by enzymatic specificity and both the peptide mass and the fragment ion tolerances were set at 5 and 20 ppm, respectively. This search was not limited by any methodological bias (e.g., individual protein selection or HLA consensus scoring algorithm use). The following variable modifications were also analyzed: acetylation of N-terminal residue, cysteinilation (C), oxidation (M, P, H, F, Y, and W), methylation (C, and N-terminus), and phosphorylation (Y, S, and T). The identified peptides were selected when their 10logP score from Peaks 8.5 was > 21. In addition, the maximum false discovery rate (FDR) was set to 1%. No viral peptides were found in a search of the reversed database. When the MS/MS spectra fitted more than one peptide, only the highest scoring peptide was selected. The mass spectrometry data have been deposited to the

7 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

MassIVE

repository

(https://massive.ucsd.edu)

Page 8 of 38

with

the

data

set

identifier

MSV000082670. In silico binding prediction of HLA-DR and -DP ligands The predicted binding of each peptide to HLA-DR or -DP class II molecules was calculated using the artificial neural network-based alignment method NetMHCIIpan (version 3.2) (available in http://www.cbs.dtu.dk/services/NetMHCIIpan/) 16

and the MHC-II binding prediction from IEDB (available in http://tools.iedb.org/mhcii/)

17

. Functional bioinformatics procedures Information about all 260 proteins from VACV Western Reserve strain were

downloaded from Uniprot 18 and clustered on a 90% identity basis using CD-HIT 19 for a final number of 231 proteins. Gene ontology terms

20

for "Cellular component" and

"Biological process" sections were downloaded from Uniprot. The isoelectric point and GRAVY index were calculated using the ProtParam tool of EXPASY21. The GRAVY index was calculated using the Kyte and Doolittle hydrophobicty scale 22. Experimental design and statistical rationale Several previous analysis have shown that 2x1010 virus-infected human cells are sufficient to identify the HLA-bound viral peptide pools

7, 13, 23, 24

. A precolumn was

utilized to remove non-specific binding proteins and peptides. Similar amounts of uninfected human cells were used as negative control to discriminate viral and cellular peptides (included in proteome databases as well as unknown peptides) and to exclude erroneous assignments of viral peptides. Also, two synthetic peptides were purchased to Peptide 2.0 (Chantilly, VA, USA) and its MS/MS spectra were used to confirm the assigned sequences.

8 ACS Paragon Plus Environment

Page 9 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

A schematic representation of experimental design is shown in Figure 1. Additionally, the FDR was estimated by searching against the database with the reversed viral and human sequences. To analyze the statistical significance of the assays, chi-square and unpaired Student’s t-tests were used. P values < 0.05 were considered to be statistically significant.

9 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

RESULTS Physiological processing generated multiple different viral HLA-DR ligands in human VACV-infected cells. The HLA-DR-bound peptide pools were isolated from large numbers of either uninfected or VACV-infected cells from two different human cell lines expressing HLADR molecules (HOM-2 and JY cells). These peptide mixtures were subsequently separated by capillary reversed-phase HPLC and were analyzed on-line using tandem mass spectrometry. By means of bioinformatics tools, different fragmentation spectra detected in the VACV-infected HLA-bound peptide pools (but absent from both control uninfected pools), were resolved with high confidence parameters as VACV proteinderived peptides (Table 1 and Supplementary Fig. 1). Additionally, a human proteome database search failed to identify any of these spectra as human protein fragments, supporting the viral origin of these ligands. Eleven different VACV proteins (A5, A18, A35, B13, D5, D7, E3, E8, H5R, J2, and K2) displayed individual ligands bound to HLA-DR1 molecules from the HOM-2 cell line (Table 1). In addition, other three VACV proteins yielded several HLA-DR1 ligands (Table 1). First, two viral 14-mer and 18-mer sequences, which spans amino acid residues 1-18 and 29-42 of the VACV A4 protein, were also identified as natural peptides bound to this HLA class II allele. Second, four ligands: one individual peptide, which spans residues 1-18, and a nested set of three natural peptides spanning residues 413-434, 418-434 and 426-434 from the VACV D13 protein were also identified in the HLA-DR1-bound peptide pool. Finally, the VACV I3 protein also generated four HLA-DR1 ligands: two individual peptides spanning residues 190-199 and 246-269 and a nested set of two natural peptides spanning residues 172-183 and 172-185. Nested sets in both cellular and viral ligands bound to HLA class II molecules are usually identified

24, 25, 26

. In addition, one additional ligand,

which spans residues 21-35 from the VACV I2 protein, was also identified in the HLADR-bound peptide pool from the heterozygous (HLA-DR4, -DR13) JY cell line. 10 ACS Paragon Plus Environment

Page 10 of 38

Page 11 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

Therefore, these results indicate that a total of twenty-two HLA-DR ligands from fifteen different VACV proteins, five of which were included in two different nested sets of peptides from the D13 and I3 proteins, were processed and presented in VACVinfected human cells. Four viral HLA-DP ligands were processed in the human VACV-infected cells. Similarly to HLA-DR, the HLA-DP-bound peptide pools were isolated from either uninfected or VACV-infected HOM-2 and JY cells. Four fragmentation spectra present in the HLA-DP-bound peptide pools, but absent in the uninfected pools, were also resolved as peptides from VACV B5, D1, D12, and A38 proteins (Table 1). In the HOM2 cell line, VACV B5, D1 and D12 proteins yielded HLA-DP4 ligands spanning residues 78-92, 620-629 and 174-183, respectively. Furthermore, in the JY cell line a viral 11mer sequence spanning residues 221-231 of the VACV A38 protein, bound to HLADP2 or HLA-DP4 molecules, was identified. A search on the human proteome database failed to identify these spectra as human protein fragments, supporting the viral origin of these HLA-DP-bound peptides. Collectively, a total of twenty-two HLA-DR and four HLA-DP-bound natural peptides from nineteen viral proteins were processed and presented in VACV-infected cells. These ligands ranged between 9 and 24 residues long, with an average length of 14 residues, within the standard HLA class II size range.

Theoretical binding affinity of the viral peptides and allele assignment.

Prediction of HLA peptide binding by bioinformatics software is routinely utilized to select potential candidates for viral ligands. Therefore, to analyze the accuracy of the algorithms, prediction of the peptide binding of the VACV ligands to each HLA-DR or -

11 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

DP molecule was analyzed using two different computational approaches, the NetMHCIIpan neural network-based alignment method and the IEDB prediction software. The homozygous HOM-2 cell line expresses HLA-DR B1*0101, and HLA-DP B1*0401 chains. First, the NetMHCIIpan neural network-based alignment method predicted that six ligands (A544-60, A357-21, D5269-284, H59-25, J2118-133, and K274-89) could be bound with different affinities to the HLA-DR1 molecules that are expressed by the HOM-2 cell line. However, this analytical software failed to predict the binding of the other fifteen ligands presented by HLA-DR1 in VACV-infected cells. The consensus approach using the IEDB prediction software showed similar results, with A544-60, A35721,

H59-25, J2118-133, and K274-89 ligands predicted as HLA-DR1-bound peptides. In

contrast, the A18445-453 natural peptide, but not the D5269-284 ligand, was also positive for this computational approach. The other fifteen natural ligands generated in VACVinfected cells showed low binding score by the IEDB HLA class II prediction tool. In addition, for HLA-DP molecules NetMHCIIpan predicted high affinity binding to B1*0401 chain for B578-92, but not for D1620-629 and D12174-183. Using the IEDB prediction software, B578-92 and D12174-183 peptides, but not the D1620-629 peptide showed high percentile rank values. Unlike the HOM-2 cells, the heterozygous JY cell line expresses four HLA-DR chains: B1*0404, B4*0101, B1*1301, and B3*0101. Thus, prediction analysis of HLA binding of the I2L21-35 ligand was carried out. The NetMHCIIpan method predicted that this viral ligand could be bound with different affinities to the four HLA-DR1 molecules expressed by the JY cell line. The IEDB tool also predicted high affinity values for B1*0404, B1*1301, and B3*0101 chains but not for B4*0101 molecule. Furthermore, both computational approaches predicted the binding of the A38221-231 ligand to both HLA-DP chains expressed by the JY cell line: B1*0201, and B1*0401. Moreover, both bioinformatics tools predicted that I221-35 and A38221-231 ligands, identified by mass spectrometry in this study, could bind up to four different HLA-DR 12 ACS Paragon Plus Environment

Page 12 of 38

Page 13 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

and the two HLA-DP molecules expressed by the JY cell line, respectively. This promiscuity in HLA class II binding, which is far from being an exception, could be relatively common 24, 27.

The HLA class II ligandome from non-infected and VACV-infected cells.

Similarly to VACV ligands identification, a MS analysis over a human proteome database resolved 12110 fragmentation spectra as peptidic sequences of 1808 human cellular proteins bound to HLA-DR or -DP molecules from both cell lines utilized in this study (manuscript in preparation). None of these human sequences was assigned in the fragmentation spectra associated to the viral ligands identified in the VACV proteome (Table 1). The peptide size of cellular HLA class II ligands followed a normal distribution, with an average length of 14.8 ± 3.3 (mean ± SD) residues, within the standard HLA class II size range 24, 28. This number was very similar to the length of the VACV ligands identified in the current study (14.2 ± 3.9 residues) (Table 1). Alignment with Gibbs Clustering (http://www.cbs.dtu.dk/services/GibbsCluster/) shown that more than 80% of the 8369 HLA-DR and 3741 HLA-DP class II ligands, identified by mass spectrometry from the host, clustering the known binding motifs for the regarding alleles previously described 29, 30. The NetMHCIIpan method predicted that half of these cellular ligands could be bound with significant affinities to the respective HLA class II molecules from the homozygous HOM-2 cell line. Moreover, a prediction of organelle location of these 1808 proteins based on the Gene Ontology database (http://www.geneontology.org) showed that membrane, vesicles, and extracellular space, in this order, were the more relevant compartments identified. In addition, other organelles also associated with the endocytic pathway as endoplasmic reticulum and Golgi were other relevant sources of cellular HLA class II ligands. In total, proteins located in these five compartments of the endo-lysosomal

13 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

pathway were predominant (74% of total) as expected in agreement with other previous studies 24, 28.

Conservation of ligands among several Orthopoxviruses.

The sequences of the twenty-six VACV ligands bound to HLA-DR or -DP class II molecules identified by mass spectrometry were aligned with homologs derived from two strains of human pathogenic orthopoxvirus responsible for smallpox (variola major and variola minor), and of another related poxvirus (cowpox). Half of the HLA class II VACV ligands were fully conserved among these orthopoxviruses, whereas thirteen showed at least one single residue change (Table 2).

The parental and non-parental VACV proteins for HLA class II ligands were different in length, but not in their hydrophobicity or isoelectric point.

Using a mass spectrometry approach similar to ours, four HLA-DR1 ligands derived from the I6, D6, and A10 VACV proteins were previously identified 9. These viral peptides, along with the twenty-two HLA-DR, and four HLA-DP ligands identified in the current report, raised a total number of twenty-nine VACV HLA class II ligands. These ligands, identified in human VACV-infected cells, belonged to twenty-two viral proteins encoded by the VACV genome. Thus, next we analyzed the major features of the twenty-two VACV proteins that renders HLA class II viral ligands (termed parental VACV proteins), and compared to the remaining 209 VACV proteins that do not renders any HLA class II viral ligand (termed non-parental). First, no relevant differences were observed regarding the hydrophobicity (Fig. 2A) and the isoelectric point (Fig. 2B) between parental and non-parental VACV proteins for HLA class II ligands. However, we found a difference in the average length

14 ACS Paragon Plus Environment

Page 14 of 38

Page 15 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

of both sets of VACV proteins: 333 ± 202 residues for parental proteins, whereas the remaining 209 VACV proteins were significantly smaller, with a mean of 241 ± 207 residues (P < 0.05, Student's t-test) (Fig. 2C).

Medium-sized VACV proteins were a relevant source for HLA class II ligands.

A more detailed analysis of VACV protein length distribution trends were carried out by binning the VACV proteome into three roughly equivalent groups termed small (300) proteins. Proteins larger than 150 residues showed a statistically significant enrichment as source for HLA class II ligands versus the rest of the viral proteome (P < 0.001, Chi-square analysis) (Fig. 3A), consistent with the likelihood that larger proteins could be more frequently source of the HLA ligands (example in

31

). Nevertheless, there was not a linear relationship since

VACV proteins ranging from 150 to 300 residues represented a slightly more abundant source of HLA class II ligands than those greater than 300 amino acids (Fig. 3B). This was more striking when the total length of the VACV proteome was analyzed (Fig. 3C). The viral genome encode for about 58,000 amino acid residues, where 59% of them are included within the proteins larger than 300 residues. The fact that the mediumsized proteins (150-300 residues long) only contain the 27% of the VACV residues (Fig. 3C), indicates that this viral proteome subset was a preferential source of the HLA class II ligandome when this factor was normalized.

Protein abundance was a relevant factor for HLA class II viral presentation. Generally, high cytoplasmic mRNA concentrations are associated to abundant proteins. Considering the VACV time-course mRNA levels previously reported

32

as an

indicator of universal viral protein levels, since VACV expression levels are promoter15 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 16 of 38

driven and must be comparable between different host cells, parental proteins for VACV HLA class II ligands were more abundant than the average of transcripts from the rest of VACV proteome at all times post-infection analyzed (30 min, 1 h, 2 h, and 4 h), being significant at 2 h post-infection (Fig. 4A). Data corresponding to the individual parental proteins of the HLA class II ligands identified by mass spectrometry as shown in Supplementary Figure 3. No correlation between transcript level of individual VACV genes and protein length was found (Fig. 4B).

Virion core proteins, but not virion membrane or viral envelope proteins, were the major source of VACV HLA class II ligands.

A comparative analysis of the spatial location of the parental and non-parental VACV proteins for HLA class II ligands was carried out by gene ontology ("cellular component") enrichment. As expected, VACV HLA class II ligands were mainly derived from proteins of enveloped virion particles, a location overrepresented in parental proteins in comparison to the non-parental dataset (P < 0.05; Chi-square analysis) (Fig. 5A). However, of the three main structures constituting VACV virion particles

33

,

the virion core (but not virion membrane or viral envelope) proteins were the major source of HLA class II ligands. Only two of parental VACV proteins are in the top-20 most abundant proteins in the virion, which amount to the vast majority of virion content 34, 35

, suggesting that the protein amounts in the original infective particles are not the

major factor underlying the association between abundant transcripts and parental proteins. Furthermore, minor contribution of viral proteins from the host membrane or the extracellular space, which are therefore accessible to classical HLA class II antigen presentation pathway, was also found. In addition, three VACV parental proteins (B13, H5, and I3) were included in the cellular component host cytoplasm. As B13 is also an

16 ACS Paragon Plus Environment

Page 17 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

extracellular protein, and H5 was included in the viral envelope thus, only the I3 protein remains as an exclusive cytoplasmic source of viral HLA class II ligands.

VACV HLA class II ligands were mainly derived from proteins involved in the viral gene expression.

Next, the potential functional trends of parental and non-parental VACV proteins for HLA class II ligands were also analyzed. Gene onthology "Biological process" terms were grouped into gene expression functions (for example DNA replication, transcription, and repair), suppression of host functions (e.g. viral proteins that help to evade either innate or adaptive immune responses) or associated functions with viral processes (such as morphology, viral entry or shedding). The analysis showed that VACV proteins associated with gene expression functions constituted the main source of VACV HLA class Il viral ligands, which in fact are significantly enriched respect to non-parental VACV proteins (P < 0.05; Chi-square analysis) (Fig. 5B).

Virion core and/or functions related to viral gene expression were the main features of VACV parental proteins for HLA class II ligands. As we previously described, most of VACV HLA class II ligands derived from proteins of the virion core (Fig. 5A) or proteins involved in viral gene expression (Fig. 5B). Thus, of twenty-two VACV parental proteins, fourteen are included in the VACV virion core, eleven have different functions related with the viral gene expression, and eight of them (A5, A18, D1, D6, D7, D12, H5, and I6) share both characteristics (Table 3). In contrast, only five proteins (A35, A38, B13, D13, and E3; 23% of VACV parental proteins) were not included in any of these two categories. Therefore, the HLA class IIrestricted immune response against VACV is mainly focused on virion core structural proteins and/or proteins involved in the VACV gene expression.

17 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 18 of 38

DISCUSSION

Identification of virus-derived peptides presented by HLA molecules is a challenging task, that classically has been addressed using partially overlapping synthetic peptides covering the partial or full sequence of known antigenic proteins, other viral proteins or even, the whole proteome when the virus is small enough

36

. In

contrast, prediction of HLA peptide binding by bioinformatics software is a fast and useful tool to select potential candidates for high affinity ligands from poorly studied viruses. However, the information obtained by these computational tools is very dependent on the accuracy of the algorithm. In fact, in the present study, a relative disparity between the experimental data and two computational tools utilized was found. Ligands from VACV A38 and I2 proteins showed high binding score by both NetMHCIIpan and IEDB tools, but these two bioinformatics tools failed to predict binding of most natural HLA-DR B1*0101, and HLA-DP B1*0401 ligands identified by mass spectrometry. This discrepancy indicates that for some HLA class II alleles these predictors, in order to improve their performance, must be refined by training with new sequences obtained experimentally by either mass spectrometry or functional assays. In addition, HLA peptide binding is only one of the multiple elements involved in the antigen processing and presentation pathway for individual HLA class II ligands. Other elements as the class II invariant chain, the accessory molecule HLA-DM, endocytic activity, lysosomal pH, activity of different cathepsins and other uncharacterized proteases, and so forth are not computed by the algorithms of IEDB or NetMHCIIpan. Thus, bioinformatics approaches will be always subjected to give only partial information. As a feasible alternative for viruses like VACV, with complex proteomes of more than 200 proteins, mass spectrometry-based approaches can identify viral HLA ligands naturally processed and efficiently presented on the cell surface of infected cells. In the

18 ACS Paragon Plus Environment

Page 19 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

current report, we have demonstrated that HLA peptidomics analyses of virus-infected cells can yield the identification of a broad array of natural HLA class II ligands. To the best of our knowledge, this study provides the highest number of natural HLA II ligands reported for VACV. More relevant, general antigen processing rules for HLA class II antigen presentation can be defined from these mass spectrometry analyses and applied in a new generation of predictors. Several studies have shown that the cellular proteins associated to highly abundant mRNA transcripts were found to be the most likely sources of endogenous HLA class I ligands, but the poorly transcribed mRNA also generated a significant fraction of these ligands

31, 37-39

, but with poor correlation, if any

40

. However, this

association was less pronounced for HLA class II molecules. Thus, while the 5% of the most abundant mRNAs were the 69% of the parental proteins for HLA class I ligands, for proteins containing HLA class II ligands, the corresponding percentage of parental proteins was only the 17%

37

. Another study has revealed that the most abundant

proteins observed in the HLA-DR ligandome were distributed over the full dynamic range plot of the source proteome

28

. Our results suggested that the viral proteins

source of HLA class II ligands are relatively more abundant than VACV proteins not involved in HLA class II antigen presentation. Most of the parental proteins that render VACV HLA class II ligands are included in the viral particles; and in addition, the B13 is an extracellular protein. Thus, all of them can access to the classical HLA class II antigen processing pathway. Thus, endocytosis and subsequent degradation of these proteins in endosomes by aciddependent proteases likely generated the VACV peptides that interact with peptidereceptive HLA class II molecules. The exception was the VACV I3 protein that localizes in cytoplasmic virus factories, where it is associated with VACV DNA. Thus, this protein can not be presented via the classical HLA class II antigen presentation pathway and therefore alternative additional non-classical HLA class II antigen-processing pathways

19 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 20 of 38

must be functional in VACV-infected cells to generate the four HLA-DR ligands derived from the I3 protein and identified by mass spectrometry. Several examples of endogenous MHC class II presentation involving different mechanisms (e.g. macroautophagy and chaperone-mediated autophagy) have been previously described (summarized in

41-43

). These mechanisms are not exclusive for VACV-infected cells

considering that, for example, epitopes derived from HRSV proteins such as the nonstructural NS1 and NS2 proteins were restricted by the HLA-DR

24

or the murine MHC

class II molecule I-Ed 44, respectively. Gene ontology analyses showed a preferential enrichment in virion core proteins, but not virion membrane or viral envelope proteins, and proteins involved in the viral gene expression as source of VACV HLA class II ligands. In summary, the global picture emerging from the current report suggests that the VACV HLA-DR and DP-bound ligands are generated mainly, but not exclusively, by the classical HLA class II antigen processing pathway. In addition the source of the viral ligands was mainly medium-sized, and abundant proteins from virion core and/or proteins with functions related to the viral gene expression. The presenting molecule HLA-DP shows two noticeable features: first, their limited polymorphism in comparison with the HLA-DR locus and second, the existence of a DP supertype including five HLA-DP class II molecules where the 74% of peptide repertoires are common

45

. In the current report, the first four natural ligands from

VACV bound to HLA-DP class II molecule have been identified. In contrast, twenty-two VACV HLA-DR ligands were identified in similar sequential mass spectrometry experiments. This relative abundance correlates with the higher cell surface expression of HLA-DR molecules than HLA-DP proteins 46, 47.

Only the HLA class II ligands conserved between the pathogenic variola virus and either cowpox or VACV, vaccine viruses that were responsible for the

20 ACS Paragon Plus Environment

Page 21 of 38 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Journal of Proteome Research

crossreactive protection in subjects exposed to variola virus. The sequence analysis of the twenty-six VACV HLA class II ligands between both pathogenic and nonpathogenic poxviruses showed that a half of these twenty-six ligands were identical in the variola proteome whereas most of the rest showed only minor variations. In contrast, the sequences of the four HLA-DR ligands previously identified by mass spectrometry are conserved among orthopoxviruses 9. Thus globally, the 50% of the naturally processed HLA class II ligands was conserved between pathogenic and nonpathogenic poxviruses, a percentage very close to 60%, as obtained in similar comparison analyses of HLA class I epitopes from VACV 5, 6. Finally, the findings reported here have clear implications for rational design of antiviral vaccines. In addition to the role of VACV in the successful eradication campaign of smallpox, attenuated forms of this virus are considered as a vaccine platform for biothreat agents or against emerging diseases such as chikungunya, yellow fever, influenza H5N1 or Ebola (summarized in

48

). For that reason, by using

here a recombinant VACV expressing the chikungunya virus structural genes

10

we can

assess the VACV HLA class II immunoprevalence. Then, for the generation of recombinant VACV expressing candidate antigens, we propose the inclusion within the viral genome of pathogen recombinant genes, with strong promoter sequences allowing high-level expression, and encoding immunogenic medium-sized proteins that can improve the critical HLA class II antiviral immune response against the pathogen of interest. This protein should be packaged in the VACV virion core, including for example in the recombinant gene specific sequences to interact with the viral DNA or any VACV protein from virion core; thus, the optimized recombinant VACV will have the capacity to increase even more their T helper immunogenicity.

21 ACS Paragon Plus Environment

Journal of Proteome Research 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

FIGURES

Figure 1. Schematic representation of experimental design. Diagram of sequential affinity chromatographic strategy to isolate differentially HLA-DR and -DP peptide pools and exclude non-specific binding proteins and peptides (panel A). Diagram of peptide identification strategy to identify VACV ligands bound to HLA class II molecules and to exclude erroneous assignation of cellular peptides (panel B).

Figure 2. Structural features of VACV parental and non-parental proteins for HLA class II ligands. Features of the 22 VACV proteins rendering HLA class II ligands (parental) with the remaining 209 VACV proteins of the VACV genome (non-parental) is analyzed. Box and whisker plots show the hydrophobicity (GRAVY index, panel A), isoelectric point (pI) (panel B), and length (panel C) of VACV parental proteins for HLA class II viral ligands versus other VACV proteins. Significant P values: *, P < 0.05; n.s., not significant.

Figure 3. Length of VACV parental and non-parental proteins for HLA class II ligands. Panels A and B: Distribution by length of parental (open bars) or non-parental (closed bars) VACV proteins for HLA class II viral ligands. Panel C: Horizontal slices of the percentage of parental VACV proteins for HLA class II viral ligands (upper slice), and the percentage of encoded residues in the VACV proteome (bottom slice) from proteins binned by length : < 150 residues (white bars), 150-300 residues (grey bars), and > 300 residues (black bars). Significant P values: **, P