Anal. Chem. 1997, 69, 1986-1991
Complete Determination of Disulfide Forms of Purified Recombinant Human Serum Albumin, Secreted by the Yeast Pichia pastoris Kazuo Ikegaya,† Masaaki Hirose,† Takao Ohmura,† and Kiyoshi Nokihara*,‡
Research Division, The Green Cross Corporation, 2-25-1 Shodai Ohtani, Hirakata-shi, Osaka 573, Japan, and Bio-Medical System Division, Shimadzu Scientific Research Inc., Kanda-Nishikicho 1-3, Chiyodaku, Tokyo 101, Japan
In the case where the supply of material is limited from natural resources and/or risks of infection are to be avoided, recombinant proteins are an important substitute. Consequently, the physicochemical characterization of the primary and tertiary structures of such materials that are to be used clinically is indispensable. In this context, disulfide linkages play a significant structural role and their determination is of paramount importance. As the demand for human serum albumin (HSA), which contains 35 cysteine residues, is continually increasing, its industrial-scale production from the genetically engineered yeast Pichia pastoris is of interest. The present paper describes a methodology that allows the characterization of the multi-disulfide linkages including exact positions in purified recombinant HSA by use of gas-phase protein sequencing. Mild Edman degradation followed by isocratic analysis of the phenylthiohydantoin amino acids in combination with multienzymatic digestions in acidic conditions allowed the exact positions of the 17 disulfide bridges and 1 sulfhydryl group to be rigorously determined. The sulfhydryl content of the present recombinant HSA was the same as plasma HSA. Recently, the production of recombinant proteins for clinical purposes has rapidly expanded. As human serum albumin (HSA)1 plays an important role in clinical therapy, its industrial preparation is of particular importance. In addition, the supply from human plasma is limited and there is always the potential risk of contamination of the product by blood-derived pathogens; hence, alternative large-scale production is in great demand. Recently, HSA was successfully produced and purified from the yeast Pichia pastoris by recombinant DNA technology.2-4 HSA is a globular monomeric protein with a relative molecular weight of 66 500 and contains 35 cysteinyl residues, of which 17 †
The Green Cross Corp. Shimadzu Scientific Research Inc. (1) Abbreviations used: HSA, human serum albumin; rHSA, recombinant HSA; pHSA, plasma-derived HSA; mHSA, mercapto form HSA; nHSA, nonmercapto form HSA; RP-HPLC, reversed-phase high-performance liquid chromatography; PTH, phenylthiohydantoin; ∆S, a dithiothreitol adduct of dehydroalanine; PEC, S-pyridylethylated cysteine; ATZ, anilinothiazolinone; DTNB, 5,5′-dithio-bis(2-nitrobenzoic acid); MES, 2-(N-morpholino)ethanesulfonic acid; TFA, trifluoroacetic acid; DTT, dithiothreitol; Gdn-HCl, guanidine hydrochloride; TBP, tributylphosphine; 4VP, 4-vinylpyridine. (2) Yokoyama, K.; Ohmura, T. Jpn. J. Apheresis 1995, 14, 19-20. (3) Ohi, H.; Miura, M.; Hiramatsu, R.; Ohmura, T. Mol. Gen. Genet. 1994, 243, 489-499. (4) Sumi, A.; Ohtani, W.; Kobayashi, K.; Ohmura, T.; Yokoyama, K.; Nishida, M.; Suyama, T. Biotechnol. Blood Proteins 1993, 227, 293-298. ‡
1986 Analytical Chemistry, Vol. 69, No. 11, June 1, 1997
Figure 1. (A) Chromatogram of standard PTH-amino acids (10 pmol). DMPTU, dimethylphenylthiourea; DPTU, diphenylthiourea; DPU, diphenylurea. (B) Chromatograms at the 34th cycle, after subtraction of the previous cycle, of intact (left) and in situ Spyridylethylated rHSA (right).
disulfide bridges and 1 sulfhydryl group at position 34 are known to exist.5 The polypeptide chain is assembled as preproalbumin on hepatic membrane-bound ribosomes. Peters and Davidson reported in vivo studies of the formation of disulfide bonds in HSA,6 and the complete nucleotide sequence of human serum albumin mRNA was deduced from recombinant cDNA clones.7 For the evaluation of recombinant HSA (rHSA), the analysis of the exact protein sequence accompanied by the determination of disulfide positions is indispensable. The formation of disulfide bonds in a secretory protein is believed to play an important role in the process of protein folding and is of particular importance for tertiary structure stabilization. (5) Brown, J. R. In Albumin Structure, Function and Uses; Rosenoer, V. M., Murray, O., Marcus, A. R., Eds.; Pergamon Press: New York 1977, 2751. (6) Peters, T., Jr.; Davidson, L. K. J. Biol. Chem. 1982, 257, 8847-8853. (7) Dugaiczyk, A,; Law, S. W.; Dennison, O. E. Proc. Natl. Acad. Sci. U.S.A. 1982, 79, 71-75. S0003-2700(96)01316-9 CCC: $14.00
© 1997 American Chemical Society
Figure 2. Pepsin mapping analyses of both rHSA and pHSA (3.7 nmol each). The resulting digests were analyzed by RP-HPLC on a SynProPep RPC18 column (4.6 mm × 150 mm) using a linear gradient elution of 0-100% solvent B over 80 min with a flow rate of 1.0 mL/min, where solvent A was 0.1% TFA and solvent B was 50% aqueous acetonitrile containing 0.1% TFA.
Peptide mapping analysis by reversed-phase high-performance liquid chromatography (RP-HPLC) has played a major role in the characterization and quality control of protein pharmaceuticals manufactured by recombinant DNA technology.8 In the present paper, pepsin mapping followed by semipreparative RP-HPLC separation has been employed, although it was insufficient to determine all disulfide linkages. Thus, a second cleavage with a different enzyme was employed for several fractions from the first digest. Sequence analyses were performed using an Edman-type gas-phase sequencer, and both cystinyl and cysteinyl residues were determined by in situ pyridylethylation according to Nokihara et al.9 An automated gas-phase protein sequencer system with the mild Edman reaction conditions and isocratic separation of all the phenylthiohydantoin (PTH)-amino acids (Figure 1A) was employed. The system allows determination of the intact cysteine and the second half-cystine from simultaneous identification of cysteine and ∆S, a dithiothreitol adduct of dehydroalanine. Disulfide linkage could be confirmed together with intact and S-pyridylethylated cysteine (PEC) residues. Recently Clerc et al. attempted to define the locations of all disulfides of rHSA secreted by the yeast Kluyveromyces, by mass spectrometry, but were unable to completely determine all the disulfide forms.10,11 We report here the complete determination of the disulfide forms in rHSA, secreted by the yeast Pichia pastoris. EXPERIMENTAL SECTION Materials. Plasma-derived HSA (pHSA) was purchased from Miles Inc. Diagnostics Division (Kankakee, IL) and rHSA was (8) Garnick, R. L.; Solli, N. J.; Papa, P. A. Anal. Chem. 1988, 60, 2546-2588. (9) Nokihara, K.; Morita, N.; Yamaguchi, M.; Watanabe, T. Anal. Lett. 1992, 25, 513-533. (10) Clerc, F. F.; Mone´gier, B.; Faucher, D. Cuine´, F.; Pourcet, C.; Holt, J. C.; Tang, S.-Y.; Dorsselaer, A. V.; Becquart, J.; Vuilhorgne, M. J. Chromatogr., B 1994, 662, 245-259. (11) Fleer, R.; Yes, P.; Amellal, N.; Maury, I.; Fournier, A.; Bacchetta, F.; Baduel, P.; Jung, G.; L’Hoˆte, H.; Becquart, J.; Fukuhara, H.; Mayaux, J. F. Bio/ Technology 1991, 9, 968-975.
Figure 3. Sequence analysis of fraction A-1 (Figure 4-1), fraction B-1 (Figure 4-2), fraction C (Figure 2), fraction D (Figure 2), fraction E (Figure 2), fraction F-1 (Figure 4-3), fraction F-2 (Figure 4-3), fraction G (Figure 2), and fraction H (Figure 2).
prepared, starting from a culture solution of the P. pastoris UHG42-3 strain carrying the HSA gene, as described previously.2-4 Endopeptidase pepsin (EC 3.4.23.1) from porcine stomach mucosa was purchased from Sigma (St. Louis, MO). Endoproteinase Lys-C (EC 3.4.21.50) from Achromobacter lyticus M497-1 (Wako Analytical Chemistry, Vol. 69, No. 11, June 1, 1997
1987
Table 1. Amino Acid Sequences Found for the Fractions in Figures 2 and 4
Pure Chemicals, Osaka, Japan), Glu-C (EC 3.4.21.19) from Staphylococcus aureus V8 (Boehringer Mannheim, Germany), and thermolysin (EC 3.4.24.4) from Bacillus thermoproteolyticus (Nacalai Tesque, Kyoto, Japan) were used as received. 5,5′-Dithiobis(2-nitrobenzoic acid) (DTNB), trifluoroacetic acid (TFA), acetonitrile, and 1-propanol from Wako Pure Chemicals, 2-(N-morpholino)ethanesulfonic acid (MES) monohydrate from Dojin (Kumamoto, Japan), dithiothreitol (DTT) and guanidine hydrochloride (Gdn-HCl) from Nacalai Tesque, and tri-n-butylphosphine (TBP) and 4-vinylpyridine (4VP) from Aldrich Chemicals (Milwaukee, WI) were used without further purification. Sulfhydryl Determination. Sulfhydryl groups of rHSA and pHSA were determined according to Ellman.12 Reduction of proteins was carried out with DTT (10 mM) in 4 M Gdn-HCl. (12) Ellman, G. L. Arch. Biochem. Biophys. 1959, 82, 70-77.
1988
Analytical Chemistry, Vol. 69, No. 11, June 1, 1997
Just before analysis, the residual DTT was removed on a PD-10 column (Sephadex G-25 M) (Pharmacia, Uppsala, Sweden). Enzymatic Digestion of HSA. Both rHSA and pHSA were dialyzed against 0.1 M acetic acid (pH 2.9) at a concentration of 150 nmol/mL and digested with pepsin at a molar ratio of 1:100 (enzyme/protein) for 21 h at 37 °C. The resulting digests were analyzed by RP-HPLC on a SynProPep RPC18 column (4.6 mm × 150 mm, Shimadzu, Kyoto, Japan) using a linear gradient elution of 0-100% solvent B over 80 min with a flow rate of 1.0 mL/min, where solvent A was 0.1% TFA and solvent B was 50% aqueous acetonitrile containing 0.1% TFA. The second digestion with endoproteinase Glu-C was performed in 0.1 M sodium phosphate buffer (pH 6.5) at a molar ratio of 1:20 (enzyme/substrate) for 4 h at 37 °C. The second digestion with thermolysin and endoproteinase Lys-C were carried out for 4 h at 37 °C in 0.1 M MES/
Table 2. Sequence Yields (pmol) for PTH-Amino Acids of Intact and S-Pyridylethylated F-1 (Figure 4-3)a intact cycle 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
V (191.6) H (57.3), A (187.6) T (35.5), S (19.8), ∆S (42.8) E (101.7), L (33.6) not found ∆S (4.2), C (1.5) H (25.7) G (56.3) D (29.4) L (47.3) L (31.1) E (34.9) ∆S (7.3), C (1.4) A (38.5) D (17.4) D (13.4) R (11.0) A (23.0) D (4.8) L (9.1)
S-pyridylethylated cycle 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
V (160.5), PEC (66.2) H (15.8) A (129.3) T (27.0), S (15.5), ∆S (34.0) E (77.2), L (21.4) PEC (9.5) PEC (5.1) H (15.1) G (37.5) D (18.4) L (28.8) L (24.1) E (23.0) PEC (4.1) A (22.0) D (10.2) D (10.8) R (5.9) A (9.8) D (1.8) L (2.6)
a With the exception of cycle 1, yields were corrected by subtraction of the previous cycle.
S-Pyridylethylation in the Reaction Chamber of the Sequencer and Sequencing. The sample was dried on the polybrene-coated glass fiber disk in the reaction chamber of the sequencer. A mixture of 0.4% 4VP and 0.2% TBP in 80% propan1-ol (30 µL) was applied on the disk, and the reaction chamber was closed immediately. Sequencing was performed successively according to the pyridylethylation protocol.9
Figure 4. Elution profiles of the second digestion of fraction A in Figure 2 with endoproteinase Glu-C (1), fraction B in Figure 2 with thermolysin (2), and fraction F in Figure 2 with endoproteinase Lys-C (3). The digests were separated on the same column (Figure 2) using a linear gradient elution of 0-35% solvent B over 35 min at a flow rate of 1.0 mL/min, where solvent A was 0.1% TFA and solvent B was acetonitrile containing 0.1% TFA.
NaOH buffer (pH 6.5) at a molar ratio of 1:50 and 1:100 (enzyme/ substrate), respectively. The digests were separated on the above column and eluent. All fractions were monitored at 215 nm and collected manually. Instruments. RP-HPLC were carried out using a System Gold HPLC (Beckman, San Ramon, CA). Sequence analyses were performed on a gas-phase sequencer, Model PPSQ-10 System (Shimadzu) with reagents for this system (Wako Pure Chemicals). All the PTH-amino acids including PTH-S-pyridylethylated cysteine (PTH-PEC) were separated on a Wako-Pac WS-PTH column (4.6 mm × 250 mm, Wako) under isocratic conditions using 40% acetonitrile and 20 mM sodium acetate buffer (pH 4.7), at a flow rate of 1.0 mL/min and monitored at 269 nm.
RESULTS AND DISCUSSION pHSA contains 35 Cys residues of which 34 residues form 17 disulfides and 1 free cysteine at position 34.5 pHSA is believed to be a mixture of the mercapto form (mHSA), which has a free sulfhydryl group (SH), and the non-mercapto form (nHSA) which has a modified cysteinyl residue. The sulfhydryl content was reported as 0.6-0.7 SH/mol.13 The nHSA has a mixed disulfide with cysteine and glutathione.13 Additionally, the SH group of a nHSA component has been partially oxidized to the sulfinic or sulfonic derivative.13 Thus, albumin is believed to be composed of three or four components with only slight differences.13 The sulfhydryl content of both rHSA and pHSA was determined according to Ellman.12 Under the reducing conditions, 35 mol of sulfhydryl groups per mole of both rHSA and pHSA was detected. The sulfhydryl group content was 0.6-0.7 mol/mol for both rHSA and pHSA in the absence of DTT. Before and after pepsin digestion of rHSA, the sulfhydryl content remained unchanged. As the biological and clinical significance of mHSA has been reported,14 identification of the residue at position 34 is indispensable. Identification of the residue at position 34 of the present rHSA has been reported in a preliminary publication,15 and a (13) Anderson, L.-O. In Plasma Protein; Blombock, B., Hansen, L. P., Eds.; WileyInterscience: New York 1979; pp 43-54. (14) Nishimura, K.; Harada, K.; Nakayama, M.; Sugii, A.; Uji, Y.; Okabe, H. J. Anal. Biosci. 1992, 15, 200-205. (15) Nokihara, K.; Morita, N.; Yokomizo, Y.; Ikegaya, K. In Perspectives on Protein Engineering and Complimentary Technologies.; Geisow, M. J., Epton, R., Eds.; Mayflower Worldwide: Kingswinford, U.K., 1995; pp 317-323.
Analytical Chemistry, Vol. 69, No. 11, June 1, 1997
1989
Figure 5. Amino acid sequence and disulfide-bonding pattern of human serum albumin.
cysteinyl (SH-free) residue, both the intact and pyridylethylated forms (Figure 1B), was confirmed. Digestion with pepsin was carefully carried out at pH 2.9 as disulfide interchanges are prevented under these conditions. The RP-HPLC profiles of pepsin digests of both rHSA and pHSA were very similar, as indicated in Figure 2. All peaks were sequenced to identify cystine and/or cysteine residues. The isocratic separation employed in the present system provided highly stable elution positions of all the resulting PTH-amino acids including a PEC derivative (Figure 1A). The present system provides specific peaks for PTH-cysteine, which is derived from intact cysteine as well as the second half-cystine.9 Thus, the disulfide linkage can be determined by Edman cycles on both pyridylethylated and 1990
Analytical Chemistry, Vol. 69, No. 11, June 1, 1997
intact peptides. At the first half-cystine, no amino acid derivatives are transferred to the conversion flask, so that no PTH-amino acid is observed and an anilinothiazolinone (ATZ)-half-cystine, still covalently linked to the second half-cystine through the disulfide linkage, is slowly decomposed during further Edman degradation. It is supposed that at the second half-cystine, ATZ-cystine-related material might be cleaved from the peptide backbone and transferred to the conversion flask, where it is converted to the corresponding PTH derivative with aqueous TFA containing DTT.16 Thus PTH-cysteine and ∆S are eluted independently to give two peaks. (16) Hewick, R. M.; Hunkapiller, M. W.; Hood, L. E; Dreyer, W. J. J. Biol. Chem., 1981, 256, 7990-7997.
The dipeptide sequence, Cys-Cys, occurs eight times in HSA. The peptide bonds in the natural protein, in this case a consecutive L-Cys-L-Cys sequence, are believed to be in the trans form, which does not allow a disulfide linkage between adjacent Cys residues.17,18 Fraction A in Figure 2 was sequenced to give (510)HADICTLSEKERQ(522) and (555)VEKCCKADDKETCFAEEGK(573), although, it was insufficient to determine the exact disulfide forms, since duplicate disulfides were expected. Therefore, further digestion with endoproteinase Glu-C at pH 6.5 was carried out. All resulting fragments were separated by RP-HPLC (Figure 4-1) and sequenced. Fraction A-1 in Figure 4-1 gave the sequence indicated in Figure 3 and Table 1 fraction A-1. Fraction B in Figure 2 was sequenced to give (310)VESKDVCKNYAEAKDVF(326) and (358)EKCCAAADPHECYAKVF(374) and was further digested with thermolysin at pH 6.5. The resulting fragments were separated (Figure 4-2) and sequenced. Fraction B-1 in Figure 4-2 gave the sequence indicated in Figure 3 and Table 1 fraction B-1. Fractions C-E in Figure 2 gave the structures described in Figure 3 and Table 1 fraction C, Figure 3 and Table 1 fraction D, and Figure 3 and Table 1 fraction E, respectively. Fraction F (Figure 2) gave the following sequences: (185)LRDEGKASSAKQRLKCASL(203), (239)TKVHTECCHGDLLECADDRADL(260), and (261)AKYICENQDSISSKLKECCEKPLLEKSHCIAEVENDEMPAD(301). To determine the exact disulfide forms, fraction F was further digested with endoproteinase Lys-C at pH 6.5, separated (Figure 4-3), and sequenced. Fraction F-1 (Figure 4-3) revealed the structure indicated in Figure 3 and Table 1 fraction F-1. Similarly, sequencing of fraction F-2 (Figure 4-3) gave the sequence described in Figure 3 and Table 1 fraction F-2. Fractions G and H in Figure 2 gave structures shown in Figure 3 and Table 1 fraction G and Figure 3 and Table 1 fraction H, respectively. To confirm possible side reactions and/or nonspe-
cific cleavage, all peaks in the second mapping were sequenced. The results indicated that no degradation of the disulfides occurred. As an example, yields of PTH-amino acids found in the actual sequencing of the intact and the S-pyridylethylated F-1 are indicated in Table 2. CONCLUSION During the whole procedure, disulfide interchange as well as reduction of cystine was prevented by the use of acid conditions. As four fragments from the pepsin digestion did not give exact disulfide forms, the second digestion was performed with three different enzymes. Since the trans-peptide bond does not allow disulfide bridges between adjacent Cys residues, the complete disulfide form of rHSA was determined as follows: [(53)S-S(62)], [(75)S-S(91)], [(90)S-S(101)], [(124)S-S(169)], [(168)S-S(177)], [(200)S-S(246)], [(245)S-S(253)], [(265)S-S(279)], [(278)SS(289)], [(316)S-S(361)], [(360)S-S(369)], [(392)S-S(438)], [(437)S-S(448)], [(461)S-S(477)], [(476)S-S(487)], [(514)SS(559)], and [(558)S-S(567)]. The amino acid residue at position 34 of rHSA indicated a cysteinyl (SH-free) residue. These results were consistent with the structure of natural protein, pHSA. Together with previous results,15 the present rHSA, primary structure of which is summarized in Figure 5, has a high proportion of the molecule that is identical with pHSA. ACKNOWLEDGMENT We thank Dr. V. Wray, Gesellschaft fuer Biotechnologische Forschung, Braunschweig, Germany, for reading the manuscript. Received for review December 31, 1996. Accepted March 21, 1997.X AC961316L
(17) Brown, J. R. Fed. Proc., Fed. Am. Soc. Exp. Biol. 1974, 33, 1389. (18) Brown, J. R. Fed. Proc., Fed. Am. Soc. Exp. Biol. 1976, 35, 2141-2144.
X
Abstract published in Advance ACS Abstracts, April 15, 1997.
Analytical Chemistry, Vol. 69, No. 11, June 1, 1997
1991