Identification of Codon-Specific Serine to Asparagine Mistranslation in

Oct 23, 2009 - Biosimilar, Biobetter, and Next Generation Antibody Characterization by Mass Spectrometry. Analytical Chemistry 2012, 84 (11) , 4637-46...
0 downloads 0 Views 3MB Size
Anal. Chem. 2009, 81, 9282–9290

Identification of Codon-Specific Serine to Asparagine Mistranslation in Recombinant Monoclonal Antibodies by High-Resolution Mass Spectrometry X. Christopher Yu,* Oleg V. Borisov, Melissa Alvarez, David A. Michels, Yajun Jennifer Wang, and Victor Ling Protein Analytical Chemistry, Genentech, South San Francisco, California 94080-4990 Translation errors in protein biosynthesis may result in low level amino acid misincorporation and contribute to product heterogeneity of recombinant protein therapeutics. We report the use of peptide map analysis by reversed-phase high-performance liquid chromatography and high-resolution mass spectrometry to detect and identify mistranslation events in recombinant monoclonal antibodies expressed in mammalian cell lines including Chinese hamster ovary (CHO) cells. Misincorporation of an asparagine residue at multiple serine positions was detected as earlier-eluting peptides with masses 27.01 Da higher than expected. The exact positions at which misincorporation occurred were identified by tandem mass spectrometry of the asparagine-containing variant peptides. The identified asparagine misincorporation sites correlated with the use of codon AGC but with none of the other five serine codons. The relative levels of misincorporation ranged from 0.01%-0.2% among multiple serine positions detected across three different antibodies by targeted analysis of expected and variant peptides. The low levels of misincorporation are consistent with published predictions for in vivo translation error rates. Our results demonstrate that state-of-the-art mass spectrometry with a combination of high sensitivity, accuracy, and dynamic range provides a new ability to discover and characterize low level protein variants that arise from mistranslation events. Recombinant protein therapeutics are becoming common in the battle against challenging diseases and illnesses. With the desire to improve product safety, efficacy, and consistency, manufacturers of protein therapeutics are increasingly using stateof-the-art analytical technologies for product characterization and development. High fidelity translation of the desired recombinant DNA sequence to its correct amino acid polypeptide product is critical to pharmaceutical production of human proteins in Escherichia coli, yeast, or mammalian cells, including Chinese hamster ovary (CHO) cells. Verification of sequence fidelity is routinely performed on the protein product by peptide mapping. * To whom correspondence should be addressed. Phone: (650) 225-1138. Fax: (650) 225-3554. E-mail: [email protected].

9282

Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

A primary goal of these analyses is to identify possible errors in the gene sequence that produce incorrect protein sequences. In addition to sequence variants that are attributed to variation or mutation in the DNA sequence,1 it is also possible that protein variants may arise due to translation errors. Loftfield has reported that an error frequency of 10-3-10-4 (per amino acid residue) could exist in vivo for misincorporation of uncoded but structurally similar amino acids using radioisotope labeling.2 Misincorporation of norleucine for methionine was found to occur at ∼5% or higher in recombinant proteins expressed in E. coli.3-5 Similarly, misincorporation of norvaline for leucine was observed at 0.5-3% in human hemoglobin expressed in E. coli.6 Although multiple proof-reading mechanisms exist that involve amino acid transfer ribonucleic acid (tRNA) synthetases, error in protein biosynthesis is still possible due to mis-aminoacylation of tRNA. Kinetic measurements of yeast seryl-tRNA synthetase (SerRS) showed in vitro mis-aminoacylation of seryl tRNA by threonines at ∼0.4% level.7 Through computational analysis of the crystal structure of SerRS, McClendon et al. predicted in 2006 that asparagine can also compete with serine for formation of the activated intermediate.8 Detection of mistranslation products is often facilitated by changes in electrophoretic or chromatographic profiles of a protein. Amino acid analysis, peptide mapping with reversed phase high-performance liquid chromatography (HPLC), and mass spectrometry are common analytical techniques employed for identification, characterization, and quantitation of protein variants. Highly sensitive mass spectrometric instruments with high mass (1) Harris, R. J.; Murnane, A. A.; Utter, S. L.; Wagner, K. L.; Cox, E. T.; Polastri, G. D.; Helder, J. C.; Sliwkowski, M. B. Nat. Biotechnol. 1993, 11, 1293– 1297. (2) Loftfield, R. B.; Vanderjagt, D. Biochem. J. 1972, 128, 1353–1356. (3) Bogosian, G.; Violand, B. N.; Dorward-King, E. J.; Workman, W. E.; Jung, P. E.; Kane, J. F. J. Biol. Chem. 1989, 264, 531–539. (4) Randhawa, Z. I.; Witkowska, H. E.; Cone, J.; Wilkins, J. A.; Hughes, P.; Yamanishi, K.; Yasuda, S.; Masui, Y.; Arthur, P.; Kletke, C.; Bitsch, F.; Shackleton, C. H. L. Biochemistry 1994, 33, 4352–4362. (5) Budisa, N.; Steipe, B.; Demange, P.; Eckerskorn, C.; Kellermann, J.; Huber, R. Eur. J. Biochem. 1995, 230, 788–796. (6) Apostol, I.; Levine, J.; Lippincott, J.; Leach, J.; Hess, E.; Glascock, C. B.; Weickert, M. J.; Blackmore, R. J. Biol. Chem. 1997, 272, 28980–28988. (7) Gruic-Sovulj, I.; Landeka, I.; So ¨ll, D.; Weygand-Durasevic, I. Eur. J. Biochem. 2002, 269, 5271–5279. (8) McClendon, C. L.; Vaidehi, N.; Kam, V. W. T.; Zhang, D.; Goddard, W. A., III Protein Eng., Des. Sel. 2006, 19, 195–203. 10.1021/ac901541h CCC: $40.75  2009 American Chemical Society Published on Web 10/23/2009

accuracy in conjunction with reversed-phase HPLC provide two dimensions of outstanding resolving power in the detection of a large number of potential and previously unknown variants at very low levels. As an example of such an application of high-resolution mass spectrometry in protein sequence variant analysis, we report here a novel finding that asparagine residues were misincorporated at multiple serine positions in recombinant monoclonal antibody molecules expressed in CHO cell lines. MATERIALS AND METHODS Materials. Recombinant IgG1 and IgG4 monoclonal antibody (mAb) samples (all with κ light chains) were expressed in CHO, NS0, and E. coli cells and purified at Genentech (South San Francisco, CA). Peptides IYPTSGSTNYADSVK and IYPTSGSTNYADNVK were synthesized and purified at Genentech. Chemical Reagents. Tris base, Tris HCl, trifluoroacetic acid, calcium chloride, dithiothreitol (DTT), and iodoacetic acid were purchased from Sigma (St. Louis, MO). Guanidine hydrochloride was from Thermo Fisher (Rockford, IL). Ethylenediaminetetraacetic acid (EDTA) (disodium, dihydrate) and acetonitrile (HPLC grade) were purchased from J.T. Baker (Phillipsburg, NJ). Sequencing grade modified trypsin was from Promega (Madison, WI), and chymotrypsin was purchased from Roche Applied Science (Penzberg, Germany). PD-10 desalting columns were purchased from GE Healthcare (Uppsala, Sweden). Sample Preparation. A total of 1 mg of mAb sample was reduced by incubation with 20 mM DTT for 60 min at 37 °C, in 6 M guanidine-HCl, 360 mM Tris, and 2 mM EDTA at pH 8.6. Iodoacetic acid was added to a final concentration of 50 mM after cooling to room temperature, and the reaction was allowed to occur in the dark for 15 min. The reaction mixture was buffer exchanged into digestion buffer (25 mM Tris, 2 mM CaCl2, pH 8.2) using PD-10 desalting columns. Protein concentration was determined by absorbance at 280 nm and then adjusted to ∼0.4 mg/mL with digestion buffer. Samples were digested with trypsin or chymotrypsin for 5 h at 37 °C, using an enzyme to substrate ratio of 1:40 (w/w). The digestion was stopped by adjusting the pH to 2 with addition of 10% trifluoroacetic acid (TFA) solution. The digested samples were stored at 2-8 °C until injection onto the column. Reversed-Phase HPLC and Online Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS) analysis. Peptide map analysis was performed with Agilent (Santa Clara, CA) 1100/1200 HPLC using a Phenomenex (Torrance, CA) Jupiter C18 column (2.0 mm × 250 mm, 5 µm, 300 Å). Mobile phase A was water with 0.1% TFA, and mobile phase B was 90% acetonitrile in water with 0.09% TFA. A linear gradient from 0 to 40% B in 160 min was used, with 40 µg of digested protein injected for each analysis. The flow rate was 0.25 mL/min, with a column temperature of 55 °C. Liquid chromatography-mass spectrometry (LC-MS) and LC-MS/MS experiments were performed with Thermo Fisher Scientific LTQ XL and LTQOrbitrap XL (San Jose, CA) mass spectrometers equipped with an electrospray ionization source. Mass spectrometers were operated in positive ionization mode with the following source conditions: 4.5 kV spray voltage, 300 °C capillary temperature, and 35 and 9 L/min sheath and auxiliary gas flows, respectively. Collision induced dissociation (CID) experiments were performed with a 4 amu isolation width and a normalized collision

energy of 30%. Instruments were tuned and calibrated according to the manufacturer’s recommendations prior to data acquisition. Initial sequence variant analysis was performed with the LTQ XL using acquisition conditions where one full-MS survey scan was followed with a zoom scan and MS/MS scan for the five most abundant ions, with dynamic exclusion enabled. Ions with MS/MS data that do not match expected amino acid sequence of the protein were searched against a database containing all possible amino acid substitutions that may result from a single base substitution using a Mascot error tolerant search algorithm (Matrix Science, London, U.K.; http:// www.matrixscience.com/help/error_tolerant_help.html). Details on the development and optimization of the LC-MS/MS data acquisition and analysis conditions will be published separately. Spiking experiments with the synthetic variant peptide were performed using LTQ XL instrument, with MS1 data collected for m/z 800-805 and 814-819 in zoom scan mode. Highresolution mass determination was performed with LTQOrbitrap XL instrument operated in a full-scan MS mode with resolution set at 30 000 at m/z 400. These MS-only data were used to identify potential Ser to Asn misincorporation sites. For estimation of the relative levels of substitution, extracted ion chromatograms for expected and variant peptides were generated using their monoisotopic m/z ± 0.1 amu. Ions identified during this process were selected for further LC-MS/MS studies using LTQ-Orbitrap XL instrument. A parent mass list was populated using the most abundant charge state m/z for each pair of expected and variant species. LC-MS/MS data were acquired using a full-MS survey scan with resolution set at 30 000 at m/z 400, followed by ion trap MS2 scans only for the ions included in the mass list with a parent mass selection width of 25 ppm. RESULTS AND DISCUSSION Identification of Serine to Asparagine Variant in One Tryptic Peptide. As part of cell line development for monoclonal antibody (mAb) products, sequence variant analysis was performed using purified protein. Tryptic digests of one IgG1 (MAb1) expressed in four different clones were analyzed by reversed phase LC-MS/MS. Ions that did not match the expected amino acid sequence of the protein were searched against all possible amino acid substitutions that could result from a single base substitution (MASCOT error tolerant search). Such data analysis identified a Ser to Asn variant with a mass increase of 27 Da in the tryptic peptide IYPTSGSTNYADSVK (underline indicates substitution site). The retention time of the variant peptide was approximately 2 min earlier than that of the expected peptide (see extracted ion chromatograms in Figure 1). The CID spectra of the expected and variant peptides were compared and are shown in parts A and B of Figure 2, respectively. The y series of fragment ions clearly identifies each peptide as well as the third Ser residue from the N-terminus as the location of Asn substitution. This assignment was also supported by the observation of the b12 ion at the expected m/z and b14 ion at the expected m/z + 27 Da. Since the variant peptide was detected at a very low level (∼0.1%), we considered the possibility that the + 27 Da variant signal was perhaps caused by unknown side Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

9283

Figure 1. Extracted ion chromatograms at (A) m/z 801.5-802.5 for expected peptide IYPTSGSTNYADSVK and (B) m/z of 815.0-816.0 for + 27 Da variant peptide. Peaks are labeled with retention time (RT) and peak area (MA).

Figure 2. Tandem mass spectra of (M + 2H)2+ precursor ions at (A) m/z of 801.9 for peptide IYPTSGSTNYADSVK; (B) m/z of 815.4 for + 27 Da variant species; and (C) m/z of 815.4 for synthetic peptide of IYPTSGSTNYADNVK. Fragment ions that support the identification of Asn substitution site are shown in red font.

reactions from the guanidine hydrochloride denaturation and iodoacetic acid alkylation steps during sample preparation. Peptide mapping and LC-MS analysis of MAb1 samples that were reduced with tris(2-carboxyethyl) phosphine (TCEP) and digested with trypsin, without exposure to guanidine hydrochloride and iodoacetic acid, showed that the same + 9284

Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

27 Da variant was present at similar levels (data not shown). This observation indicated that it was unlikely that the + 27 Da variant was an artifact from sample preparation conditions used in peptide mapping. Accurate mass determination using both tryptic and chymotryptic digests confirmed the mass difference between the normal

Figure 3. Extracted ion chromatograms of m/z 815.4 (variant) and m/z 801.9 (normal) of sample tryptic digest and that spiked with synthetic variant peptide. Panel A, sample tryptic digest; panel B, sample + 0.05% spike; panel C, sample + 0.10% spike; panel D, sample + 0.25% spike. Peaks are labeled with retention time (RT) and peak area (MA).

and variant peptides was 27.01 Da. While theoretically both Ser to Asn and Thr to Gln substitutions can result in a mass increase of 27.011 Da (addition of C1H1N1), the possibility of a Thr to Gln variant was eliminated with the MS/MS data (Figure 2A,B). The observation of chymotryptic peptide ADSVKGRF that is devoid of Thr residues and a corresponding + 27.01 Da variant peptide (ADNVKGRF) further supported the conclusion of a Ser to Asn substitution. Confirmation of Ser to Asn Substitution with the Use of Synthetic Peptide. Identification of the variant peptide as a result of Ser to Asn substitution was firmly established with the use of a synthetic peptide. When analyzed with the same experimental conditions, the variant peptide from the sample tryptic digest produced the same MS/MS fragmentation pattern and ions as those seen in the synthetic peptide IYPTSGSTNYADNVK (parts B and C of Figure 2, respectively). We further confirmed the coelution and level of the variant peptide in the tryptic digest by spiking in the synthetic peptide at three different levels. As shown in Figure 3, the variant ion at m/z 815.4 was found at higher levels with increasing spike amounts. The relative level of variant was found at 0.11% for sample

without spike, using the peak area of 815.4 normalized to the sum of peak areas of 815.4 and 801.9 in their respective extracted ion chromatograms. In samples that are spiked with 0.05%, 0.10%, and 0.25% variant peptide (moles of variant peptide spiked per mole of normal tryptic peptide in the sample), our analysis detected the variant at 0.15%, 0.20%, and 0.35%, respectively. These data demonstrate that the synthetic Ser to Asn variant peptide coelutes with the variant peptide detected in the sample. The response was additive and linearly correlated with the added amounts (R2 ) 0.999). The DNA codon used for Ser at which Asn substitution occurs is AGC. It is possible that a single base variation of G to A would result in codon AAC and lead to incorporation of Asn instead. We did not attempt to verify the presence of this low level DNA variant, as detection of polymorphisms by DNA sequencing is typically limited at ∼10% sensitivity level and in the best case at ∼1%.9 While DNA sequence variants are typically clone-specific, four different clones of MAb1 subjected to peptide map analysis were observed with the same Ser to Asn variant, at similar low levels. Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

9285

Table 1. Tryptic Peptides in IgG1 MAb1 that Show + 27.01 Da Variant Signals normal peptide m/z, monoisotopic theoretical

observed

error (ppm)a

variant m/z, monoisotopic

∆massb (Da)

IYPTSGSTNYADSVK SLSLSPG TTPPVLDSDGSFFLYSK NTAYLQMNSLR LSCAASGFTFTSTGISWVR DSTYSLSSTLTLSK

801.8859 660.3563 937.4646 655.8297 1024.9909 751.8829

801.8855 660.3559 937.4643 655.8300 1024.9911 751.8824

-0.5 -0.6 -0.3 0.5 0.2 -0.7

VYACEVTHQGLSSPVTK VDNALQSGNSQESVTEQDSK

938.9591 1068.4880

938.9593 1068.4868

0.2 -1.1

815.3912 687.3659 950.9711 669.3376 1038.4983 765.3890c 765.3870c 765.3885c 952.4656 1081.9924

27.011 27.010 27.014 27.015 27.014 27.013 27.009 27.012 27.013 27.011

peptide

a Error is the difference between observed and theoretical monoisotopic m/z values, divided by the theoretical m/z, and multiplied by 1 000 000. ∆mass is the difference between observed m/z for normal and variant peptides multiplied by the respective charge state (all ions have a charge state of 2, except those for SLSLSPG have a charge state of 1). c The variant ion at m/z 765.39 was observed at three separate retention times, representing possible heterogeneity in Asn substitution sites. b

Observation of + 27.01 Da Signal at Multiple Ser Positions by High-Resolution Mass Spectrometry. We hypothesized that the Ser to Asn variant occurred not at the DNA level but rather at the translational level during protein synthesis. Unlike a DNA sequence variation, such a defect in protein synthesis would not be expected to be limited to a specific clone or to a specific codon location. Drawing analogy to other mistranslation events,2-5 we also hypothesized that a Ser to Asn substitution would occur at multiple Ser positions. Since some of the peptides with potential substitution sites were likely less abundant than the top five ions in any given survey scan, we would not have collected MS/MS data on those ions and would not be able to identify them by a MASCOT error tolerant search. We decided to take advantage of the significantly higher specificity in survey scan mode due to high mass resolution and acquired LC-MS data of the tryptic digest using an Orbitrap LTQ mass spectrometer. Samples from the same cell line (clone) but with two different cell culture conditions were analyzed, with sample A containing low levels of variant peptide IYPTSGSTNYADNVK and sample B showing ∼ 5-fold higher level of the variant peptide. Multiple differences exist in cell culture conditions between Samples A and B. However, we are unable to identify specifically which factor(s) contributed to the ∼5-fold difference in misincorporation. All tryptic peptides containing Ser were interrogated for the presence of Asn substitution with a 27.01 Da mass increase. Out of a total of 34 possible peptides that contain at least one Ser, we detected variant peptide masses (expected + 27.01 Da) in 8 of them (Table 1). The detection of multiple peptides with possible Ser to Asn misincorporation was consistent with our hypothesis of low-level mistranslation events. Identification of the Exact Positions of Asn Misincorporation by MS/MS. All ions listed in Table 1 were subjected to MS/MS analysis for confirmation of the peptide identity as well as the determination of the exact location of substitution. The analysis resulted in the identification of one Ser position in each peptide at which Asn misincorporation occurs (highlighted in red in Table 2). The variant signal for peptide DSTYSLSSTLTLSK (m/z 765.39) showed evidence of heterogeneity with several distinct retention times. Close examination of the fragmentation (9) Druley, T. E.; Vallania, F. L. M.; Wegner, D. J.; Varley, K. E.; Knowles, O. L.; Bonds, J. A.; Robinson, S. W.; Doniger, S. W.; Hamvas, A.; Cole, F. S.; Fay, J. C.; Mitra, R. D. Nat. Methods 2009, 6, 263–265.

9286

Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

Table 2. Misincorporation Positions in IgG1 MAb1 and Apparent Correlation with Codon AGC

a Ser to Asn misincorporation positions are shown in red font. Ser that is coded by AGC codon is shown with underlines. b Misincorporation % is the peak area of variant ion divided by the sum of peak areas of normal and variant peptides multiplied by 100.

data in comparison with that of the normal peptide identified at least three different Ser residues at which Asn misincorporation occurs (see Figure 4 for extracted ion chromatograms and Figure 5 for corresponding MS/MS spectra). Overall, 10 sites of Ser to Asn substitution were identified in MAb1 samples, providing strong support for our hypothesis that a mechanism other than DNA base substitution is responsible for the observed Ser to Asn misincorporation. Identification of Ser to Asn Variant in mAb Molecules Expressed from Multiple Cell Lines. If our hypothesis was correct that mistranslation were responsible for the observed protein sequence variation, we expected that the observed Ser to Asn misincorporation would not be specific to a particular mAb. This was supported by LC-MS analysis of a second IgG1 mAb molecule (MAb2) expressed in CHO cells. Accurate mass determination confirmed the presence of multiple variant peptides containing Asn misincorporation sites. It is interesting to note that the same protein expressed and purified from E. coli showed

Figure 4. Extracted ion chromatograms of m/z 751.9 (peptide DSTYSLSSTLTLSK, normal) and m/z 765.4 (variant) at three separate retention times with schematics of fragmentation for each peak of interest. Peaks of interest are shaded and labeled with retention time (RT) and peak area (MA). (A) m/z 751.9 at RT 80.0 min for normal peptide; (B) m/z 765.4 at RT 78.7 min; (C) m/z 765.4 at RT 79.4 min; (D) m/z 765.4 at RT 79.8 min. Schematics show the major daughter ions observed for each peak. The red font highlights the Asn substitution sites and the fragment ions supporting the assignment. Refer to Figure 5 for MS/MS spectra of these four ions.

similar levels of Asn misincorporation across multiple sites of Ser residues (Table 3). An IgG4 mAb (MAb3) expressed in both CHO and NS0 cell lines also showed Ser to Asn misincorporation at multiple sites (Table 4). We note that the slightly higher misincorporation values seen in the NS0-expressed IgG4 mAb in comparison with the CHO-expressed material do not necessarily indicate that NS0 cells produce higher misincorporation (see below discussion). Among all mAb samples that we have analyzed to date, a typical range of 10-3-10-4 substitution level was observed. Since the underlying cause of the observed substitution is not DNA mutation, we do not expect any cell age related stability or variability in the relative level of the variants. Samples of MAb1 and MAb2 molecules expressed in CHO at different cell ages (up to ∼100 days of age) were tested and the analyses showed similar misincorporation levels. Codon Correlation and Misincorporation Level. Serine can be encoded by six different codons (TCA, TCC, TCG, TCT, AGC, and AGT). Without exception, in this study all identified (by MS/ MS) sites of substitution are serine positions that are encoded by the AGC codon (shown in underlines in Tables 2, 3, and 4). Codon AGC is the most frequently used of the six Ser codons in these mAb molecules, representing approximately 35% to over 50% of

Table 3. Misincorporation Positions and Levels in MAb2 Expressed in CHO and E. coli Cell Lines

a Ser to Asn misincorporation positions are shown in red font. Ser that is coded by AGC codon is shown with underlines. b Misincorporation % is the peak area of variant ion divided by the sum of peak areas of normal and variant peptides multiplied by 100.

Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

9287

Figure 5. Tandem mass spectra of m/z 751.9 (peptide DSTYSLSSTLTLSK, normal) and m/z 765.4 (variant) at three separate retention times. (A) m/z 751.9 for normal peptide; (B) m/z 765.4 at RT 78.7 min; (C) m/z 765.4 at RT 79.4 min; (D) m/z 765.4 at RT 79.8 min. The red font highlights the fragment ions that are used to support the assignment of the exact position of Asn misincorporation. Refer to schematics in Figure 4 for amino acid sequences.

all Ser positions. While not all AGC codons resulted in detectable levels of Asn misincorporation, the fact that the observed misincorporation sites are only associated with the AGC codon is intriguing. The observed Asn misincorporation level is uneven, with a difference of 10-fold or more observed among the identified Ser positions, and with more than half of the AGC-coded Ser showing no detectable misincorporation at all. Using mass spectrometric characterization, Forman et al.10 reported context dependent misincorporation of lysine at four arginine positions coded by a rare codon in E. coli. Defects at two different steps of protein biosynthesis contribute to the overall error rate, those at the transcription step during messenger ribonucleic acid (mRNA) synthesis, and those at the translation step during assembly of polypeptide chains. Translation errors can result from either misacylation (mischarging) of the cognate tRNA by the specific tRNA synthetase or codon-anticodon mismatch on the ribosome (misreading). Misacylation is expected to result in uniform distribution of the erroneous amino acid or amino acid analogue at the affected positions, and this appears to be true for norleucine misincorporation at methionine positions.3,4 Codon-specific misreadings of glutamine (CAC or CAU) for histidine (CAG or CAA) and lysine (AAA or AAG) for asparagine (AAU or AAC) have been reported in both E. coli and mammalian (10) Forman, M. D.; Stack, R. F.; Masters, P. S.; Hauer, C. R.; Baxter, S. M. Protein Sci. 1998, 7, 500–503.

9288

Analytical Chemistry, Vol. 81, No. 22, November 15, 2009

cells.11-13 Misreading errors appear to occur more frequently at the first or third position of the codon, although middle position errors have been observed as well (reviewed in ref 14). It is possible that misreading of the AGC codon for Ser with the AAC codon for Asn gives rise to the observed misincorporation, although our results do not clearly show which of these mechanisms is involved. With the observation of mischarging of Thr by seryl tRNA synthetase in vitro7 and computational analysis that suggested Asn can compete with Ser for formation of activated intermediate with seryl-tRNA synthetase,8 it is possible that Ser to Asn misincorporation is the result of mis-aminoacylation involving tRNA that is specific for the AGC codon. The possibility of codon-specific misincorporation of norvaline was raised and investigated in recombinant expression of human hemoglobin in E. coli.6 Misincorporation of norvaline was not equally distributed at all Leu positions, and the only occurrences of a rare Leu codon was correlated with the two positions showing higher levels of misincorporation. However, the overall level of misincorporation (11) Parker, J.; Johnston, T. C.; Borgia, P. T.; Holtz, G.; Remaut, E.; Fiers, W. J. Biol. Chem. 1983, 258, 10007–10012. (12) Schneider, E. L.; King, D. S.; Marletta, M. A. Biochemistry 2005, 44, 987– 995. (13) Harley, C. B.; Pollard, J. W.; Stanners, C. P.; Goldstein, S. J. Biol. Chem. 1981, 256, 10786–10794. (14) Parker, J. Microbiol. Rev. 1989, 53, 273–298.

Table 4. Misincorporation Positions and Levels in MAb3 Expressed in CHO and NS0 Cell Lines

a Ser to Asn misincorporation positions are shown in red font. Ser that is coded by AGC codon is shown with underlines. b Misincorporation % is the peak area of variant ion divided by the sum of peak areas of normal and variant peptides multiplied by 100.

was not significantly impacted when the same rare codon at these two sites was changed to the one most commonly used. Although computational analysis predicted a higher ability of Thr than Asn to compete with Ser for binding with SerRS,8 there was no evidence of Ser to Thr substitution in our analysis. The exact position and level of misincorporation may be influenced by overall stress conditions for cell growth and protein production, amino acid concentration ratio, and other factors (reviewed in ref 14). It is also possible that misincorporation at certain positions can result in incorrect conformation, and those populations would fail to accumulate and would not be detected in the purified protein.6 It was reported recently that the ribosome would recognize errors due to mismatched codon-anticodon helix formation after peptide bond formation and terminate protein synthesis prematurely through release factors.15 Accuracy in protein biosynthesis is dependent on the ability of cellular machinery to correctly replicate, transcribe, and translate genetic information. There is a balance between the need to preserve the gene and its intended function and the benefit of sufficient flexibility to allow adaptation to changes in the environment.16 The transcription error rate is estimated to be ∼10-4.17 In vitro measurements of error rate in the selection of correct amino acids by aminoacyl-tRNA synthetases showed a range of 10-4-10-5.18-20 A slightly higher error frequency of 10-3-10-4 Zaher, H. S.; Green, R. Nature 2009, 457, 161–166. Jakubowski, H.; Goldman, E. Microbiol. Rev. 1992, 56, 412–429. Rosenberger, R. F.; Foskett, G. Mol. Gen. Genet. 1981, 183, 263–268. Hopfield, J. J.; Yamane, T.; Yue, V.; Coutts, S. M. Proc. Natl. Acad. Sci. U.S.A. 1976, 73, 1164–1168. (19) Lin, S. X.; Baltzinger, M.; Remy, P. Biochemistry 1983, 22, 681–689. (20) Yamane, T.; Hopfield, J. J. Proc. Natl. Acad. Sci. U.S.A. 1977, 74, 2246– 2250. (15) (16) (17) (18)

has been reported for in vivo protein synthesis.11,21 The observed range of 10-3-10-4 in Ser to Asn misincorporation in our study possibly represent basal and slightly elevated levels that can be expected . Analytical Considerations. A number of factors influence our ability to detect and identify unknown amino acid substitutions that may occur randomly during protein expression, including the degree of chromatographic separation and ionization efficiency (or possible suppression of ionization). In our experience, the ability to identify variant peptides present at low abundance (