Data Shop: Breaking up is hard to . . . predict - Analytical Chemistry

Data Shop: Breaking up is hard to . . . predict. Katie Cottingham. Anal. Chem. , 2004, 76 (15), pp 291 A–291 A. DOI: 10.1021/ac041481y. Publication ...
1 downloads 0 Views 46KB Size
data shop

Breaking up is hard to . . . predict

T

o identify proteins in complex mixtures, proteomics researchers often use an enzyme to digest the proteins and analyze the peptides by LC/MS/MS. The experimentally obtained spectra are then compared with theoretical spectra of peptides contained in a database by an algorithm, such as Sequest or Mascot. If the score calculated by the algorithm is above a certain threshold value, researchers generally assume a correct identification has been made. But sometimes, the algorithms provide incorrect identifications. In the July 15 issue of Analytical Chemistry (pp 3908– 3922), Zhongqi Zhang at Amgen describes a new model for predicting how peptides will fragment in MS/MS experiments. The approach yields more realistic theoretical spectra than conventional models. “If you’re [able] to predict [fragmentation] more accurately, you can increase the confidence of peptide assignments,” says Zhang. When Zhang first approached the issue of peptide fragmentation, he wasn’t trying to get on the proteomics bandwagon. He says that scientists at his company use LC/MS/MS to characterize individual, highly purified, recombinant proteins. “I wrote a little program to assign the fragment ions to help manual peptide assignments, but for some spectra, it didn’t work as well for non-expert users,” he says. The problem nagged at Zhang. He developed simple models, but none of them worked to his satisfaction. Although Zhang worked with peptides all the time, he admits that he was not exactly an expert in the area of fragmentation mechanisms. “I had to start from the beginning and read those papers,” he says. “There were so many groups and so many studies, and I had to put it all together.” © 2004 AMERICAN CHEMICAL SOCIETY

Other strategies for predicting fragmentation are based on statistical approaches, but Zhang says they are too simplistic. He explains that many parameters, such as basicity and activation energy, should be included for accurate predictions. And whereas some models generate spectra with the major fragment ion peaks normalized to the same intensities, Zhang’s version predicts the expected intensities of peptide fragments in a spectrum.

“If you’re [able] to predict [fragmentation] more accurately, you can increase the confidence of peptide assignments.” Once Zhang developed the new model, he trained it with 5605 MS/MS spectra. It simulated a spectrum for each peptide, which was compared with the experimentally obtained spectrum. If the two spectra differed, Zhang would tweak the model and a program optimized the parameters. The optimization process took two or three weeks every time he made a change. In the final version, the simulated spectra matched experimental spectra with an average similarity value of 0.71, according to a metric he developed. Using a conventional approach, the similarity score was only 0.37. Zhang also tested the model with 147 test spectra obtained from hemoglobin peptides that were not included in the initial training set. The similarity score was 0.73 with his strategy and 0.40 with the conventional method. Although Zhang says the model works well, he admits that it has some limita-

tions. Zhang has tested it on only one manufacturer’s ion trap instrument. “Definitely, if you [use] it on a quadrupole TOF, for instance, it’s not going to work,” he warns. The model is instrument-specific and must be retrained before it can be applied to data from other instruments. Another limitation is that the current version predicts only singly and doubly charged precursor ions. Some proteomics researchers are excited about Zhang’s work. “I think this is an important first step,” says John Yates at the Scripps Research Institute. Although Yates is impressed by the similarities between the theoretical and experimental spectra shown in the paper, he suggests that additional peptides be tested to further optimize the model. Nonetheless, he thinks that it will be very useful for proteomics database searching and de novo sequencing. Matthias Mann, at the University of Southern Denmark, regards Zhang’s work as a landmark paper in proteomics. “It’s the first [model], I think, that simulates how peptide fragmentation actually happens,” he says. “What’s surprising is that this can be done at all, and that it simulates the spectra quite well.” Neither Zhang nor Amgen are interested in selling a program containing this model, but Zhang encourages interested researchers to contact him. He is planning to offer a version of the executable program to the scientific community as a free resource. And he’s leaving the further development of the model to anyone who is willing to spend the time to do it. “I’m hoping instrument manufacturers or some software developers will read the paper and write their own program,” Zhang says. a —Katie Cottingham

A U G U S T 1 , 2 0 0 4 / A N A LY T I C A L C H E M I S T R Y

291 A