Data Trends in Protein Analysis for Safety Assessments - ACS

Mar 28, 2019 - Rong Wang , Ryan C. Hill , Norma L. Houston. 1 Bayer Crop Science, Regulatory Science, 700 Chesterfield Parkway,Chesterfield, Missouri ...
2 downloads 0 Views 467KB Size
Downloaded via BETHEL UNIV on April 4, 2019 at 05:18:45 (UTC). See https://pubs.acs.org/sharingguidelines for options on how to legitimately share published articles.

Chapter 5

Data Trends in Protein Analysis for Safety Assessments Rong Wang,*,1 Ryan C. Hill,2 and Norma L. Houston3 1Bayer

Crop Science, Regulatory Science, 700 Chesterfield Parkway, Chesterfield, Missouri 63017, United States 2Corteva Agriscience, Agriculture Division of DowDupont, 9330 Zionsville Road, Indianapolis, Indiana 46268, United States 3Corteva Agriscience, Agriculture Division of DowDupont, 7300 NW 62nd Avenue, Johnston, Iowa 50131, United States *E-mail: [email protected].

Genetically modified (GM) crops go through a rigorous safety assessment process prior to commercialization. A number of regulatory requirements and technological innovations have contributed to the weight-of-evidence and tiered approach used to evaluate the safety of GM crops for food and feed consumption. Over the past 30 years, global regulatory agencies have gained significant experience and knowledge regarding safety evaluations in assessing the risks of GM crops, and to date, all products reviewed have been found to pose no risks to agriculture, human health, or the environment. One component of the safety assessment includes the characterization and evaluation of newly expressed proteins. In this chapter, we highlight the current data trends of this component including the process, technologies used, and unique challenges for plant-expressed proteins in GM crops.

© 2019 American Chemical Society Schoenau et al.; Current Challenges and Advancements in Residue Analytical Methods ACS Symposium Series; American Chemical Society: Washington, DC, 2019.

Introduction Since the introduction of genetically modified (GM) crops in the 1990s, regulatory processes and risk-assessment frameworks have been developed and used to assess safety for humans, animals, and the environment (1–3) GM crops that express transgenic proteins are produced with recombinant DNA techniques that target insertions into a plant’s genome to confer a new desirable trait. To date, all GM crops have undergone extensive assessments for food, feed, and environmental safety before commercialization (2, 4). The safety assessment of GM crops follows the Organisation for Economic Co-operation and Development comparative principles, according to which conventional crops with a history of safe use serve as the baseline for evaluating the safety of GM crops (3). The regulatory assessment of a GM crop determines whether the GM crop is “as safe as” a conventional crop (5). From 1992 to 2017, nearly 100 petitions for deregulation in the United States were approved by the U.S. Department of Agriculture (USDA) (6). Over the past 30 years, global regulatory agencies have reviewed safety assessments to evaluate GM crops for safety, and to date, all products reviewed have been found to pose no risks to agriculture, human health, or the environment (7). Further, the European Commission (8) stated, “The main conclusion to be drawn from the efforts of more than 130 research projects, covering a period of more than 25 years of research, and involving more than 500 independent research groups, is that biotechnology, and in particular GMOs [genetically modified organisms], are not per se more risky than e.g. conventional plant breeding technologies”. The assessment includes gauging the potential food and feed safety of the GM crop to prevent the introduction of a potential allergen or toxin into the food supply. On a molecular level, proteins are one of nature’s most complex and versatile polymers. Other biological polymers, such as nucleic acids and carbohydrates, use less diverse monomers and have limited chemical properties and structures. Proteins are composed of an assortment of 20 amino acids with a range of three-dimensional structures that enable proteins to serve as enzymes, structural materials for tissues, mechanisms for chemical transport, and storage aids that underlie biological functions (9). The widely diverse chemistries and biological functions of proteins can make them challenging to analytically detect, quantify, and characterize to meet the demands of increasing regulatory scrutiny. A collection of analytical techniques is used to identify, quantify, and characterize the safety of the newly expressed protein(s) present in GM crops, and the results are analyzed to assess their safety. Use of advanced technology helps to generate accurate data that answer key questions to assess risks. As a result, it is important to understand the advantages and limitations of each technology to provide clear, purposeful, transparent, and thorough data to meet the requirements of global regulatory agencies. As long as regulatory requirements are met, the adoption of faster, more reliable, and affordable technologies can promote innovation and support sustainable agriculture. In this chapter, our discussion of analytical techniques used to assess the safety of newly expressed protein is organized into two major sections, “Protein Safety Assessment” and “Protein Characterization and Equivalence”. In the former section, we review and discuss 50 Schoenau et al.; Current Challenges and Advancements in Residue Analytical Methods ACS Symposium Series; American Chemical Society: Washington, DC, 2019.

the current state of protein analysis and the methodologies used in the safety assessment of GM crops, including relevant applications that assess protein abundance, structure, function, and stability. In the latter section, we examine analytical techniques used to characterize purified proteins prior to use in safety studies. In-depth details of the methods are not provided in this chapter; however, the focus is on the basic principles of each technology, intended applications, potential problems for plant-expressed proteins, and the contributions of analytical methods to protein safety evaluations.

Protein Safety Assessment The thorough safety assessment of newly expressed proteins in GM crops is outlined by the joint Food and Agriculture Organization of United Nation (FAO) and World Health Organization (WHO) Food Standard Program and summarized in the Codex Alimentarius titled Foods Derived from Modern Biotechnology (4). The following sections provide a high-level overview of the safety assessment components recommended by the Codex Alimentarius, including history of safe use, sequence homology to known toxins and allergens, expression levels, heat stability, in vitro digestibility, and toxicity studies (4). The multicomponent assessment is consistent with the tiered testing strategy advocated by the International Life Science Institute (ILSI) International Food Biotechnology Committee Task Force on Protein Safety, which consists of tier I (protein hazard identification) and tier II (hazard characterization), which is conducted on a case-by-case basis (10). Technologies presented in this chapter can be used to draw conclusions regarding potential toxicological or allergenic properties of the protein(s) expressed in GM crops. Examples of each selected component are provided below. History of Safe Use The vast majority of proteins, as macronutrients, have a long history of safe use (HOSU) and are safe for consumption (11). However, a limited number of well-known dietary proteins are toxic, act as antinutrients, or are allergenic to humans, such as botulinum neurotoxin (12), some lectins (13), and the peanut allergen Ara h 2 (14), respectively. HOSU is often evaluated prior to the selection of candidate proteins (15, 16). For example, strains of Bacillus thuringiensis (Bt) bacteria that contain insecticidal crystal (Cry) proteins have more than 50 years of demonstrated safe use as biological pesticides in spray applications, and several of these Cry proteins, such as Cry1F and Cry1Ac, were introduced into GM crops and have been in use for more than 20 years (17). In addition, 5-enolpyruvylshikimate3-phosphate synthase (EPSPS) from Agrobacterium strain CP4, a key enzyme in the shikimate pathway that is involved in aromatic amino acid biosynthesis and confers glyphosate tolerance, also has more than 20 years of demonstrated safe use (18). Proteins that do not have a HOSU can be considered novel from a consumption point of view, but homologues of those proteins found in food can be used to assess the history of use of the protein. Those with limited homology 51 Schoenau et al.; Current Challenges and Advancements in Residue Analytical Methods ACS Symposium Series; American Chemical Society: Washington, DC, 2019.

to commonly consumed proteins should be subjected to a multicomponent risk assessment for potential allergenicity or toxicity. To date, all proteins expressed by transgenes in GM crops have been subjected to this evaluation regardless of demonstrated HOSU; however, hazard identification and characterization should only be conducted on a stepwise and case-by-case basis (10). Sequence Homology Bioinformatic analysis takes into account the structure and function of the protein and focuses on the similarity or identity of the amino acid sequenceof the protein of interest to a collection of known protein allergens, toxins, and antinutrients. A bioinformatic investigation is typically the first step toward identifying the potential homology of a novel protein to known allergens or toxins (19, 20). Allergenic proteins are processed by antigen-presenting cells into small peptides that are part of the immune response. Evaluating regions of intact proteins for similarity to regions of known allergens makes sense biologically. For the assessment of GM crops, the Codex Alimentarius provides guidelines to support this strategy (2, 4). Under these guidelines, the amino acid sequence is considered to have potential allergenic cross-reactivity if greater than 35% sequence identity over a window of 80 or more amino acids is identified compared with known allergens; other criteria have been found to be both more selective and sensitive for detecting true cross-reactivity (21–23). One such criterion is the E score, which is a metric that describes the number of hits (exact or similar matches) expected by chance. The lower the E score, the higher the similarity between sequences. E scores provide a more robust method, with a lower false-positive rate, than the simple identity criterion (24) and appropriate accuracy and sensitivity for sequence similarity of a query protein to known allergens (21–23). An E score threshold of 1 × 10–5–1 × 10–6 has recently been reported to be useful for identifying cross-reactive immunoglobulin E- (IgE) binding epitopes (25). A second query is often performed with an eight-amino-acid sliding window that seeks to identify exact eight-amino-acid matches between a query protein and allergen proteins (Figure 1), but this criterion has been found to add little value to the allergenicity risk assessment (21, 26). Ensuring that introduced proteins are not homologous to non-IgE-mediated allergens, mainly referring to celiac disease (CD) peptides and specifically human leukocyte antigens HLA-DQ2 or -DQ8, is of concern to some regulators (27). CD is caused by the consumption of wheat, barley, rye, and sometimes oats by susceptible individuals (28). Avoidance of food products made with these grains is an effective, albeit inconvenient, means of preventing CD symptoms. A set of newly implemented regulations extend beyond protein families and require searching against a gluten-derived nine-amino acid peptide motif and its degenerate sequences consisting of Q/E-X1-P-X2 (where X1 is amino acid L, Q, F, S, or E and X2 is amino acid Y, F, A, V or Q), resulting in 50 unique four-amino-acid peptide combinations (Figure 2) (27). Although the scientific basis for this new requirement is limited, new bioinformatic tools have been developed to align the known and putative four-amino-acid stretches with current regulatory guidance (27). An unintended 52 Schoenau et al.; Current Challenges and Advancements in Residue Analytical Methods ACS Symposium Series; American Chemical Society: Washington, DC, 2019.

consequence of this new requirement is that there is a high probability of finding random irrelevant matches with the 50 four-amino-acid combinations of known CD peptides and putative peptides.

Figure 1. Illustration of a static alignment using CLC Bio (QIAGEN): Eight-amino-acid stretch (solid box) against a pool of proteins in an allergen database, sliding one amino acid to the right for each search, giving a total of 147 consecutive eight-amino-acid stretches for Pru av 1 in the illustrated sequence length. 35% identity over 80 amino acids (dashed box).

Figure 2. 50 amino acid peptide combinations resulting from the Q/E-X1-P-X2 motif. 53 Schoenau et al.; Current Challenges and Advancements in Residue Analytical Methods ACS Symposium Series; American Chemical Society: Washington, DC, 2019.

With the advances that have been made in genome sequencing, an enormous number of predicted protein sequences have been collected; however, the translated gene sequences do not provide a complete view of proteins or their interactions in plant biological systems. As a result, data management has become very important for allergen research. Proteins entered annually into the National Center for Biotechnology Information database may not be characterized or have clinical data to support or negate claims of allergenicity or toxicity. However, the COMPARE database (http://comparedatabase.org, accessed November 16, 2018), established by the Health and Environmental Sciences Institute’s Protein Allergen, Toxins, and Bioinformatics Committee, is an actively curated and continuously updated allergen database that includes new entries that have clinical or immune response evidence to enable meaningful allergy assessment. Other databases, including the International Union of Immunological Societies’s Structural Database of Allergenic Proteins, Allergen Database for Food Safety, AllergenOnline, AllFam – The Database of Allergen Families, AllergenPro, and AllerBase, are curated with different scopes and frequencies by their responsible organizations (29). Unlike allergenicity, about which conclusions are drawn from the weight of evidence, protein toxicity can be directly tested with established animal models for acute oral toxicity. Recent investigations of in vitro methods with human intestinal epithelial cell monolayers have shown experimentally the potential for alternative tests to predict protein toxicity (30–37). Protein toxins display target-organism specificity and dosing constraints. Toxic proteins elicit their adverse effects through specific structural features of the intact (or nearly intact) protein. When toxic proteins are denatured by heat or digested, they lose their toxicity (unlike allergenic proteins, which elicit allergic responses as major histocompatibility complex-displayed fragments). For instance, snake venoms are toxic if injected into the blood stream or soft tissues, but they are nontoxic if ingested orally (10). Consequently, evaluation of toxicity using bioinformatics is better suited to determining whether a protein is from a family that has toxic members. The bioinformatic assessments of the allergenic and toxic potentials of an introduced protein are different (15, 38). However, only the latter requires relative similarity or identity comparison and manual inspection to determine the potential risk.

Expression Protein expression is used to characterize the exposure component of risk, where risk is a function of both hazard and exposure (39). Dietary exposure to an introduced protein can be estimated from the protein expression level in edible parts of a GM crop. The enzyme-linked immunosorbent assay is the most common technique used to determine protein expression levels; however, liquid chromatography coupled with tandem mass spectrometry is also capable of protein quantification (40–42). These protein quantification methods are discussed further in Chapter 6. 54 Schoenau et al.; Current Challenges and Advancements in Residue Analytical Methods ACS Symposium Series; American Chemical Society: Washington, DC, 2019.

Table 1. Expression Levels of Introduced Proteins from a Subset of GM Crops Event namea

Crop

Phenotype

Company

Year Deregulated by USDA

MZHG0JG

Maize

Glufosinate- and glyphosate-tolerant

Syngenta

2015

MON-87419-8

Maize

Diacamba- and glufosinate-tolerant

Monsanto

2016

MON-87411-9

Maize

Rootworm-resistant, glyphosate-tolerant

Monsanto

2015

55 DP-ØØ4114-3

Maize

Insect-resistant and glufosinate-tolerant

Pioneer

2013

MON-87751

Soybean

Lepidopteran-resistant

Monsanto

DAS-81419-2

Soybean

Insect-resistant

Dow AgroSciences

2014

SYHT0H2

Soybean

4-Hydroxyphenylpyruvate dioxygenase- and glufosinate-tolerant

Syngenta

2014

2014

Expression level in seed (ppm)

Protein name mEPSPS

36.89 ± 10.06

PAT

b

DMO

0.19 ± 0.048

PAT

0.93 ± 0.27

Cry3Bb1

4 ± 0.56

CP4 EPSPS

1.9 ± 0.31

Cry1F

3.3

Cry34Ab1

24

Cry35Ab1

1.1

PAT