(DFRC) Condensed Tannin NMR Database - ACS Publications

6 provides 184 flavan structures with predicted. 1. H NMR. 43 spectra available in a variety of deuterated NMR solvents but is limited to monomeric fl...
3 downloads 6 Views 264KB Size
Perspective pubs.acs.org/JAFC

The U.S. Dairy Forage Research Center (USDFRC) Condensed Tannin NMR Database Wayne E. Zeller* and Paul F. Schatz U.S. Dairy Forage Research Center, Agricultural Research Service, U.S. Department of Agriculture, 1925 Linden Drive, Madison, Wisconsin 53706, United States S Supporting Information *

ABSTRACT: This Perspective describes a solution-state NMR database for flavan-3-ol monomers and condensed tannin dimers through tetramers obtained from the literature to 2015, containing data searchable by structure, molecular formula, degrees of polymerization, and 1H and 13C chemical shifts of the condensed tannins. Citations for all literature references are provided and should serve as valuable resource for scientists working in the field of condensed tannin research. The database will be periodically updated as additional information becomes available, typically on a yearly basis and is available for use, free of charge, from the U.S. Dairy Forage Research Center (USDFRC) Website. KEYWORDS: condensed tannins, proanthocyanidins, PACs, database, nuclear magnetic resonance spectroscopy, NMR



these oligomers, and the 13C NMR spectra are available via computer-generated shift predictions based on the HOSE code method. This database is somewhat dependent on individual outside investigators to contribute to the expansion of the respective database (open source component). To summarize, none of the available online databases provided the NMR information we required. Thus, we set out to develop a database focused exclusively on isolated and characterized condensed tannins, with the specific objective of consolidating all of the experimental NMR spectroscopic data from the available literature. The impetus for embarking on a project of this type was that we simply needed a way to compare data from the literature to satisfy the objectives of our current investigations into the structure of purified condensed tannin samples. This compilation and organization of the data has resulted in the production of a searchable database. Herein we describe access to and the utility of a database listing of all available 1H and 13C NMR data for oligomeric condensed structures through tetramers along with their flavan-3-ol monomers. The database also includes listing of entries to this class of compounds where NMR data were not made available. The database was constructed to be searchable on a variety of criteria including structure type, extender and terminal flavanol unit sequence, interflavan linkage type, and 1H and 13C NMR chemical shifts.

INTRODUCTION Information helps drive new technological advances. Traditionally, the assemblage of existing knowledge into publications and monographs, disseminated as review articles and focused chapter entries, have assisted scientists in locating requisite information. Computer-assisted compilation of this literature data into searchable databases has greatly increased retrieval capacity of this information. Instruments such as mass spectrometers and infrared spectrophotometers are now typically equipped with a standard databases for spectral searches upon installation. Online databases exist for mass spectrometric,1−3 infrared,1,4 and NMR database4−7 libraries and provide important tools which enable researchers to quickly identify plant components and metabolic compounds. The assemblage of spectroscopic data of naturally occurring components and their metabolites into searchable databases remains an ongoing goal for aiding the compound identification of previously isolated materials and in the ever expanding discovery of new molecular entities. We were faced with the task of retrieving and comparing data for the known and NMR-characterized condensed tannins (also known as proanthocyanidins or PACs). The search of the available databases to retrieve this type of information proved less than fruitful. The Spectral Database for Organic Compounds (SDBS)4 lists 1H, 13C, and 1H−13C HSQC spectra for only epicatechin. The Human Metabolome Database (HMDB)5 returned 229 hits on a flavan-3-ol substructure search but only included NMR data on catechin and epicatechin (1H, 13C, and 1H−13C HSQC spectra) with most of the spectral information available in the form of predicted LC−MS/MS data. The MetIDB database6 provides 184 flavan structures with predicted 1H NMR spectra available in a variety of deuterated NMR solvents but is limited to monomeric flavan-3-ols and their corresponding derivatives. The NMRShift database7 did return 41 hits on a flavan-3-ol substructure search, from which seven condensed tannin oligomers were listed. The 1H NMR chemical shifts are available on most of This article not subject to U.S. Copyright. Published 2017 by the American Chemical Society



DESCRIPTION OF THE TANNIN NMR DATABASE Requirements, Platform, Approach, and Overview of the Database. The FileMaker Pro platform was selected for the construction this database due to its straightforward construction and ease of use for the novice researcher. In generation of this database, it was envisioned to serve as a Received: Revised: Accepted: Published: 5104

May 17, 2017 June 6, 2017 June 7, 2017 June 7, 2017 DOI: 10.1021/acs.jafc.7b02314 J. Agric. Food Chem. 2017, 65, 5104−5106

Journal of Agricultural and Food Chemistry

Perspective

Figure 1. Numbering and alphabetical ring designations of structures depicting B-Type (1 and 2, 4-8 and 4-6 bonding, respectively), A-Type (3, 482O7 bonding), and less common B-ring to D-ring (6p8) linkages. Conventional alphabetical listing of flavan-3-ol rings are also given.

chemical structure of each of the condensed tannin oligomers to a single line, alphanumeric designation and allows rapid searching and sorting of database entries. For example, in structure 1 (Figure 1), since the top flavan-3-ol (extender) subunit is an epicatechin (EC) subunit and the bottom flavan3-ol (terminal) is a catechin (CA) subunit, and the covalent bond between the two flavan-3-ols is from C-4 of EC to C-8 of CA, the structural descriptor (backbone code) for this molecule can be reduced to simply EC48CA. For structure 2 (Figure 1), where the covalent bond between the two flavan-3-ols is from C-4 of EC to C-6 of CA, the one line structural description (backbone code) is EC46CA. Note that the relative stereochemistry of the hydroxyl group at position 3 of the heterocyclic ring of the top unit is trans to the link at position 4. This is almost exclusively the case. Only on very rare occasions are the 3,4-substituents of the condensed tannins positioned in the cis configuration in naturally occurring condensed tannins,10,11 however the 3,4-cis configuration has been obtained during synthetic studies of condensed tannins.12,13 The second major type of linkage is one in which two single bonds are formed between adjacent flavan-3-ol subunits. This is demonstrated in structure 3 and is referred to as A-type linkage. The carbon atom at the 4 position of the extender EC subunit is singly bonded to the carbon atom at the 8 position of the terminal CA subunit, and the carbon atom at the 2 position of the EC subunit is singly bonded to the oxygen atom attached to the carbon at the 7 position of the CA unit. In this example, the bonding type is coded as 482O7. The backbone code for this molecule is then EC482O7CA. Interflavan A-type linkages can also arise from the 46 interflavan linkage (i.e., 462O7). The connectivity of flavan-3-ol subunits in a few of the condensed tannin structures contained in the database entries do not fall under the umbrella of B-type or A-type linkages. The interflavan linkages of these few entries involve bond connections with B ring carbons. To designate the linkage uniquely, the number of the carbon from the B ring participating in covalent bond formation is followed by the letter p. As an example, in structure 4 (Figure 1) the extender EC subunit is connected to the C-8 carbon of the terminal CA subunit via carbon 6′. The backbone code for this molecule is then EC6p8CA. Alphabetical Labeling of Condensed Tannin Rings. Identification of ring systems in the database parallel those

resource tool for individual researchers engaged in isolation, structural determination, and synthesis of condensed tannins. The Reaxys database was used as the primary search engine for identifying and guiding entry selection for population of the database. Condensed tannins reported in the literature up through tetramers are included and any available NMR chemical shift data is listed. The NMR data, available from 309 literature references, was extracted from these references and manually entered into the database. When more than one set of NMR data was reported for a compound, the most complete NMR set was selected for population of the database. Abbreviations of Condensed Tannin Subunits. The condensed tannin subunits (monomeric flavan-3-ols) present in the database are represented by the following two letter abbreviations: CA, Catechin; EC, Epicatechin; GC, Gallocatechin; EG, Epigallocatechin; AF, Afzelechin; EF, Epiafzelechin; GB, Guibourtinidol; EB, Epiguibourtinidol; FS, Fisetinidol; EF, Epifisetinidol; RB, Robinetinidol; and ER, Epirobinetinidol. For coding purposes, no differentiation has been made between enantiomers of flavan-3-ol subunits, i.e., both (−)-epicatechin and (+)-epicatechin are coded as EC. However, designation of enantiomeric flavan-3-ol subunits reported from the literature are captured under the Name listing. Subunit Numbering. Flavan-3-ol subunit numbering used in the database NMR tables follows conventional and wellaccepted practices8,9 depicted in Figure 1. The numbering commences at the oxygen and proceeds around the benzopyran ring system (2, 3, 4, 4a, 5, 6, 7, 8, 8a). The numbering of the phenyl substituent attached to C-2 of the benzopyran ring system follows conventional numbering of 1′ through 6′, following IUPAC nomenclature rules. Interflavan Linkage Nomenclature and Structure Searches. Whereas structural searches are available on some databases through drawing programs such as ChemDraw and Marvin Sketch, we decided to exploit the unique and simplified method to search our collective entries based on subunit identification and interflavan bond connectivity designation commonly used in the condensed tannin literature. The two letter flavan-3-ol subunit identification takes into account the hydroxylation pattern of the flavan-3-ol base structure and the relative stereochemistry of the C-2 and C-3 substituents of the flavan-3-ol C ring. The bonding between adjacent flavan-3-ol subunits is listed simply as the numbered atoms participating in the interflavan linkage. These simplifications reduce the 5105

DOI: 10.1021/acs.jafc.7b02314 J. Agric. Food Chem. 2017, 65, 5104−5106

Journal of Agricultural and Food Chemistry

Perspective

ORCID

already commonly used in the literature. In this convention, sequential capital letters are used to identify first the phloroglucinol ring, followed by the phenyl ring projecting from C-2 and finally the dihydropyran ring of the monomeric unit (Figure 1). The letters ABC are reserved for the extender unit furthest from the terminal unit, followed in sequence (DEF) for the phloroglucinol, phenyl, and dihydropyran rings, respectively, of the next extender unit. This capital letter labeling is continued until the terminal flavan-3-ol unit is reached. In a few cases, there is more than one possible terminal unit (i.e., entry 139, Reaxys number 20000364), especially where extensive branching (i.e., 4-6 interflavan linkage) occurs. In these cases, guidance in the alphabetical labeling of rings is provided in the comment section of the database entry. Comment Section. Any revision made in the NMR assignments, references to Reaxys number entries of compounds of similar structure, observation and ratios of rotamer populations, and clarification of structure connectivity due to branching are contained in the comment section of each database entry. Access to the Database. The database is available for use from the U.S. Dairy Forage Research Center Website at https://www.ars.usda.gov/mwa/madison/dfrc/tannin. Registration for database access is free of charge. To register, simply click on the “Click here to request access to the Tannin Database” and write a short request note in the e-mail prompt. You will then receive a user name/password in a return e-mail message. Information obtained from the registration will not be made public in any manner and will simply be used to monitor database usage. Corrections and Revisions. Although not occurring in high frequency, we did encounter incorrect assignments, typographical errors and, in some cases, errors generated in transferring structures from the literature into the Reaxys search engine. Thus, a small number of chemical shifts were edited throughout the construction of database entries, and these edits are noted in the comments section of the entry. Suggestions for addition of any NMR data available from the literature inadvertently missed during our searches and notifications of new data as it becomes available are welcomed from the users, as well as corrections needed and suggested revisions. Please send details of your suggestions in an e-mail addressed to the corresponding author of this paper. It is expected that the database will be revised on a yearly basis with the latest update listed on the USDFRC Website. The database should serve as a useful resource tool for scientists researching identity, isolation, or synthesis of condensed tannins, fills a void in the SDBS, HMDB, and MetIDB databases, and also complements the existing NMRShiftDB database entries.



Wayne E. Zeller: 0000-0002-1883-4519 Notes

The authors declare no competing financial interest. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.



(1) NIST Standard Reference Database 1A, NIST/EPA/NIH Mass spectral library with Search Program. http://www.nist.gov/srd/nist1a. cfm. Accessed June 1, 2017. (2) Horai, H.; Arita, M.; Kanaya, S.; Nihei, Y.; Ikeda, T.; Suwa, K.; Ojima, Y.; Tanaka, K.; Tanaka, S.; Aoshima, K.; Oda, Y.; Kakazu, Y.; Kusano, M.; Tohge, T.; Matsuda, F.; Sawada, Y.; Hirai, M. Y.; Nakanishi, H.; Ikeda, K.; Akimoto, N.; Maoka, T.; Takahashi, H.; Ara, T.; Sakurai, N.; Suzuki, H.; Shibata, D.; Neumann, S.; Iida, T.; Tanaka, K.; Funatsu, K.; Matsuura, F.; Soga, T.; Taguchi, R.; Saito, K.; Nishioka, T. MassBank: A public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 2010, 45, 703−714. (3) mzCloud. https://www.mzcloud.org/. Accessed June 1, 2017. (4) Spectral database for organic compounds, SDBS. http://sdbs.db.aist. go.jp/sdbs/cgi-bin/direct_frame_top.cgi. Accessed June 1, 2017. (5) Wishart, D. S.; Tzur, D.; Knox, C.; Eisner, R.; Guo, A. C.; Young, N.; Cheng, D.; Jewell, K.; Arndt, D.; Sawhney, S.; Fung, C.; Nikolai, L.; Lewis, M.; Coutouly, M.-A.; Forsythe, I.; Tang, P.; Shivastava, S.; Jeroncic, K.; Stothard, P.; Amegbey, G.; Block, D.; Hau, D. D.; Wagner, J. Y.; Duggan, G. E.; Miniaci, J.; Clements, M.; Gebremedhin, M.; Guo, N.; Zhang, Y.; Duggan, G. E.; MacInnis, G. D.; Weljie, A. M.; Dowlatabadi, R.; Bamforth, F.; Clive, D.; Greiner, R.; Li, L.; Marrie, T.; Sykes, B. D.; Vogel, H. J.; Querengesser, L. HMDB: the human metabolome database. Nucleic Acids Res. 2007, 35, D521−D526. (6) Mihaleva, V. V.; te Beek, T. A. H.; van Zimmeren, F.; Moco, S.; Laatkainen, R.; Niemitz, M.; Korhonen, S.-P.; van Driel, M. A.; Vervoort, J. MetIDB: A public accessible database of predicted and experimental 1H NMR spectra of flavonoids. Anal. Chem. 2013, 85, 8700−8707. (7) Steinbeck, C.; Krause, S.; Kuhn, S. NMRShiftDB-constructing a free chemical information system with open-source components. J. Chem. Inf. Comput. Sci. 2003, 43, 1733−1739. (8) Hagerman, A. E. Tannin Handbook; Miami University: Oxford, OH, 2011; available online at http://www.users.muohio.edu/ hagermae/. Accessed June 1, 2017. (9) Beecher, G. R. Overview of dietary flavonoids: nomenclature, occurrence and intake. J. Nutr. 2003, 133, 3248S−3254S. (10) Schleep, S.; Friedrich, H.; Kolodziej, H. The first natural procyanidin with a 3,4-cis configuration. J. Chem. Soc., Chem. Commun. 1986, 392−393. (11) Santos-Buelga, C.; Kolodzieu, H.; Treutter, D. Procyanidin trimers possessing a doubly linked structure from Aesculus hippocastanum. Phytochemistry 1995, 38, 499−504. (12) Kozikowski, A. P.; Tückmantel, W.; Hu, Y. Studies in polyphenol chemistry and bioactivity. 3. Stereocontrolled synthesis of epicatechin-4α,8-epicatechin, an unnatural isomer of the B-type procyanidins. J. Org. Chem. 2001, 66, 1287−1296. (13) Mohri, Y.; Sagehashi, M.; Yamada, T.; Hattori, Y.; Morimura, K.; Kamo, T.; Hirota, M.; Makabe, H. An efficient synthesis of procyanidins. Rare earth metal Lewis acid catalyzed equimolar condensation of catechin and epicatechin. Tetrahedron Lett. 2007, 48, 5891−5894.

ASSOCIATED CONTENT

S Supporting Information *

The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jafc.7b02314. Detailed user’s guide for successful use of the database (PDF)



REFERENCES

AUTHOR INFORMATION

Corresponding Author

*Phone: 608-890-0071. Fax: 608-890-0076. E-mail: Wayne. [email protected]. 5106

DOI: 10.1021/acs.jafc.7b02314 J. Agric. Food Chem. 2017, 65, 5104−5106