It takes a lot of drawers to hold the Max Weaver Dye Library. Each drawer holds between 600 and 1,000 vials.
INFORMATICS
Researchers at NC State analyze the first structures from the Max Weaver Dye Library’s 98,000 vials CELIA HENRY ARNAUD, C&EN WASHINGTON
A
college of textiles is an unexpected place to find a chemist trained in high-resolution mass spectrometry. But the college of textiles at North Carolina State University had something that attracted Nelson R. Vinueza: It is home to the Max A. Weaver Dye Library. Weaver, a longtime researcher at Eastman Chemical, and his team collected dyes from the company over a period of more than 30 years. His efforts resulted in a treasure trove of about
H2N
HO
N N
N
N
N
NH2
HO
N N
N
NH2
H2N
Swapping the positions of amino and hydroxyl groups is enough to change the colors of these dyes from yellow (left) to orange (right), even though they have the same absorption maximum (387 nm). 98,000 vials of dye molecules dating from the 1960s to the 1980s, which Eastman donated to NC State in 2013, along with accompanying fabric swatches. NC State hired Vinueza that same year, and he’s now codirector, with Harold S. Freeman, of the dye library. The library team is working to make the collection publicly available, in part to encourage researchers to find new uses for the compounds other than as textile dyes. As a first step toward that goal, Vinueza, computational chemist Denis Fourches, and coworkers have digitized and analyzed the structures of 2,700 of the dyes. The team recently reported the results of this work, giving chemists a first glimpse at the structural diversity within the collection (Chem.
24
C&EN | CEN.ACS.ORG | MAY 1, 2017
“Any large, welldocumented collection of colorants like this is of interest to those of us in conservation science and technical art history.” —Gregory D. Smith, senior conservation scientist, Indianapolis Museum of Art
Sci. 2017, DOI: 10.1039/C7SC00567A). The team wanted this first set to be representative of the collection, so their students opened drawers and randomly picked dyes to be included. They avoided the earliest dyes because they worried the older ones might have degraded. “This sample is quite representative of the entire library in terms of the expected color distribution that Eastman provided,” Fourches says. “If we had obtained different distributions, we could have questioned whether this sample was really random.” Each vial in the collection is labeled with the chemical structure of the dye. The team wanted to make sure the structures they analyzed were all unique ones. So after digitizing the structures, the researchers applied a standardization protocol that looked for salts, mixtures, and duplicates. Those compounds were removed from further analysis; the salts and mixtures remain in the database, however. “We want to make sure that in the final set, there is a very specific, well-defined structure for each dye that is curated,” Fourches says. The standardization protocol resulted in a set of 2,196 unique dyes. They searched these structures for wellknown chromophores, such as anthraquinone and stilbene substructures. The azo group is the most common substructure in the data set. The team also acquired high-resolution mass spectra of a randomly selected 74-compound subset to make sure the mass, isotopic distribution, and elemental
CREDIT: NORTH CAROLINA STATE UNIVERSITY
Digitizing a massive dye library
CREDIT: NORTH CAROLINA STATE UNIVERSITY (VIALS); CHEM. SCI. (DENDROGRAMS)
composition matched the expected values from the structures on the labels. “We just had the vials and structure,” Vinueza says. “High-resolution mass spectrometry gives us an idea if what we are searching is correct. That gives us more confidence about the structures.” They characterized the dyes’ properties using cheminformatics software. “We can screen the library for dyes that are similar in terms of structure, shape, volume, or charge distribution to other dyes that have known biological or physicochemical properties of interest,” Fourches says. “It is very likely that in this library we have potential antibiotics, anticancer agents, new ways of coating materials.” This modeling provides a fast and inexpensive way to prioritize dyes for testing for these other properties. The team has entered 150 of the dyes into the ChemSpider online structure database. When their collaborator Antony J. Williams of the National Center for Computational Toxicology compared the structures to the 58 million in ChemSpider, only seven were already in the database. “It’s almost the same probability as winning the lottery,” Fourches says. “It shows how unique the chemistry is in the dye library.”
ly published subset, had multiA close-up of some in distilling meaning out of this ple structural scaffolds. of the vials in the large group of materials, these But sometimes very similar dye library. collections are most useful to structures lead to different our field when the samples have colors. When the team searched the databeen fully characterized and the resulting base for constitutional isomers, they found spectral libraries can be used to assist one pair of compounds in which the only in dye identification from artworks and difference was in the location of a chlorine artifacts.” atom. But that difference was enough to But it will be a while before the entire make one dye red and the other one orange. collection is digitized, let alone fully charIn another pair, swapping the positions of a acterized. Students have been drawing the primary amine and a hydroxyl group meant structures so they can be converted into a the difference between a yellow and an digital format compatible with modeling orange dye, even though both dyes have the software. same experimentally measured absorption “From my perspective on the modeling maximum. side, the more unique structures I have to
The NC State researchers NC State scientists clustered the structures of 2,196 dyes based feed my computer, the better,” also used modeling methods to on similarity to create these dendrograms, which are colored Fourches says. “But I know cluster the dyes according to according to the dyes’ color (left) and molecular weight (right). from the experimental point of their structural similarity and view, the number of hours for displayed the results as dendrograms. The dye library is of interest to researchundergrads and graduate students to write “If two dyes are close together on the ers in the wider chemistry community. down the structures is enormous. We’re dendrogram, it means they are structurally “Any large, well-documented collection not sure yet what is the best strategy, but similar,” Fourches says. “You expect to of colorants like this is of interest to those we need to find a path to get this library see clusters where the dyes have similar of us who work in conservation science entirely digitized and modeled as soon as structures and similar colors.” Through and technical art history,” says Gregory D. possible. And that requires funding.” The this analysis, the scientists found that blue Smith, senior conservation scientist at the NC State team plans to seek grant funding dyes, which are the largest color family Indianapolis Museum of Art. “While the and establish collaborations to continue both in the overall library and in the recentcheminformatics approach is interesting characterizing the library. ◾ MAY 1, 2017 | CEN.ACS.ORG | C&EN
25