The chemical formula. Part II: Determination - Journal of Chemical

Kinds of chemical formulas, determination of a chemical formula, and problem formulas. Keywords (Domain):. Analytical Chemistry;. Keywords (Feature):...
1 downloads 0 Views 5MB Size
The Chemical Formula Part II: Determination by Doris Kolb Illinois Central College East Peoria, Illinois 61635

"Haw do we know that cholesterol molecules really look like that?" That neatly printed structural formula of cholesterol, or sucrose, or penicillin gives a student no hint as to how much effort, ingenuity, and frustration went into its determination. Ewr) r h ~ m i r aformula l is based un experiment01 ohservolion The simnlest formulas are those derived from elemental composition aione. Structural formulas are usually much more difficult to determine. The more c o m ~ l e xthe formula. the more inventive must be the analyticairnethods for figuring it out. Sometimes many bits of information must be fitted together like pieces in a jigsaw puzzle before the structural formula of a molecule is finallv determined. In the case of cholesterol, forkxample, its molecular formula (1859) and then corrected to was calculated to he C2~H440 -C27H460(1888), but no one was willing to tackle its structure really seriously until Adolf Windaus began to study it in 1903. Using oxidative degradation and other techniques he was able to identify and locate various structural units (an isoodyl side chain, a double bond, a secondary alcohol, four interconnected rings, etc.) which eventually led him to propose this structural formula ~~

Figure 1. Sbuctural formulas for cholesterol.

possible for that structure, and only one of them is natural cholesterol. The stereochemistrv of cholesterol was not completely worked out until 1955, providing configurational formulas such as those in Fie. 1 (b and c). For simple suhstances we are accustomed to writing emoirical formulas strictlv on the hasis of valence. Thus chloride ion (with a valence o i -1) combines with ions of sodium, magnesium, and aluminum (with valences of +1, +2, and +3, respectively) to form compounds with the formulas NsCl

In 1928 Windaus received the Nobel Prize in chemistry for this work, even though i t turned out later that the structure was incorrect. I t wasnot until an accurate structure for the fourring steroid nucleus was finally established in 1932, through X-ray diffraction studies, that the correct structural formula was assigned to cholesterol (Fig. la). However, the cholesterol molecule has eight different asymmetric carbon atoms (chiral centers), each one capable of existing in aright- or left-handed configuration, so there are 28 or 256 different stereoisomers

"The Chemical Formula, Part 11: Determination" is part of a series of substantive reviews of chemical principles taught first in high school chemistry courses. Dr. Kolh received a BS degree from the University of Louisville and both MS and PhD degrees from The Ohio State University. She has been employed as a chemist at the Standard Oil Company and as a television lecturer in a series "Spotlight on Research." , She has served on the staffs of Corning Community College and Doris Kolb Bradley University. Since 1967, Illinois Central College she has been Professor of ChemEast Peoria,lllinois61635 istry at Illinois Central College.

MgC12

AICh

I t may appear that these formulas did not require any experimental observation, since all we needed in order to write them were the appropriate valences. But where did we get the valences? "From the periodic table", someone suggests. Actually the valence concept came before the periodic table, and was in fact partly responsible for its development. Valences came originally from empirical formulas, which were obtained in turn from experimental data. No matter how simple or how complicated they are, all chemical formulas are based on experimental facts. Kinds of Chemlcal Formulas The simplest kind of formula is an empirical formula, which reflects the composition of a substance. The symbols of the elements in a compound are given along with subscripts to indicate their least atomic ratio. An empirical formula tells us what elements are present in a compound, and in what atomic proportion, hut that is all. A molecular formula looks verv much like an em~irical formula, but a i e w dimension has-been added. A mofecular formula tells us about the size of the individual molecule as well as its composition. The subscripts in this case represent actual numbers of atoms of each element per molecule. A molecular formula contains information about molecular weight. ---A structural formula tells us not only how many atoms of what kinds are in a molecule but also how they are joined together. Structural formulas vary in the amount of information thev contain. A condensed structural formula shows only whrch atoms are connected to what other atoms, whereas stereometric formula describes the spatial configuration of a molecule in three dimensions. The formulas shown for glucose (Fig. 2) and serine (Fig. 3) illustrate the variety that structural formulas for the same substance can exhibit. There are also electronic formulas, which contain infor-

a

Volume 55, Number 2,February 1978 1 109

Emoirical Formula:

FIgwe 2.

CH20

Ho1ecul.r

Famula:

C6Hlz06

Some formulas for n-D-glucopyranose,

mation about the valence electrons in a molecule. In 1916 G. N. Lewis conceived the idea of a covalent bond as a pair of ahared valence electrons, and his electron dot formulas are 3tructural formulas showing valence electrons as dots and :ovalent bonds as shared pairs of dots. For many atoms a state ~foptimum stability exists when the outermost shell contains 2ight electrons, or four electron pairs. (This is the basis for the 'octet rule" or the "rule of eight".) Hydrogen atoms, of course, :an accommodate only two electrons. In Lewis electron dot 'ormulas valence electrons are usually arranged so as to satisfy ;he rule of eight for each atom (or the rule of two in the case ,f hydrogen). The Lewis formulas given below are for amnonia, formyl chloride, and hydrocyanic acid.

Lewis formulas are sometimes modified by having lines used in place of the painof dots, so that the formulas ahove mieht nlso be writte;:

Figure 3. Some sbuctural formulas fw L-serine.

For inorganic compounds of cnmplicated composition, such as many of the silicates, resolved formulas are useful. The empirical formula for the mineral tremolite, for example, is CazMg5SisO~~(OH)z. When the formula is resolved into simit looks much less pler components, 2Ca0.5MgO.ESiO~H~0, formidable. Ionic formulas are used to identify electrically charged atoms or groups, with superscripts to indicate the quantity and sign of the charge. Some typical ionic formulas are: hydrohromic acid lithium phosphate sodium acetate tetraethylammonium chloride potassium hemcyanoferrate (11)

H30+,Brr (or HC,Br-) 3 LiC,POr3-

CHZCOO-,Na+ (CH&H&N+, CI4 KC, [Fe(CN)6I4-

Determination of a Chemical Formula

A chemical formula must always be consistent with experimental data. Let us consider the formula determination for a very simple substance-acetic acid. Chemical analysis of the pure compound shows that it is 40.0% carbon, 6.73% hydrogen, and 53.3% oxygen. (Such analysis usually involves combustion Many chemists prefer to combine the two notations, using of a carefully weighed sample, followed by measurement of lines for shared electron pairs and dots for unshared electhe CO2 and Hz0 produced. This is the Lavoisier-Liebig aons. method, which was modified for micro-analysis in 1911 by H H Pregl.) The analysis percentages (40.0:6.73:53.3) represent a weight ratio of carbon:hydrogen:oxygen in the compound. To 1 .. H-N-H q-C=O H-CIN: convert this to a ratio by number of atoms, which is what we need in a formula, we must know the relative weights of carAbbreuiated formulas are convenient, especially for repbon.. hvdroeen. .. and oxveen ... atoms. A dance a t the oeriodic .esenting ring compounds. Simple lines are used to depict table tells us that theiratmnicweightsare 12.0, l.Ol,and 16.0, :arbon-carbon bonds, as illustrated here for limonene. For a 1 0 0 . ~sample resoectivelv (rounded off to three fieures). wecan calculate as follows: 40.0 g carhon = 3.33 moles carbon 12.0 glmole of carbon 6.73 g hydrogen = 6.66 moles hydrogen 1.01 glmole of hydrogen 53.3 g oxygen = 3.33 moles oxygen 16.0 glmole of oxygen The ratio of moles (or gram-atoms) of these elements is the Aromatic formulas for compounds containing benzenoid same as their ratio of numbers of atoms, so we might write the :inns can take various forms, as shown below for the comformula for this compound as C3.33H6.6603.33. However, the m i n d henzoic acid. formula should preferably have smallest whole number subscripts, obtained by dividing the numbers calculated above COOH FOH by their greatest common denominator (3.33 in this case). Dividing through by 3.33, we obtain the integers 1, 2, and 1, so the correct empirical formula for acetic acid is C1H201, or Pn.n ---'-. However, CHzO is also the empirical formula for glucose, Complex formulas for coordination compounds are usually ribose, lactic acid, formaldehyde, and a very long list of other written in the style recommended by Alfred Werner, using substances, all of which yield exactly the same elemental brackets to identify the metal ion and ligands making up the analysis as does acetic acid. For acetic acid we need more than zomplex. just an empirical formula. To determine the molecular formula we need molecular tetraammine copper (11)sulfate Cu(NHs)r SO4 L ~ ~ [ c ~ ( N L I weight data. Molecular weights can he ohtained from vapor sodium hexanitrocobaltate (1111

7

.. I

-

HO I Joumai of Chemical Education

density measurement? (if the compounds are volatile) or from data on freezing point depression, boiling point elevation, or osmotic nressure. Since this comnound is &acid. its molecul~r -weight ior more precisely its equivalent weight) can be determined bv titration of a weiehed samnle aeainst a standard solution of Lase. But it is unli!&y that any oFthe methods j&t mentioned would be used in a modern laboratorv eauinned with a mass spectrometer. Molecular weights oitalned by mass snectrometrv are faster to measure and more accurate than those ohtninid by any other method. We find that the molecular weight of areticacid is Mamu (atomic mass units). Since the weight of a simpIeCHz0 molecule would be only 30 amu (12 units of carbon nlus 2 of hvdrogen and 16 of oxygen), the acetic acid moleche must 6e twice as large. The molecular formula of acetic acid must therefore be (CHz0)z or CzH40z. But this formula is still not specific for acetic acid. I t turns out that methyl formate and glycol aldehyde both have exactlythe same molecular formula, although their properties are very different. They are isomers of acetic acid. We need a structural formula to represent acetic acid. In order t o determine the structure of the acetic acid molecule we must know something about its chemistry. In the laboratorv we can readilv ascertain that the comnound has acidic hydrogen (litmus &st) and a methyl group s'ttached to an oxidized carbon atom (iodoform rest). This information suggests that the structural formula for the CzH402 acetic acid molecule must be

Of course, acetic acid is a much simpler compound than the unknown substances a chemist is actuallv likelv to encounter. A more complex compound might first bidegraded to simpler, more recognizable products. Derivatives might be prepared for easier identification. There would be thorough examination for any possible functional groups. In the case of a previously unknown compound, the proposed structure might be confirmed by synthesis, using some unambiguous synthetic pathway. Certainly the chemist would take advantage of instrumental assistance available to him. Some of the modern analytical tools especially useful in structure determination are: Infrared S~ectroscoov. Infrared fir) analvsis nrovides much information abo"i the bonding i d fun&ional groups in a molecule with little expenditure of time and effort. The frequency range of IR radiation is just below that of visible light and corresponds to the vibrational frequencies of covalent bonds. As IR rays of increasing wavelength are passed through a sample, the compound absorbs only those photons with frequencies matching the various stretching and bending freauencies of its own chemical honds. The spectrum Droduced is a series of absorption peaks and valleys~the locations of which confirm the presence or absence of specific kinds of bonds (C=C, C=O, O-H, C-N, etc.). IR spectra are distinctly individual and can be used as "fingerprints" for compound identification Nuclear Magnetic Resonance (nmr). The nucleus of a hvdroeen atom is a ranidlv spinnine nroton. which acts as a ti& magnet. When placed i n a magLtir field, the hydrogen nuclei in a substancealien themselves with the field. An nmr spectrometer measuresthe energy required to flip them out of this alignment. (Radio-frequency photons have energies of the right magnitude to do this.) When a molecule is placed in a magnetic field, hydrogen atoms located in different chemical environments within the molecule actually lie in areas of different field strength because of the electronic shielding effect of neighboring atoms. This results in different absorption peaks in the nmr spectrum for different kinds of

hydrogen atoms (methyl, methylene, phenyl, amino, hydroxyl, etc.) The distances of particular absorption signals from that of a reference standard (usually tetramethylsilane) are called their "chemical shifts" and can be used to identify what "kinds" of hydrogen are present. The areas under the absorption peaks measure how many atoms there are of each kind. Most nmr spectrometers are designed to excite protons only, but nmr analyses based on fluorine, phosphorus, and carbon-13 have also become routine in manv laboratories. Mass Spectrometry. Not only is mass spectrometry the best method for molecular weight determination, but it is also a valuable tool for structure analysis. In the mass spectrometer molecules are first converted to ions and charged fraements by electron bombardment, and then the charge> partkes are sorted according to mass bv means of a powerful magnetic field. The "mass spectrum" of a compound is a series hf ion collection peaks occurring at certain mass numbers. The base peak, normally the intense peak of highest mass number, provides an accurate measure of molecular weight, while the fragment peaks confirm the presence of various structural sub-units in the molecule. The "cracking pattern" of a compound is reproducible and useful for identification. Microwave Spectroscopy. The energies of photons in the microwave reeion of the soectrum corresnond to the rotational energies of molecules, and so they can be used to measure rotational moments of inertia. From these can be determined bond angles and bond distances. Microwave spectra are especially useful in conformational analysis. The method is limited to fairly simple molecules, however, since samples must be in the gaseous state. X-ray Diffraction. When X-rays are passed through a crystal, they are scattered, or diffracted, by the various atoms or ions with which they collide. The many scattered beams go out in all directions, alternately reinforcing and cancelling each other, to produce a photographic pattern of light and dark areas. By using computers to analyze these X-ray diffraction patterns, i t is possible to construct contour maps of electron density, from which i t is sometimes possible to determine the arraneements of atoms within molecules. even quite complex moihcules. Much of.our knowledge abo"t the detailed structure of chemical com~oundshas heen obtained through X-ray studies. During the past century there have been some outstanding feats of &uctural formula determination. These might be mentioned in particular: the carbohydrates with their formidable problems of steroisomerism; the alkaloids (quinine, strychnine, morphine, etc.) with their complex polycyclic ring structures: the steroids (cholesterol. the bile acids. the sex hormones,'cortisone, etc.) with their troublesome n k l e u s of four fused rims and their com~licatedstereochemistrv: the penicillins with their rearrangknent-prone system of &sed heterocyclic rings; the metal-centered porphyrins (chlorophyll, heme, vitamin BIZ,etc.) with their chelating framework of interconnected pvrroles: the rote ins (including various enzymes and horm&es) with the$ specific ~equence~of amino acids and their intricate secondary and tertiary structural features; and the nucleic acids with their sugar-phosphate backbones and non-repeating patterns of purine and pyrimidine hases-especially DNA (deoxyribonucleic acid, the chemical of the genes) with its self-replicating double helix structure and fascinating genetic coding sequence. The magnitude of the research effort and creative energy that went into these projects of structure elucidation is attested to by the fact that moat of the primary investigators in these studies were subsequently awarded the Nobel prize. Problem Formulas

Writing chemical formulas is easier to do for some substances than for others. Perhaps we should a t least mention a few of the problem areas. There are some compounds that defy the Law of Definite Composition and cannot be assigned simple empirical forVolume 55, Number 2,February 1978 / 111

mulas. Thev are non-stoichiometric com~ounds.which can vary someihat in their chemical composition. usually they are solid stste materials containing lattice defects, with a slight deficiency or surplus of metal atoms. Frequently they are metal sulfides or oxides of variable composition, yielding such empirical formulas as Fe0.950,Zn1.~30, or Cu1.92S. Ordinary glass in another material of diverse composition, c o n t a i n i n i w i d e ~varying ~ amounts of metal oxides, such as Na20,CaO, AI& $. etc.. hmded into a giant network of silicon dioxide. There are thousands of differcnt amorohous silicates ~.~ known collectively as "gla&! 1t is unlikely have ever seen a formula for elass. In fact. the idea that these amornhous silicates are truly compoundsis probably debatable. Polvmen can cause difficulties in formula writine because they L v e such enormous molecules. Polymers with repeating , formula for monomer units are usuallv treated s i m.~.l vthe polpinvyl chloride heing written -(-CH2-CHCI-J-,. in which n isa \,cry large number.Thia kind of formula is very practical, but it dues neglect the composition of the end groups (the chain initiators and terminators) and usually ignores the si7e of n ,which is nurmally a very wide range of large numbers. The empirical formula of such polymers actually varies from one molecule to another. althoueh the variations are neelieihle if the molecules are quite large.-polymers with non-repeiting monomers are more comwlicated and must be re~resentedin some detail. The complete formula for a protein molecule, for example, should include its entire sequence of amino acids. In order to save time and space, it is common practice to use abbreviated names of the amino acids (e.g. gly for glycine) in place of their structures in writing formulas for proteins. There are other compounds that exhibit tautomerism, existing as equilibrium mixtures of easily interconverted isomers. In which isomeric form should the formula be written? Sometimes it is best written both ways, as illustrated here for 2,4-pentanedione (which is about three-fourths in the enolic ~~

~

~

~

Still another problem involves those important secondary honding forces known as hydrogen bonds. Because alcohol molecules are associated by hydrogen bonding, alcohols have higher melting and boiling points than we would predict. Should their moleular formulas reflect this extended molecular size?

~

.,...0-H .....0-H .....0-H

I

CH,

I

CH,

I

CH,

Hydrogen fluroide molecules also form hydrogen-bonded chains (or rings) with polymers of six H F units predominating a t room temperature. Should the formula for~hydmyen fluoride he writ& HsFti? Surely the most widely renlgnized chemical formula is that of water.. HIO. - Hut water molecules also exhibit hvdroeen " bonding, a typical cluster a t room temperature consisting of about four H?O units. This makes the effective molecular weight of water much greater than its simple H20 formula would indicate. I t also explains why water is a liquid and not a gas. Is i t possible that we should consider changing that familiar old formula?

-

Since the hydrogen bond, some chemists are fond Of staring i n accents satlrlml: 'H,O is no more! Water's H-01. Its hieh hoiline-. m i n t is no miracle." But do not be misled; Hs0 is not dead. There's no need to wax pancgyrrml Pdymerk or no, wnter'r still H,OThe forrnuln'9 srmply empirlml.