Instrumental and numerical considerations for on-line interpretation of

(18) G. Haase, director, Instituí für Wíssenschaftllche Photographle,. Technlchen Universitat München ... choice of techniques for on-line data ac...
0 downloads 0 Views 2MB Size
the commercial samples are those of the initial average concentrations of the impurity added. For a given section, the actual concentration may vary from near zero to many times the average amount indicated by the vendor. Good correlation is not expected between the spark source and vendor’s values. Lead is very difficult to detect even a t long exposures (1000 nC). It is not known at this time why this problem occurs, but the same problem with lead analysis in silver chloride has been found by another SSMS research group (18).

The detection limits for this technique are the same as those reported for other SSMS analyses. However, these (18) G Haase, director, lnstitut fur Wissenschaftliche Photographie, Technichen Universitat Munchen, private communication. 1972

limits may not be determined by the instrument but rather by the impurity levels present in the reagents used.

ACKNOWLEDGMENT The authors would like to thank Charles Childs, manager of the crystal growth facility, for supplying all of the silver chloride crystals and for explaining crystal doping techniques. Received for review October 11, 1972. Accepted December 26, 1972. Paper presented at the 24th Southeastern Regional ACS meeting in Birmingham, Ala., November 1972. Work supported by the Materials Research Center of the University of North Carolina under contract DAHC-15-67-C0223 with the’ Advanced Research Projects Agency.

Instrumental and Numerical Considerations for On-Line Interpretation of High Resolution Mass Spectral Data Richard M. Hilmer and James W. Taylor’ Department of Chemistry, University of Wisconsin, Madison, Wis. 53706

Several important instrumental characteristics of an AEIMS9 double focusing mass spectrometer in the NierJohnson configuration are discussed which influence the choice of techniques for on-line data acquisition and numerical methodology for high resolution mass spectral data processing. The approach described yields an average error of less than 10 ppm in mass measurement, and peaks as small as 0.05% of the tallest peak are observed. Instrument vibration raises the average mass error. Values below 3 ppm are obtained from data where vibration isolation has been attempted. The time of execution of the data reduction is about 3-10 min on the Raytheon 706 computer, and about 15-60 sec per spectrum on the Univac 1108. The computing times depend on the number of multiplet peaks to be deconvoluted, the number of peaks in the spectrum, and the number of atoms allowed in the elemental formulas.

The advantages of computerized data reduction of high resolution mass spectra have been widely discussed (1-9), and several systems for accomplishing this task have been described (3-16). Few of these systems (9-14), however, are directly coupled or on-line with the mass spectrometer, which means that the experimenter must obtain the data from the mass spectrometer in digital form and transport the data to a large computing facility which is not primarily dedicated to the processing of that mass spectral data. He may then have to wait as much as several hours before he can see the mass spectral results. With on-line capability, the data are processed immediately. The experimenter can run a high resolution mass spectrum, determine his results, and change instrumental parameters as necessary. The on-line capability requires a small to medium-sized computer, dedicated to accumu1

Address correspondence t o t h i s a u t h o r ,

lating the intensity data and performing the calculations which relate scan time to m l e . The data acquisition from a high resolution instrument is complicated by the volume and rate of the incoming data and the data reduction is a difficult mathematical problem on a small computer because of the computational precision required. The HIRES programs, written by Tunnicliff and Wadsworth (7) for a Mattauch-Herzog configuration instrument, were kindly supplied in early stages of this work. For these programs, the data were read from a photographic plate by an automatic digital recording optical densitometer, the digital data were stored on magnetic tape, and the magnetic tape was read by a large computer. It was hoped that these programs could be modified for use with the Nier-Johnson configuration, not involving photographic plate readout. The two spectrometer configK. Habfast, Advan. MassSpectrom., 4, 3 (1968) S. P. Perone, Ana/. Chem., 4 3 , 1288 (1971). R. Venkataraghavan, F. W. McLafferty, and J. W. Amy, Anal. Chem., 39, 178 (1967). K. Biemann and P. V. Fennessey, Chirnia. 21,226 (1967). A. L. Burlingame, D. H. Smith, and R. W. Olsen, Anal. Chern., 4 0 , 13 (1968). A. L. Burlingame, Advan. Mass Spectrom., 4, 15 (1968). D. D. Tunniciiff and P. A. Wadsworth, Ana/. Chern., 4 0 , 1826 (1968). T. Aczel, D. E. Allen, J. H. Harding, and E. A. Knipp, Ana/. Chem., 42, 341 (1970). H. C. Bowen, T. Chenevix-Trench, S. D. Drackiey, R . C. Faust, and R. A. Saunders, J . Sci. Instrum., 4 4 , 343 (1967). H. C. Bowen, E. Clayton, D. J. Shields, and H. M. Stanier, Advan. Mass Spectrom., 4, 257 ( 1968), R. J. Kiimowski, R. Venkataraghavan, F. W. McLafferty. and E. B. Delany, Org. MassSpectrom., 4, 17 (1970). A. L. Burlingame, R o c . Int. Conf. Mass Specfrosc., 104 (1970). D. H. Smith, R. W. Olsen, F. C. Wails, and A. L. Burlingame, Anal. Chem., 4 3 , 1796 (1971). R. S. Gohlke, G. P. Happ. D. P. Maier, and D. W. Stewart, Anal. Chem.. 4 4 . 1484 11972). W.J. McMurray, S. k. Lipsky, and B. N. Green, Advan. Mass Spectrorn., 4, 77 (1968). D. M. Desiderio, Jr., and T. E. Mead, Ana/. Chem., 4 0 , 2090 (1968). A N A L Y T I C A L C H E M I S T R Y , V O L . 45,

NO. 7,

J U N E 1973

1031

r--'

"

I

' ' ' ' ' ' ' '

"

'

"

' ' ' '

"

' ' ' '

'

'

"

"

' ' ' '

'

'

'

'

"

'

'

'

"

'

"

'

'

' ' '

'

'

'

'

'

'

'

'

'

'

' ' '

'

'

'"I

1,.

't

\

k /

\

, ,

,

'OO Figure 1.

\

-._L>.

50

,

I00

.

,

,

. . ,

1 .. . 150 ,

,

,

,

. , .. . . 200

,

,

,

.

,

,

I

, , , , ,

..

250

. 300 , ,

'r ,

.1 . , . , . 1

.

350

,

..

,

SECONDS

Effect of scan rate

Magnet range = 6. set start = 1000 at (1) scan rate = 5, (2) scan rate = 6, and (3) scan rate = 7

Figure 2.

Effect of magnet range

Scan rate = 5 , set start = 1000 at (1) magnet range = 6, (2) magnet range= 5, and (3) magnet range = 4

urations, however, have sufficiently different mathematical relationships between time and m / e that only some sections of the HIRES programs could be modified for use. Other sections, chiefly the mass identification and calculation, required rewriting. In developing the new procedures, Fortran V. was initially used for the Univac 1108. Afterward, a parallel on-line version was generated in SYM 11, the Raytheon 706 assembler language. 1032

ANALYTICAL CHEMISTRY, VOL. 45, NO. 7 , JUNE 1973

THE ANALYTICAL PROBLEM The analyst dealing with high resolution mass spectral data requires two functions of an on-line system. The first is data acquisition, the conversion of the data from an electrical signal in analog form into a series of digital data points representative of that same signal, followed by storage of those digital data points on some mass storage

Figure 3.

Effect

Scan rate = 5, magnet range = 6 at (1) set start = 1000, (2) set start = 800, (3) set start = 600, and ( 4 ) set start = 400

device. The data acquisition may also involve a rapid assessment of the quality of the experimental data. The second function is to process the data and present it t o the analyst in a format which is descriptive of the chemistry of the sample. The first function, the data acquisition, can occur in three steps. The first step is to obtain various operational parameters relating to the mass spectrometer, such as scan rate, magnet range, band width, electron energy, resolution, etc. The second step is the actual A-D conversion and data storage, and the third step is a very rapid preliminary processing of the data for information regarding the intensity of the spectrum and the resolution actually obtained under scanning conditions. This latter information provides the diagnostic information for the operator, who can decide either to accept and process the data or run another scan. The data reduction can occur in four basic stages. The first is to process the digital data points to locate the time of occurrence of the centers of the peaks in the spectrum and to calculate the intensity of the peaks, separating overlapped peaks if necessary. The second stage is to calculate the masses corresponding to each of the peaks, as accurately as possible. This is accomplished by establishing the relationship between mass and time for the particular spectrum being processed. The third stage. involves the optional choices of editing, background subtraction, and averaging or compositing of several spectra to improve the mass and intensity measurements. The fourth stage is then to calculate all the elemental formulas which could possibly correspond to each of,the masses, while eliminating those formulas which do not make chemical sense. If a computing system is to perform these functions for the analyst on-line, it must be designed with knowledge of the characteristics of the mass spectrometer and must present the information to the analyst in such form that he can immediately determine whether the data are the type he wants and what to do to improve them if necessary. Finally, the computational approach must be considered which leads to the required information in the fastest time with the required accuracy.

CHARACTERISTICS OF THE MASS SPECTROMETER The instrument used for these studies was a commercial AEI-902-C double focusing mass spectrometer modified by the incorporation of a 20-stage electron multiplier and a thud sample inlet system, micrometer needle valve controlled, for introduction of a standard compound, usually perfluorokerosene (PFK), directly into the ion source. Because the described system and numerical methodology were tailored to this particular instrument, the nature of the decisions is outlined in some detail to permit adaptation to other instruments of differing geometries and characteristics. The spectrum is obtained by scanning down from high mass to minimize magnet hysteresis effects. The relationship between mass and time, as shown by the dashed lines in Figure 1 is roughly In M

= a

+ bt

or M

=

M , expbt

(1)

where a = In Mo and b is a negative slope whose magnitude depends only on scan rate, the faster scan rates giving the most negative slope. The constant, a, equal to the log of the starting mass is seen to be independent of scan rate and is a function only of the magnet range control and the set start control. The magnet range control does not affect the scan rate (cf. Figure 2) but determines roughly the mass range to be scanned, while the set start control acts as a fine control for the magnet range (cf. Figure 3) and allows the starting of the scan at intermediate points, in that mass range. (In these figures, no masses higher than approximately 650 have been plotted, since none are observed in the PFK spectrum.) The scan is stopped either manually or by a meter relay which monitors the magnet current. Setting the shut-off point is essentially the same as setting a stop mass on the low mass end of the spectrum. The time of the scan is then determined by the scan rate and the distance between the starting and stopping masses. Describing the scan in this manner frees the operator of two responsibilities. He does A N A L Y T I C A L C H E M I S T R Y , VOL. 45, N O . 7, J U N E 1973

1033

_..-

Figure 4.

Approximation of half-width

not need to specify the location in time of any peaks in the spectrum, since this can be calculated from the parameters of the scan. Nor does he need to specify a digitization rate, since this can also be calculated from the average half-width of the peaks in the spectrum. An important characteristic of the mass spectrum is that all the peaks have nearly the same half-width in terms of time, regardless of their position in the spectrum. An approximate value for this half-width can be calculated from the scan rate and resolution of the mass spectrometer. If, as in Figure 4, one defines two peaks a t masses M I and M2 as being separated by a valley equivalent to 10% of the height of the peaks, one may define the resolution as being

R Then

M , / ( M , - MI)

=

M z / M , = 1/R

+

1

But M I = MOexpbti, and M , = MOexpbtz, hence

(4) The starting mass, Mo, cancels out; hence the expression is independent of position in the spectrum. Further - ti) =

eXpb(tz

1/R + 1

(51

and

t2 - tl

=

l / b In (1/R

+ 1)

(6)

If R is greater than 1000, one may approximate

tz

-

tl

= l/bR

(7)

The distance from the center of one of the peaks to the center of the valley between them, assuming the peaks are of the same height, is

d = ( t 2 - t1)/2

(8)

Assuming the peak is symmetrical, the width of the peak a t 5% of its height is then 2d. As a rough approximation, we may say that the half-width of the peak, expressed in units of time, is half the width at the base, that is

W , = XbR

(9)

Thus, the average half-width (expressed in units of time) is seen to be primarily dependent only on the scan rate 1034

ANALYTICAL C H E M I S T R Y , VOL. 45, NO. 7, J U N E 1973

and the resolution of the instrument, and is defined as in Equation 9. In Equation 9, the resolution is required for calculating the half-width, and there are two different types of resolution which one may use. The first, static resolution, is the resolution under nonscanning conditions, and the second, dynamic resolution, is that obtained under actual scanning conditions. Static resolution on a double focusing instrument can be estimated using a peak matching accessory whereby one matches a peak against itself on the oscilloscope and adjusts the decade resistors until the two images of the peak overlap a t 5% of the height of the peak. The static resolution is then calculated from the readings on the decade resistors. There is, however, no simple way to measure the dynamic resolution of the mass spectrometer, and it is the dynamic resolution under scan conditions which must be considered by the analyst. Some factors which can cause the two resolutions to differ are: vibration, magnet instability, scan rate nonlinearity, beam instability, and response time in A/D and detector electronics. The effects of vibration of the mass spectrometer can be very serious. If the collector slit is vibrating as the beam is passing through it, the effect is the same as having a beam instability. A peak under these conditions may be expected to acquire a distorted shape of lower apparent resolution. Its position in time also becomes more uncertain.

DATA REQUIREMENTS The peaks in a mass spectrum provide analytically useful information in terms of intensities and position along a mle scale. Because each peak is to be described in terms of a series of data points, questions arise as to how many are necessary, what accuracy is necessary in the analogto-digital conversion, and what data rates can be conveniently handled by on-line computing equipment. Halliday (17) has said that a useful number of digital data points per peak should be a t least 20, and that analysis of peak profiles for deconvolution of moderately overlapping peaks would conservatively require something closer to 40 data points per peak. Using this basis as the minimum, we have decided to use a factor of three more data points per peak to improve the accuracy of intensity and mass measurements, If the assumption is made that a dynamic resolution of 10,000 is required and that scan rate 8 (16.8 secldecade) is the fastest scan rate which will be used, then b is -0.1370 sec-1 and, according to Equation 9, the average half-width of the peaks of the spectrum is 3.65 X 10-4 sec. If then one assumes that a half-width of 3.65 x 10-4 sec requires a minimum of 30 data points (about 60 data points per peak), one may calculate the digitization sec = 82,200 rate required; 30 data pointsl3.65 X data pointslsec. This rate, while not extreme, is quite high, and the interface between the A-D converter and the computer must be designed to handle data a t this high rate; alternatively, the scan rate or number of data points per peak must be reduced to accomodate a slower digitization rate. At scan rate 8 with 82,000 data points/sec, assuming one decade of scanning ( i . e , , m / e 500 to mle 50), the mass spectrum would generate 1.3 X 106 data points for storage (16.8 sec X 82,000 datalsec). This mass of information is far more than most disk units can handle and this rate is faster than tape units can accumulate. By using a thresholding technique to eliminate the data describing the 9597% of the spectrum which is base line, it is possible to store the remaining data on a disk storage unit. If the base line is eliminated, however, the position of the peak (17) J

S

Halliday. Advan Mass Spectrometry. 4, 239 (1968)

in the spectrum can no longer be calculated by counting the number of data points stored from the start of the scan, and some sort of position marker must be attached to the data points describing the location of some point in the peak. Because the masses of the peaks must be measured to approximately l ppm, the positions of the peaks in the spectrum must also be obtained to within 1 ppm. One way to achieve the required accuracy is to mark the position of the first point to exceed threshold in terms of the number of A-D converter clock pulses which have occurred between the start of the scan and the first data point. The clock pulses may be counted with no loss of accuracy, but one must utilize a crystal clock of stability on the order of a t least 0.1 ppm over long periods of time to generate the pulses. The stability of this clock would then provide the ultimate limitation in accuracy of peak position. For measuring the peak intensity, Halliday (17) has stated that if 11 bits of precision are available for each word output by the A-D converter, no noticable degradation of the data will occur. This furnishes a theoretical dynamic range of about 1 part in 2000, but if thresholding is employed, this may be lowered to about 1 part in 200 or 0.5%. If 14 bits are used, the theoretical dynamic range is 1 part in 16,000, and thresholding could reduce this to about 1 part per 1000 or 0.1%. This greater range is preferable in terms of recording very small peaks while maintaining very large peaks on scale. The intensity data from the mass spectrum would not be expected to be more accurate than 1%, but the large number of bits for a single converter is required to detect the smaller peaks. Other configurations involving dual channel conversion might be employed to relax this converter requirement.

DESCRIPTION OF HARDWARE The computer used to obtain and process the mass spectral data from this study was a Raytheon 706 computer. I t uses a 16-bit word, has a 900-nsec cycle time, has 12 levels of interrupt, and hardward multiply/divide (6.3-9.0 psec multiply, 9.0 p e c divide). The memory is 16,384 sixteen-bit words. For obtaining the data, 8192 words were adequate, but this had to be doubled in order to handle the described processing programs. Two types of input or output from the memory are available; direct memory access (DMA), which requires no action from the central processing unit (CPU) for the actual transfer of data; and direct input/output (DIO) which requires some action on the part of the CPU for each word transferred, to or from memory. The peripheral devices and A-D converter used were as follows. (1) ASR-35 Teletype unit with paper tape reader/ punch; 10 characters/sec; data transfer uia DIO lines. ( 2 ) Disk unit with 64 tracks; 128 sectors/track; 47 words/sector; total of 385,024 words. Maximum access time is 33.3 msec.; average is 16.7 msec; data transfer is uia DMA lines a t a rate of 187,000 wordslsec. (3) Magnetic tape unit; 9-track; 800 bpi; odd parity; 25 ips forward speed; data transfer uia DIO lines. ( 4 ) Card reader; Mohawk Data Sciences No. 6002; 225 cards/min; data transfer uia DIO lines. (This is not necessary for on-line application but was used for program development.) (5) Line printer; A. B. Dick Videojet 960; 250 characters per sec; 130 characters/line; data transfer uia DIO lines. (6) A-D converter, high speed (multiverter), *lo V input, sample and hold amplifier with 50-nsec aperture, 14 bits plus double sign bit output, 100 KHz max conversion rate, data transfer via DMA lines, programmable

Threshold 1

VI



Figure 5. D e s c r i p t i o n of thresholding

sampling rate, crystal clock stability better than 1 part in 108 over a 5-min period, 28-bit elapsed time counter, programmable thresholding and data chaining (double buffering) available. A number of features were specifically built into the A-D converter, such as the clock with high stability and the 28-bit counting register for accuracy in measurement of peak times and the high digitization speed for fast scans, thresholding, and double buffering.

DATA MANIPULATION Two important pieces of information must be recorded for each segment of the mass spectral peak selected for storage-the intensity and position in time. Because of the noise superimposed on a peak, it is probable that the threshold will be crossed several times a t the beginning of the peak. If a single thresholding level is used, these noise spikes will appear as separate peaks. In this study, this effect was avoided through the use of two comparison registers which examine digital data relative to two preset levels. Below both levels, no data are transferred to memory, but the clock pulses which establish a peak position in time are still being counted. When some data point becomes larger than threshold l (labeled “start” in Figure 5 ) the pulse count and the first data point are then placed in memory. Each successive data point is then placed in memory a t constant time intervals until a data point occurs which is less than threshold 2. At that point (labeled “stop,’ in Figure 5), no further data are stored in memory until threshold 1 is again exceeded. Threshold 2 is commonly set just above the noise of the base line, so that the base line will not be recorded. Threshold 1 is set above threshold 2 by the width of the noise band, so that if a t the beginning of a peak, a negative going noise spike occurs, the data will still be recorded until the end of the peak is observed. Using this technique, the base line of the spectrum is eliminated. In order to accept the high data rates during a scan, double buffering is employed whereby data are transferred from one memory buffer to disk while a second buffer is being filled by the A-D converter. All data are transferred on DMA lines, which permits the CPU to perform some preliminary data processing while the data are being transferred. Finally, before the spectrum is processed, an 18-character label is assigned to the spectrum which serves as the spectrum identification, and three lines of comments describing the spectrum are also.input uia the teletype. The A N A L Y T I C A L C H E M I S T R Y , VOL. 45,

NO. 7,

J U N E 1973

1035

+

I

LIST RAW DATA

r----i I

;j-Eq

LARGEST DATA PT.

I

I

MAGNET RANGE, SET START, STATIC RESOLUTION, ELECTRON

CALC Wo FROM DATA O N DISK (PEAKS WITH FEWER POINTS THAN M l N N P NOT COUNTED)

DONE?

L AVERAGE FOR

i [MINNP = ( Z * W o ! / ( R A T E * 5 )

+

FROM SCAN RATE

I I

[PRINT OUT DYNAMIC RESOLUTION

c

PRINT N O OF PEAKS USED TO CALC Wo

1

INITIALIZE ADC

WRITE PARAMETER DATA ON DISK

Figure 7. Data acquisition flow chart

START A D C

Figure 6.

Data acquisition flow chart

label and comments appear at the head of any output data resulting from processing of the digital data.

COMPUTATIONS USED IN DATA ACQUISITION The data acquisition program, DIGMS9, requires a number of scan parameters from the operator. The flow diagram for DIGMSS is shown in Figures 6 and 7 . The parameters required are scan rate, magnet range, set start, static resolution, electron volts, and band width. All of these, except static resolution, are read directly from the knobs or meters of the MS-9. The static resolution obtained from the peak matching accessory is used initially to approximate the dynamic resolution and to calculate a reasonable digitization rate. After the dynamic data have been obtained, the average half-width of the peaks in the spectrum gives the true dynamic resolution and the true data rate. The approximate slope of the In M us. time curve, b, is obtained from a table relating b as a function of the scan rate. Then, using this value and that of the static resolution, the expected average half-width of the peaks in the spectrum is calculated using Equation 9. The digitization rate (the number of microseconds per time slice or data point) is then calculated from this average half-width assuming that 50 data points are desired over that halfwidth (about 100 data points per peak). With the mass spectrometer not scanning, one buffer (4700 data points) of base-line noise data is collected and the data are analyzed for the noise amplitude of the base line. First, an average of the data points is taken to determine the voltage offset. The maximum and minimum data points are determined, and the minimum is sub1036

ANALYTICAL CHEMISTRY, VOL. 45,

NO. 7,

JUNE 1973

tracted from the maximum to obtain the noise amplitude. Threshold 2 is then set to the value of the maximum data point minus the voltage offset and threshold 1 is set to the value of threshold 2 plus the noise amplitude. The A-D converter is then reinitialized, the thresholds, digitization rate, buffer addresses, and word counts are passed to the interface, and the buffers are blanked. The system is a t this time ready to obtain digital data from the MS-9; hence the program loops until an external sense switch is activated. This same sense switch is activated by a push button which also starts the scan of the mass spectrometer. When the scan has started and the external sense switch is activated, the program starts the A-D converter and drops into a second loop which tests the sense switch until it is deactivated a t the end of the scan. The third section of DIGMSS provides the necessary information to the operator uia the teletype. It prints out the number of peaks found during the run, and the largest intensity data point is multiplied by 100 and divided by the largest number possible from the A-D converter (full scale output) to yield the message “LARGEST PEAK = XX% OF FULL SCALE.” The data on the disk are then read back, and each peak is checked for the number of data points occurring from half-height t o half-height over the maximum. If this number of data points is greater than 40% times the number of data points desired for the expected average half-width, it is used in computing the average number of data points per half-width for the data recorded. The number 40% was empirically determined to be reasonable for deleting peaks due to noise and very small peaks near the threshold. This average number of data points per half-width is multiplied by the digitization rate to give the average half-width of the spectrum in microseconds. Then, using Equation 9, the dynamic resolution of the run is calculated from the average half-width and printed along with the number of peaks used in cal-

Table I. Results of Gaussian Fit for a Spectrum of 2,4,5-Trichloroaniline RCCORG

NO. 1

54 5

6 7 8

9 10

11

3;

STANDARD HIDTH ERROR [MSEC. J 1 MEANINGLESS RESULTS 0 TOO fEH DATA POINTS

NO, O f ITERATIOIJS

1 .42 4.93 0 TOO FCH DATA P O I N T S 1 MEANINGLESS RESULTS 1 .59 5.78 1 M A N I N G L C S S RESULTS 1 ,67 4.86 5.94 POINTS 1 1.04 5.24 * 99

4124 4.62 4.65 4.78 4.63 4.90 4.63

39

1 1 1 1 1 1

41

0 TOO f E H DATA P O I N T S

35 36 37 38 40 42 56

57 58 59 60 61 62 63 64

8; 83 84 85 86 87

lei0 101 102 113 104 105 106 107 106

109 110

ill 112 113

1Yi

1

1

.29 3.05 .92 6.44 1.46 7.35

1.06

3.82

i

TOO FEH ~ T POINT; A 0 TOO fEH DATA P O I N T S 1 n E A N I r G L E S S RESULTS 7 1.09 3.47 1 1

1

2.45 2.10 1.06

4.97 5.67 4.61 4.43

1 f i f % N I N G L ~ s s RESULTS 1 1.28 4.61

1

.47

4:56

1

2.03 1.97 5.10 .42

4.71

1 1

4.65 4.51 4.93

1 0 TOO fEH DATA P O I N T S

i

1 1 1 21 1 1

1 1 1 1 1 1 1

1 1 1 1

5:09 1.41 3.46 1,33 1 * 35

1.77 4.90

1.11 .50 e33 1.50 .83 1.40

.82

ar: 87

4.88 4.56 6.39 3.44 7.76 3.73 4.48 4.91

4.40 3.24 3.74 3.50 4.01 3.57

4.44 5.00

1Y1 142 143 144

21

196

1 5

147 148 145

1 1

18; 183

i

18+

1

185 186 187 1ea

1 1 0 TOO fEH DQTA POINTS i 2.99 4.09 1 1.6e 4.04 1 * 20 4.83

145

199 190 191 192 193 194 195 196

1 1

1

1.61 ,93 1.52

7.32 3.20 RESULTS

.69 1.02 3.07

4.51 4.51 4.13 5.87 3.59 4.07

.82 1.10 2.40 2.32 5.40

3:66 3.49 4.14 3.92 Y.2@

.er 19.93

4.25

4.62 4.11 q.45 0 TOO f f H DRTA P O I N T S 3.10 I .32 0 T@O X H PQTA P@INTS

I I

PEFIK RRC R

PO1 I T I ON Of PER6

,020 ,010 .029 .010 .024 ,040 * 027 030 ,043 ,010 ,049

213.382101 3RI. 777765 31.566316 32.371753 34.177992 35.043160 35,929700 37.938914

TOTRL

POINTS

Poiws USED

59

3s

s0

35 58 109 86 81 127 21

37 4 3s

8 8

8

4

4s

118

53

,069 ,041 ,814 I179 2.305 .220 2.453 ,014 ,091

60;016164

118

4;

60,170663 610.327031 60.483274 6 0 . 6 4 1272 60.798894 60 958503 61.119347 61.271234

94 201

39 39 46 40 48 40 8 50

,011 ,009 e 09s ,121 s 152 ,119 * 191 ,131 ,017 ,062

66.334417 66,336677 66.520607 66 718760 66.715297 66.908371 67.100924 67.296710 67,493964 6 8 . 0 8 1 229

30 21 233 210

94

151 160 141 53 117

45 44 42 35 48

74.251913 74,498501 74 742633 74.992821 75.245418 75.500219

108 154 178 175

40 47 33 38

104

41

21

8

193 109 206 119 178 124 174 109 85 63 124

39 46 49

039 ,156 1200 .552 ,043 010

.

0

,395

.a7 .192 ,156

,111 I170 ,385 .062 .036 ,017 144 .123 ,124 .123 I

6.695 .130 .044 ,041 ,261

,136 .363 ,064 .033 ,069 ,260 ,054 ,099 ,110

. 5 76 1.19e ,011 ,200 .067 Et5 .I64

.

1. ?a? ,335 ,013 ,813 .010

81 :642813 81.805665 81,956042 82 * 117212 82.270963 82.432286 82.586269 88.908574 83.220769 83.234487 83.882948 84.881 532 85,229053 85 580449 93.087530 94 912887 95.333567 95.643819 95.877468 96.131922 96.371737 96.380979 96.389659 96.631 791 96.876255 I

116 :?I2901 117.679293 1?0.?77?31 120.791527 120.804003 121.885r73 121.900293 123.850a75 123.065573 131 .E95383 13it.311562 136.073r56 137.94.1 925 142.007992 146.725832

8

48 34 35 37

38.895095 39.882746 39.887486

I

PC~K UNUSUc)L

c

8

151 220 165 237 35 118

4

c

i

f

8

t

41

e

n

c

e

49

118

03 53 48 47 37 53 52 47

124 123

47

229

41

141

41 50 35 99 37

tn

54

118 85 321 160 267

f

in n

85

35

37

104 141

40

44

164

35 48 53 43 37

39 145 168

43 it6

96 114

123 164

4

67

4s

138 19* 150 30 62 21

38 38

f

i

46

13

i

8

6

4s

ANALYTICAL CHEMISTRY. VOL. 45,

NO. 7 , JUNE 1973

1037

culating the dynamic resolution. This preliminary analysis of the data for sensitivity and dynamic resolution takes less than 10 sec. The spectrum can be rerun using the value of the dynamic resolution obtained in the previous run to set the digitization rate, rather than the static resolution. A number of analyst options are available in DIGMSS. The In M us. time slope table is resident in DIGMSS, so an option has been provided to update it if desired, but this has seldom been found necessary. The spectral data can be listed on the line printer; it can be written onto magnetic tape for remote processing or later processing; or it can be processed immediately using the on-line versions of the data processing programs, CIFAS1, CIFAS2, and CIFAS3.

DESCRIPTION OF THE COMPUTATION PROBLEMS, DIAGNOSTICS, AND PROGRAMS The first data reduction program, CIFAS1, has as its function the calculation of the position in time of the centroid of each peak relative to the start of the scan and the intensity of each peak in the spectrum. Since each peak has a position marker which establishes the position of the first data point relative to the start of the scan, the problem becomes one of establishing the position of the centroid of the peak relative to the position marker. One technique which might be used is to assume some simple geometrical shape for the peaks, such as a triangle or rectangle, and establish the position of the geometrical center of the peak by some curve-fitting technique. The geometrical shape which corresponds most closely to the peak shapes is the normal Gaussian distribution function (18). At high resolution with fast scans, however, the small peaks (0.1 to 0.2%) may be statistically insufficiently defined (10 ions or less) for the peak shape to be determined ( I O ) . The advantage of assuming a geometrical shape is that it then becomes possible to deconvolute overlapping peaks; that is, to separate the intensity contribution of each peak and the position of each peak from the total of all -the peaks. The nonlinearity of the Gaussian function (7) requires the use of an interative nonlinear curve fitting algorithm and invokes a considerable increase in computing time. In this case, a compromise is faced in the trading of some accuracy in the intensity calculation for less computation time by using one of the simpler peak descriptions. The main disadvantage to choosing any particular shape is that ion statistics, noise, and vibration on the mass spectrometer can cause the peak shape to deviate significantly from the shape assumed, resulting in considerable error in intensity calculations and some error in centroid calculations. Small peaks could even appear to have more than one maximum, which would result in unnecessary deconvolution calculations. Another technique which can be used is the “center of gravity” calculation, in which the peak is treated as a flat object, the center of gravity of which is calculated by the formula

(10) This technique has been used successfully (9, 11-13) and has the advantages that it uses very little computing time, and the peak shape is not important. Its main disadvantage is that deconvolution is not possible. The approach used in CIFASl allows the operator to choose the method he wishes to use. If he does not specify (18) R. Venkataraghavan, R . Board. R. Klimowski, J. Amy, and F. W. McLafferty, “Fifteenth Annual Conference on Mass Spectrometry,” (ASTM E-14), Denver, Colorado. 1967.

1038

A N A L Y T I C A L C H E M I S T R Y , VOL. 45, N O . 7, J U N E 1973

otherwise, the “center of gravity” method will be used. He may also elect to have each peak fitted to a Gaussian, using an iterative deconvolution technique (19) wherever necessary, Any peaks which cannot be adequately processed by the Gaussian fit result in an error message, and the centroid and intensity are calculated using the “center of gravity” technique. The operator also has an option to use a Gaussian fit without deconvoluting. Whenever deconvolutioh is turned off and the peak tracing routine encounters more than one maximum, a small “M” is printed out on that line to denote that the peak was detected to be a multiplet. Table I shows the printed output from a spectrum which has been processed using a Gaussian fit where possible and deconvolution where necessary. In Table I, records 59 and 146 correspond to successfully deconvoluted doublets. The error message “TOO FEW DATA POINTS” (Table I, records 2, 4, 10, etc.) results from very small peaks which are too small or have too few data points to be processed by a Gaussian fit. The error message “MEANINGLESS RESULTS” (Table I, records, 1, 5 , 7 , etc.) is printed when an attempt is made to fit to a Gaussian curve and meaningless results are obtained in solving for the coefficients. This is usually caused by peaks having a shape not described well by a Gaussian function. If either error message is printed, the calculations of the centroid and intensity of that peak are performed using the “center of gravity” technique. Table I1 shows the printed output for several peaks in the same spectrum using both the Gaussian and “center of gravity” techniques. The positions of peaks calculated by either method are very nearly identical. According to Habfast ( I ) , the intensity of a peak can be taken as the peak height, or as peak area. If the collector slit is open enough, the peaks appear flat-topped, because the intensity is limited by the source slit. In this case, peak height is most representative of intensity. As the collector slit is closed down, the peak assumes a more Gaussian shape, and the collector slit limits the intensity. In this case, peak area is more representative of the intensity. In this study, the intensity of a peak was taken as the area of the peak for several reasons. First, the peak shape is assumed to be Gaussian for deconvolution. Also, the collector slit must be nearly closed to obtain high resolution. The ion beam is widened slightly by space charge effects with more intense ions, however, and this results in shorter, wider peaks for the more intense ions. Area calculations in these cases would be expected to be better than a height calculation. Noise has little effect on peak area but introduces considerable error into the peak height. Finally, if the “center of gravity” method is used, the height of the peak a t the centroid is not necessarily the height of the largest point in the peak, and some additional criterion for the height would be required. The area of the peak, calculated by the “center of gravity” method, must be corrected for the small areas on either side of the peak which lie below the cut-off level. These are the shaded areas in Figure 8. This may be only a small fraction of the total area for a large peak, but it can be a considerable fraction for a smaller peak. The correction term chosen corresponds to the area of a triangle of height equal to the cut-off level of the peak and halfwidth equal to the average half-width of the peaks in the spectrum. A comparison of the calculated intensities between the Gaussian fit and the “center of gravity” method, shown in Table 11, indicates that the intensities are nearly the same. An option which has been extremely useful in CIFASl is to process any single record out of a spectrum, printing (19) D W Marquardt J SOC lnd Appl Math 11, 431 (1963)

Table II. comparison of Gaussian Fit and Center of Gravity Methods

POSITION

fiRm

RECORD NUMBER

GACISSIFiN

CENTER O f GRfiVITY

3

31.566316

31.566181

0.029

0.031

11

39,887486

39.887438

0.099

0.052

18

48.212352

48.212326

0.lea

0,127

26

55.668971

55,668973

a. 278

@. 283

36

6B.327031

6i3.32?@4?

0.814

0.827

Ya

60.958503

60.958+85

2.453

2.511

1 4a

93 087530

93.087427

6.695

6.920

out important information and all the steps of the iteration. Also, after a spectrum has been processed, one may cause the important information to be listed for every peak which was classified as being “unusual.” A peak is unusual if it is too wide, too narrow, has too much error on the fit, takes too many iterations, etc. All unusual peaks are denoted by a n asterisk in Table I. If the “center of gravity” method is used, a peak is unusual only if it is detected to be a multiplet. The deconvolution algorithm is written so that the peak height, half-width, and centroid location are checked every 20 iterations and when the iteration stops. If the peak intensity is less than zero, or if the half-width is very much larger or smaller than the average, or if the centroid is outside the bounds of the peak, the results are considered to be meaningless, that peak is deleted, and the deconvolution continues without it. If only one peak is left, the record is processed as a singlet. An example of this is seen in Table I, record No. 104, which was initially detected to be a doublet. After 20 iterations, the second peak was deleted, and the record was processed as a singlet. Upon deletion of all the peaks, the message “MEANINGLESS RESULTS” is printed, and the “center of gravity” method is used to calculate position and intensity. This occurred in Table I a t record No. 144, which also started as a doublet but converged to meaningless values. When a large peak cannot be fitted properly, it generally indicates a peak shape which should require the attention of the analyst to determine the origin of the difficulty. For the data in Table I, the value of WO,the average half-width for the spectrum, was computed to be 3.17 msec. Considering the degree of approximation inherent in Equation 9, this value compares well with the halfwidths observed in the spectrum. The dynamic resolution calculated from Wo can also be said to be representative of the spectrum as long as this condition is met. The standard error in Table I represents the RMS standard deviation of the heights of the data points between the observed value and that calculated from the fitted Gaussian function. The second data reduction program, CIFAS2, has for its main purpose the calculation of the mass t o charge ratio corresponding to each peak in the spectrum. Because there are slight deviations in the positions of the peaks from one run to the next, a standard must be available

GWSSIAN

CENTER G f G R A V I T Y

i\, 1

i

Figure 8.

1

\

Intensity error arising from center of gravity calcula-

tion

within each spectrum to provide reference peaks for determination of the unknown sample masses. This is accomplished by introducing a standard sample, usually perfluorokerosene, into the mass spectrometer source along with the unknown sample. The problem then becomes one of identifying which peaks in the spectrum are due to the standard, interpolating between them to find the masses of the unknown peaks, then separating out the peaks due to the standard. The identification of the standard peaks also becomes a two-part problem. The first is to correctly locate and identify several standard peaks in the spectrum. This may be called “lock-on.” The second part of the problem then involves some sort of curve fit through the known points, followed by extrapolation along the curve to identify other standard peaks. This curve fit and extrapolation procedure may be called “tracking” the function. The relationship between mass and time is shown graphically in Figures 1, 2, and 3. The relationship is very nearly linear (as expressed in Equation 1) on the low mass end of the spectrum but deviates from the linear relationship a t the high mass end. The linear relationship may A N A L Y T I C A L C H E M I S T R Y , VOL. 45,

NO.

7, J U N E 1973

1039

serve as an approximation but is not nearly accurate enough for the mass calculations. Since the mass spectrometer is capable of measuring masses (by peak-matching) to 1-3 ppm accuracy, the calculations and curve fitting (which essentially are also peak matching) must be based on a mathematical function relating mass to time which is accurate to less than 1ppm. The lock-on problem was solved using the linear approximation, starting a t the low mass end of the spectrum. The low mass end was chosen because the relationship is more nearly linear and the peaks are more widely spaced. The approximate location of any given mass can be predicted using Equation 11, which is a variation of Equation 1, as long as the two constants, a and b, are known.

t

= l/b(ln

M

-

a)

The slope, b, which is a function of the scan rate, is passed as a parameter from the digitization program, DIGMS9. I t remains only to calculate the intercept, a, which is a function only of the magnet range and set start controls. The magnet current a t the start of the scan is calculated from the set start.

c=p+qs

(12)

where C is the magnet current read from the meter on the MS-9, S is the set start parameter, and p and q are constants determined empirically. Next, the logarithm of the starting mass is calculated from the magnet current and the magnet range.

a

= In

Mo

=

r

+

t(ln C)

(13)

where r is a constant looked up in a table as a function of magnet range, and t is a constant, determined empirically, which theoretically should have a value of 2. Experimentally, the slope. b, has been observed to change insignificantly over a 2-year period. This might be expected, since it is determined solely by the values of an unchanging RC network in the scan circuit. The value of a, however, is related to the parameters of the magnet scan amplifier and the magnet and electrostatic sector regulator and stabilizer amplifiers. When these amplifiers were changed, the observed effect was an alteration of the offset voltages of these circuits and thereby the starting mass. For this reason, an option was included t o “calibrate” the program by updating the table from which the constant r in Equation 13 is obtained. The calibration is performed by entering the location in time of the peak due to water a t m / e 18. In order to assist the operator in locating this peak, the expected locations of the water peak a t m l e 18 and the CF3+ peak of PFK a t m l e 69 are printed a t the top of the output from CIFAS2. If the 18 peak is not where expected, it is usually off by about the same amount and always in the same direction as the 69 peak, which is easily located, being the largest peak in the PFK spectrum. From the location of the 18 peak, the program recalculates the value of r in Equation 13 and updates the table. This calibration is generally necessary only if some changes have been made in the mass spectrometer. The preset value of r in the table will normally be adequate. The masses corresponding to the standard compound are contained in a standard mass table. Using Equation 11, the approximate locations of several of these standard masses on the low mass end are calculated, and the program enters a phase which attempts to identify the peaks in the spectrum corresponding to these masses. All possible combinations of matches are tried, including the possibility that the peak corresponding to a standard mass 1040

A N A L Y T I C A L C H E M I S T R Y , VOL. 45, NO. 7 , JUNE 1973

may not be physically present in the spectrum. The combinational analysis continues until lock-on criterion is met, that is, six out of ten successive standard masses have been identified within some preset error bound. Once this criterion is met, the lock-on has been accomplished, and the tracking phase begins. Tracking is accomplished by means of a cyclic sliding curve fit of a cubic function through six points a t a time. The algorithm is as follows. Step 1. The coefficients of the parabolic function In M = a

+ bt + ct2

(14)

are calculated by a least-squares curve fit through the six known points. Step 2. The coefficients, a, b, and c, are used to reidentify the standard peaks. This reidentification serves to more accurately establish which peak of several close together peaks is the one due to the standard compound. Step 3. The coefficients of the cubic function In M = a

+ bt + ct2 + d t 3

(15)

are calculated by least squares using the six reidentified peaks. Step 4. The masses of all peaks between the second and third known peaks are calculated and output, using the coefficients of the cubic function. Step 5 . An extrapolation along the cubic function is performed, and if a peak falls within an extrapolation “window” of a standard mass, the first known peak is dropped and the other six are moved down one place. The cycle then reverts back to step 1. Several attempts were made initially to find a single function which would describe the mass-time relationship over the entire spectrum. None of the simple functions tried came near to the accuracy required for mass calculation, More complex functions were found to improve the accuracy only slightly, while drastically increasing computing time. For this reason, this approach was abandoned, and the cyclic sliding curve fit was used. This allows the use of a simple function which is fitted repetitively through sufficiently small sections of the spectrum that the required accuracy can be attained. A t first, the parabolic function (Equation 14) was used for mass calculation and extrapolation as well as for reidentification. The average error was between 10-15 ppm for a spectrum. When the cubic function was adopted, the same data yielded average errors between 5-10 ppm. Bowen et al. (10) and Levenberg (20) also obtained improved results with a cubic function rather than a parabolic function. Peak time and intensity data supplied by Levenberg from an AEI MS-9 mounted on a vibrationdamped concrete pad was used to test the computation, and the error dropped to 2.21 ppm (compared with 2.29 ppm as obtained by Levenberg using his programs on the same data). It is probable, a t this point, that the cubic function is furnishing all the accuracy required, and that the higher observed error in our own data is due to instrumental sources, especially vibration. The cubic function does not work as well as the parabolic function in the reidentification step, however. The additional degree of freedom in the cubic function apparently provides enough flexibility in the function that a peak incorrectly identified initially is often not reidentified properly. To circumvent the ill-conditioning problem in the leastsquares matrix arising from the fitting of the nearly linear (20) M. Levenberg, Abbott Laboratorles, Chicago, personal comrnunication, 1972

Table I I I . Results of Mass Calculation for 2,4,5-Trichloroaniline

NO. 1 2

3 4

5 6

7

TRvE Mk55

t 2 * 00000 14,00307 15.99491 17.00273 18,01056 20.00623

8 9 10 11 12 13 14 15

27,99491 30.99840

46 49 50

51 52 53

54 55 56

57 58

65.02025

87 92.99521

96

97 98

ill 112 113

99.99361

123.99361

114

115 116

13; 135 136 137 138 1 39 140' 141 142 143

154: 99201

161.99941

1st 158 159 161 161. 162 163 164 165

192:98882

184

380:97603

190 191 192 193 194 195 196 197 19s

392.97603 404.97603 4 30,97284 442.97204 454.97234 480.96964 492.96964 504.96964 542.96645

le9

ncrss

12.00000 14.00306 15,99497 17.0026 3 18,01054 20,00622 26.ti0356 26.01604 27.01137 27,02444 27.39497 28.00647 28.01907 30.99819 31.99049

JNTCH5I T V

CWOR IFIWI

- ,60000 ,00001 -- ,60006 ,00610 I00002 -,06001

-06 -.6S 3.61 -5.59 -.91 -.27

,00006

2.13

- ,00021

-6.78

61 00495 61.491 30 61.97682 61.99530 62.01291 62.49783 63.01646 63.49650 6 4 . pa467 65,02181 68.992e9

88 02189 89.02979 90.03779 92.99751 94.96938 95.01157 95.97717 96,98429 97.46915 97.97371 98.46878 98.98405 99,46724 99,99315

192:98748 j93.93271 194.93779 195.9+019 196.93506 197.93727 198.93326 299.93496 200.92355 330.97697 381.03078 392.97800 404.975;r7 430.97353 ~2.97259 454.973E9 480.96709 q92.96993 504.97087 542.96625

,010

142.00799 137.94193 136.073134,31 156 131,09538 123.06558 123.05187 121 ,90029 121.885'47 120.80400 126.79153 120,77773 117.67923 116.71290

,013 ,013

95.87747

.00153

23.50

95.64362 9s. 39357 94.91289 93.08753 85,56045

.00230

-.00046

121 97820 122.98705 123.99531 125.1@142 125.99145 126.99937 154.991 48 157.95293 158.96055 159.96703 160.96308 161.96813 161.991?1 i62.98807 163.95702 163.96836

we. 72583

S5.87626 96.631 79 96.38966 95.36048 96,37174 96.13192

1

8i 86

88 89 90 91 92 93 94 95

CFILC.

.00170

24.78

-4.62

13.71

- .00053

-3.45

.00130

8.03

- .00134

-6.92

.00094

2.47

,00197 ,06056 .00069

- ,0002s ,00105

5.01 -1.38 1.59

.00255 ,00029 .00123 - .00021

-5.30 * 59 2.43 -.38

r

- .56 2.31

85.22905 84.88153 83.88295 .83.23449 83.22177 82.90857 82.56627 82.4 3229 82.27096 82.1 i 721 81.95604 81.80567 81.64281 75.50022 75.24542 74.99282 74.74263 74.49858 74.25191 68.00123 67.49396 67.29671 67.10092 66.90837 66.71530 66.71076 66.52061 66.33668 66.33442 61.27123 61.11935 60.95850 60.798e9 60.64127 60.48327 60,32703 60.17066 60.01616 39.087+9 39.08275 38.89509 37.9309 1 35.92970 35,04316 34 * 17799 32.37175 31.56632 30.77776 28.38210

A N A L Y T I C A L C H E M I S T R Y , V O L . 45,

,335 1.707

,184 -025 067

#

.2m .01a 1,198 * 576 ,110

,099 ,BSl! ,260 ,069

.a33 ,064

.363

.136 .261 ,0'41 .w4

.130

6.695 .123 .124 .123 .144 ,017

.036 .e62 ,385 .178 ,111 * 156 .192

.a57

.395 .010

.0'43 .552

.200 ,156

.039

.062 .017 .131 ,191 ,119

.152 .121 .e95 .009

.ell .091 .014

2.453 .220 2.305 .179

.E14 .041 .069

.049 .010

.043 .030

.027 .040

.024 .010

.029 .010

.020

NO. 7 ,

J U N E 1973

1041

Table IV. Comparison of Single Spectrum and a Composite of Five Spectra SJNCLE SPECTRLm

TMGRETICF~L

~~s/fiBur~r+Cf

nkS5

19*94692/100.6% 194.93779 1%*4;r42&' 6.73% 195.94119 l96*93?~7/97. 19% 196.93506 Z97*94133/ 6 . 5 5 % 137.93727 198.93562/3i.49% 198.93326 199.93q96 1 ~ . 9 3 e 3 r /2.12% 2Ofi.93287,' 3.46% 290.92955

COrPOSITE

W S S ERRGR ( P P W

FtBIJNVANCE

-16.66 -20.87

100.0%

-14.78 -21.51 -8.85

93.97% 7.31% 33.19%

-3.31

-17.11

1.66%

-12.54

2.83Z

-21.71 -16.76

e.*&%

X ERROR e----

32.99 11.60

5.41

mss

m s s caRoR

ls4.93090

.1-

195.94110 196.93601 197.93010 190.93129 190.93402

-1S.82 -S.S

208.SZ923

C P P ~ J& K W N ~ ~ N C C

36

-15.91 -0.70

-21.01 -14.13

SrCCTRLm

- -

160.7.67%

os.66x

b.S3X U.tOI 1. sax 2. 70%

%

cm?o~

-----

13.97

-1.s7 5. w -3.01 -6.-10.2*

WCRCIGE W S S ERROR 13.01 PPH FlMRAOE INTENSITY ERROR 0.33%

IMPROVEMENTS IN MASS SPECTRAL DATA THROUGH EDITING, COMPOSITING, AND BACKGROUND SUBTRACTION

Figure 9. Mass error caused by uneven spacing of standard

masses Solid line = curve fitted with uneven spacing of points, dotted line = actual In M vs. time relationship

In M us. time relationship to a cubic function, two operations were performed. The first consisted of a scaling operation on the six points before fitting the function. The second consisted of using a special least-squares curve fitting algorithm described by Noble (21). Of the two, the scaling operation is probably the more important. The choice of masses used in the standard mass table was found to be of critical importance. The masses must be spaced as evenly as possible and should correspond to peaks which will very likely be present in the standard spectrum. When closely spaced masses are used in the table, large errors are often encountered, as shown in Figure 9, when two pairs of closely spaced points are located on each side of a fifth point. The curve fitting routine results in a function which very nicely describes the two pairs of points but leaves considerable error for the point in the middle. This computational error is generally far larger than the error normally expected, represented by the diameter of the points in Figure 9. The calculation of exact masses by interpolation has commonly been done using a Lagrange interpolating polynomial (8, 11), but this requires considerably more computing time than simply using the coefficients of the sliding function which are already available from the curve fit and has been found not as accurate ( I O ) . For this reason, the masses are calculated between the second and third known points with each cycle of the sliding fit. The standard peak identification and the exact mass calculation are performed concurrently, with only one pass through the data-a very efficient process. Table I11 shows the printed output from CIFASB during the mass calculation. The data are the same spectrum given in Table I. Those peaks which correspond to masses found in the standard mass table have an entry showing the true mass and the error in amu and ppm, in addition to the calculated mass. The average error of all the known peaks is given at the end of the data. (21) 6 Nobel, ' Applied Linear Algebra, Prentice-Hall, Englewood Cllffs. N J , 1969, p 260

1042

A N A L Y T I C A L C H E M I S T R Y , V O L . 45, NO. 7, J U N E 1973

Mass spectrometers operated in any sort of routine basis have a residual background which contributes mass spectral peaks to the sample of interest. Likewise, the standard which must be used for mass calibration also contributes peaks of no analytical interest. The last stage of CIFAS2 provides several options for handling these problems as well as other useful features for improving the precision of the mass presentation. The first option is that of editing the spectrum. If an edit is performed, each mass in the spectrum is compared to an edit mass table, which contains all the exact masses for peaks observed in the spectrum of the standard compound, plus some background masses due to other species such as air and water. If a mass matches within some preset ppm tolerance to one in the edit mass table, it is deleted from the spectrum. The next option is a background subtraction. In this case, the masses in the spectrum are compared to the masses in a previously stored background spectrum, which usually includes the standard compound. In order to subtract the background spectrum, an intensity ratio between the two spectra must be established which can serve to cancel out changes in intensity from one scan to the next. Such changes in intensity could be due to pressure changes in the source, etc. The problem immediately arises, however, that some peaks may have intensity ratios significantly different from others, because of contribution from both the sample and the background. These peaks should not be counted into the average intensity ratio to be computed. On the first comparison between the sample spectrum (with background included) and the stored background spectrum, if two masses match within the preset ppm tolerance (the same tolerance as used for editing), the ratio of the intensities is taken and stored in an array. After all masses have been compared, those ratios which occur most often are averaged to obtain an intensity ratio between the spectra. The comparison of masses is again performed, and when two masses match, the intensity of the background times the intensity ratio is subtracted from the intensity of the sample. If the result is less than or equal to zero, the peak is deleted. If the result is greater than zero, the mass and the resulting i n k n sity are printed out, with an asterisk to indicate that some of the intensity has been substracted. Those masses which do not match masses in the background are printed out with the intensity unchanged and without the asterisk. The compositing or averaging option was designed to aid in cancelling random errors in mass and intensity to yield data of higher accuracy than might be obtained from a single scan. Burlingame (12) has observed that the error

Table V . Final Results of Elemental Formulas for 2,4,5-Trichloroaniline PC T TP

1.01 2,72 8.17 .44 4.49 10.58 2.82 1.33 2.60 1 4 E0

5.55 10.64 i .6e

5:05 5.65 5.62 .69 2.53 15.7a 6.94 4.52 6.36 7.84 2.34 ' .42 1.73 8.15 6.35 1.65

35:9@ Hh6N 36.28 H3C6N 37.76 HYC6N 5.37 CSCL 5.11 HCSL -2.27 Hi?c5cL 4.5 DOUBLY CHQRGED SPECIfS--fORMULFIS CAN BE fOlJNI) FIT HFI55 1 9 ' + . 9 3 8 3 1 -.41 HCYNCL 4.5 DOUBLY CHfiRGED SPECIES--fORMVLGS CAN BE fCilJND FIT MASS 196.93756 35,66 WC4NCL 4.0 25.16 ~2C5CLl37) 4.5 DOUBLY CH6IRGED 5PECIE5--fGRMULFIS CkN BE FOUND f i T MFISS 198.93*97

6:s

6.0 5,s 5.5 5.5

-

6.5 6.0 5.5 .5

,a

5.a

-12.71 -4.31 -14.56 29.37 29.84 -7.24

.70

5.5 1.0

5.35

5.a .5

7.79

4.'5

.a

4.86

-.5

-31.63 35.45 -37.96

4.5 .S -.S 4.5

-

-6.85 -27.26 45.62 16.69

MC~NCLCL[ 371 H7C3NCL2CL(37) H?C3NCL2CL(37! H4C6NCL(37)2

4.5

-2.06 -16.16 -43.79 -20.98 -1Y.88 39.61 -43.19 -20.61 33.60 -8,95 4;). 99 39.56 -17.21 36. +6 -12.64 40.76

t-!qC6NCLfj?)3 H~C;)CLCL 137 1 3

6.19 .38 .43 .58 15b.Bd 8-95 53.97 7.31

4.0

3.5 4.0 4. I

- .s

3.5 4.6 -.5

33.19

9.a

1.66

3.5

-.s

Y.O

-.5

2.83

-.5

4;

-22.16 45.78 -23.34 94.16

-

A N A L Y T I C A L C H E M I S T R Y , VOL. 45, N O . 7, J U N E 1973

1043

can be expected to decrease by a factor approximately equal to the square root of the number of scans averaged together. For this reason, the third option in CIFASZ is to composite spectra, that is, to average the masses and intensities of a number of spectra. The first spectrum is simply stored as is. The masses of the second spectrum are then compared with the masses of the first, and the ratios of the intensities of those that match are averaged. The masses are again compared, and when they match, the intensity of the second, times the average intensity ratio, is averaged onto the intensity of the first. The mass of the second is also averaged onto that of the first. If further compositing is desired, this composite is stored where the first spectrum was stored, and a third spectrum is averaged in the same manner onto the composite. When no further compositing is desired, the composite spectrum is assigned a new label and new comments, and the results are printed out. Along with the mass and intensity of the composite peak is printed the number of times that peak occurred in the composited spectra. In Table IV, the exact masses and relative isotopic abundances for the parent ion region of the trichloroaniline spectrum obtained on a theoretical basis are compared with those obtained for a single spectrum and for a composite of five spectra. It is obvious from these data that the composite is more accurate in terms of both mass and intensity than the single spectrum, presumably because of cancellation of the random error introduced by any of a number of sources, the most likely being vibration in the mass spectrometer. The improvement is much more marked for the intensities than for the masses, indicating that some determinate error still exists in the mass calculation a t this point in the spectrum. The last option in CIFASS is to store the spectrum as a background. Once stored, this spectrum is the one which will be subtracted when background subtraction is desired. Typically, a background is not edited. It may or may not be a composite spectrum.

CALCULATION OF CHEMICAL FORMULAS The purpose of the design of CIFASS is to calculate possible formulas for the exact masses calculated. The atoms to be used in these formulas are described in an atomic table which may contain up to 20 different isotopic species. For each atom, the table contains an atomic symbol, the number of characters in the atomic symbol, the atomic weight, the relative isotopic abundance, the number of bonding electrons minus two, the maximum number of that type of atom to be permitted in any formula, and a logical variable indicating whether or not an atom is an isotope of the atom preceding it in the table. The preset table, built into the program, contains the following atoms: H, IZC, 13C, IlB, log, 28Si, 29Si, 3OSi, 0, N, 32S, 33S, 34S, P, F. 35C1, 37Cl, 79Br, SIB,, and I. The table can be changed a t any time by reading in a new table from a card deck. If the analyst is fairly certain what atoms are likely present in the sample, he should specify which atoms should be used in generating the formulas from the exact masses. If he is not certain, he may use the “search” option, specifying which atoms might possibly be present and ruling out atoms which are not present. If he makes no specification a t all, the search will be conducted for all the atoms in the table. If the search option is specified, the program searches the lower part of the spectrum for the occurrence of atoms specified by the operator. When the search is completed, only those atoms which have been found to occur are allowed, and the spectrum is pro1044

A N A L Y T I C A L C H E M I S T R Y , V O L . 45, NO. 7 , J U N E 1973

cessed normally. If the operator does not wish to use the search option, he must specify which atoms to use. The program also contains provision for processing a single exact mass, either with or without the search option. This is most useful in an interactive mode, when the operator is not sure what the elemental composition of the compound might be. He can pick out distinctive ions in the spectrum, see what the likely formulas are, and himself specify the atoms to be allowed in processing the entire spectrum. Each mass in the spectrum is processed to obtain possible formulas for the ion. First, the mass is doubled, and a check is made for the existence of a corresponding mass in the spectrum, that is, for the singly charged species corresponding to the doubly charged species observed a t lower mass. If the singly charged species is found, a message is later printed stating “DOUBLY CHARGED SPECIESAT MASS FORMULAS CAN BE FOUND XXXXXXXX” where XX is the mass of the singly charged species. If the singly charged species is not found, the message is not printed. The formulas corresponding to each mass in the spectrum are obtained by trying all possible combinations of the atoms specified. I t is obvious, therefore, that when more atoms are allowed, the computation time is greater and more formulas will be found for the given tolerance, normally set a t 50 ppm. Each possible formula is checked to see if it meets three more criteria before it will be printed out. First, the number of rings and double bonds is computed, and the formula is accepted only if this number is less than some preset maximum. Next the formula is checked for the presence of too many terminal (nonchaining) atoms. Both of these criteria can be obtained from a single calculated value, ISUME ISUME = 2.0

+ TNO(1)

X W I )

(16)

I-1

where NO(1) is the number of atoms of type I, and NEL(1) is the number of bonds the atom will form minus two: that is, for carbon and silicon, NEL = 2; for nitrogen or phosphorous, NEL = 1; for oxygen or sulfur, NEL = 0; while for hydrogen or halogens, NEL = - 1. If ISUME is divided by two, the result is equivalent to McLafferty’s formula (22) for the number of rings and double bonds which would be found in the unionized fragment. Half of a ring or double bond indicates that the fragment, if neutral, has an odd (unpaired) electron. If ISUME is less than -1, too many terminal atoms (hydrogen or halides) are present, and the formula is meaningless. Thus for NH4+, ISUME = -1 and the formula is acceptable, but NH5+ has too many terminal atoms. The last criterion to be checked is the relative isotopic abundance, with regard to the rest of the peaks in the spectrum. Any formula containing isotopic atoms should only be accepted if all isotopic homologs of that formula which are more abundant can also be found in the spectrum. For instance, C37Clf should not be accepted unless the more abundant C35C1+ is also present. If no isotopic atoms are in the formula, it is acceptable and is printed out. If isotopic atoms are present, the relative isotopic abundance for the formula is calculated ( 2 3 ) .If the relative abundance is less than 2 x 10-4, the formula is rejected. All the isotopic atoms in the formula are then permuted, and the relative abundance of the formula corresponding to each permutation is calculated. If the relative (22) F W McLafferty, Interpretation of Mass Spectra, W A Benjamin, N e w York, N Y 1967, p 26 (23) J L Margrave and R B Polansky, J Chem E d u c , 39, 335 (1962)

194.93779. That formula is H4C6NCls which is correct €or the sample, trichloroaniline. Peaks can be observed a t 159.96703 corresponding to p-C1; a t 125.00142 corresponding to p-2C1; and a t 90.03779 corresponding to p-3C1. The structure of the compound is well represented in the formulas derived as a final result of the data reduction.

abundance for any one permutation formula is larger than that for the original formula, the mass of the permutation formula is computed, and a check is made to determine the existence of an ion corresponding to that mass in the spectrum. If the ion corresponding to the more abundant permutation formula is not found, the less abundant original formula is rejected. In other words, if any one of the permutations which are more abundant is not found, the formula is unsatisfactory. The final printed output is shown in Table V for the same trichloroaniline spectrum as in Tables I and 111. The columns give the observed mass of the ion, its intensity in per cent of tallest peak, the number of rings and double bonds for the formula, the ppm error between the observed mass and the mass calculated for the formula, and the formula. The ppm error averages 5-10 ppm, but worst-case error runs as high as 30-40 ppm. It is expected that this will improve when vibration is eliminated from the instrument, but a tolerance of 50 ppm between the formula and the observed mass is being used until then. It should be noted that even with a 50-ppm tolerance in the mass, most of the ions correspond to only one formula. Only one formula is given for the molecular ion a t mass

ACKNOWLEDGMENT We wish to thank Paul Bender and his staff for their efforts and cooperation in the development of this work and Milt Levenberg of Abbott Laboratories, Chicago, Ill., for his valuable advice and comments on various aspects of the approach to obtaining meaningful data. Received for review December 22, 1972. Accepted February 26, 1973. This research was supported by the Air Force Office of Scientific Research under AFOSR 69-1725, by the Wisconsin Alumni Research Foundation, and by the National Science Foundation under GP-36236X. The AEI MS-902C mass spectrometer and the Raytheon 706 computer system were purchased in part by funds supplied by the National Science Foundation.

Application of Chelating Ion Exchange Resins for Trace Element Analysis of Geological Samples Using X-Ray Fluorescence C. W. Blount Department of Geology, University of Georgia, Athens, Ga. 30602

D. E. Leyden, T. L. Thomas, and S. M. Guill Department of Chemistry, University of Georgia, Athens, Ga. 30602

The use of chelating ion exchange resins for the quantitative batch extraction of ions of trace elements and as a matrix for the determination of the elements using X-ray fluorescence is described. Chelex-1 00, an ion exchange resin containing iminodiacetic acid functional groups, is used for the determination of cobalt and nickel in U.S. Geological Survey samples. The selective extraction of bismuth is achieved by pH control, and bismuth was determined in several geochemical standard samples. N M R R , a chelating resin highly selective for gold and platinum metals, is used for the specific extraction of gold. In all cases, the determination is performed by pressing the resin into pellets and using these pellets as samples for X-ray fluorescence. Samples containing as little as 0.04 ppm gold, 0.2 ppm bismuth, and 15 pprn cobalt or nickel were analyzed.

Combined applications of ion exchange materials and X-ray spectroscopy for elemental analysis have been in use some time. Campbell et a2. (1) have recently reviewed applications of ion exchange, resin loaded papers in this field. However, only a few types of resin loaded papers are commercially available. Batch equilibration of an ion ex-

change resin with a solution of the ion to be determined has also been reported (2-8). Use of the latter technique provides a homogeneous distribution of the sample on the resin. However, a high distribution coefficient of the ion of interest is required for quantitative removal of the ion from solution, especially if large concentrations of other ions are present. In order to further develop this technique for elemental analysis, there are several factors to consider. With regard to extraction methods, the use of column techniques has some advantages over batch equilibrations for the more common cation ion exchange resins ( e . g . , Dowex 50). However, the former technique usually requires a larger quantity of resin and the sample must be (1) W. J. Campbell. T. E. Green, and S. L. Law, Amer. L a b , June, 1970, p 28. (2) J. N. Van Nickerk, J. F. D e Wet, and F. T. Wybenga, Anal. Chem., 33,213 (1961). (3) M. J. Miles. E. ti. Doremus, and D. Valent. Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy, 1966, paper 236. (4) C. W. Blount, W. R. Morgan, and D. E. Leyden, Anal. Chim. Acta. 53,463, (1971). (5) C. W. Blount, R. E. Channell. and D . E. Leyden, Ana/. Chim. Acta, 56,456 (1971). (6) D . E. Leyden, R. E. Channell, and C. W. Blount, Anal. Chem., 44, 607 (1972). (7) R . L. Collin, Anal. Chem.. 33, 605 (1961). (8) A . T. Kashuba and C. R . Hine, Anal. Chem 43,1758 (1971). A N A L Y T I C A L C H E M I S T R Y , VOL. 45, NO. 7, JUNE 1973

1045