Acomputersearchsystem for identification of unknown X-ray diffract ion patterns compares the d-spacings and relative intensities of the unknown with those of a large number of known patterns stored on tape. The method is quite general and can be used for matching known and unknown patterns in fields other than powder diffraction.
A
Computerized Powder Diffract ion Identification System GERALD G. JOHNSON, Jr. VLADIMIR VAND
ver the past 25 years, the ASTM Joint Committee on Powder Diffraction Standards (7) has compiled file of over 12,000 powder diffraction patterns. With the advancement of high speed electronic computers, repetitious jobs such as the identification of unknowns using these reference patterns can be accomplished easily if the proper control programs are written for these computers. I t is the purpose of this paper to describe such a system for the identification of multiphase unknown powder patterns. This system has been developed under the auspices of the Joint Committee on Chemical Analysis by Powder Diffraction Methods of the American Society for Testing and Materials (here abbreviated ASTM). The ASTM Powder Diffraction File is now available for the first time in a computer-machinable form on IBM cards or on magnetic tape from ASTM. The system has been used a t the Materials Research Laboratory of T h e Pennsylvania State University (8). It utilizes both the speed of the electronic computer, and the reference patterns compiled by ASTM to identify multiphase unknowns in less than 4 min. T h e ASTM file has been available in the form of 3 X 5 in. printed cards for some time. T o assist the hand search, various index books are available, arranged in terms of three strongest spacings, name, formula, etc. The ASTM index of powder diffraction patterns is a good example of a system for identification of unknown
0 a
VOL 59
NO. 8
AUGUST 1967
19
The retrievalprogramis as efficient as ean be devised, with applications other than powder data being kept in mind materials by means of “spectral” properties-Le., of continuously variable positions and intensities of diffraction lines. An x-ray powder diffraction pattern of a polycrystalline solid consists of diffraction lines in some order of interplanar spacings measured in angstroms. The intensity of each line is generally recorded as either peak or integrated intensity measured in arbitrary units. Each crystalline substancehas its own set of interplanar spacings which is different from those of other crystalline substances. The relative intensities of the various reflections are also characteristic of the substance. T h e r e fore each substance gives its own characteristic powder pattern. Each ASTM pattern thus consists of a set of pairs of numbers, one representing the interplanar spacing, d, measured in angstroms, and the other representing a relative intensity, I/I,,, expressed on the scale of the strongest line as 100. From the fact that each crystalline polymorphic form of a substance has a unique crystallographic unit cell, the d-value data for each crystalline phase can be accounted for by no more than six independent cell constants. The only valid information deducible from the spacings data pertains to the actual geometry of the unit ceUs of each crystalline phase present. Along with each diffraction line, a corresponding intensity is determined by the arrangement of the atoms within the unit cell and the dimension and orientation of the crystallites. The intensities of the powder pattern of a particular crystalline phase yield implicit information about the three-dimensional elcctron density in the phase. However, since the crystal structure of an unknown phase is not directly determinable from x-ray data alone, no explicit determination of the number or kind of atoms within the unit cell can be made. Even though no positive identification of elements or compounds can usually be deduced, the empirical method of quantitatively matching the d-values of the unknown with those of the standard, with the qualitative matching between relative peak intensities, is wed to identify compounds (12). Normally, to identify a single-phase unknown substance, the spacing of its strongest intensity lines is selected and reference is made to an index of tabulated substances which have the correct spacing to within the possible limits of errors (7). Most single substances can be so identified with comparative ease and certainty, especially when the other spacings from the pattern are used. This is not so when complicated patterns of mixtures are analyzed. The search using the hand 20
INDUSTRIAL A N D ENGINEERING CHEMISTRY
method becomes more uncertain and time-consuming the more components are present in the mixture. A mixture of two or more substances gives a pattern consisting of the superimposed patterns of the individual components, provided these components exist as separate crystals in the powder specimen. Expressed mathematically, the set of interplanar spacings corresponds to the sum of the spacings for each crystalline phase present (6). The exception to this rule occurs only in solid solutions, which can exhibit powder patterns continuously variable over the solubility range. However, solid solutions can be handled successfully within a file by including their representative patterns for sufficiently closely spaced compositions.
BASIS OF THE SEARCH SYSTEM The main assumptions and criteria on which a search system is based are therefore as follows: (1) A pattern of an unknown mixture is the sum of the patterns belonging to the individual substances. [For some exceptions to this rule, see (5, lo).] (2) Theintensities of the component patterns are proportional to the amount of substances present. This implies that the effects of various experimental errors on the intensities of both the standard ASTM file patterns and the unknown pattern can be neglected. The most serious error is due to absorption and preferred orientation, and can be largely eliminated by correct experimental procedure. When absolute intensities become available for the standard pattern, the knowledge of intensities will allow estimation of true percentages of‘ components of a mixture. (3) The effects of solid solution can be neglected. As many of the patterns in the file are for pure stoichiometric compounds, errors may arise when mixtures of technical substances or minerals are analyzed. (4) The patterns of the component substances of a mixture are present in the ASTM file. However, even when a substance is not in the file, valuable clues can be obtained by considering isomorphous substances, which are often retrieved in addition to or instead of the true substance searched. ( 5 ) Maximum error allowable between the positions of the lines and intensities of the file patterns and the unknown pattern is given. This error, both in spacing and intensity, may include allowance for the solid solution effect, orientation, etc. The search has to be made allowing for such an error. However, the larger the
error allowance, the more false substances will be retrieved; on the other hand, if the allowance is made too small, some correct patterns may be missed.
PREVIOUS COMPUTER PROGRAMS Several computer search programs which are based on the above assumptions already exist for identification of mixtures from x-ray powder diffraction patterns. For example, a successful program, the ZRD SearchMatch, has been described by L. K. Frevel (2) of Dow Chemical Co. I n its name, Z stands for the search of the atomic number of the spectroscopically detectable elements, R for the set of codified polyatomic groups detectable, and D for interplanar spacing of the 10 most intense powder lines of the standard. The program has been written in ALGOL 60 for the Burroughs 5000 digital computer. At the time of publication, the file contained 1359 standard patterns selected from the ASTM file, each containing the first 10 d-spacings arranged in descending order of their intensities. The time to search and match the digitized powder data of an unknown with the standards requires from 2 to 5 min. using magnetic tape storage. Frevel (3) stresses that the prerequisite for a chemical identification of a solid by powder diffraction is an elemental analysis by x-ray fluorescence and/or optical emission spectroscopy to determine the chief elements present, and his program uses this knowledge a t the outset of the search. Another program has been described by M. C. Nichols (9). This program has been written for an IBM 7094 with 32K memory storage and it matches d-spacings from a n unknown pattern with a standard file. Nichols, at first using the three strongest lines of each standard, uses a device of subtracting already found patterns from the given mixture of patterns as a means of unraveling mixtures. T h e file contains about 2700 patterns and the program takes about 1 min. per phase to be subtracted. His program was tested on examples given by Frevel and seemed to perform well even without chemical information about the elements present. These are two typical programs presently available, and other similar programs probably exist. Similar problems of search and identification arise in infrared, visible, and ultraviolet spectroscopy, in mass spectroscopy, in chromatography, and elsewhere. Computer programs written for one application can easily be adapted for others.
AUTHORS: Gerald G. Johnson, Jr., and Vladimir Vand are associated with the Materials Research Laboratory at The Pennsylvania State University. The authors wish to acknowledge the support of the Joint Committee on Powder D$raction Standards obtained through A S T M . They also thank Dr. Daryl Boudreaux of Cambridge University for the initial programming and the Computation Center of Penn State for the computer time used.
In addition, the problem is only a special case of a much broader problem of information processing, storage, search, and retrieval, whether the processing is applied to documents or physical objects or pattern recognition. Examples are the recognition of diseases from their medical symptoms, or identification of animal or plant species from the description of their characteristics or their descriptors. An advance in one field of application might thus throw light on the more general problems, which are beginning today to be of considerable practical importance.
FURTHER CRITERIA FOR AN EFFICIENT SEARCH PROGRAM We took as a task to write as efficient a retrieval program as we could devise, with applications other than powder data being kept in mind. However, the main impetus to our work was the present availability of the whole ASTM Powder Diffraction File on a magnetic tape in a form suitable for computer processing, which we helped to prepare jointly with ASTM. The ASTM Powder File (1966) contains at present about 12,000 active powder diffraction patterns, some of which contain up to 106 powder lines. The average number of powder lines per pattern is, however, 35, so that the whole file contains about 5 X l o 6 spacings and intensities. If we compare this number with the file of Frevel of about 1 4 X l o 4 lines, we see that our file would be about 35 times longer, and search for a mixture, using Frevel’s system would, instead of minutes, take about an hour, assuming the same program structure, computer speed, and time proportional to the length of the file. One would probably obtain slightly shorter, but comparable, times with the program of Nichols, which may be accounted for by the somewhat higher speed of the IBM 7094 over the Burroughs 5000. Our first task was to investigate how the running time of the search computer program would be appreciably shortened. We established that up to a tenfold increase in speed could be achieved by the following: (6) Using a n inverted file instead of a direct file for the first stage of the search. The use of an inverted file in a powder diffraction search is not new. The Matthew’s Coordinate Index is based on the idea, but the execution is mechanical by means of multiple-punched Termatrex, or “Peek-a-Boo” cards (7). We are applying the same principle in a computerized form. Each pattern (or its ASTM file number) is found in a manner similar to the method of the Matthew’s Coordinate Index. The ASTM numbers are sorted and a second pass is made comparing the unknown pattern with reference patterns arranged in the direct file. During this comparison, the estimate of the amount of each matched reference phase is made. Further speed-up techniques are: ( 7 ) Using packed representation of data within the computer system. (8) Using integer arithmetic which is considerably faster than floating point arithmetic. I
VOL. 5 9
NO. 8
AUGUST 1967
21
These points became thus our main guidelines in the program design. However, we found that the use of both inverted file and of direct file a t different stages of search proved to be the most advantageous solution.
CONSTANT BAND-PASS TRANSFORMATION O F DATA The ASTM file patterns consist of a set of pairs of numbers, the spacing, and the relative intensity. The unknown pattern consists also of such a set of pairs of numbers, and our task is to match the sets together. Owing to presence of experimental errors, the file of known patterns must be sieved by means of a comparison system, having a certain “band-pass” width determined by the maximum permissible error, both in spacing and in intensity. When a small number of patterns is handled by hand, the efficiency of such a comparison does not matter, but when the file contains over 10,000 patterns and machine comparison methods are used, the efficiency of filtering of the desirable information rapidly decreases with the increasing band-pass width, especially when many lines are being compared for a simultaneous pass condition. If the error varies greatly as a function of the parameters to be compared, the use of a constant band-pass width can greatly impair the efficiency of the system, because the size of the band-pass must be adjusted to accommodate the largest error within the range of measurements. This is generally valid for any comparison or transmission system. There are two ways open to allow for such a variation. One obvious way is to supply to the program or other device the band-pass width as a function of the parameters, and to use this variable width during each comparison. However, a less obvious but much more efficient way is to find the band-pass function and then to transform all the parameters into a new system which has a constant band-pass error over the whole useful range. The system then operates at the maximum possible computer speed while maintaining at the same time optimum efficiency. This we decided to do with our system. Analysis of data of the ASTM powder file against new data published by NBS for pairs of the same substances shows that in the present system which expresses the positions of the lines in spacings d, measured in angstroms, the error varies very nearly in proportion to d2. I n this case, simple analysis by Vand (17) shows that the constant window condition is best approximated by transformation of all the spacings d into reciprocal spacings d* = l / d . Similar analysis of intensities disclosed that the system of log I l l a fulfills the constant window condition. Thus we can express further guidelines for the construction of a search program: (9) The data of the file should be transformed into a system having a constant error window. (10) For powder diffraction file, this means the use of reciprocal spacings and logarithms of intensities. 22
INDUSTRIAL A N D ENGINEERING C H E M I S T R Y
I t is unfortunate that the ASTM powder file historically started using direct spacings rather than reciprocal spacings. Thousands of man-hours must have been wasted because of added labor and complications in hand search which a variable window brings with it. However, we do not suggest that the hand system should now be changed. When computers are available, it is a simple matter to feed in the direct spacings, let the computer convert them into reciprocals, complete the computer search in reciprocals, convert the results back into direct spacings, and print these as such for the customer. The ASTM file is therefore available to customers on a magnetic tape in its customary form-Le., in direct spacings expressed as floating decimals and relative intensities. In our search system, this tape is translated (usually once a year, to allow for revisions) into another tape, which contains the packed integer representation of the transformed system. This direct file tape is then transformed into an inverted file tape, which associates an ASTM file number with every packed spacing and intensity. The inverted file thus consists of ordered packed spacings (PS) and all ASTM numbers which have this packed spacing in common. Only the transformed tapes are then used by the program during the search. The constant band-pass transformation of data is a general principle which should be used in all data storage, retrieval, and transmission systems dealing with continuously variable properties to increase their efficiency. In addition, if the unit of measurement in the transformed system is chosen to be somewhat less than the smallest expected error, the data can be expressed as integers, the digits of which carry maximum possible information. The decimal fractions can be omitted, as being below the significant level. Such a system is then most suitable for packing as described under our Criterion 7.
UNDESIRABILITY OF STARTING T H E SEARCH W I T H CHEMICAL INFORMATION Further considerations were concerned with the general strategy of the search. First, we realized that although the knowledge of chemical elements may be helpful or in some cases even indispensable for unique identification of mixtures, it is a faulty strategy to use such knowledge at the outset during search for the following reasons: First, the ASTM powder file is necessarily incomplete. Second, many isomorphous substances and solid solutions exist which may have similar powder patterns. Their detection may serve as a valuable clue to identification of an unknown. Therefore, even in the absence of the proper standard in the file, a search unrestricted as to elements present may reveal helpful substances which would have been suppressed by restriction on elements present, because they may contain elements not present in the sample. Their rejection on the grounds of elements present should
always be carried out by hand at the end of the search, after they have been found and printed out by the program. I n the case of the absence of a standard, they may provide some very valuable clues as to the type of substance which may account for the lines of the unknown. We added therefore the following requirements to the search program: (11) The search system should work successfully without introducing at the outset the chemical knowledge of elements present. Such a knowledge should be applied a t the output stage of the program.
DANGERS OF SUBTRACTING ALREADY FOUND COMPONENT PATTERNS Next, we considered the seemingly elegant technique of subtracting already found patterns, as practiced by Nichols ( 9 ) . This technique seems at first sight to be justified by assumed Criteria 1 and 2 of the additivity of patterns. The technique is, of course, valid in the ideal case of all the component patterns present in the file, all the powder lines resolved (no overlap), and all the patterns unique. We came to the conclusion that this is a dangerous thing to assume, because there is no safe criterion whatsoever that one pattern already found really corresponds to a component present in the unknown sample. In trial experiments we found that very often the most probable component, as determined by some criterion of goodness of fit, number of lines of match, etc., is due to a spurious combination of lines. Also, the intensities of important lines are often modified by overlap, so that an incorrect amount is subtracted from the pattern. Subtraction of overlap lines then spoils the chance of correctly identifying the subsequent components. I t transpired that it is a logically wrong approach to regard a given unknown pattern as an arithmetic sum of component patterns to be subtracted one by one as the search progresses, until nothing remains. The correct approach is to take the whole unknown pattern, and compare this in turn with all the file patterns, by asking each time the question: “What is the maximum amount of each standard substance compatible with the lines present in the unknown pattern?” T o illustrate this principle on an actual rather oversimplified but concrete example, let us assume that there are two patterns in the file, namely MnsZnC (card 7-46) and PtaZn (card 6-0584) which match equally well the unknown pattern. If we subtracted the first satisfactory pattern on our file, namely Mn3ZnC, nothing would remain and on the second pass we would miss the
presence of PtSZn completely, if present. However, under our scheme, we would obtain the answer: “The unknown pattern is compatible with 100% of pattern 7-46 and also with 100% of pattern 6-0584,” and the answer will remain true whether our unknown were actually 100% MnaZnC, or 50% MnsZnC 50% PtSZn, or 100% Pt8Zn. Thus, given a search system and a file, our answer is logically final and complete under all the circumstances. The criterion for search can be thus formulated as follows : (12) The search should give maximum possible concentrations of file substances compatible with Criteria 1 and 2 of additivity of patterns present. There is, however, one obstacle to practical fulfillment of this criterion, namely that the intensities of ASTM powder patterns are presently expressed in relative units (strongest line being set to 100). We overcame this difficulty by writing the program to handle the intensities as if they were absolute. If they were, the program would then give the maximum concentrations of components in absolute units. The program also formally works with relative intensities, but the concentrations so obtained are no longer in any familiar units. T o prevent misunderstanding, we introduced a new unit of concentration measure, named after the late Prof. W. P. Davey of The Pennsylvania State University, pioneer in powder diffraction analysis. This measure will be discussed later. There is research in progress to bring ASTM powder patterns on an absolute intensity scale. The pattern listing can still be given in relative units, but a single additional entry would tell the user of what absolute intensity the strongest line is. When we know that, all the intensities can easily be converted. One more note about the additivity criterion: If we really wished to decompose a given pattern of a n unknown mixture into its additive components, then the proper way to allow for overlapping lines is to write the problem in the form of a system of simultaneous equations. The solution would then be independent of the order in which the component patterns were found.
+
DETAILS O F T H E ASTM SEARCH FILE There are about 12,000 active patterns in the ASTM file (1966) and some contain up to 106 d-spacings. The first requirement is to put these patterns onto a magnetic tape. T o take advantage of the speed of integer arithmetic in a computer (4 psec. for integers compared with 19 p e c . for decimal numbers) on an IBM 7074, the reVOL. 5 9
NO. 8 A U G U S T 1 9 6 7
23
1
0 5127 4974
10290 5117 -254
3839 5556 0
2719 4736 0
0
0 0 0 0 0
0
0 0
0
0 0 0
0
0
0 0
0
0 0
0 6844 5490 0 0 0 0 0
0
0 0 10300 7354 0 0
0
0 0
0
0
0 0 0 0
0 0 0
0 0
0 0
2129 7692 0
3899 7142 0 0
2449 6712 0
5341, 6532 0 0 0
0 0 0 0 0 0
0
0 0 0 0
2569 6025
0 0 0 0 0
0
0 0
2109 6175 0 0
0
0
0
0
0
0 0
0 0
0 0 0 0
0
0
0
3178 6324
6717 5344
0 0
0 0 0 0 0
0
0
0 0 0 0 0
0 0 0 0 0
6136 3232
5785 7040
0 0
0 0
0
0
0
3268 4465
0
0
0 0 0
0
3528 4585 0
0 0 0
0 0
x o 0 0
0
x o 0
0 0
5645 6410 0
4904
0
0
0 0 0 0 0
0 0
0
0
0
0
598:
x o
0 0
0
0
0
O
O
Figure 7. A portion of the direct file
ciprocal of the spacing multiplied by 1000 is the number initially stored. The possible range of these numbers called the packed spacing (PS) is 200 A. + 5 and 0.6 A. --c 1666. PS is defined by a procedure :
PS =
21 * 1000
where * merely indicates procedural multiplication. There are N = 12,000 powder patterns each having an average 2 = 35 powder lines. Total number of powder l i n e s N x = 420,000 = 0.5 X lo6. If packed lines are represented by integers from 5 to 1666, there are about 1666 possible packed spacings. Some of these packed spacings will not be filled: suppose the number of used packed lines P = 1300. When N x = 420,000, each reciprocal spacing will carry, on the average, 320 ASTM numbers. As in the direct file each ASTM number carries, on the average, 35 packed spacings and intensities (PSI) and there are 12,000 ASTM reference patterns ; therefore 420,000 PSI are entered. An inverted file has 1300 PS with 320 ASTM numbers and also contains 420,000 other ASTM numbers. Therefore it would seem that both files would have the same advantage. However, in a direct search each unknown line is compared with 420,000 PS, while in an inverted search, each characteristic number representing the spacing is only cornpared with only 1300 PS. If all the comparisons could be handled internally-it takes 10 psec. per comparison -the total time per unknown line using a direct file would be 4.2 sec. while for an inverted file it would take 0.01 sec. If the unknown pattern contains say 20 lines, the time is 84 sec. compared with 0.2 sec. The intensity will also be reduced to a single digit number by the following formula: I1 = 5 loglo I3
where 1 3 is the 3-digit intensity and 11 is the 1-digit integer. Thus the conversion table becomes
13 0-1 2-2 3-3 4-6 7-10 11-15 16-25 26-39 40-63 64-100 I 1 0 1 2 3 4 5 6 7 8 9 The redefined intensity
(5 * 24
1000)
I1
is added to the value of
* 10 to form a single number which is char-
INDUSTRIAL A N D ENGINEERING CHEMISTRY
acteristic of a decimal d-spacing and a relative intensity. The numbers formed in this manner are called packed spacings and intensities (PSI). Figure 1 shows an example of the direct file as it appears on magnetic tape. Each ASTM number is followed by all PSI for the particular pattern. Thus, in the programming language, PSI =
c
- * 1000 * 10 + 11,= PS * 10 + I1
)
BUILDING T H E TAPES First a program is used to construct the direct file using an unblocked version of the tape available for ASTM (which has a blocking factor of 50). This program not only calculates the values of PSI but it also orders the values so that I / I o is decreasing. This ordering in Ill0 decreasing rather than ordering on d as in the ASTM 3 X 5 in. cards is used to full advantage in later programs. Rather than writing one ASTM pattern per physical record on tape, these direct file (DF) records are written with 20 ASTM patterns per block. (This is since each pattern takes approximately '/z in. of magnetic tape, and without the blocking factor of 20 there would be a wasteful 3 / 4 in. inner record gap between each ASTM pattern-25 in. of tape us. 103/4in. of tape.) The 12,000 ASTM patterns then take up 600 blocks on tape. After the direct file is built the result is a single tape with all ASTM patterns on it with the characteristic values for each pattern following the ASTM number. This tape is then sorted to make sure that the ASTM numbers are ascending for letter search procedures. This finishes with the building and processing of the direct file. Now the building of an unsorted inverted file is made. This procedure uses the direct file as the data input for this construction. A choice is made by the programmer by means of certain parameters, just what will be the minimum intensity lines to go on the inverted file. Let us suppose we use the single digit 7 as the parameter. This would mean that all patterns from the direct file with a single digit (11)intensity of 7 would go into the inverted file (all lines I / I o 2. 26). This program constructs a tape with all PS which have 11 2 7. The construction is made by associating with every PS an
60322 6 0 4 34 40725 5 0 6 15 40bl3 4 0 4 14 4@443 30489 50699 31085 80405
0
0
R 1 00 1134R0 71178 10538 20463 20826 2 1745 60717 70344
571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571
60274 40865 40728 50546 4Mb6 40265 40678 10791 40204 31C94 00402 80140
10329 70916 206~3 70846 20542 21203 60604 70220
571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 571 5 71 571
60268 40832 50470 50471 406 74 40686 4U587 30597 40164 30979 80024 R0267 1038b 20674 20746 70577 20840 21 7 5 1 60659 70205
571 571 571 571 571 571 57 1 571 57 1 571 57 I 571 571 571 57 1 571 571 571 57 1 571
60428 40734 50535 40666 40693 4~7478 30409 70631 31U21 40080 80072 R0274 ZllOL 20729 2044b 20504 21417 60637 60690 70102
571 571 571 571 57 1 571 571 571 571 571 571 570 571 571 571 571
571 571 571 571
60539 50339 50400 40661 40690 40687 30472 30660
571 571 571
31U.11
571 571 571
80432 70375 R0326 2ll2M 20749 20722 2064? 21397 60637 60689 70074
55 57 1I 571 571
0
571
E
571 57 1 57% 571 571 57 1 571
0
0
Figure 2. A portion of the invertedjfile
ASTM number. Thus the output consists of PS, ASTM number; PS, ASTM number, etc. These “doublets” are written 250 to a block, and when the entire direct file is subjected to this program, the result is a n unordered tape of intense lines with doublets on it. After this pass is completed, the tape is ordered decreasingly on the value of PS. The result of this ordering is a tape on which the value of PS is decreasing while the ASTM number corresponding to each PS is “just tagged along.” Figure 2 shows a segment of the inverted file as it appears on magnetic tape. T h e third major program is to build a packed inverted file tape (IFT). This tape will be the other (direct file is the first) important tape in the search system. This third program reads the sorted inverted file and records ASTM numbers which have a similar characteristic value PS. I t then writes four of these strings (PS and all ASTM numbers) in a block. Again by writing the tape (IFT) with a blocking factor (4 in this case), time is saved because of a shortening of magnetic tape in the data processing of the unknown. This concludes the building of tapes for the search system. This blocked and packed inverted file is constructed so that a speedy search can be carried out to find what ASTM patterns best fit the unknown lines.
DETAILS OF T H E ASTM SEARCH T h e search system for unknowns is now simple. First a n unknown pattern is read in-it does not use intensities at this point-with a window, or error factor. This error factor is a constant in the program which must be determined by each particular diffractionist. It is well known that the value of error for any d-spacing increases as the value of the d-spacing increases. If the absolute value of this error (NBS data-older data) is plotted as a function of the measured d-value, a curve in real space approximately obeying a formula Kd2 is produced. In reciprocal space this error corresponds to a constant value independent of d ( 1 7 ) . I n other words, if equiconstant error value is used in reciprocal space, a n error proportional to the square of the d-value is obtained. Suppose that in the range of 1.50 A., the experimental data is read to within fO.01 A. This would mean that in our system, the error window would be f 4 . I t
can be easily seen that this window of 4 would correspond to an accuracy of f0.05 A. at 3.52 A. More numerical values follow: 1.50
f
0.02 -+ f 8
f
0.01 + f 4
f
0.005 + f 2 0.002 + f 1
f
I n the choosing of the experimental window, the user should decide the accuracy of his experiment and decide from these tables what the size of “his window)’ should be. Once a window is found to be correct, this window should continually be used for the search program. If the window is too large, the search will find too many identifications, and the time of the runs will increase. If the window width is too small, too few identifications will be made and it will be necessary to rerun the entire search, as the correct answer may have been missed. This error factor is the range of PS you wish to accept. T h e program, SCH, searches through the I F T and finds all ASTM numbers which have the same d-spacings
(l *
1000) and within the limits of error.
The program does this searching for all the unknown lines which are read in and records the ASTM numbers (unordered) on magnetic tape. T h e magnetic tape input with the ASTM numbers on it, which are patterns that have one or more lines with the same characteristic as the unknown, is then sorted in increasing fashion. T h e result of this sort is a magnetic tape which has the matching ASTM numbers in ascending order. T h e final phase is made by using this ordered magnetic tape of ASTM numbers and the direct file. If an ASTM pattern has a sufficient number of occurrences from SCH, this means that the standard reference pattern has lines which are coincident with the characteristic lines of the unknown. Then this pattern is examined by looking a t the reference pattern-the direct file. Since this direct file has the single digit intensity (11)decreasing, it is a simple job to see if the most intense lines are present on the unknown. (Note here that the reference patterns are compared into the unknown.) Thus the problem of coincident lines does not present a difficulty in this case. If the most intense lines, 11 = 9, are found VOL. 5 9
NO. 8
AUGUST 1967
25
to be present, then the lines with 11 = 8 are used and so on. From this comparison, the so-called Davey Maximum Concentration (DMC) is obtained by this calculation of the largest residue. The Davey Maximum Concentration of a substance is a measure which can be computed solely from comparison of relative intensities of both ASTM and unknown patterns. DMC is defined as the largest number 0 5 D M C 5 1 for which the relative I A s *~DMC ~ 5 I u n k n o w n is satisfied for all ASTM lines. I n other words D M C is such a number that there is at least one ASTM d-spacing within the pattern for which IAsTM * DMC = zunknown,
and there is no ASTM d-spacing with the pattern for which 1 ~ * D s M C~ >~1unknow.n. ~ I n these statements, the I,,,, values are relative intensities in a given ASTM pattern, and the I u n k n o w n values are relative intensities of an unknown mixture both expressed on the scale
I,,,
=
100
D M C is measured in Davey units and is expressed by a number formula as used before
DMC
=
-
(IASTnl
=
1unknown)max
( 5 loglo 13A4S777f -
loglo 13,nknown)mL3X
0
0 I S A TEST k I T H K C L P A T T E R N P R E S E N T 1 C O P t R C E h T .
THIS
0
PACKET1 SPACIhGS
INPUT OATA SPACINGS
0
0
LUW~H
BCUhC
BOUPiC
1.4c70
71C
711
7LY-
1.5730
635
b3C
634
1.81CO
55c
55 1
549
2.2240
445
42c
446
317
316
3It
0
3*14t0 1 A P E S E A R C H PHASE, C O M P L E T E D
0
THE EXECUTE T I M E FOR
THIS RllN HAS
T A P E CN U N I T hUHEER 2 3
0
StARLF
UPPtR
$3
0 0
0 0
SECChCS
0
READY FCR SLRTIhG
0
RUN C e M P L E T E
Figure 3. Printed results f o r SCH program for Case 7
_-
0
__
.
o
LESS
**COMMENT**
PLTTERNS I N F I L E N I T H
**COMMENT**
PATTE_RNS I O F N T l F l E D W I T H 4 LOG
**COWENT**
P A T T E R N S W I T H L E S S THAN
**COMMENT**
ONLY D - S P A C I N G S
THAN
ONLV D - S P 4 C I Y L S
**COMMFNT**
WlNDIlW FOR T H I S RUN I S t nR
GREATER T H A N
OF
W4CL
~-
I
I
59 23 ,8
L.OIO"
1.5730 1.4070 S E A R C H SPACING, 710
0
-
._
__ 100
3. TT6b -2.2240
0
INTC~SITY. b
bT5
4
550
b
4
0
0 -
-
~
__--
__ - __
UPPER BOUND, 7 11
LUW-BTUNU 709 ..
551
549 --..~
bd0
-I
634
~
I
_
O F 3 L I N E S W I T H I N T E N S l T V OF
I F T H E *ROLE
....
.
.
_
l
_
~
-
.
.
^
...___I-
0
X
...
-"
0 OR GREATER
2o581-CAN
0
1 1 " " l___llll_l
I
~
0
I
____ BE FOUND W I T H 4 L?G
-O-OR_G_REATFR
8 0 2 2 2 C 4 N RE !%UNO
10 OR GREATER
0 0 4 0 7 CAN
W I T H A LOG
B E FOUND
0
~
I Y T MATCH
I N T M_?T_CH r)F
YITH A_L?LINT
0
OF LOO P i R C E N T
nF
HATCH
._
KCl
15 PERCENT
PAgZn
10-QtRCENT
SnTe-
0
-____ 0
I I
R E S U L T S 00 NOT R E S U L T I N ' A ~ ~ E N T I F I C A T I O N I ' T~ 1 5 SUGGESTED THAT T H E PARAMETERS RE C H A N G E D b N D T H I S PHASE REPEA-
THE E X E C U T E T I M E FOR THIS RUN COMPLETE
I-
.~ _
..
0
.
llll-_
_I
I__II
I _
OF
B A S E 0 ON A COMPARISON OF 3 L I l e l T H I N T E N S I T Y O F
RUN Y A S 123 s E c n N n s
I _ -
-
^
Figure 4. Printed results f or IiVT program for Case 7 26
__I
-
_____
.-
I
B A S E D ON 4 COUPARISJN
~
.I
BASED ON A C O t l P A R l S O N O F 5 L I N E S W I T H INTE-N-IEY
-
_ _ I
T H I X PHASE
1 I
UNKNOWN S P A C I N G S AND I N T E N S I T I E S
0
0
I
__
I S A T E S T W I T H K C L P A T T E R N P R E S E N T 100 PEEN?._
SANPLF THIS
_~
0
-
N L B E PRINTED
I N THIS P H b S E _ _
ARE-FSEDlN
1.39
-
5 OR L F S S
MATCHES ARF NOT P R I N T Z A R F USED
RLOO
I*CORPIENT**
-0
-
2 L I N E S W I T H A N I N T E N S I T Y C R E A T E Z H A N 2 5 CANNOT B E F?UNO B Y T H I S SEARCH
JlI _YLTAH
3 LI-NE
L E S S THAN
-
-
-
P O S S I B L E I O E N T l T Y _ OF U N K E C A T T E R N RASED ON SEARCH OF S P A C I N G S ONLJ
INDUSTRIAL A N D ENGINEERING CHEMISTRY
_
-_
0 ~
-
~
-
~
_
_
I
_
_
_
_
Since the intensities have been reduced to single digit integers, the differences between intensities (reference and unknown) will also be single digit integers. Hence in our system the DMC can only assume the following values: 0.01, 0.02, 0.03, 0.06, 0.10, 0.15, 0.25, 0.39, 0.63, or 1.00 as seen in the intensity conversion table. D M C is by definition a fictitious concentration derived under the assumption that both I A ~ T Mand runknown are correct and on an absolute scale. It could be converted into actual weight yo or mole Yo, if the conversion factor is known, under the assumption of a linear mixing law. I n testing an unknown pattern, all lines are compared into the unknown, and the largest difference between IASTM.and runknown is recorded. This step is repeated until a line in the reference cannot be accounted for in the unknown. The larger of the two (difference or intensity of unmatched line) determines the DMC. Once the DMC is determined, a n idea of the relative concentration is obtained and is given as a log intensity .match.
T O S U M M A R I Z E T H E DETAILS OF T H E SEARCH D-I tapes are built (acquired from ASTM). (1) A direct file with PSI ordered with I/Io decreasing is built. (2) Direct file is sorted on ASTM numbers. (3) An unsorted inverted file Ill0 greater than chosen limit is built (intensity is not carried along). (4) Inverted file is sorted on PS. ( 5 ) Packed inverted file is built from sorted inverted file-ordered on PS with all ASTM numbers following. (6) Search inverted file to give ASTM patterns retrieved. (7) Sort these ASTM numbers. (8) Intensity phase program with limits gives best fitting patterns as output. Steps 1, 2, 3, 4, and 5 are done once a year whenever a new set is produced by ASTM. Once the direct file and inverted file are built, steps 6, 7, and 8 are run for every unknown pattern. The time required for 6 is about 1.5 min..; for 7, about 0.5 min.; and for 8, about 2.0 min. T h e sum is therefore 4.0 min. per unknown pattern on a tape-oriented IBM 7074 regardless of the number of phases present. O n a larger and faster disk oriented computer, this time would be 1 min. Many options are written into the programs for the acceptance of “an identification.” Although the system described uses a magnetic tape oriented 10,000-word computer, the programs will run on both larger and smaller machines. However, the computer must have a n adequate file storage space either in magnetic tapes or disk storage for the direct and inverted files.
SOME RESULTS OF T H E SYSTEM T h e search procedure has been run on many examples a t the Materials Research Laboratory, but a limited number of selected results will be shown here. Each result will show the system with a different number of components in the multiphase mixtures. Some will be theoretical, but most are experimental patterns. Case 1
The first example shown is a theoretical test pattern of KC1 with KC1 present 1 0 0 ~ o . Figure 3 shows the printout and the results of this pass-the pass called SCH for search. From the upper and lower bounds it can be seen that the error window in reciprocal space is A1 or 0.002 a t 1.00 A. in real or direct space. T h e result of this program-the unordered ASTM numbers are written in 93 sec. on a magnetic tape. After the results are ordered (a procedure which produces no output but takes about 60 sec.), the ordered ASTM numbers are on magnetic tape. This list of ordered ASTM numbers on magnetic tape is then used in the program I N T (intensity comparison phase). The results of this program are seen in Figure 4. All data values used in this search are printed out in a message, so that the user can see the limitations imposed. The parameters can be thus found for every run. It can be seen that only reference patterns which have two coincident lines with the unknown are used in the search, that those with a DMC concentration of less than 5% are not accepted, and that three lines must match to be considered correct. I t can be seen from the d-value range (8.00 to 1.39 A.) that the investigation was from 11.0” to 67.4’ 2 8 (Cu radiation). Again the error window is given, and the name spacings and intensities of the unknown are printed next. T h e search range used with the new intensities and upper and lower bounds conclude the information given to the user. With these data, all parameters used in the identification are shown. The results of the program I N T are given next. I t can be seen that three components (KC1, BAgZn, and SnTe) can be identified as possibly being in the unknown pattern using the parameters listed above. The substances are now identified by referring to the ASTM file. The chemical compositions are presently given at the end of the line by the user. I t can be seen that the program INT takes 127 sec. to identify the pattern. The user must ascertain which of the three results is correct-even though for this example KC1 is known to be correct. The general rule of
(1) Biggest % (2) Biggest number of matches (3) Product of (1) and (2) usually gives the correct results immediately without chemical knowledge. VOL. 5 9
NO. 8
AUGUST 1967
27
-
0
-__
P O S S I R L E I D E N a n F U N K Y O Y N P A T T X B A S E D ON S F A R C H OF S P A C I N G S O N L Y
0
**COMMENT**
P4TTERNS I N F I L E WITH L E S T H A N
**COMMFNT**
PATTERNS
____I
0 0
**COMMFNT**
PATTERNS Y l T H L E S S THAN
**COMYFNT**
ONLY 0-SPACINGS
L E I S THAN
O*COMYFNT**
ONLY 0-SPACIYGS
G B A T E R THAN
**COMMENT**
H l N f l O U 2 T H I 5 RUN
5 OR L E S S U I L L - N I I T
BE PRINTEO
5 L I N E WATCHES ARE NOT P R I N T E D ~
50.00
ZAHPLE THIS
I S 4 THFORFJLCAL
I I N K N O U N S P A C I N G S AND
OR
-
l
_
l
__-
l
_ I
0
- - -~0
A R F USED I N T H I S P H A S E
1 - 3 7 ARE U S E D I N T H I S P H A S E ~
IS +
__
3 L I N E S W I T H AN I N T E N S I T Y G R F l T E R T H A N 2 5 C A N N O T B E F O U N D BY T H l S - ~ ~ t , . C J
I O E N T l F l E n W I T H A L X I N T MATCH OF
-0
-_ __-
l _ l l l
-
__ - -
_ I
-
-
I
___I
P A T T E R N OF 8 5 f l A S O 4
-
__ -__
._
0
I
-
1
I -
1 5 ZN_9__---
0
~~~~
INTENSITIES
-~
\
0
_"
1.4260
R
1.4070
1
1.3790
7
0 0 OR G R E A T E R
50448 C A N B E FOUNO Y I T H A L O G I N T MATCH-0:-100
R A S E O ON A C O M P A Q I S O N O F 8 L I N E S U I T H I N T E N S I T Y OF
0 OR G R E A T E R
5 0 6 6 4 C A N R E FOUND WITH A
RESULTS 0 0 N O T R E S U L T I N 'AN
I F THF ABOVE
_
l
l
l
l
_
l
l
--
I _
A COMPARISON OF19 L I N E S W I T H I N T E N S I T Y OF
R A Z F O ON
0
0 _
L O G I W T MATCH
OF
PERCENT 1 5 PERCENT
~
0
&SOL
5 0
I O E N T I F I C A T I O N ' ~ I T IS 5 U G G E S T E D T H A T T H E P A R A M E T E R S B E CHANGED A N 0 T H I S P H A S E R F P E A T E D
0 0
T H E F X E C U T E T I M E FOR T H I S R U N WAS 1 2 5 SECONDS
0
0
RUN COMPLETE
Figure 5. Printed results for I~VTprogramfor Case 2
0
P O S S I B L E I D E N T I T Y OF UNKVOWY P A T T E K N B A S E D ON S E A R C H OF S D A C I N C S O N L V
0
**COMMENT*+
P A T T E R N S I O E N T l F I E O Y l T H A L O G I Y T WATCH OF 1 6 OR L E S S W I L L '401
.+COMMENT**
PATTERNS
**COMMENT**
DNLY 0-SPACIYGS
LESS THAN
F I L F W I T H L E S S THAY
G K C A T t H THAN
50.3')
ARE U S E D
**COMMENT**
ONLY D - b P A C I N G h
uivoow
SAMPLE T H I S
I S AN E X P E R I M E V T A L P A T T E K N OF 8 - 2
F O R T n i s KUN
Is
0
BE PRIYTEO
8 L I Y F M 4 T C H E S ARE Y O T P R I N T E D
hlTH LESS THAN
+*cO~+ENT**
4.4203 4.3303
0
6 L I N E S k l T 4 4N I N T E N S I T Y GREATER T 4 A Y 2 5 CAYVOT BE F J U Y D B Y T H I S SEARCH
PATTERNS
UVKNOUN S P A C I N U S AND
0
IN
**COMMENT..
+ OR
1.36
-
IU
0
T H I S PHASE
AHF U S E 0 I Y T H I S PHASE
0
5 AS G l V t N
I Y HAYAWALT'5
REPDR1.
0
INTENSITIFS 17
3.9000
21 24
3.1701
'?
-
-
0 /
PLUS
0
39
0
0
0
0
0
'
0
=
8 4 S E D ON A C O M p A R [ S ON O F 1 5 L I N E S W I T H I N T E N S I T Y % OF 3 0 5 GREATER 30799 C 4 Y Eq FeS2 B A S E D ON A C 3 M P A R I S D N O F 3 9 L I N E S W I T H I N T E V S I T Y OF
0 OR GKtATER
50448 C 4 N BE F O U Y D * I T H A LOG I Y T M A T C H
OF
39 PERCENT
B A S E D O N A C O M P A R I S O N UF 9 L I N E S U I T H I N T E I I S I I Y OF
0 OR G R F A T E R
53661 CAN B E F 3 U Y O Y l l H A L O G
I Y T RATCH
2F
3 9 P:RCEYT
I F T H E ABOVE R F S U L T S OLJ
Vur RESULT I N ' 4 N I D E N T I F I C A T I O V * I
&soh
0
I T I S S U G G E S T E D T H A T THE P A R I f l E T E R S B E CHAYOED A N 0 T H I S P H I S E R E P E l T E D
T H E E X E C U T E T I M E F U R T H I S RUN HAS 136 SECOUOS
0
9UN COMPLETt
Figure 6. Printed results for IN T program for Case 3 28
INDUSTRIAL A N D ENGINEERING CHEMISTRY
P O S S I B L E I D E N T I T Y OF UNKNOYN PATTERN BASED ON SEARCH OF S P A C I N G S ONLY
uiw
**COMMENT**
PATTERNS N I
**COMMENT**
PATTERNS I D E N T I F I E D WITH A LOG I N T MATCH OF
**CnlYIENT**
PATTERNS Y I T H I E S S THAN
**COMMENT**
ONLY L!-SPACINGS
L E S S THAN
**COMMENT**
ONLY D-SPACINGS
GREATER
**COMMENT**
YINDOU FOR T H I S RUN 1 5 t OR
SAMPLE F R E V E L ' S
FILE
L E S S THAN
SAMPLE FROM ANAL.
3 LINES
GREATER T H M 25 CANNOT B E FOUND B Y THIS
A N INTENSITY
UITH
CHEM.
-
0
ARE USED I N T H I S PHASE
1.11 A R E
THAN
0
-
b L I N F MATCHES ARE NOT P R I N T E D
6.41
SEARCH
5 OR L E S S W I L L NOT BE" P R I N T L O
USEP I N T H I S PHASE
0
3
VOL 37 NO 4 PAGE 479
(AS+SE+AS2031
0
UNKNOUN S P A C I N G S A N 0 I N T E N S I T I E S
Q
h.4070
11
3.7810 3.5160
23
.
B
0
0
~-
0
0
0
0 \
0 1.1150
0
... .
--
h
I
0
I -
0
I
B A S t 0 U N A COUPARISON OF
6
L l N t S U I T H I N T E N S I T Y OF
6
flR GREATER
1075R
CAN BE FOUND W I T H A LOG I N T MATCH
OF
BASED ON A COMPARISON OF
6
L l N t S W I T H I N r r N S I T V OF
15
OR GREATER
40566
TAN BE FOUN_O W I T H A LOG I N T MATCH
OF
b PERCEN!
BASED ON A COMPARISON OF
7 L I N L S W I T H I N T E N S I T Y OF
511632
CAN-BE
OF
1 5 PERCENT
LI
B A S t 0 UN A COMPARISON OF
8 L l N t S Y I T H I N T E N S I T Y OF
60362
CAN BE FOUND Y l T H A LOG I N T MATCH
a
1 5 PERCENT
(8bh) 0
a
0
2 3
6-OR
GREATER
6 OR
GREbTER
FOUND Y I T H A LO!
OF
1 5 PERCENT
OF
-6
4 O R GREATER 120403 CAN BE FOUND Y I T H A LOG I N T MATCH
OF
OR GREATER
110655-TAN
__-_
I
BASED ON A COMPARISON OF
I F THE
7 L I N E S WITH I N T E N S I T V O F
I N T MATCH
BE FOUND W I T H A LOG 1NT MATCH
15
L I N E S WITH I N T E N S I T V X
BASED ON A COMPARISON
0 ~-
0
e H n V E R E S U L T S 00 NUT RESULT I N 'AN I D E N T I F I C A T 1 O N ' .
PERCENT
6 PERCENT
0
%?9w-,
I T 1 5 SUGGESTED THAT THE PARAMETERS BE CHANGED AN0 T H I S PHASE RFPFATEO
G 0
THE EXECUTE T l M F FOR T H I S RUN WAS 1 2 4 SECONDS
0
*IJN
-
ConPLETt
___
0
Figure 7. Printed results for IN T program for Case 4
0
0 I D E N T I T Y OF U N K N O W N P A T T E R N R A S E 0 ON S E A R C H OF S P A C I N G S
POSSIBLE **COMMFNT**
PAITERNS
IN F I L E W I T H L E S S THAN
+*COMMENT**
PATTERNS
IDkNTIFltO k I I H A
**COWENTI*
P A T T E R N S WITH L E S S T H A N
*LCOMMFNT**
PATTERN\
**COMMENT**
ONLY
0-SPACINGS
L E S S THAN
ONLY
D-SPACIN6S
GRkATER
**COMMENT**
riINUOW FOH
FREVtLt
IB1D.r
UNKNOUN S P A C I N G S AND
3.0660 6.7880
0 0
e
Y I T H L E S S T H A N 40
**COMMENT**
SAMPLE
LO6
THIS
RUN
SAMPLt
INTENSITY
G R F A T E R T H A N 2 5 C A h i Y l l T R C FQIINI)
RV
THIS
SFARCH
L I N E M A T C H E S ARE N o r P * I N T E o
50.00
MATCH
ARE
1.44
-
(Y203r
OF
STANDARD
USFD I N THIS ARE
UStl)
0
I N RANGE CflNSlOtRCl)
A R E NOT
PRIYTED
PHASF
0
I N THIS PHASF
3 ASZD3.
0 AG3ASfl4,
KH2ASfl4y
PBSF114l
INTENSITI€S
0
10
9
.. 6.5YfO
Ob
5.2580
24
5.2240
24
L
L
0
0
PLUS
0 0
0
0
0
I N T M A T C H OF I 1 OR L E S S WILL NOT RE P R I N T E D
PERCFNT
THAN
I S + OR
U-3-6
3 L I N E S WITH A N
ONLY
0
1.4910
-
~~
VALUES
~~~~
~~
0
7
1.4420
9
1.5320
10
1.9400
0
0
0
Figure 8. Printed results of input of ZNT program for Case 5 VOL. 5 9
NO. 8 A U G U S T 1 9 6 7
29
-
0
-
BASED ON A COMPARISON OF -0ASED
- --
I1 L I N E S WlTd-INTENSITY
B F FOUND W I T H A L O G I Y T MATCH
OF
15 PERCENT
10624 CAN BE F W N O WITH A X G I Y T YATCH
OF
15 PERCENT
10810 CAU-E
OF
15'PERCENT
0 OR G R C A T E R 10604 C A N
OF
0
I
ON A CUMPARISON OF
9 L I N E S Y L L H T E N S l T Y OF
6
BASED ON A C O M P I R I S O N OF 1 5 L l N t S U I T H I N T F h l S l T Y OF
OR GREATER
6 OR'GREATER
A LOG I N 1 M4TCH
F=ITH
0
~
BASED ON A COl4PARlSON nF B A S E U ON A COMPARISC>>_OF
E OF
8 L l N t S W I T H I N T E N S I T Y OF
0 OR GREATER
11113 CAN B E FUUNO U I T H 4 L O G I N T
10 L I N E S Y I T d I N T E N S I T Y OF
6 OK GREATFR
40230 CAN BE F J U N D U I T H A L O G I Y T H A T C H
OF
7 5 PEHCFNT _
_
BASED ON A COMPARISON OF 13 L I N E S W I T H I N T E N S I T Y OF
1 OR G R E A l E R
40566 C A N BE FUUND W I T H A L q G 1 Y T MATCH D L 3 9 PFRCENT
8 L I N t S U I T I i I N T E N S I T Y OF
2 OR GREATER
50534 C A N R E F O U N l W I T H A L O G 1'41 MATCH
BASED ON b C O H P E L S S N OF 16 L I N E S W I T H I N T E N S I T Y OF
0 OR GREATER
5 0 5 7 4 CAN RE FOUND W I T H A LOG I N T HATCH
BASED ON A COMPARISON OF
0 OR G R E A L E
60493 CAN RE FOUND
3 OR GREATER
BASED ON A COMPPKISON OF
OF
1 5 OFRCENT
- OF
39 PtRCENT
2 5 PERCENT
70234 CAN B E F'IUVO U I T V A L I G I N T MbTCH
OF
1 5 PERCENT
80044 CAN RE F l E N D Y l T h b L l G I Y T UnJCH
OF
15 PERCENT
6 OR GREATER 100084 CAN RE FOUND Y l T H A L0G I N T MbTCH
DF
15-PEPCENT
GREATER 100090 C 4 N B C O N O Y I T H A L l l G I N 1 M A T C H
OF
15 PERCENT
L b S E D ON A COMPARISON OF 1 3 L I N E < Y I T H I N T F N S I T Y OF
2 OK BREATER 130194 CAN R E F J U N D W I T H A L I C I N 1 HATCH
OF
39 PERCENT
BASED ON A C O M P A R I S D N OF IO L I N E S W I T H I N T E N S I T Y OF
6 OR GREATER 110023 C A h B E FOUND W-ITH A I O G I N T MATCH
OF-
1 5 PERfFNT
0 O L G R C A T E R 110309 CAN B E F 3 1 I N ~ ~ Y I T bH 1 %
BASEU ON A COMPARISON OF8-
LINES YITH
I H T E N S I T Y OF
BASED ON A CONPARISON O s 1 L I N E S Y l T H I N T E N S I T Y OF
ItiiEN~ITY
BASED ON A COMPARISON OF 10 L l N t S Y l T H
OF
I)ASED U N A COMPARISON OF 1 0 L I N E S Y I T H I N T E N S I T Y OF
GREATER
6 "-
- L O R
YITH
A L O 6 I N T MATCH
I
0 0
-~
OF
0
8 L I N t S Y l T H I N T E N S I T Y OF
0 -
1 5 PERCENT
-____
0 0 0
l l l l _ l
B A S E D ON A COMPARISON OF
13 L I N t S Y I T d INTENSITY n F
BASED ON A C f l H P A R l S O N OF 10 LN'S -t
Y I T I i I N T E h S l t Y Ok
6 OR GREATER
1 2 O r b f TAN
I N 1 H4TC'l
OF
1 5 UERCEYT
R E FOUk1I W I T * A L O G 1NT MATCH
OF
15 PERCENT
B A S E D J N A COHPARISON OF
9 L_!NES W I T H I N T E N S I T Y OF
6 OK GRE4TER 1301108 C A N R E FOUND Y I T H A L O $ 1 N T YATCH
OF
1 5 PERCENT
BASED ON A COMPARISON OF
8 L I N E S UlT!
6 OR GREATER 130010 CAN B E FOUND YKH--A
L 7 G I N T MATCH
nF
15 UERfENT
LOG I N 1 M C B I?.
I N T E N S I T Y OF ~
B A S E 0 ON A COMPARISOY
of.
BASED ON A C O M C ~ R I S O NOF
0
~
~
I 1 L I N E 5 U I T H I N T E N S I T Y OF ~
BASED ON A COMPARISON OF 1 2 L I N E S *I_TH m
ll_l____
0
S
I
T
Y OF
8 L I V C S W I T d I N T E N S I T Y OF
B A S E U ON A COMPARISON OF 26 L l Y L S W I T H I N T E N S I T Y OF
--?
OR GRFATER 130426 CAN FIE F W N O YITd 3 OR G R E A I E R - 1 0 0 0 2 6
b
CF
25 PERCENT
INT Y4TCH
OF
lS_ECENT
W I T H A L O G I N T MII_T_CH
C A N BE F3UUD UEH-b
rl
L-OTI
OF
1 5 PERCENT
I V T HATCH
OF
15 PFRCFNT
3 OR (IREATER 1 6 0 7 2 8 C A N B E N O U l T H b L 2 5 I N T WATCH
OC
OP CREATER 140836 CAN BE FJUNO
6 O R E A T E R 1 5 0 3 7 5 C A N BE FOUhiD Y I E A L f l G
--
0 -0
~
EASED ON A COHPARISON OF
LJIMES
Y l T H I N T E N S I T Y OF
-I F THE AdOV'RESULT3
_ I .
UO NOT RESULT
I N 'AN
IDENTIFICATICN'.
-15
PERCENT
- _
I T I S SUGGESTEU THAT T H E PARAMETERS HE THAN5ED-AN7
T H I F PHASF RFPEATFD-
-0
0
T H E EXECUTE T I M E FOR T H I S HUN WAS 193 SECllhlOS RUN CD-MPSETE
c Fzgure 5'.
Printed results of output of I X T program for Case 5
Case 2
The next sample given is the theoretical two-phase pattern of BaSOd-ZnO. Only the I N T phase will be given, since the SCH phase and the sort phase give no information which is necessary for the results. T h e results show that only the two phases Bas04 and ZnO will successfully fulfill the given conditions listed a t the top of the program. The Davey Maximum Concentration of Bas04 of 1OOyo and ZnO of 15% compares with 85Oj, Bas04 and 15y0ZnO as given by the mixing of the reference patterns. I t can be seen in Figure 5 that these DMC's which are given as a log intensity match agree very well with the given concentration. Case 3
T o compare our experimental pattern with the example given in Case 2, a pattern of the specimen of BaS04-ZnO was made. h-o special experimental technique-no internal standards or high resolution scan-was made. The results in Figure 6 show that FeS2, BaS04, and ZnO fulfill the given parameters chosen by the diffractionist. I t should also be noted that, compared with Case 2, the DMC of the individual 30
INDUSTRIAL A N D E N G I N E E R I N G CHEMISTRY
phases is greatly changed. This is because the intensities measured were simple estimates of peak height and no attempt was made to ascertain the integrated intensities. Case 4
Kext an example of an accurately measured pattern was run. IYhen this multiphase pattern of this sample was analyzed by the film matching technique of Hanawalt, Rinn, and Frevel ( 4 ) , a practice devised by them in DOW'S laboratory for almost three decades, no identification could be made. These diffractionists tried to use the standard methods which they devised for multiphase identifications. This pattern was run within internal standard and the intensities were measured on a photodensitometer. I t is thus possible to use a smaller error window and also to get better results on the DMC of the output. Although the input sample consisted of As and Se, the results given by Frevel (2) show that As, Se, and As203 are successfully identified as correct. The results under the present system also show (Sb, As)203, K2CO3.H2O, and BaBl&b209 can be as successfully identified as possible in the specimen. Thus the unknown pattern has been successfully reduced to 6 out of 12,000
patterns in approximately 2 min. The fact that 3 out of G are correct, shows the power of the system [4 if you consider (Sb,As)203 as also correct] (see Figure 7).
SAMPLE
Case 5
1.
The last case, Figure 8, concerns itself with another multiphase pattern. The measured d-spacings and intensities are given by Frevel ( 2 ) . The results of Frevel show that five phases can be successfully identified in the sample. The results shown here demonstrate that even with precise data (both d-spacing and intensities from a calculated Guinier camera and a microdensitometer, there is still room for the knowledge of the user to obtain the final correct results. This system shows 24 possible answers, Figure 9, of which Frevel, using x-ray fluorescence analysis to identify elements Y , AS, Ag, Pb, Se, and K, has successfully found five phases present. When the chemistry (Figure 10) of these results is made, No. 3 is also found to be in the sample and was overlooked by the author. I t should be noted that in approximately 3 min., six phases in the unknown pattern U3.6 were identified. I t would be nearly impossible to solve this problem using books or the ASTM file alone without the computer’s help.
2.
CONCLUSIONS I t can thus be seen that the search system described in this paper works well without a chemical analysis of the unknown. The present ASTM file of powder diffraction standards can be used effectively with this system. This system, which is much more powerful than any previous hand method of search, may stimulate the large scale application of x-ray diffraction methods to chemical analysis. It can free the diffractionist from the fatiguing task of searching for matches and will allow him to concentrate the textured information revealed in a pattern. The timings given here are for an IBM 7074 computer, which is a 10,000-word magnetic tape computer. If a large computer with disk storage were used, then the time required would be less than 1 min. The future plans of the authors are to adapt the program to an IBM 360/67 with telecommunication mode for direct solution of the patterns from the laboratory.
x
10810
5. 6. 7.
40230 40566 50534 50574 60493 70234 80044 100084
10. 11.
12.
13* 14. 15.
16. X 1.7,
18. 13. 20.
21,
22,
23. 24.
(X
denotes components)
10604 10624
3. 4.
x 8. x 9.
X
u-3.6
llllj
lOOOgO
100194 1 1 ~ 3 110309 120767 130008 130010 130426 14oa26 140836 150375 160728
~~
Figure 70. Chemical ident9cation of output f o r
Case 5
REFER ENCES (1) ASTM Joint Committee on Powder Diffraction Standards., Philadelphia, Pa., 1916. (2) Frevel, L. K., Anal. Chem. 37, 471 (1965). (3) Frevel, L. K., IND.ENO.CHEM.,ANAL.ED. 16, 209 (1944). (4) Hanawalt, J. D., Rinn, H. W., Frevel, L. K., Ibid., 10, 457 (1938). (5) Hargreaves, A., “X-ray Diffraction by Polycrystalline Materials,” H. S. Peiser, H. P.Rooksby, and A. J. C. Wilson, Eds., p. 298, The Institute of Physics, London, 1955. (6) Hull, A. W., J . A m . Chem.Soc. 41, 1168 (1919). “Index (Inorganic) to the Powder Diffraction File (1965),” ASTM Special (7)Tech. Publ. PD15-15i, Philadelphia, Pa. (8) Johnson, G . G . , Jr., Vand, V., Twenty-fourth Pittsburgh Diffraction Conference, Paper No. B-1, 1966. (9) Nichols, M., Zbid., Paper No. B-3, 1966. (10) Stokes, A. R., “X-ray Diffraction by Polycrystalline Materials,” H. S. Peiser H. P. Rooksby, and A. J. C. Wilson, Eds., p. 409, The Institute of Physics: London, 1955. (11) Vand, V., Fourteenth Pittsburgh Diffraction Conference, Paper No. 8,1956. (12) Warren, B. E., .J. Am. Ceram. Sot. 17, 73 (1934).
VOL. 5 9
NO. 8 A U G U S T 1 9 6 7
31