Statistical Procedures in Chemical Investigations - Industrial

Industrial & Engineering Chemistry. Advanced Search .... Statistical Procedures in Chemical Investigations. W. L. Gore ... Experimental Statistics. R...
0 downloads 0 Views 571KB Size
Statistical Procedures in Chemical Investigations W. L. GORE E . I . rlu I’ont de IYernours & Company, Inc., Arlington, S.J .

T

HE recent trend towaid Statistical methods are finding widespread use i n many if not most, chemists are lackthe use of statistical fields of scientific inquiry, and particular progress is being ing in fundamental knov&dge methods in the fields of chemiachieved in chemical fields. Industrial chemists are findof the principles involved in cal research and developing that many hitherto difficult problems become solvable designing experiments so that ment, as well as their successwhen these methods are brought to bear on the design of the results will have the reful use in other lines of scienexperiments and analysis of data. Tw-o examples are prequired precision, so that intertific endeavor, has led many sented here of typical problems encountered in industrial action effects will be deterresearch and pilot plant development where the statistical mined, and so that the rechemists to question how these new techniques can be approach has shed new light on the mechanisms involved. quired answer is found with a applied t o their work. Exminimum of effort. Many tensive trials of statistical experts in this field believe methods in a variety of fields have demonstrated their utility in that current experimental work in the chemical field is only scientific research and development, and no fields of scientific in25 to 75y0 efficient because of limited use made of available vestigation have come to the author’s attention where these techniques for the design of experiments. In addition to the methods cannot be used. The industrial chemist, in particular, help one can get from statistical procedures in planning experlcan use the statistical approach t o great advantage. ments, these procedures also extract all available information Before attempting t o show some examples of the application of from the data, and often make i t possible to test hypotheses statistical methods, i t appears worth while t o consider briefly which are impractical t o test by nonstatistical methods. When the basic nature of statistical procedures. The relationship the final function of the scientific method-the test of the truth between statistical philosophy and procedures and the philosophy of the hypothesis-is carried out as a test of statistical significance, a definite probability is associated with the decision as to and procedures of the scientific method is intimate-the statiswhether the hypothesis is accepted as true or rejected as being tical procedures being b u t the mathematical formulation of certain aspects of the scientific method. This relationship can false. This criterion eliminates the elements of optimism, be shown best by considering the functions of both methods. pessimism, faith, and hope which frequently lead t o drawing The scientific method can be divided into three fairly wellimproper conclusions or trusting insufficient evidence. defined functions: the creation of a hypothesis; the design of an An axiom is prevalent among chemists that controlled experimentation consists in “holding all variables constant except the experiment t o test the hypothesis, and the test of the truth of the hypothesis by the results of the experiment. one under study and determining its independent effect.” UnThese functions are not always distinctly separated, and the fortunately, in chemical mechanisms the effect of variables often original hypothesis is frequently evolved by repeated interplay is not independent and a simple interpretation of this maxim leads to confusing and anomalous results from experimental work. of these functions into a theory quite different from that which was originally postulated. The interplay of these three functions A branch of statistical methods knowii as design of experiments of the scientific method appears t o require two rather different is being developed which has great potential value for chemists engaged in experimental work. A clear understanding of the types of thinking which may be designated as type I and type 11. Type I is the abstract, imaginative, unfettered, far-range rigorous requirements of experimental arrangements t o give specified information is a worth-while educational goal for any thinking required to create a hypothesis which defines the general mechanism regulating the occurrence of individual phenomena. chemist. The following examples taken from the field of research This type thinking is developed by use, is dependent upon training in plastics are presented as a demonstration of the danger of and background knowledge, but is primarily an inherent talent poorly designed experiments as well as the power of properly present (or lacking) in all of us in varying degrees. designed ones. These examples cover only a narrow aspect in the field of designing experiments; more complete treatments are Type I1 thinkingrequired t o perform the functions of the scientific method is less exalted but just as necessary. Type 11 is the listed in the bibliography. methodical, systematic, calculative thinking necessary t o devise FLEXURAL STRF,NGTII OF A N EXPERIhlENTAL POLYRIER a n experiment which will test the truth of the PostuIated hypothesis and which is required in evaluating experimental results A certain vinyl-type monomer was evaluated as a casting t o determine if they support the truth of a hypothesis. Type I1 resin. The castings were prepared by polymerizing the liquid thinking is primarily statistical in its characteristics. This may monomer. This polymerization was carried Out in &Ss C e b be shown by considering the functions performed by statistical immersed in a heated oil bath. The monomer was catalyzed procedures. These functions may also be clrtssified under three with benzoyl peroxide and then held at a given temperature for a headin@: principles of the design of experiments and the collecmeasured period of time. One of the measurements made on the tion of data; methods for reducing data t o show relationships castings was flexural strength. As it was well known that the and t o determine the single “best” estimates of parameters; and methods for calculating the ieliability of experimental catalyst concentration, temperature, and length of time in the bath might affect the strength of the casting, it was necessary results and tests of statistical significance. Obviously, these functions are coincident with the last two t o vary these factors to determine which combination gave the best product. Accordingly, an experiment \vas designed to functions listed for the scientific method. Unfortunately many, 320

measure the effect of each factor. Proceeding in the classical manner, a control casting was prepared first:

Control Effect of increased catalyst Effect of increased temperature of bath Effect of increased time in bath

8

TIme in Bath, Min. 20

Flexural Strength of Casting Av. of 5 Test’ Bars, Lb./Sq. Inch 9,980

Increase in Flexura Strength over Control

100

20

11,840

1860

1

120

20

11,600

1670

1

100

60

11,570

1590

Cats- Temp of lvst Cdncn., +th, 70 C. 1 100

2

...

From the results of the above series i t was apparent that the optimum conditions for a high flexural strength casting are the increased time, temperature, and catalyst concentration over the control. Furthermore, the iwreases in flexural strength indicated a value of 9980 1860 1620 1590 = 15,050 pounds per square inch for a casting made under the preferred conditions. I n order to check this, such a casting was prepared:

+

+

+

Catalyst c o n c n e b 7 Temp. of bath, Time in bath min. Flexural streigth of casting, av. of 5 test bars, lb./sq. inch

6.

2 120 60

9830

This anomalous result indicated that something had gone wrong with the evaluation, a not uncommon occurrence with this type of experimentation. The explanation in this case, as further work showed, was not that castings are so unreliable that predictions are unsafe, but t h a t the experimental design did not include an evaluation of interaction between factors. This very common method of designing experiments is perhaps the poorest way in which i t can be done and is wasteful in precision as well as limited to a very narrow interpretation of the results. A method more efficient in both these respects which yet does not require more work is available in the “Latin square” type of experimental design. Table I gives the results of such an experiment.

TABLE I. FLEXURAL STRENGTH VALUES(POUNDS PER SQUARE INCH) OF FOURSHEET CASTINGS

_

_

_ Time~in Polymerization Bath, Min.------Rn

20 _.

._

C. Bath, 1% Catalyst I. 9,500 10,650 9,700 9,950 10.100 Av. 9,980 120° C. Bath, 111. !I

100’ C. Bath, 2% Catalyst

100’

11.

Av.

11,900 11,850 11,850 12,000 12.100 11,940

1 1 1

Av. of

321

INDUSTRIAL AND ENGINEERING CHEMISTRY

February 1950

Av. 11,040 I and I1 = I and 111 = I I a n d IV = I11 and I V =

Av.

11.520

10,960 10,510 11,730 11,280 Effect of Factors

Time in bath 20 min., av. of I and I11 = 10,510Ib./sq. inch 60 min., av. of I1 and I V = 11.730 Difference = 1.220 = 10,960lb./sq. inch =

=

11,280 320

IV = 10,750lb./sq. inch

I11 = 11,490 =

740

It is apparent that this arrangement of the factors also has given somewhat misleading information, but the misdirection is considerably less than in the first experiment since here the estimated flexural strength of the casting made at the optimum conditions is 12,260 pounds per square inch sa compared to 15.050 pounds per square inch in the first experiment and 9830 pounds per square inch by actual determination. (The optimum

flexural strength is calculated by adding to the control result the improvement expected with the adjustment of each factor to its preferred level.) A Latin square experiment does not give a n evaluation of interactions between variables, but it does tend t o minimize their misleading effects. The second advantage of the Latin square arrangement is the gain in precision of averages. The effect of each factor is based on a n average of 10 test bars rather than 5 as in the first experiment. The reliable range ( * two standard errors) of a flexural strength average from 5 test bars is about 1 4 0 5 pounds per square inch, while that from 10 bars is 1 2 9 5 pounds per square inch. (The probability of t h e true average falling within plus or minus two standard errors from a n observed average is 0.95; therefore, the reliable ranges given here may be considered to be 95% reliable.) These ranges indicate t h a t the errors encountered are due to the arrangements of the experiments rather than to poor precision in measuring flexural strength, since the observed differences are much larger than can be accounted for by measurement errors. A Latin square experiment should be considered as only exploratory and, if important effects are found, the test should be expanded to a factorial design, where each variation of every Sacfor is combined in one of the tests with each variation of every sther factor. This was done by preparing and testing four additional castings to complement the four prepared for the L ~ t i nsquare experiment The results of these tests together with these from the first Latin square are shown i n Table 11.

TABLE 11. FLEXURAL STRENGTH VALUES(POUNDS PER SQUARE INCH) O F EIGHT SHEET CASTINGS O F P O L Y M E R --Time

in Polymerization Bath, 3Iin.---60

20 100’ C. Bath, 1% Catalyst

I.

z

9,500 10,650 9,700 9,950 10,100 = 9,980

V.

11,500 11,650 11,250 11,560 11,900 I : = 11,570

100’ C. Bath, 2% Catalyst VI. 11,800 11. 11,900 11,750 11,850 11,800 11,850 11,950 12,000 11,900 12,100 Z = 11,840 = 11,940 120’ C. Bath, 1%

VII. 11,300 11,750 11,600 11,650 11,700 2 = 11,600

Catalyst IV.

i: =

10,900 11,500 11,850 11,700 11,650 11,520

120’ C. Bath, 270 Catalyst 111. 10,550 VIII. 11,000 11,100 11,350 11,200 2 = 11,040 I =

9,900 10,150 9,400 9,800 9,900 9,830

Effect of Factors Time in bath 20 rnin., av. = 11,110Ih./sq. inch 60 rnin., av. = 11,215 Differenoe = 105 Temperature of bath 100’ C av. = 11 330 1200 C” av. = II:OOO DiffeieGce = 330 Catalyst concentration I 7 catalyst = 11,170 2% catalyst = 11,160 Difference = 10 Reliabilities (95% Level) Ayerages = 245 lb./sq. inch Difference between averages = 346 Average Interraations (1, 9) Time X temp. = 373 Ib./sq. inch Time X catalyst = 328 Temp. X catalyst = 560 Time X temp. X catalyst = 45

Vol. 42, No. 2

INDUSTRIAL AND E N G I N E E R I N G CHEMISTRY

322

SOLUTION VISCOSITY OF A CONDEVSATlO\ TABLE

1 2 3 4 5 6

7

8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36

111. DATAFROM CONDENSATION POLYRIERIZATION

33.0 26.5 30.6 29.8 28.4 29.3 31.4 28.8 26.6 27.3 29.6 28.8 28.2 27.8 27.9 20.4 22.2 23.6 28.0 24.8 25.8 25.8 26.2 26.3 24.5 27.4 27.1 27.3 25.9 26.9 26.0 26.5 26.3 24.8 24.7 26.1

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.8 0.8

0.4 0.0

0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2

7.67 7.67 7.59 7.59 7.59 7.68 7.68 7.61 7.61 7.52 7.52 7.61 7.59 7.67 7.55 7.55 7.55 7.55 7.55 7.55 7.55 7.55 7.55 7.60 7.60 7.G8 7.68 7.65 7.61 7.67 7.61 7.56 7.66 7.52 7.52 7.52

GO 30 60 GO 30 30 GO 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30

POLYMER

A large number of iactors may be expected to affect the chain length of a linear condensation polymer. In a series of experimental autoclave runs of a condensation polymerization, the data listed in Table I11 were obtained. In this reaction a bifunctional organic base was combined with a dicarboxylic acid to produce water and a polymer. The chain length of a polymer, as is well known, can be determined (at least empirically) bv measuring the viscosity of solutions of the polymer. The chain length of a condensation polymer is believed t o be dependent upon the extent to which water has been removed from the polymerization autoclave, and upon the care exercised in balancing the acidic and basic groups in the starting materials. The amount of a monofunctional acid or base added as well as the p H of the bifunctional acid-bifunctional base mixture before adding the monofunctional reactant are evpected to be important factors since unbalance will result in chain termination. This chain termination reaction can be used as a method of controlling chain length. I n the particular experiment described here, a number of additional factors besides those listed in Table 111were measured, but none of these were found to be significantly correlated n ith the solution viscosity of the polymer produced and, therefore, they have not been considered in the analysis outlined here. Unfortunately, as will be shown, these experiments were not planned t o evaluate some of the most meaningful combinations of the variables. Calculations. Let Xo = a datum on relative viscosity (0 = relative viscosity property) X1 = a datum on mole per cent acid stabilizer (1 = acid stabilizer factor) Xp = a datum on p H of salt (2 = p H factor) X3 = a datum on hold cycle (3 = hold cycle factor)

A study of the results of the analysis of this factorial experiment indicated t h a t most of the effects were interaction effects and t h a t polymerizations a t 100" C. were somewhat better than those a t 120" C. If 2% of benzoyl peroxide catalyst was used, the time in a 100" C. bath was not ~ X O 968.6 Z(Xo = 213.94 ZXaX1 = 131.60 important over a 20- to 60-minute range, but if only Z(Xi -xi)' - = 2x1 = 1.31 ZXoXz = 7356.249 5.4 1% of catalyst were used, the cell must be left in the 2x2 100' C. bath for at least 60 minutes t o obtain the 0.1043 ZXoX3 = 32,802.0 273.33 Z(X2 -Xz)' optimum flexural strength of the casting. Increased 2x3 = 1200 Z(X3 --Xs)2 = 3200 ZXiX2 = 40.896 precision was obtained by the factorial experiment ZXiX3 162.0 ZXpX3 = 9115.80 over the Latin square (*245 pounds per square inch as compared to *295 pounds per square inch), All summations are taken over the set of 36 data. Cross-prodbut twice as much testing was required. The important uct sums are the totals of the products of the corresponding pairs gain by the factorial design Jvas the evaluation of interactions of data. between the factors, and in the example these interactions AVERAGES (2). were the most important variables. Interaction variations appear to be the rule rather than the exception in chemical = 26.91 ) processes and, therefore, successful experimentations will be XI = 0.15 b X- = ZX/N ( 4= number of data = 36) accomplished only when experiments are designed to evaluate = 7.592 i interaction hypothesis as well as hypotheses concerning effects Z3 = 33.33 J of independent variables. A primary requirement in the statistical design of experiments ~ T A X D a R DDEVIaTIONS (8). is that the experimental errors be determined in order t h a t the reliability of results and significance of effects can be determined. Xo = 2.4724 This consideration alone should result in a marked improvement SI = 0.19347 = Z ( X - T)' in the efficiency of experimental work. Sz = 0.054589 N - 1 The industrial chemist is sometimes faced with a problem where 8 3 9.5619 j, only a limited manipulation of the variables is practical and all combinations of factors cannot be explored. This situation most CORRELATION COEFFICIEXTS (7). commonly occurs when working with processes which have been - BXaZX1 ZXOX1 ____ developed t o the pilot plant stage or which are in commercial T o l = -0,81775 T12 = -0,28135 N = 40.45578 r13 = -0.27801 ~ o = i (Av - l)sosl operation. Here the best approach seems t o be that of changing, TO3 = +0.62279 rz3 = +0.26274 over as wide a range - as possible, each condition suspected of being a factor in the quality, yield, or operational characteristics The least-squares h e a r equation representing these relationof the process, and measuring all factors to be considered at each ships is of the form: of these sets of operating conditions. The resulting data can be analyzed by the techniques of partial regression wherein each variable is evaluated under a statistical control and its inS etc. bl = B1E', b2 = Bz 2, dependent effect measured. The following example shows and if Si S? how this may be done.

-

z o

z t

'

1

-\I

i

February 1950

and the variables are written as divergences from their respective means:

The coefficients of Equation 3 can be evaluated by filling in the values of the r’s and solving the simultaneous equation set:

+ + ++ + +

rot = BI B2r1z B3r13 T O Z = Blrlz BZ Bar23 r03 = B1r13 B m Ba

B, = -0.66115 B2 = f0.16588 B3 = +.0.39540 The fraction of the total variability in solution viscosity which can be explained by each of these factors can now be calculated: 4

rolBl = 1-0.540 of variability due to acid stabilizer rozBt = +0.075 of variability due to p H of salt ra3B3= t 0 . 2 4 7 of variability due to hold cycle

RZ

=

+0.862 of variability due to combined effect of the three factors (R2 = multiple coefficient of determination)

The coefficients in Equation 1 can now be calculated from Equation 2. The value of the constant A can be evaluated by filling in the mean values for each variable and solving. This gives:

Xo

=

-8.45%

+ 7.51X2 + O.lOZX3 - 32.24

for the prediction equation for linear relationships between the variables. The reliability of estimated solution viscosity values from this prediction equation is calculated (within the range of the variables explored in this experiment):

-\I

S (estimate) = SO

(1

N-1 - R2)= 0.96 N - 4

The 95% reliability is about twice the standard deviation of the estimate which equals * 1.92% relative viscosity. It was desired to adjust the amount of acid end group stabilizer to give an average relative viscosity of 26% when a 30-minute hold cycle and 7.60 p H salt were used. The prediction equation is reduced t o the following when these values of p H and hold cycle are substituted (using the coefficients from the prediction equation) for their variable components: Solution viscosity

=

-8.45 (mole

% acid)

+ 27.89

Solving this for a solution viscosity of 26%, the specified amount of acid to add is 0.22 mole %. SUMMARY

The findings from the calculations may now be summarized:

*

323

INDUSTR-IAL AND ENGINEERING CHEMISTRY

1. Changing the amount of monofunctional acid added to the charge gave the greatest effect on solution viscosity (5470 of variability in solution viscosity was due to this factor). 2. The p H of salt had little effect over the range from 7.52 t o 7.68 (7y0 of the variability due to this factor). 3. Hold cycle in the autoclave gave an appreciable effect on viscosity (25% of variability in viscosity was due t o hold cycle effects). 4. A viscosity of 26 * 2% can be maintained by adjusting the p H of the salt to be polymerized t o 7.60, using a 30;minute hold cycle, and adding 0.22 mole % of monofunctional acid to act as a chain growth terminator. The process of extracting information from the data has been somewhat laborious (though the labor cost for the calculations was only about $50, less than 1% of the cost of obtaining the data), but this technique has revealed relationships in a quantitative fashion and enabled the experimenter to place reliability limits on his conclusions. I n this particular problem approximately the same information on the independent effect of these factors plus information on interaction between factors could have been obtained by making only the following eight batches:

No acid, 7.50 pH, 30-minute hold No acid, 7.70 pH, 30-minute hold 0.8 mole yo acid, 7.50 pH, 30-minute 0.8 mole % ’ acid, 7.70 pH, 30-minute No acid, 7.50 pH, 60-minute hold No acid, 7.70 pH, 60-minute hold 0.8 mole yo acid, 7.50 pH, 60-minute 0.8 mole % acid, 7.70 pH, 60-minute

hold hold hold hold

When these eight batches have been tested, those factors found to be important should be expanded to test combinations at intermediate levels in order to improve the precision of evaluating the relationships of the factors with viscosity and to determine if the relationships are linear. Also, as was demonstrated in the example on flexural strength of a cast polymer, interactions may be very important, and the data given in Table 111 appear to have been unknowingly arranged t o eliminate any information on interaction between variables. The adherence to the principle of conventional control will usually eliminate those combinations of factors which show interaction effects which are almost certain to appear when the process is operated commercially. The elimination from the experiments outlined in Table I11 of combinations of factors which give information on interactions between variables throws some doubt on the reliability of the conclusions reached from the correlation analysis of the data, although the particular set of conditions chosen t o give 26y0 relative viscosity is well represented in the experimental batches run. However, i t may be predicted that variations of the factors in commercial operation will give combinations which interact to throw the viscosity out of control if such interaction effects exist; therefore, the safe procedure in this process is t o run the eight batches listed above and determine if certain combinations of factors are likely to give wide variations in relative viscosity of the polymer. It is thought that a careful consideration of the two problems presented here will convince anyone faced with similar problems that there is a need for more application of the principles of statistical experimental design and for the use of statistical methods in analyzing complex data. ACKNOWLEDGMENT

The author wishes to thank Miss M. T. Dunleavy who performed the computation described under “Calculations.” BIBLIOGRAPHY

Brownlee, K. A., “Industrial Experimentation,” London, His Majesty’s Stationery Office, 1947. Fisher, R. A., “The Design of Experiments,” London, Oliver and Boyd, 1942. Fisher, R. A., “Statistical Methods for Research Workers,” London, Oliver and Boyd, 1941. Gore, W. L., Industrial Q u u l i t ~Control, 4 (2), 5-8 (1947). Goulden. C. H.. “Methods of Statistical Analvsis.” New York. John Wiley & Sons, Inc., 1939. Grant, E. L., “Statistical Quality Control,” New York, McGrawHill Book Co., 1946., (7) Peters and Van Voorhis, “Statistical Procedures and Their Mathematical Bases,” New York, McGraw-Hill Book Co., 1940. (8) Shewhart, W. A., “Economic Control of Quality of Manufactured Products,” New York, D. Van Nostrand Co., Inc., 1931. (9) Simon, L. E., “Engineers’ Manual of Statistical Methods,” New York, John Wiley & Sons, Inc., 1941. RECEIVED January 15, 1949. Presented in part before the Division of Paint, Varnish, and Plastics Chemistry at the 113th Meeting of the AMERICAN CHEMICAL SOCIETY, Chicago, Ill.