Factorial Experiments in Pilot Plant Studies - American Chemical Society

used in the present paper under the subtitle, “Regression in Two. Dimensions.” The case handled here is the one in which the variability of the y ...
0 downloads 0 Views 1MB Size
1300

INDUSTRIAL AND ENGINEERING CHEMISTRY

parameter, u, is known. Optimum procedures when estimates of u must be based on small amounts of data are under study. As in almost all statistical reasoning, y is assumed t o be in a state of statistical control wherever measured. Mosteller has given a discussion of the mathematical model used in the present paper under the subtitle, “Regression in Two Dimensions.” The case handled here is the one in which the variability of the y distribution changes from one fixed value of x t o another. Mosteller did not discuss this case. A common null hypothesis in estimating slopes would be: I s the true slope of the line-of which there is usually only an estimate from a sample-equal to some standard value, say BI A statistical test made as indicated near the end of Mosteller’s paper will give a t value as large as 1.96 only one time in twenty nhen the true slope is B . This is the 0.05 significance level or the frequency of making an error of Type I. The power curve given in his paper can be used to decide how many 21 measurements should be made (A‘ = nl n ~ ) .It is only necessary to substitute ( B - b ) / u ( b ) for his abscissa. When measurements are made a t z1 and xz in the ratio U I / U Z ,

+

Vol. 43, No. 6

u ( b ) can be found directly from Equation 9 of this paper. After deciding how large a real difference in slope from the standard, B - b, must be detected with some definite probability (ordinate of Mosteller’s Figure I), the corresponding abscissa value can be read off and set equal t o

which can be solved for N. Conversely, if N is already fixed, the same graph can be used t o estimate the frequency of making a n error of Type I1 of any size B - b. LITERATURE CITED

(1) Daniel, C., and Heerema, N., J . Am. Statistical Assoc., 45, 546-56 (December 1950).

( 2 ) Mosteller, F., Ibid. ENG.CHEM.,43, 1292 (1951). (3) Scheffb, H., Ibid., 43, 1292 (1951). RECEWBD February 8, 1951. Presented before the Division of Industrial and Engineering Chemistry, Symposium on Statistics and Quality Contra in the Chemical Industry, a t the 117th Meeting of the AMERICANCHEMICAL SOCIETY, Detroit, Mich.

Factorial Experiments in Pilot Plant Studies J. R. Bainbridge Imperial Chemical Industries of Australia and New Zealand Ltd., 380 Collins Street, Melbourne, CI, Australia T h e factorial experiment reported in this paper was used to investigate unexpected performance in a small gaseous synthesis plant. This approachproved applicable and gave convincing and comprehensive information of a kind which had not been obtained in a much longer period of “noi+nial”plant operation. Of interest to chemical engineers is the detailed examination of the structure of the factorial experiment; the analysis of covariance; the information obtained and the types of error to which it is subject; and the superior effectiveness of the factorial . over other experimental approaches.

T

HE growing application of statistical procedures to Iaboratory and pilot scale investigations ( 2 , 3 ) is most commendable. Whenever a new kit of tools, such as statistical methods, becomes available the owner faces two problems: First, he must acquire technical proficiency in the use of these tools, and secondly, he must learn discrimination-when t o use and when not to use each tool and how t o modify his traditional procedures t o make best use of his new assets. This article is directed toward the second problem, though necessarily it must deal extensively with the first. It is the author’s opinion t h a t in the majority of problems requiring experimental investigation-whether laboratory, pilot, or commercial scale-the factorial experiment is the most effective and economical means of experimental approach. I n order to sustain this thesis a brief description is given of a factorial experiment performed on a small gaseous synthesis plant. This is followed by a detailed analysis of the results using analysis of variance and also analysis of covariance. Finally, the value of the information obtained is compared with that arising from Huhndorff’s repro-

ducibility study (3’) and from Gore’s approaches t o factorial designs ( 2 ) ; the shortcomings of nonstatistical approaches are indicated. TYPICAL E X P E R I M E N T

The factorial experiment described here was performed on a small plant carrying out a catalytic gaseous synthesis reaction and removing the product as a liquid solution. The plant consists of the following essential parts, which are indicated in Figure

1: 1. Gas preparation plant in which the concentration of the active constituent is controlled 2. Gas purification plant in which impurities causing undesirable side reactions are removed 3. Synthesis section consisting of a converter with catalyst and temperature control arrangements, means for removing the synthesized product in solution, means for recirculating exit gas through the converter and for purging inert gases Three important variables in most chemieal plants are converter reaction temperature, throughput rate through the converter, and the concentration of the active ingredient in the makeup gas. In general, and also in this experimental plant, these three factors are controllable. However, the degree of gas purification depended ultimately on cooling water temperature ; this was not readily controllable and constituted a potential cauee of nonreproducible results. The experimental investigation was required t o show the effect of running the plant a t several different levels of each of the three controllable variables-temperature, throughput and concentration. In addition, it was essential that there be reproducibility of results when running the plant on different occasions under allegedly the same conditions. Unless this reproducibility was determined, it would be impossible t o assess the reliability of

June

1301

INDUSTRIAL A N D E N G I N E E R I N G CHEMISTRY

1951

a test, or the result of a test, at the lower concentration of a c t i v e ingredient in t h e make-up gas c1 = a test, or the result of a test, at the higher conc e n t r a t i o n of active i n g r e d i e n t in the make-up gas & = a test, or t h e result of a test, made in the first replication dl = a test, or the result of a test, made in the second replication co =

TABLE I. ANALYSISOF PRODUCT STRENGTH

f

Test Condition, Test abed No. 0000 4 1000 8 0100 5 1100 3 0010 7 1010 2 0110 1 1110 6 0001 14 1001 10 0101 9 1101 13 0011 11 1011 16 0111 15 1111 12 Subtotal 0 Subtotal 1 Column total Check total

Test Result 99 18 51 52 108 42 95 35 46 18 62 - 47 104 22 67 36 632 176 808 352

Operations A B C D 220 808 117 500 280 308 -456 103 -106 79 -206 150 58 229 -250 130 34 210 80 64 - 72 22 15 -126 88 20 126 -137 56 103 -113 - 30 14 60 -192 - 81 44 20 160 1 38 - 66 - 49 - 46 23 24 -118 - 60 90 28 82 - 6 -109 6 26 70 32 81 - 76 82 132 208 31 51 200 20 280 .. 152 284 288 352 304 568 576 304 568 576

-

-

-

-

-

-

-

... . ...

Effect

Mean I 50.5 -28.5 A B - 6.625 3.625 AB 13.125 C AC - 1.375 1.250 BC 3.5 ABC D -12.0 2.75 AD BD 2.375 7.375 ABD 6.625 CD 4.375 ACD 2.0 BCD 13.0 ABCD Sum of mean squares

--

Sum of test results*

Mean Sauare 40,804 12,996 702.25 210 25 2,756.25 30.25 25.00 196 2,304 121 90.25 870.25 506.25 306.25 64 2,704 64,686

Any test may be identified by a symbol, such as U I , bo, Q, dl, which indicates the second test at high temperature, low t h r o u g h p u t , a n d low make-up gas c o n c e n t r a t i o n T h i s c a n be shortened t o

64,686

.

1001.

a

-

the measured effects of changes in the levels of the controlled factors. I n this particular instance it was also desirable t o find if gas purity affected results; if so, by how much and, finally, to make allowance for changes in purity t o remove errors and confusion arising from the uncontrollable changes in this factor. The experimental program chosen made use of a 2 5 factorial design. The “cube” means that three factors, temperature, throughput, and concentration, were varied in a controlled manner. The “two” means that tests were carried out a t two levels of each factor-that is, at two temperatures, two throughputs, and two concentrations. The word factorial means, among other things, that tests were carried out a t all combinations of the factors and levels; thus eight tests were required. The experiment included two replications, making 16 tests in all. The experiment was confounded and carried out in four blocks each of four tests; finally, the various tests were performed in a random order, this being another requirement implicit in “factorial.” The set of sixteen test results gives information on all the points set forth earlier. This comprehensiveness is a n outstanding feature of factorial experiments. To any one test there are a number of observed results, such a s the rate of production, the composition of the purge gas, and the strength of the product in the product solution. For each of these qualities there is a set of sixteen results, one for each of the test runs. The method of analysis of results is the same for each quality, though naturally the relative importance of the several controlled factors is likely to be different for the different qualities which might be considered. The data recorded here relate solely t o the strength of the product solution. EXAMINATION OF T E S T R E S U L T S

Notations and Definitions. The “analysis of variance” of the set of sixteen test results is carried out by a modification of a tabular method originally given by Yates (6). As a complement t o systematic experiment and as a n aid to quick and clear analysis, the following systematic notation was used:

a test, or the result of a test, at the lower reaction temperature used a1 = a test, or the result of a test, a t the higher reaction temperature used bo = a test, or the result of a test, at the lower throughput used bl = a test, or t h e result of a test, at the higher throughput used a0 =

Although the lower case letters a, b, e, and d with suitable suffixes relate t o observations under particular conditions, the GAS

PREPARATION

t GAS WRlFlCATlON

L__,___I MNTHESIS

! l , i

GAS PURGE

Y

PRODUCT SOLUTION

Figure 1. Flow Sheet of Experimental Plant

capital letters A , B , C, D may be used t o denote the effect of changing the level a t which the corresponding factor was controlled. It is convenient to define: (a1 - U O ) for the same b, c, d (A)= A = mean value of ( A ) for all b’s, C’S, and d’s = (l/s)Z(A) = (l/le).(Zul

- Zan) in this particular experiment with 16 test results total, 8 a t each reaction temperature

Similar definitions apply to the effect, B , of changing throughput and, C , of changing make-up gas concentration. I n a similar way the effect, D,of repeating the experiment may be defined; such an effect is merely a measure of the errors t o which the experimental work is subject. It is also convenient to define a new effect by the relation:

4 ( A B ) = (ulbl

-

d i )

-

(ulbo

-

mho) for the same c and d

( A B )is then the effect of change of temperature at high throughput less the effect of change of temperature measured at low throughput. This is also expressed as the effect of throughput on

1302

INDUSTRIAL A N D ENGINEERING CHEMISTRY

the effectof temperature or as the interaction of throughput and temperature. We may also define

AB

= =

mean value of t.he interaction ( A V )for all e’s and (1’s ( 1 / 1 8 ) (ZUlbi - Z a o h - Bn,ho + Zaoho)

Vol. 43, No. 6

effect of temperature-that is, of the temperature-throughput interaction. Column C is formed from column B, and columa D is formed from column C, using the same routine operation. The meaning of the several figures can be traced from the operations in the manner already described, but for simplicity the meaning of the figures in column C: are indicated symbolically in the “Effect” column. The “mean” is found by dividing the figure in column 1) by 16, the divisor used in defining effects. The letter symbols in t h r Effect column €ollo\v suffixes 1 in the symbol denoting test conditions so that, in practice, they can be aritten in without tracing through the table. The figures in thc last column, Mean Squares, are the products of corresponding figures in the RIearl and D columns. Any arithmetic procedure requires independent checks. To provide these, the columns are totaled in two sections. Thc upper subtotal is the sum of the upper numbers in ea.& of thi, eight pairs, and the lower subtotal is the sum of the eight Iowrer numbers. The column total is the sum of these two Rubtotalri, and the check total is twice the lower subtotal. It is easily shoi~ii that’the check total must equal the column total for the follo~z-irig column. A further check arises from the fact that the sum of t i i t , mean squares equals the m m of the squares of the original test rtasults. This identity i s a check not orily of arithmetic but also of correct choice of the divisor (16) arid nlao of a legitimate choice of definition of the effects, A , B, etc. Interpretation. The means given for A , B, and C measure the apparent effech on product. strength of raising the tempertiture, the throughput, and the make-up gas concentration, respect,ivelp. Since the test rcsults have been manipulated by multiplying percentages by 10, the effects are also multiplied by 10. It, follows from the definit>ionof effect ( A ) that raising tho reaction temperat,ure by half the change used, lowers the product strength by 2.85’%. Similar interpretations apply to B , C and to the interactions, AB, AC. All the test results are subject to error and so also are the effecta. All the effects containing D are measures of nothing but error since they are the differences of repeat measurements of the same thing. -4s a result they provide a measure of what percentage of the effects is likely to be error and what percentage a real effect of change of temperature or throughput. At first sight only the effect of temperature looks appreciably larger than the errors which occur. Confounding. The experimental program was carried out in a special way, as a result of which errors fall into two categories. The sixteen test runs were performed in four groups (or blocks), each of four tests, as shown by the order of t,est in Table I. The errors can be classified in two groups: I n the first group are errors which occur between one test and t.he next; in the second group are errors which are common to tests in the one block

Similar definitions apply to the two-factor interactions, A C and BC, and t o a t’hree-factorinteraction, ABC. Jn addition there are quasi interactions, AD, BD, ABD, which give the differences bctween the measures of the effect, A , B, AB, made in the two rcplications, do and d l . Thcse interactions are therefore measures of errors which have occurred in the experimental processes. The detail calculation and meaning of these interaction8 can be studied mom easily from the routfine analysis of the experimental data. Analysis of Experimental Data. The systematic sorting and computation is given in Table I. The first column lists the csperiment.al conditions according to the code already described. The test results are arranged in a particular standard order. :iccording t o the conditions of test and not according to the chronological order of test which is given in the second column. The standard order in Table I is important from a computational point of view; the contrast bet’ween standard order and chronological order is important in assessing the significance of the rcsults of the experiment as will be evident later in discussing errors. I n order to reduce arithmetic a certain amount of manipulation of results has been introduced-for example, if strengths of the product are mostly in the range 80 t o 95% and are measured to the nearest one tenth of 1%,then it is convenient to deduct 80 and multiply the residue by 10 so that a strength of 89.9% appears as 99. This device has been widely used. The third column of Table I gives thc manipulated rtmilt of each test, and the nest four columns, which are all calculated using the eame rout,ine, carry out t.he sorting process. Thc fourth column is calculated from the previous column hy taking entries in pairs, starting from the top, and adding them: thow arc the first four pairs of figures in column A. The pairs of results rulate t o tests which are made under identical process conditions except that different temperatures, a 0 and a,,were used; (-omequently, these sums lead to averagcs Cor both temperatures and cannot show the effect of change of temperature. The sccond four pairs of figures in column A were obtained by ag:iin taking pairs of test results, start’ing from t’he top, and sukitritcting the upper result in each pair from the l o w r . From the nature of the pairs, each of t,he figures obt’ained is a measure of tlie effect of raising temperature and of not,hing else. The values recorded for t,his effect are 2 ( A )in terms of t’he definition already given, and differ in that they are obtained under different sets of conditibns, b, c, d-that is, a t different throughput,s, concentrations, and replications. Column B is formed from column A by exactly the same -~ process. Examination will show Taxm 11. ANALYSISOF ERRORS BETWEENBLOCKS that, arising from the standard Test order used, the pairs in column OperatioI=-_ Condition, Block A B C D abcd Error h differ only in the throughput 0000 40 4h 4a + 4h t 4 j + 4k 29 2h s+h 9 used and as a result the first u la 29 2h 4j + 4k .......... 1000 h two pairs of column B show 0100 h 21 f 2k 9 ll u + h 2j + 2A 1100 neither the effect of temperaQ ..... ..... .......... 0010 h ture nor throughput; the next j + k ..... ..... .......... 1010 j + k 9 two pairs show four measures ..... .......... 0110 a j + k of the effect of temperature ..... - 4 9 f 4h + 4 j - 4 h 1110 h j + k -9 + R . . . . ..... - 4 y - 4 h 4- 4 j + 4 k 0001 6 but not of throughput, the Q - 71 ..... ..... .......... 1001 j third two pairs show four ..... ..... .......... 0101 3 0 - h m e a s u r e s of t h e e f f e c t of .... .... .......... k 1101 -9 h j - k 2 g - 211 ..... ,.. throughput, but not of tem0011 j j k 2 g + 21. . . . . .......... 1011 k perature, and the last quarter - 2 j + 2k -4g + 4h .......... 0111 k -j k shows four measures of the j - k Z j - 2k 4 j - 4k 1111 40 - 4 h + 4 j - 4 k 3 effect of throughput on the

+ +

+ + +

++

+

Effect

I .t

R AR

c

Ac BC ABC

D

AD BD ARD

CD

A CD BCD ARCD

June 1951

INDUSTRIAL A N D ENGINEERING CHEMISTRY

1303

but which change from block TABLE IIT. h ? A L Y S I S O F COVARI.4NCE to block because of factors such as major changes in the Observations Effects, Column D Corrected Symbol, Im-Mean weather or deterioration of abcd Strength purity Symbol y 2 ua/16 w/16 x’/16 Effert catalyst. The errors of the 0000 99 808 201 4 0 , 8 0 4 . 0 10,150.500 2 , 5 2 5 . 0 6 2 5 10 I 1000 18 16 A -456 5 12,996.0 -142.500 1.5625 -26:86 second clam applying to the 0100 d l 13 B -106 9 702.25 - 59.625 5.0625 - 3.68 four blocks may be indicated 58 - 11 1100 52 10 AB 210.25 - 39.875 7.5625 .02 hy g, h, j , and k , respectively. 108 10 C 210 1 2,756.25 - 13.125 0.0825 12.80 0010 T a b l e I1 s h o w s t h e t e s t s 1010 42 12 AC - 22 - 13 30.25 17.875 10.5625 - 5 64 written in standard order, the 0110 95 20 7 17 BC 25.0 8.750 3.0625 3 54 1110 35 14 ABC’ 56 3 196.0 10.500 0.5625 , . . block errom, and the analysis 0001 46 11 Df -192 0,5625 ... of block errom by the method 1001 18 13 AD 0,0625 ... +- 31 2 , 31 02 14 ..00 - 3 26 .. 07 0500 - 44 used in Table I.‘ It is evident 62 12 BD - 38 0101 - 3 90.25 7.125 0,5625 ... 16 ABD -118 17 1101 - 47 870.25 - 125.375 18,.0625 .,. that the block errors effect the 0011 104 13 CD 90 - 9 506.25 - 50.625 5.0625 ... general average result, 1; the 70 - 5 306.25 1011 22 11 ACD - 21.875 1.5625 ... threefactor interaction, ABC; 12 BCD 32 - 17 0111 67 64.0 - 34.000 18.0625 the difference between means 1111 36 208 11 ABCD+ - 5 2,704.0 - 65.000 1.5625 of the two replications, D; and Sum of mean squares of mean products 64,686.00 9,676.000 2,599.000 ... Sum of squares or products of observations 64,686 9,676 2,599 ... the difference in measures of A , B , C in the two replications. I t is also evident that differences between blocks cancel out and do not upset the determination of any other effects. This thcn temperature probably does not affect product strength. If, arises from the choice of tests included in the several blocks and however, A is so large that it is most unlikely t o be due solely to crror, thcn part at least of the apparent effect is almost certainly this design is said to confound the three-factor interaction, ABC, due to the change of reaction temperature. This is the procoss of with block differences. Comparison of Tables I and I1 suggests that the changes from significance testing; the practical computation follows: block t o block in the product strengths are: The sum of the mean squares of the six effects, A D , BD, ABD, C D , ACD, and BCD, phich show within block error, is 1958. Second block, h , about 19 units = 1.9%, lower than the first Divide this sum of squares by the number of independent est.iblock g mates making up this sum. This quantity is called the degree8 of Third dock, j, about 17 u n i h = 1.7%, lower than the first block, freedom and equals 6. Y The quotient is 326.33 and is called the mean square for withiii Last block, ]E, about 50 unity = 5.07& lowrr than the first block block error, or the residual mean square. Ll Hypothesize that regction temperature has no effect on product, strength and, therefore, that the value obtained for A is mere]) Thew figures suggest a t first sight a general fall in product chance accumulation of error. The mean square for A (12,9!)6) Ftrength from block to block, and this tendency might well occur therefore as valid an estimate of within block error as is thc: also over the four tests in each block. Errors, whatever their residual mean square 326.33. Calculate the variance ratio, F = 12,996/326.33 = 39.8. cause, have occurred in some definite and uncontrolled pattern. Consult tables (1)of variance ratio, F ; for the circumstances By using the confounded pattern in four blocks, the portions, in which the mean square in the numerator is based on one degree g, h, j , and k have been canceled out and do not affect the imof freedom (d/f) and that in the denominator is based on 6 d / f , portant comparisons. the value of F will exceed 35.5 only once in 1000 times by chance. Since in this experiment the value found for F is 39.8 the result Randomization. The errors of the typw indicated by g, h, obtained is either one which would he obtained less frequently j , and k between blocks have been canceled out, but there are than once in 1.000 such experiments or raising the reaction temother portions of error, within the blocks, which also are likely perature does decrease the product strength. Most people preftai. to occur in a definite pattern but which cannot be canceled out. to reject the former alternative and, instead, have a fairly contident belief in the importance of temperature. The four tests within each block have, however, been performed The standard error of mean effect A is 4326.33/16 = 4.51: in a random order (Table I ) so that it is purely a matter of the table of Student’s t distribution (1) for six degrees of freedom chance how the (preordained) errors become allocated among the shows that errors in A exceeding 2.447 times the standard error different test runs and different experimental conditions. The may be expected only once in twenty times. This means that tht. data obtained from test runs are converted by a systematic proprobability is 19/20 that the true effect of temperature lies withiii the range -28.5 * 2.447 X 4.51-that is, between -39.6 an11 cedure to certain effects of changing test conditions, but since the - 17.4. allocation of “within block” error among the various test runs is purely a chance allocation, its allocation among the several effects Block Differences. The block differences, g, h, j , k, may bc> is also purely n chance allocation. checked in a rather similar manner. First hypothesize that these are zero and also that the three-factor interaction, A BC, I3ecaurre of randomization in the design of the experimental program it is possible validly to apply the laws of chance t o the is zero. It then follows from this hypothesis that the sum of occurrence of errors; without this randomization there is no squares of ABC, D , and ABCD, which equals 5204, arises solel>. from within block error. Dividing by the three degrees of freevalid way of estimating the limits within which the errors in this dom gives a mean square of 1734.66 which must be compared with experiment are likely to lie. the residual mean square. Calculate F = 1734.66/326.33 = 5.31. Significance. I n an earlier section it was pointed out that the Tables of F for three and six degrees of freedom show that, the effects, A D , B D , A B D , CD, ACD, and BCD, were in reality probability of obtaining such a value is about one in 30 by c.liance measures of experimental errors. It hras now been pointed out alone. Therefore, it is possible that this is purely a chance that these effects are measures only of the within block comphenomenon, but the odds are against its being so. If this is not a ponents of error and that these are the only error components chance phenomenon, then an unknown factor is causing a reducaffecting the effects, A , B, AB, C , AC, and BC. I t has also been tion of production strength, and this probability should be given pointed out that the within block errors are distributed in a purely attention. This result is termed significant. chance manner among the twelve effects named. The problem The effect of concentration C is in this same significant catenow is to determine, with the aid of statistical theory, whether a result as large as that for A could be due t o error alone. If so, gory. _ I

INDUSTRIAL AND ENGINEERING CHEMISTRY

1304

ANALYSIS OF COVARIANCE

The analysis of variance so far discussed has shown that reaction temperature affectsproduct strength, though the magnitude' of the effect is subject to considerable doubt. In addition, there are strong suggestions that make-up gas concentration affects product strength and that some unknown factor is upsetting results from block to block.

f

100-

75

-

50

-

same, but if the impurity is different in the two tests, then a correction proportional to this difference in impurity is required. I n Table I, each effect, except the first ( I ) is the sum of eight differences between pairs of test results, and each such effect needs t o have a correction proportional to the sum of the eight corresponding differences in impurity. The sums of these eight differences in impurity are given in Table 111, The amount of correction per unit difference in impurity can be found from a study of those effects, which are independent of changes in temperature, throughput, concentration and blocksthat is, from A D , B D , A B D , C D , A C D , and BCD; Figure 2 shows the graph of mean differences in product strength versus mean differences in impurity for the six groupings represented by these effects. There appears to be a tendency for the strength t o decrease as impurity rises. In the equation

Y

25

Vol. 43, No. 6

=

mx

Y is the expected difference in product strength corresponding to a difference in impurity, x.

-

The constant, m, must be chosen so that the equation fits the data as nearly as possible-that is, so that Z(y - Y ) 2is a minimum; y is the observed difference in product strengths, and the summation extends to all effects which are not influenced by temperature, throughput, concentration, or blocks. Z(y

-

Y)2 = Z(y

-

mz)

= Zy2 - mZxy

+ m2Zx2

This is a minimum if m = 2 z y / 2 x 2 and the minimum value is = Zy2 - mZzy

0

ABD

-Y

Figure 2

At this stage it became advisable t o investigate the suggestion that make-up gas purity was a disturbing factor and was possibly responsible for both the block differences and for some of the within block error. These impurities in make-up gas were measured for each test and the follo~vinganalysis of covariance was made to determine to what extent gas purity affects strength and what corrections shouId be made for the apparent effects of temperature t o allow for changes in purity. The first step is t o record the measured values of impurities in the make-up gas and to analyze them by the method used in Table I. Table 111 lists these impurities, the results of the analysis, and corresponding figures from Table I. Table I11 shows that whereas the differences in product strength b e h e e n blocks, shown by A B C , D, and A B C D , are in general large compared with the error in degrees of freedom, the differences in impurity, between blocks, are not relatively large. This suggests that make-up gas impurity may not be the cause of block differences. For product strengths the main effects, A and C, are large compared with the error terms, but for impurity none of the main effects or interactions are large compared with the error terms. This result arises from the fact t h a t the experimental work was randomized so that only chance could make an unrelated factor, such as make-up gas impurity, coincide closely with the pattern changes of test conditions. Effect of Impurities.. A more detailed investigation of the effect of impurities should start with the assumption that the product strength is a linear function of the impurity, other things being equal. No correction for change of purity is required in the difference of two test results in which the impurity is the

Numerical Computation. Table I11 besides repeating the original observations for strength and impurity and recording as y and z,respectively, the results of analyzing these data, also includes the values of y2, zy, and x2,divided in each case by 16. The sum of the xy column is identically equal to the sum of products of pairs of original observations just as Zy2/16 equals the sum of squares of strength observations and Zz2/16 equals sum of squares of impurity observations. With an adequate calculating machine, these two sums of squares and the sum of products can be obtained in one calculation. Calculation of m is now straightforward. For the six effects, A D t o BCD inclusive, 2xy/16 = -227.550, Zz2/le = 43.3750, and m = -227.550/43.375 = -5.244956. The sum of squares m B~y/16= 1193.22749, and the sum of squares of residual deviations from the regression line is Z(y - Y)216 = 2y2/16 mZzy/16 = 1958.0 - 1193.227 = 764.773. There are five independent deviations of the six values of y from the line Y = mz, so the mean square deviation is 764.773/5 = 152.955. The mean square deviation due to all factors not specifically accounted for is now 152.955. Previously, when purity changes were not specifically accounted for but were included in the residual variations, this mean square was 326.33. The reduction to 47% of the previous value looks useful. Bn accurate test is made by noting the reduction 1193.227 brought about in the sum of squares by fitting the single coefficient m; calculate the variance ratio, F = 1193.227/152.955 = 7.8. Tables of F for one and five degrees of freedom show that the chance of obtaining such a ratio experimentally in a properly randomized experiment in which there is no effect of gas purity on product strength, is less than one in 20. Since there are also theoretical grounds for believing that gas purity would affect product strength in the direction found, there is every reason for adopting, as a working hypothe&, the idea that gas impurities lower product strength by about 5.245 units per unit of impurity. The last column of Table I11 shows the mean effects corrected for impurity and should be compared with the uncorrected means in Table I. The calculation for effect A is:

.

June 1951

INDUSTRIAL AND ENGINEERING CHEMISTRY

Uncorrected total Correction = -m = -(-5.245) Corrected total Corrected mean

a

-456 26.22 -429.78 - 26 86

X 5

and other effects are calculated similarly. Significance Test for Block Differences. The original suggestion regarding make-up gas purity included the possibility that the differences which were evident between the four sections (blocks) in which the experiment was performed were brought about by changes in make-up gas purity. A cursory examiIiation of Table I11 suggested that this was a bad guess; the question was then put to a formal statistical test. The question was: If allowance is made for a general effect of make-up gas purity on product strength, will this account for the block differences or will they still be too large to be accounted for by differences in make-up gas purity and residual errors? Adopt, for the sake of discussion, the null hypothesis that there is no difference between blocks, other than differences in make-up gas purity and residual error. This places the block difference effect, ABC; D, and ABCD, in the same category as the error effects, A D , B D , A B D , C D , ACD, and BCD; accordingly all nine should be used in determining the relation between product strength and make-up gas purity. For these nine effects 2y216 = 7162.0, 2yz/16 = -246.000, and Zx2/16 = 46.0625; hence m1 = -246.0/46.0625 = -5.340570, and the sum of squares of residual deviations = 7162.0 - 5.340570 X 246.0 = 5848.220. This value 5828.220 is the sum of the s uares of eight independent discrepancies from the hypothesis %at there are no inherent block differences. Previously the hypothesis that there were inherent block differences gave 764.773 as the sum of the squares of five inde endent discrepancies. The null hypothesis that there are no inierent block differences adds 5083.447 to the sum of squares and 3 t o the degrees of freedom-that is, adds a mean square of 1694.482. Residual error, on the other hand, provides a mean square of 152.955 for five degrees of freedom. The variance ratio, F , equals 11.078 for three and five d/f,and such a ratio would occur by chance only about once in 80 tries. It is, therefore, unlikely that this occurrence is a chance effect. In other words, it is reasonably certain that there are inherent block differences which cannot be accounted for by variations in the make-up gas purity.

,

U

.

Significance of Effect of Temperature. The first analysis gave the mean effect of raising catalyst temperature as -28.5 units for half the temperature change adopted. After refining the analysis by allowing for changes in make-up gas purity, the mean effect was reduced to -26.86. This new estimate, like the earlier one, is subject t o error; its significance should be checked, and the limits between which the true value is likely to reside should be reassessed. The significance test may be carried out in the same manner as for block differences, but for single effects it is usually more convenient to use Student’s t test. The corrected mean is made up of two parts, the uncorrected mean effect and the correction. The original observations are subject to errors from a population of error whose variance, 0 2 , is estimated as the residual mean square of 152.955. The mean effect has an error variance equal to 02/16 which is estimated as 9.558. The factor, m, used in making corrections has an error variance of u2/162x2/16, where 8x2/16 has the value 43.375 used in calculating m. The error variance of the correction m / 1 6 made to the mean value of A has an error variance equal to (r/16)2 X error variance of m and is estimated as (5/16)* x 152.955/43.375-that is, 0.344. The error variance of the corrected mean is therefore estimated as 9.558 0.344 = 9.902, and its standard error is estimated as 9.902 = 3.19. This standard error is calculated from a variance estimated from only five degrees of freedom and is accordingly somewhat inaccurate. Tables of Student’s t distribution show that in such circumstances an experimental result will once in 20 times depart from the truth by more than *2.57 times the standard error--8.20. Nineteen times out of 20 the deviation between the observed and true values will be less than 8.20 and there is, therefore, a 19 in 20 chance that the true value lies in the range -26.86 f 8.20, or be-

+

1305

tween -18.66 and -35.06. Similarly there is a 999/1000 chance that the true value lies within the range -4.98 to -48.74. There is slight possibility of the true value’s being zero, and the result is termed highly significant. The original analysis, without allowing for variations in make-up gas purity, set the 19/20 limits as -17.4 to -39.6. The effect of the more refined analysis has been to narrow the limits. Since statistical methods are sound, the narrower limits lie entirely within the wider limits previously set. There is still a doubt of the order of 2:l in the effect of changing catalyst temperature, although nonstatistical hvestigations would probably not realize it. If the whole experiment were repeated and conclusions were based on all 32 results, as could easily be done, it is anticipated that the range, within which the true value of the effect of temperature is probably (19/20) to be found, would reduce to something like -22.10 to -31.62. This range may, of courRe, be shifted up or down or modified somewhat in size, but it is unlikely to exceed the range -18.66 to -35.06, and the degree of uncertainty is likely to decrease from about 4/2 to about 3/2. Gas Concentration. After allowing for the effect of make-up gas purity, the effect of make-up gas concentration becomes highly significant, and there is considerable evidence that increasing the concentration of the active constituent in the makeup gas does increase the product strength. GOOD AND BAD EXPERIMENTING

Previous sections have dealt in almost minute detail with the methods of analysis of results obtained in a factorial experiment performed on process equipment in the chemical industry. It is now appropriate to take a wider view of what this factorial procedure achieved in this particular instance and of its ability in this general field of work. First, the experiment brought t o light the fact that reaction temperature, gas concentration, and gas purity have an important influence on the product strength and gave a fair idea of the magnitude of these influences. In addition, it showed that the effect of throughput on product strength is certainly small and may be zero. Secondly, the experimental plan showed itself capable of giving all this information expeditiously. It proved sufficiently versatile to deal with afterthoughts such as the uncontrollable variations in gas purity. Thirdly, the experiment disclosed the magnitude of the experimental error and, by demonstrating a significant difference between blocks, showed that plant operation was not “in control.” These two results are of the same type as those reported by Huhndorff (3)in his reproducibility study. The absence of control, which is a quite usual phenomenon, is of importance for two reasons-namely, its effect on the apparent accuracy of the results obtained and the question it raises as t o whether the results are of any value whatever. Control. It has sometimes been argued that unless production is in control one can draw no valid conclusions by statistical methods. Lack of control, such as exists both in Huhndorff’s data and here, is characterized by operation for a short period with normal variations about a given mean, followed by a sudden change of mean and operation with normal variations about the new mean. When manufacturing is continued with this lack of control, the quality of product that will be manufactured cannot be predicted for any time in the future. The situation in this factorial experiment is rather different. Here the changes in mean occur, or may occur, between one block and the next, and these changes cancel out as shown in Table IT. This lack of control does not affect any of the effects which have been discussed in this article, though they do upset the mean result. This is not of much importance since test results on one pilot plant can never give an absolute indication of the result to be expected in commercial scale operations. So far it has been argued that lack of control from one block to the next does not in anyway invalidate the conclusions drawn

1306

INDUSTRIAL AND ENGINEERING CHEMISTRY

from the factorial experiment. However, there remains the possibility of trouble arising from lack of control within the one block. Shewhart ( 4 ) has pointed out the fundamental importance of order in quality control and has shown that if data which are out of control are randomized the result must inevitably show control on any valid test. This principle is used in the fartorial experiment; the four test conditions which are included in each block are carried out in random order so that errors, which may be out of control, must inevitably be in control for any valid test such as those used The conclusion is: If randomization within blocks is carried out effectively, the conclusions drawn are valid. Confounding. Having decided that valid conclusions may be reached even in the absence of control in plant operations, one consequence of lack of control is worth discussing: Had the experiment not been divided into blocks, all sixteen tests would have been randomized and the between block errors, ABC, D , and ABCD, would have been added into the euperimental errors. This gives in Table I a sum of squares of 7162 for nine degrees of freedom or a mean square of 796. On dividing the experiment into blocks, the large between block errors are eliminated, leaving an error mean square of only 326. This halving of the error is as effective as doubling the size of the experiment, but it is obtained by forethought instead of by work. Piodurtion from a pilot plant does not bring in an income large enough to pay its running costs which, by contrast t o those of laboratory scale work, are large. Because of this, pilot plants should be run as little as possible and forethought, such as using confounded experimental designs divided into the smallest convenient blocks, should always be used. Another factor in the economy of reducing error is that the analysis of covariance reduced the erior mean square from 326 t o 153; this also is as valuable as doubling the size of the experiment. Good Experimenting. The marks of good experimenting, as illustrated by the factorial exprriment described, are: 1. Realization that tests on a single pilot plant can never, of themselves, give information of known reliability as to the perforniance of a somewhat different full scale plant 2. Determination as expeditiously as possible of the effects of changes in a comprehensive range of variables, including the interactions of variables 3. Adequate randomization leading to the valid assessment of the likely errors in the effects determined 4. The use of small blocks and perhaps analysis of covariance in order to reduce residual error and increase accuracy and also to show whrther there are other important factors which have not been investigated or controlled

These characteristics of good experimenting may be used as a basis for criticizing other experimental layouts. In the field of curing plastics Gore ( 2 ) described several such layouts He points out the fallacy of varjing one thing a t a time since this does not show up interactions which, in his field of study are usually important. Gore also illustrates a design mhich involves a subtle but often impoitant point in the treatment of error. A number of different compositions v, ere made up, and from each batch eight test samples were prepared. Comparisons of the eight results fioni samples prepared from one batch gave a measure of error, but this error did not include the errors in making up a composition to specification. The distinction is akin to that betx-een “within block” and “between block” errors. Any use of the erior calculatd

Vol. 43, No. 6

as above for the purpose of assessing the significance of differences between different compositions is misleading and arises from a failure to appreciate the significance of randomization. Gore completes his survey by advccating the use of factorial designs, but does not give a detailed example. Another approach, the uniformity trial, has been illustrated by Huhndorff ( 3 ) . This approach leads t o the measurement of error and the detection of lack of control items ( 3 ) and (4).However, by concentration on only two aspects, mean and error, it encourages the fallacy that the practical reliability of this mean is shown by the error which is determined. The reliability of the mean is to be judged only by the irregularities that appear on changing the conditions, such as type and size of plant and operating procedure, which determine the mean. As these conditions are not changed in a reproducibility trial, the relevant errors are not observed. By contrast a i t h either of these approaches, nonstatistical approaches rely on either a complete or a partial disregard of the problem of error. Sometimes considerable attention is given to the reduction of analytical error but far more important errors arising froni nonuniformity in raw material or from imperfections in plant control are overlooked. There is never a valid assessment of the relevant errors actually occurring, and the reliability of the conclusions is judged solely by guees which may be right, but often is wrong. In pilot plant work, the factorial approach gives everything that other methods can give plus a good deal more; this indicates that the first approach should almost a l v , q ~be through a factorial design and that other methods provide an economical way of experimenting only in special circumstances. ACKNOWLEDGMEYT

The author wishes t o thank the management of Imperial Chemical Industries of Australia and New-Zealand Ltd. for permission to publish t.his paper. He also expresses his indebtedness to W. E. Donnelley and R. H. Weldon who were directly responsible for the plant operation and laboratory analyses in the work described. LITERATURE CITED

(1) Fisher, R. A , and Yates, F., “Statistical Tables for Biological,

Agricultural, and Medical Research,” 3rd ed., Edinburgh, Oliver and Boyd, 1949. (2) Gore, 1%’. L., IND. ESG.CHEY.,42, 320--3 (1950). (3) Huhndorff, Roland F., 41, Ibid., 1300-3 (1949). (4) Shewhart, Walter A, “Statistical Method from the Viewpoint of Quality Control,” Washington, D. C., Graduate School, E. S. Dept. of Agriculture, 1939. ( 5 ) Yates, F., I m p . Bur. Soil Sci., Tech. Comm., 35 (1937).

RECEIVED November 20, 1950.