Problems in Statistical Design - Industrial & Engineering Chemistry

Ind. Eng. Chem. , 1959, 51 (12), pp 85A–86A. DOI: 10.1021/i650600a763. Publication Date: December 1959. Copyright © 1959 American Chemical Society...
0 downloads 0 Views 3MB Size
I/EC Guidebook for Technical M a n a g e m e n t

STATISTICAL DESIGN

Problems in Statistical Design N e w types of studies such as in weather control and screening large numbers of materials pose new problems in statistical design by W. J. Youden, National Bureau of Standards

fallen without any encouragement. The fact that rain falls after seeding does not prove that seeding was responsible. T h e verdict must rest upon a comparison made between a series of control days and a series of seeded days, care being taken to be completely fair in the allocation of available days to the two series. So, first of all, on each day a decision has to be made regarding the suitability of the day for inclusion in the program. A second decision has to be made on whether to use the day as a control or, alternatively, submit the storm clouds to the chemical seeding. CHEMICAL engineers may find some The decision on whether or not a comfort in considering the diffigiven day is suitable could easily be culties that beset weather experiinfluenced, consciously or otherwise, menters. At first glance, the by knowing whether the day would weather problem seems simple by be a control day or a seeded day. contrast with the complex problems One way to prevent influence of this that are encountered in chemical kind would be to toss a coin after processes. Most chemical processes the day was picked and let the coin involve a number of variables. determine whether the day was to be Furthermore, these variables in many a control or seeded day. In the long cases do not act independently of run this would insure a fair division of each other. As if this were not the days and provide the requienough, there is usually no one site basis for statistical evaluation. simple measure of the outcome. There are two important drawbacks Percentage of theoretical yield will be to this simple solution. First, the important, but so may be the prescapricious nature of random events ence of unwanted side reaction might lead to a run of control days. products. Production time required Farmers might raise objections and per unit of acceptable product is also complain that seeding would give a factor that must be taken into contheir crops some much needed rain. sideration, at least in the economics Second, the frequency of suitable of plant operation. By comparison, days is low, so that the entire proit would seem a simple matter to gram stretches over the seasons. determine the effect of seeding clouds This introduces seasonal variation in with a particular chemical. addition to the wide variation in inches of rainfall observed among The experimental unit is a day and rainfalls within a season. not all days are suitable. The day must have storm or rain clouds presThe usual statistical design apent in the locality. The moisture proach would pair off successive must be present, or clearly the trigsuitable days and let a coin decide gering action of the seeded chemical whether the first day of the pair cannot operate. T h e only useful would be a control or a seeded day. days are those when rain might have But right away it is obvious that the EDITOR'S NOTE. Dr. Youden, our valued author for six years, is turning over his bimonthly feature to a former associate at NBS, Dr. W. S. Connor, now with Research Triangle Institute in North Carolina. Our deepest thanks to Dr. Youden, and his thanks to I/EC readers for their interest and encouragement.

second day of the pair will be assigned to the other series. The indispensable objectiveness in the selection of suitable days might be lost. One answer would be a very tight security imposed on personnel not to disclose what was actually done on the first day of each pair. The evaluation of the rainfall itself requires a considerable number of widely scattered rain gages and critics may question whether these were adequately protected against tampering. This point is mentioned merely to point out that an experiment outside the protection of laboratory walls is vulnerable to extra hazards. The rain-making experiment is especially good as indicating the role of a random or restricted random sequence of trials. This problem arises in pilot plant experiments. A strictly random order for a series of experiments with a cracking tower might require frequent changes of the catalyst packing between runs. The costs and inconveniences of strict adherence to statistical rules of the game have to be carefully weighed against possible penalties for their nonobservance. Screening Experiments

There are two types of screening experiments that present very different problems. T h e experimenter may wish to explore a considerable number of different variables that may or may not influence the performance of a particular process. The purpose is to identify the important variables for more intensive study and to accomplish this with a preliminary screening program of rather limited scope. Fractional factorials have been widely used in this sort of enquiry. The October column gave an example of a weighVOL. 5 1 , NO. 12 ·

DECEMBER 1959

85 A

I

/

E

C

G u i d e b o o k for Technical Management—STATISTECAL

ing design study of seven factors with eight experiments. A second type of screening problem has to do with continuing programs in which an indefinitely large number of materials are examined in turn with the hope that some really valuable material will turn up. The search for effective drugs calls for a regular program of testing new compounds as they become available. New insecticides and fungicides are discovered by the application of a persistent program of testing a great many materials. More potent strains of antibiotics are found by testing many strains that may be produced by radiation treatments of existing strains. For all these situations the outstanding characteristic is that really effective materials are rare and consequently a large number of materials must be examined. The testing cost in time and facilities is very considerable, even when a minimum test is applied to each material. There is a nice question involved here. The test must be adequate to catch outstanding materials; otherwise these will escape detection. On the other hand, the more time spent on each test the fewer the tests that can be run with the available facilities and this will reduce the chance of finding a good material. The proportion of useful materials is involved in this question, but this proportion is not known. The usual recourse is a sequential procedure. All materials are given a simple and not particularly discriminating test. A small proportion of materials, those that show best on the first test, are given a second more thorough test to reduce the chance of overlooking a valuable material. During the war, a screening problem was solved in connection with the medical examinations of service personnel. A Wasscrman test was required of every individual entering the services, so the testing problem was of very large magnitude. Positive cases were relatively rare, of the order of 1 or 2 % of cases. The test method was sensitive, so the idea was advanced of pooling the blood of several individuals and performing one test on the composite group sample. If the result was negative, all contributing individuals were cleared. A positive test result meant 86 A

that all the individuals in that group had to be re-examined. The saving that resulted may be visualized by considering the number of tests that needed to be run if 100 individuals were tested in 50 groups of two. If there were two infected individuals among the 100, at most two of the pairs would need to be retested. Four individual tests would be required in addition to the 50 already run on the 50 pairs. The total number of tests would be 54 : a saving of nearly one half over the 100 tests applied individually. If tests were run on groups of eight, a saving of 7 3 % of the testing resulted when the prevalence rate was 2%. The statistical design problem was to ascertain the optimum size of the groups for various assumed rates of the disease. Robert Dorfman reported this work in a short paper entitled "The Detection of Defective Members of Large Populations" [Ann. Math. Statistics 14, 436-40 (1943)]. The following optimum group sizes are from Table I in Dorfman's paper.

Prevalence Rate, %

Group Size

Reduction of Tests

1 2 5 10 20

11 8 5 4 3

80 73 57 41 18

The paper includes curves showing how the number of required tests depends on the group size. At low rates the efficiency changes relatively slowly in the neighborhood of the optimum size of group. Apparently little use has been made of the principle employed in this screening technique. The test is just as appropriate for rare superior individuals as for rare defectives. The technique is not without possible complications in other applications. Consider testing compounds in pairs for use as insecticides. Usually a very low concentration of the active ingredient is requisite, so solubility interference is not likely. Of course, the two materials should not interact chemically under the conditions existing in the solvent. Of more importance is the joint action of the two materials. The two materials may act independently, so that the observed

INDUSTRIAL AND ENGINEERING CHEMISTRY

DESIGN

kill is the sum of what would be obtained by individual tests on each material. Only very effective materials are of interest, so the combined action of ineffective materials would not rate attention. The materials might be antagonistic, so that one material destroys or impairs the effect of the other. This opens the door to a good material's escaping detection. There is also the possibility of a synergistic effect—that is, the combined action of the two materials is greater than the sum of the separate effects of the two materials. There is no chance of finding this if the materials are tested separately. So perhaps antagonism and synergism cancel each other out as possible losses and gains. The real gain will come in that the number of tests will be cut approximately in half. Problems in Statistical Design

The two examples just discussed exemplify the diversity of the problems in statistical design. The traditional statistical approach has been to specify a fairly substantial program of experiments and defer interpretation until the program was completed. The major advantage of this practice is that it simplifies the interpretation of the results. The disadvantage lies in the inflexibility of such programs. Statistical design will surely move in the direction of tentative interpretations at various stages of cempletion of experimental programs. Sequential inspection schemes are now widely used, with considerable over-all savings in inspection costs. Future developments in statistical design will be more and more guided by the nature of problems encountered in experimentation. The parting message is this: Disclose your problems to the statisticians and give them a chance to come up with novel and useful designs.

Our authors like to hear from readers. If you have questions or comments, or both, send them via The Editor, l/EC, 1155 16th Street N.W., Washington 6, D.C. Letters will be forwarded and answered promptly.