An approach to computer-assisted drill in synthetic organic chemistry

This report details an approach to generating computer-assisted exercises in synthetic organic chemistry appropriate for introductory, undergraduate, ...
0 downloads 0 Views 3MB Size
H. A. Clark, J. C. Marshall,' and T. 1. lsenhour University of North Carolina Chopel Hill, 27514

An Approach to Computer-Assisted

Drill in Synthetic Organic Chemistry

The study of organic chemistry requires a comprehensive understanding of general principles and a considerable amount of drill. This report, which is a result of a general study of computer representations of organic molecules, details an approach to generating computer-assisted exercises that should be appropriate for the introductory undergraduate organic course. The techniques reported here and detailed in the appendix should allow any instructor with access to a time-sharing computer and a reasonable facility with any high-level language to produce programs that are flexible and appropriate to his courses. There have been several reports of computer-generated drill in undergraduate organic chemistry. Rodewald, Culp, and Lagowski ( I ) presented students with numerous program modules, one of which allowed interaction with multi-step synthetic procedures. An alternate approach, using the PLAT0 system, has been described by Smith (2). This system begins by displaying a target compound and requests that the student supply a starting compound (with three or less carbons). The program then supervises the step-by-step conversion of the specified starting material to the target compound. Both of the above approaches require both elaborate hardware and software and are, unfortunately, beyond the limits of resources and expertise of most smaller institutions. In contrast to this, Crain (3) has reported a simple system designed to generate interactive drill on aromatic synthesis that appears to be compatible with most hardware configurations. Discussion The technique we have developed is predicated on the premise that there is great pedagogical value in summarizing synthetic sequences in the manner shown in Figure 1 (6). We will use the information in Figure 1 to illustrate how any reaction summary of this kind can be made the basis of computer-assisted drill exercises. Any other reaction summary provided by the instructor could be used. To simplify coding, the students are presented with a key for each compound. Information necessary for local computer operation would also be included on this key. A summary that could result from the reactions in Figure 1 is shown in the table. If the available computer has sufficient storage capacity, the literal representations of the

Figure 1. Reaction summary

Summary of Results from Reactions in Figure 1 Compounds Kev

Comoound

Reoresentation

RH ROH RNHl R'CN R'CH20H RCHzOH RMgX RX R'CONH2 R'C02H RCHO RC02H RCHCHZR

RH , ROH RNHZ R*CN R*CH20H RCH20H RMGX RX R*CONH2 R*C02H RCHO RC02H RCHOHCH2R

I

11

b

18 19 20

RC--CR RCH=CHR RCHICHzR

RCTCR RCDCR RCH2CH2R

compounds and reagents may he stored and used by the interactive program. If storage of this information is not possible, the numerical representations are handled directly. The reaction pathways in Figure 1 are input in the form of adjacency matrix A (4). For 20 compounds, A would be a 20 X 20 array for which ai, is one if a reaction exists hetween compound i and compound j and is zero if no reaction exists. The first two and the last two rows of the adjacency m a t ~ i xnecessary to represent the data in Figure 1are

00000OOOO00010000001 00000000000000000000 Note that az.8 = 1 which indicates the possible converRX). The sion of compound 2 to compound 8 (ROH mathematical details of path tracing techniques used are given in the appendix. The first program, which maintains all necessary information core resident, requires somewhat less than 8,000 words of storage2 and could be made smaller if necessary.

-

'Permanent address: Department of Chemistry, 'Saint Olaf Colleee. Northfield. Minnesota 55057. ~ A T I programs were written in Fortran IV and developed on a Raytheon 706 computer with 16,000 words of core memory. The system disc or magnetic tape was used when mass storage was required. Volume 50, Number 9, September 1973 / 645

CAN WESTART WITH RMGK TO MAKE RCOCH2R YES THAT$ RIGHT. WILLI*M HOWMANYSTEPSDOYOUTHlNIARENECESSARI?

01 NO. THE LEAST NUMBEROFSTEPS IS 2

GIVE MECODEFOR? INTERMEDIATES.ONEIOALlNE 03 THATWASN'TPERFECT.DOYOU WANTTOTRY AGAIN. WILLIhM?

YES GIVEMECODE FOR1 INTERMEDIATES.ONETOA LlNE 13 NOWGIVE MECODEFOR2REAGENTS ONETOALlNE 08 01 THAT WI\SN.TPERFECT. DOYOU WANTTOTRY AGAIN. WILLIAM" NO HERE'STHEPATH. WITH REAGENTS STEP NO REAGENT PRODUCT START RMGX RCH2CHO RCWHCWR 10) RCOCH2R W YOU WANTTOTRY ANOTHER SYNTHESIS? NO

GOOOBIE.COMt AGAINSOON. WILLIAM

Figure 4. Typical student-computer interaction involving Program 2

& Figure 2. General flow diagram.

This type of program is very easy to code and treats only one-step conversions. A general flow diagram of this program is shown in Figure 2. The representations of the compounds and reagents are input as arrays so that this information can be retrieved by subscript reference. For example, consideration of the conversion from compound 2 to compound 8 (ROH RX) would cause retrieval of the representations of the two compounds from positions 2 and 8 of a vector and retrieval of the appropriate reagent from position 2, 8 of an array. Finally, the adjacency matrix representation of the reaction sequence is input. Excepting the syntax of the interactive portion of the program, the program need only be elaborate enough to draw two random numbers scaled to the subscript range of the adjacency array, ask the student if the conversion suggested by the suhscripts is possible and subsequently issue appropriate replies. Alternately, the input can be arranged so that instructor designated reactants and products will be chosen. The student answer can he checked immediately by simply determining if the conversion considered, i.e., from 2 to 8 shows az.8 # 0. Figure 3 gives a typical record of a student computer exchange. The use of

-

the students' names in some of the replies adds a personal touch which apparently is appreciated by students (5). The second program is a slightly more complex version of Program 1 and requires mass storage. This program allows multi-step paths and interacts with the student concerning the identity of both the intermediate and reagents involved in each of the steps. The input demands are the same as for Program 1, requiring representations of compounds, reagents, and the adjacency representation of the reaction list. This program will always find the shortest path(s). Again the student-computer dialogue is initiated by having the computer select randomly a starting material and a target compound and then interacting with the student on intermediate steps and reagents if a path exists. A typical student-computer interaction involving Program 2 is reproduced in Figure 4. This program, written with no real attempt to conserve storage, required mass storage for arrays and about 10,000 words of core storage. The final program, and the most complex of the three, accepts the same input data as the above programs and also requires mass storage. A flow chart of this program is given in Figure 5. This program asks the student to specify the conversion he wishes to carry out and allows the

HELLO-, AMONE-STEP. WE WILLWORKON ONE-STEPREACTIONS WHAT$SYOURNI\ME.PLEI\SE? (PADTO 12SPACLS WITH BLANKS1 WILLIAM 15 R'CH20H TO R'COZH A ONESTEP REACTION? YES VERY @DO0 WILLIAM WHAT ISTHECODE FOR THE REAGENT? 06 VERY BOOD. WILLIAM

SHALL WETRY ANOTHER REACTION? VES IS RCTCR TO RCOCH2R A ONE-STEPREACTION?

NO NO.THERE IS A REACTION WHAT ISTHECODEFORTHEREAGENT" 02 NO. I WOULDUSE 1120 SHALL WETRY ANOTHERREACTION7

YES ISRNH? ToR'CONH2 *ONE-STEPREACTION? VES I DO NOTSEE A ONE STEP REACTION SHALL WETRY ANOTHER REACTION?

NO OOOD-BYE. COME AGAIN SOON

Figure 3. Typical record of a student computer exchange far Program 1.

646 /Journal of Chemical Education

Figure 5. Flow chart for Program 3.

OlVE MECODES FORSTARTINOW0 & PROO. E

student t o interact b y having him give u p a n d tell him t h e answer a t a n y time. A typical sequence of interactions with t h e program is shown i n Figure 6. As all possible p a t h s are allowed, t h e coding is more extensive t h a n in previous programs, requiring about 12,000 words of core storage a s we have written it. Again, no particular effort has been made t o minimize core storage. T h e programs presented are not intended t o illustrate software packages b u t are meant t o be illustrative of t h e flexibility of this approach. This is a convenient general approach t h a t should allow individual instructors t o produce computer-assisted drills, tests, a n d summaries of reaction sequences i n a convenient way. While we d o not mean t o imply t h a t drill of the type provided by t h e program we have illustrated is of pre-eminent importance in t h e study of organic chemistry,-it is certainly a n aspect of t h e study where computer use should b e considered.

G "1406"

01.06

DOYOUTHlNKARXN W T H EXiSTS" YES THERE i S A P A T H CANYOUTHlNK OFTHENEXTSTEPFORWARD" YES GIVE METHECODE FORTHENEXTCPO.E.G."Dd.. 11 THlSRXN (SNOT IN MY CATALOGUE CANYOU THlNKOFTHE NEXTSTEPFORWARD?

NO CAN YOUTHINKOFTHE NEXTSTEPSACKWARD? YES GIVE METHECODEFORTHENEXTCPD. E G " 0 1 . . 12 RIGHTNOW RGT

CODE. EG. "Or. TO CONVERT RCOZH TO RCH2OH

Appendix The complete adjacency matrix representation of the synthetic information in Figure 1is as follows ",.,"A

C02 iH1

ON0 YE.

----

RCOPH RCHPOH

12

e

Figure 6. Typical sequence of interactions for Program 3

Information can be derived not only from the adjacency matrix but also from matrices generated by raising A to powers. The existence of one or more steps may be determined directly, and intermediates may be determined by a decoding process described later. is the number of n-step paths that In general, far A("', exist between the vertices ur and u j . It should be noted that for all powers of A, ai.i'"l is set equal to zero. This amounts to a partial elimination of cyclic and/or redundant paths. The adjacency matrix representation is general for any level of connectivity between the reactants: bounded bv the case where each reactantproduct sequence is isolated from all others to the extreme case where every point is reachable from any arbitrary starting point. Given an adjacency matrix representing any set of organic reactions, the shortest path between any two vertices, ut and u j , is found by determining the lowest power of A for which ai,j'"' Z 0. (The numerical value of the first non-zero value of o t P ' is the number of n-steo . oaths . that exist.) The existence of oaths loneer " than the minimum path can he oetermined hy computing t'urther orders of the ndiarcney matrix. I'aths two t,r more \rep\ Imger than the minimum path must be checked for redundancy. A computer trace of a path, once its existence and length has been verified in the manner noted above, is performed hy working backward from the final product to the starting material. Hence, if a,,,'"' z 0, then there exists some # 0 for which aa,j f 0. This identifies the compound represented a t the vertex o~ as a precursor of u,, the target compound. Next, there exists some a i , ~ c " - ~for ' which a,,& z 0. This identities the compound represented a t vertex u, as a precursor of in the reaction sequence. This process is continued until the path is complete. The process noted above is completely general and could, in theory, he applied to any stepwise process. With large lists of 3 It should be noted that the rearranged matrices can be computed directly from adjacency matrix at considerable saving of core storage.

reactions, storage requirements become unrealistically high. However, to save storage, powers of A could be generated as needed. This would he very inefficient and in the case of large reaction lists, not fast enough for an interactive program. The use of a mass storage device is necessary on most small computers. A convenient way to organize mass storage files is suggested by the general path tracing pioeedure noted above. In tracing a path from ui to or. .. one is alwavs interested in the ith row of the Dowers of the adjacency matrix. Hence, to use mass storage efficiently, files of rearranged arrays are created as follows. Assume a system with m entries, i.e., there are m compounds of interest and the adjacency representation is an m x rn array. For this system, compute and store a set of matrices P', i = 1, m such that

The utility of the rearranged matrices3 can be illustrated by examining a path from ui to v j . PLis loaded from the mass storage device and the elements of the jth column examined for the first non-zero element. The row index of the first such element, n, is the minimum number of steps in the conversion. The path is then traced in the same way as noted above, the important point is that A and P' contain all the information required to complete the path trace. Note that once the adjacency matrix, A, and the rearranged matrices, Pi, are computed and stored, all problems associated with the reactions coded in A can be efficiently solved. Acknowledgment T h e authors gratefully acknowledge Professor R. L. McKee for his helpful discussions. Research supported by t h e National Science Foundation. Literature Cited (11 Rodewald, L. B., Culp. G. H., and Lagomki, J. J.. J. CHEM. EDUC., 47. 134

(4) Harsry, F.."GraphTheory."Addison-Wesley, Reading, Mess.. 1969. (5) Jahn, B. L., "A Computer-&dated Pmgram in American History, 187Cb1921,'' Proceeding of the 1972 Conference on Computers in Undergraduate Curricula, p. A19 ~-~

(61 Cram D. J., and Hammond, G. S., "Organic Chemistry," (1st Ed.). MoGraw-Hill. New York. 1959.

Volume 50, Number 9, September 1973 / 647