Desktop Modeling and Computational Chemistry: The Future of PC's

tance in government, commercial, scientific, and engineer- ing endeavors. Traditionally, physically large computer systems generally referred to as ma...
0 downloads 0 Views 3MB Size
Desktop Modeling and Computational Chemistry: The Future of PC's as Mini-Mainframes Eric R. Taylor Department of Chemistry University of Southwestern Louisiana Lafayette, LA 70504

Robert Sonnier and Brian Dore -S- Cornp-I ng Center . , n ~ v e r of s ~S~~lhwestern .odslana Since the development of the computer a t the close of

WWII, computers have assumed a n ever-growing importance in government, commercial, scientific, and engineering endeavors. Traditionally, physically large computer systems generally referred to a s mainframes have been the mainstay of scientific computing. Nothing currently approaches their capacity and speed for crunching the vast array of numbers fed and generated by scientific and engineering programs. ~ o r k s t a t c o n have s become popular a t many institutions, replacine direct dependence upon mainframes. However, workstations, depending ~ ~ o n - ~ viewpoint, our are much like desktop computers connected to servers such as VAX, PYRAMID, or SUN machines with multiuser shared software access. Costs are less for workstations compared to mainframes but not necessarily less than personal or desk-

236

Journal of Chemical Education

t o p computers (PC's) deoendina uoon the extent of periphiral hardware and available multiuser licensed s o f t w a r e . Workstations, though a n attractive alternative to mainframes, may s t i l l be h i g h e r i n i n i t i a l start-up costs for individual d e p a r t m e n t budgets for these reasons. Additionally, workstations i n sharing software and some svstem hardware leave the user dependent in some way upon the decisions of others concerning hardware and soft- Astereographic projection of a 5'-GGCC-3' tetramer duplex segment of DNAviewed into the major groove. ware availability, upgrade, This plot was done using the PC beta text DISSPLA software. Coded as a POSTSCRIPT file, it was plotted and change-outs. With the on a Hewlett Packard Laserjet I i I printer using a GHOSTSCRIPT driver since the beta test version of increase of p c s the question DISSPLA at the present time lacks a HP Laserjet I l l driver. naturally arises, can PC's eventually replace mainsoftware required, the machine as a minimum must be a n frames and serve as an effective computing machine for 80386SX, 33MHz. The machine used in this laboratory the large Programs so common to scientific and engineerhad a 200 MB hard drive with one 360 KBl1.2 MB capable ing research? 5.25-in. drive and one 720 KBl1.44 MB capable 3.5-in. Many chemists have introduced microcomputers into drive. Optimal efficiency requires a math coprocessor comvarious courses. This reflects ease of use and greater expepatible with the 80386, 33MHz machine. rience students have working with PC's than with mainframes. Molecular mechanics calculations performed over Software two lab sessions ( I ) , quantum dynamical calculations (21, Software requirements include Microsoft Powerstation and molecular modeling ( 3 )for organic chemistry students 32-bit FORTRAN (F321. I t has several advantages over note the success of such approaches in chemistry educaMicrosoft's 16-bit FORTRAN version 5.1. First, i t is in tion. Of the numerous reports of microcomputer use in color and, hence, various critical regions of FORTRAN chemical education, some use BASIC language and some space appear color coded for ease of viewing and trouble FORTRAN. Time required for such calculations and the shooting. Second, it permits expanded memory over the programs used generally are tailored for completion within MS-DOS limitations through the use of Microsoft Windows a lab period. I report here the results of a n investigation 3.1. Third, error messages are accessible through window into the use of a PC for the execution of large, previously keying. Though F32 can run a t the DOS level, it is much mainframe-based programs used i n chemical investigaeasier and affords greatly expanded capabilities through tions. the Windows 3.1 interface. Thus. Windows 3.1 is also a necessary software component. Windows 3.1 also has a reRationale quirement for MS-DOS 3.0 or greater, while F32 requires Present day technology provides PC's with capabilities MS-DOS 3.3 or greater. The system used in this study emthat rival earlier mainframes. The advantages of a PC are ployed MS-DOS 5.00. Many of the results of molecular modeling research ulti(11 small initial cost for hardware mately translate into visual portrayal of structural fea(2) vast array of compatible software tures. Graphics packages are significant tools in molecular (3) ease of upgrade or expansion (41 ease of portability modelina studies. The m a ~ h i c s~ a c k a e eoreviouslv used (51 small physical size (6)no requirement far specialized installation and housing as with mainframes under development. DISSPLA is a vector graphics soft(71 independence from mainframe alterations or change-outs ware with many capabilities beyond discussion here. A (8) less costly upkeep sample plot of a DNA molecule fragment (see figure) provides evidence of the potential quality of the PC version Additionally, i n t h e present economic environment, package for researcheis needing graphics capabilities on mainframe upkeep, licensing, etc. may force economizing any PC application. within facilities or future planned facilities. This is parTable 1summarizes the software and size allocations. ticularly the case a t smaller or less financially endowed institutions. Procedure As for scientific computing, various language compiler programs are on the market including FORTRAN compilThe first research program used in this study was the ers. Also, previous mainframe-based statistical packages GAGNAS developed by Kenneth J. Miller of RPI in 1979 (SAS, SPSS) a r e now available for PC's. This software (41 and modified in 1984 and 1989 ( 5 , 6 ) .I n addition to the availability for PC's extends to p s as GAGNAS program, the second program considered was ..r a.~ h i c s~ a c k a e e such D1SSPI.A b.v Cornp~iter~\iit,uiiirrs.~is:l(lvant.;&sarc dethe MNDO 353 (7). The MNDO 353 presents a commersrrlbed below. The rrquin.mrnts and pcrtfics tollow. ciallv available nrozram for readv com~arisonsof chemical . structures. For ease of comparison several small molecules Hardware with differing number of electrons served a s test snecies for time of execution analysis of MNDO 353 on ~ 3 2 . ' The hardware used in this laboratory was a CompuAdd The GAGNAS and MNDO 353 programs originally were 80386SX, 33 MHz IBM compatible PC with 1 MB VGA graphics and 4 MB RAM. For efficiency and utilization of installed on a mainframe IBM 30901200. They were down-

*

Volume 72 Number 3 March 1995

237

Table 1. Software and Size (Bytes) Requirements

I. MS-DOS 5.0 2. Windows 3.1 3. FORTRAN 32 Bit Powerstation 4. GAGNAS 5. MNDO 353 6. DISSPLA Total Disk Size Used (Installation)

2,057,147 9,608,241 7,241,913 1,216,496 454,689 4,842,416 25,420,902

Table 2. Summary of F32 Program

File Identities

1.

2. 3. 4. 5.

Size (~ytes)

GAGNAS (.obi) 1,278,890 GAGNAS.EXE load Test Search B-DNA5 341,428 MNDO 353 (.obi) MNDO 353.EXE load Execute MNDO 353:

6. for H 2 0

7. for CH4 8. for C2H4 9. for C6H6 10. for C10H8

-

-

lime

IBM 3090 MVS Elapsed Time

10m 45s 34s 2m 23s 2m 25s 3s CPU + IIQ: CPU + VO: 40s 3.44s 2m 25s 2.25s 9.28s 27m 50s 48.585 40m 53s 5h 25m 5s 5m 31.47s

loaded to a 1.44 MB 3.5-in. disk in ZIPPED (compacted) form. They were installed on the 80386, unzipped, and utilized with the F32 system. The PC F32 compiler appears less forgiving of FORTRAN protocols than the mainframe-based fortran. As an examole. the c o m ~ i l e rwill not allow out of order statement; such as CHARACTER sratemcnts before COMhlON statements. Tho IBM 3090 \'SIFORTMN software did not key on or object to the order or random mixing of these and other statements such as DIMENSION, REAL or INTEGER and LOGICAL. Another peculiarity of F32 FORTRAN is the nesting of DO loops. On the mainframe, DO-loop nesting is not ambiguous. Thus, the following is permitted: DO10M=1,7 DO 10 N = 1,7 DO 10 J = 1,7

10 CONTINUE F32 objected to such a labeled nest of DOs. The nest must be modified to afford each loop its own unique termination statement, such as: DOlOM=1,7 DO20N=1,7 DO30J=1,7 30 CONTINUE 20 CONTINUE l o CONTINUE Theremay be other such oddities with F32; however, this laboratory has not yet stumbled upon them with the existing programs used. 238

Journal of Chemical Education

Table 3. Time Ratios of Execution for Increased Numbers of Electrons -

n. Specie

PC Time Ratios

IBM lime Ratios

tn*,itn

t"*llt"

Results Table 2 summarizes the results for the programs under study. The compilatiomlink times as well as run times are functions of the specific program's physical length, and on the complexity of operations performed by the program a t execution. Execution of the GAGNAS entailed a search of limited region of DNAconformational space requiring the program to iterate specific basepair orientations and deoxyribofuranose puckers. Details of the program's operation appear elsewhere (5.6). A specific run on the 80386 searching for B-DNAstructure i n a limited domain required 2m 23s. The same (record) run on the IBM 3090 mainframe required just under 2s. Thus, the first significant disadvantage of PC modeling arises a s a n increased execution time of about 70 fold. Table 2 lists the chemical species run with MNDO 353 on both the PC and the IBM 30901200 mainframe. As an average, the PC requires about 100 times the time for execution that the mainframe requires to run MNDO 353. The minimum time ratio, t p c / t l ~was ~ , for HzO, the maximum, for C2H4. Table 3 illustrates the difficulties in predicting the time of execution on the PC system versus the mainframe as indicated by the ratio t,+l/t,,, where n is a structure. For example, when n = 1stands for Hz0 and n c l = 2 stands for CH4, then on the PC, t,+l/t,, = 3.6 and on the IBM, t,,+l/t, = 0.7 Though the execution (CPU) time steadily increased on the mainframe for greater number of electrons i n the species, i t is inconsistent on the PC. I currently cannot explain the differences i n PC ratio trends compared to the mainframe ratios. The 80486 machines now run a t 66 MHz via clock doubling technology (80486DX2). New clock tripler CPUs are now available and will permit approximately 99 MHz execution time (80486DX4). Clock doubling a n d tripling makes the CPU two to three times faster. The I/O speeds remain the same. U0 bound program execution speed will not increase, since the PC 110 bus runs a t 8 Mhz. Alternate UO bus approaches a r e available such a s Microcbannel (IBM), EISA, VL, and PC1 (Intel). Thus, the second disadvantage, the bottleneck in execution time arises from constraints on the local VO bus rather than with the speed of the chip (33 MHz or 66 MHz). The new Pentium CPU by Intel (9)runs a t 100 MHz and Power PC by Motorola (10) runs about three times faster than the Pentium CPU. These processors rival the performance of high-end workstations. A third disadvantage of MS-DOS environment programmingis the limitation of single tasking. Once submitted for execution, MS-DOS does not permit multi-tasking. This trait is a function of the operating system (such as MSDOS). To some extent, multi-tasking can be performed with Windows. Other PC operating systems like 0512 and the U N M type (LINUX, AIX) currently handle multi-tasking very well. Memory a s a problem i n execution also is

operating system based. MS-DOS operating systems need to he specially configured to function beyond the 640K RAM limitation. Non-MS-DOS operating systems have no such constraint. Conclusions I t appears that the future for conductingextensive, large scale molecular calculations on PC's is promising though a t present not realistically practical in all applications. There is no question that operation of these programs on a 486, Pentium, or Power PC machine would result in immoved execution times.Clearlv for the complex and vast numerirnl proct.i%inglnvolved in quantum mechanical calcul;uions. the 33 l I H 7 or 66 .\lIIz PC ls not set mnpt~titi\.c with the'mainframe. For other less demanding applications, the PC may offer a viable alternative to the expensive mainframe CPU time, priority classes and the diminishment of turnaround execution/output time for results a t the host facility's printer. The output availability advantage for small run-times may be real compared to the turn around time of the mainframe printer facility which can vary from 15 min to a few hours, depending upon the policy of the mainframe facility. If the execution time and usable memory on a PC are not a factor on future machines. the cost effectiveness of a PC system becomes a real advantage. This may be a real issue for smaller institutions that cannot afford the exuensive mainframes, software systems, licensing, and support upkeep costs. To a lesser extent. the same mav be true for w,rkit,~tiuni.Currently, on(, can plck up IRM cornpatihl(~ 801% mach~neswith100 1IB to34U MI3 hurd dri\.t,i, 1 MB

RAM and required software to perform FORTRAN-based computations for under $2500. Some systems offer 60 MHz Pentium CPU, 1GB hard drive storage, 16 MB RAM memory and 17 in. Super VGA monitor on local bus with 2 MB video RAM for under $5000. Additionally, independence from a mainframe or the sharing dependence on softwarehardware of workstations offers its own advantages. Finally, for attracting students into computational ch&ical research. the PC has a real familiar "feel" that certainlv no mainframe offers. The pace of advance in the PC technology is such that by the time your read this, the capabilities will probably surpass what is noted here. Acknowledgment The author (ERT) thanks Kenneth J. Miller for a copy of the earlier version of the GAGNAS modified and used in this laboratory, and Phyllis Totaro of the USL Computing Center for JCL configuration of MNDO 353 on the IBM 30901200 MVS system. Literature Cited 1. Simpson, J. M. J. Chem. Edve 1989,66.406407. 2. Tnnne? J. J. J. Chem Educ 1990,67,917-921. 3. B0x.V G.S. J. Clhm. Edue. 1991, bR. 662-664. 4. Milkc K. J. Biopoiymers 1979. I R , 959-980. 5. Taylor E. R.: Miller K J. iliopo1,yniers1984,23.285%2878 6. Bonnetf. M.: Taylor, E.R. J. Bioniolec. Slruci. a Dyn.1989, 7, 127-149 7. Quantum Chemistry Pmgrsm Exchange. Chemistry Department. Indiana Univer mLy, Bioominpton, IN 47405 8. Computer Associates International. lne.. 10505 Sorrento Valley Rd.. San Ihego. CA Q979,

9. Intel Corp.. Literature Package X130. 1-800-955-5599. 10. Motorola Corp., Power PC Microprocessor Update, 1-800~845~MOTO.

Volume 72 Number 3 March 1995

239