A Method of Visual Interactive Regression

Dec 12, 2006 - Web pages for links to a statistics site and interactive spread- sheets. 2. Coleman, W. F. Building Interactive Spreadsheets Page. http...
0 downloads 10 Views 127KB Size
Information



Textbooks



Media



Resources edited by

JCE WebWare: Web-Based Learning Aids

William F. Coleman Wellesley College Wellesley, MA 02481

Edward W. Fedosky University of Wisconsin–Madison Madison, WI 53715

A Method of Visual Interactive Regression by Michelle S. Kim, Stanford University, Palo Alto, CA 94309; Maureen Burkart and Myung-Hoon Kim*, Science Department, Georgia Perimeter College– Dunwoody Campus, Dunwoody, GA 30338; [email protected] Keywords: Analytical Chemistry; First-Year Undergraduate/ General; Computer-Based Learning; Chemometrics Requires Microsoft Excel, with Analysis ToolPak installed

It is often necessary in general chemistry courses for students to find the best-fitting line through a set of scattered data. The method of least squares is routinely used as the statistical tool to accomplish this. However, the process behind the least-squares method, minimizing the sum of the deviations squared, is not clear to beginning students. Blind use of the analytical formulas for a slope and a y-intercept is not pedagogically effective, and it does not satisfy analytically minded students. With this in mind, we have made the process of minimizing the sum visible by allowing the individual to adjust heights in a bar graph, thus making the process more interactive and dynamic. The interactive feature of Excel spreadsheet programs (1) is utilized; use of the spinner bar (2) is particularly helpful. The visualization process requires the following steps. Students prepare a table of X–Y data in an Excel worksheet and introduce another column to generate theoretical Y values (Ycalc ) with estimated arbitrary values of slope and intercept. The deviations squared and their sum are calculated in an additional column. The experimental Y and theoretical Y values are plotted against X in an Excel graph. Another graph in a separate chart is prepared to represent each of the deviations squared and their sum (SSQ, the last bar) in bar graph form. Finally, two spinner bars (2) are prepared in order to control the slope and intercept for Ycalc. At this point, students may click on the spinner bar for slope until the height for the sum of the squares (the last bar) no longer decreases. They do the same using the intercept spinner. Then, they alternate back and forth between the slope spinner and intercept spinner to repeat the steps until SSQ is minimized. This process was developed using data and results from Beer–Lambert law experiments. Sample calculations and graphs are summarized in Figure 1. The first two columns at the top of the figure are for the original X–Y data. The next three columns are for the calculated Y values (with arbitrary slope and intercept), deviations between the Y values, and deviations squared. The last three columns are for the calculated Y values, deviations, and deviations squared after minimization is complete. The last (and highest) bar in each bar graph represents the sum of deviations squared (SSQ); its value is also displayed under the bar graphs. The slope and intercept are 1884

Journal of Chemical Education



Figure 1. Visual Interactive Regression. By clicking on the two spinner bars at the bottom of the screen display, a student can minimize the sum of the squares of the deviations, which are shown in the bar graphs at the lower right.

incremented and decremented by 0.01 and 0.001, respectively. Since the spinner control for a cell value requires an integer increment, the base value has to be scaled with scaling factors of 0.01 and 0.001 for a fractional increment. In order to allow negative values in the intercept, 50 must be subtracted from the base value for the intercept. Ten trials were made with various starting values of slope and intercept, and the results are either equal to or very close to those calculated from the analytical formulas, 3.99 for slope and 0.014 for intercept. It should be noted that the results can be slightly different depending upon whether the slope or intercept is minimized first and upon the whims of clicking order on the spinner during the minimization. In seven out of ten trials, the method was able to find the global minimum. Only three trials yielded a local minimum that is close to the global minimum. Overall, the method was successful in yielding identical results to the analytical formulas for the slope and the intercept. Literature Cited 1. Coleman, W. F. Home Page. http://www.wellesley.edu/Chemistry/colemanw.html (accessed Oct 2006); see Course and Other Web pages for links to a statistics site and interactive spreadsheets. 2. Coleman, W. F. Building Interactive Spreadsheets Page. http:// www.wellesley.edu/Chemistry/Flick/chem231/excelforms1.htm (accessed Oct 2006) for the usage of a Spinner Bar.

Vol. 83 No. 12 December 2006



www.JCE.DivCHED.org