Splines under tension for gridding three-dimensional data - Analytical

P -wave velocity structure of the uppermost mantle beneath Hawaii from traveltime tomography. Frederik J. Tilmann , Harley M. Benz , Keith F. Priestle...
0 downloads 0 Views 380KB Size
Anal. Chem. 1982, 54, 414-416

414

Splines under Tension for Gridding Three-Dimensional Data Hal R. Brand and Jack W. Frazer” Lawrence Livermore National Laboratoty, Unlversl(v of California, Livermore, California 94550

By use of the spllnes-under-tenslon concept, a simple algorlthm has been developed for the three-dlmenslonal representatlon of nonunlformly spaced data. The representatlons provlde useful lnformatlon to the experlmentallst when he Is attemptlng to understand the results obtalned In a self-adap tlve experiment. The shortcomlngs of the algorlthm are dlscussed as well as the advantages.

A fully automated apparatus can provide the capability for self-adaptive experimentation where an approximate minimum amount of data is obtained to provide the required information. In such experimentation the data sets often consist of nonuniformly spaced data. More traditional methods for three-dimensional plots of such data are not good representations of the process under study. Therefore, we have developed a rather simple representation utilizing the splines-under-tension concept. The advantages and problems associated with this representation of three-dimensional data are presented. Three-dimensional data are obtained for many experimentation based activities in a wide diversity of engineering and scientific endeavors. Traditionally such data have been reduced to seta of two-dimensional plots as aids to obtaining the required information from the experiment. With the advent of inexpensive mini- and microcomputers with low cost mass storage devices, experimentation can often be performed under computer control thus providing an economical means of capturing three-dimensional data. In addition, this inexpensive intelligence will allow the scientist to use nontraditional experimentation techniques, most of which will result in the acquisition of multidimensional data. For ease of data interpretation it is often desirable to display three-dimensional data sets as a two-dimensional projection or a contour plot. In addition, a two-dimensional projection of four-dimensional data where the fourth dimension is displayed on a time axis can aid data interpretation. Graphics programs suited to these tasks are becoming very common. However, nearly all such programs are based upon having data obtained from samples taken at uniform intervals of the independent variables, i.e., gridded data. In many applications, it is impossible or economically infeasible to acquire uniform data. What is needed is an interpolatory technique that performs well on nonuniformily spaced data. Such a technique could have many uses. The primary use for us has been in the development of self-adaptive experimentation methods (1). All experimentation is limited by the measurement accuracies. Thus, there is a “natural” gridding of data based on the measurement accuracy and range over which each variable is measured. If measurements are made only at the intersections of this natural grid, nonuniform data can be displayed using currently available graphics programs. A number of interpolatory techniques have been developed, some of which are proprietary and sold commercially (2,3). Almost all are based on true three-dimensional interpolatory schemes. In this paper, we explore the use of a two-dimensional interpolant (splines under tension) applied directly to three-dimensional data. 0003-2700/82/0354-0414$01.25/0

EXPERIMENTAL SECTION Splines have been used quite sxtensively for two-dimensional interpolation for many years. The common spline is the cubic spline in which a different cubic spline polynomial describes the data over each interval. The coefficients of these cubic polynomials are determined by continuity conditions at each of the interior data points and end point conditions. A common end point condition is to force the second derivative to be zero. The simpliest spline is the linear spline or straight line interpolant. In this case, a straight line describes the data over each interval. Although this is not a very sophisticated interpolation scheme, it is very simple to understand and apply, requires very little computation, and has small storage requirements. Also, the linear spline does not exhibit the deplorable habit of introducing “extraneous” inflection points into the interpolant. As will be shown later both the cubic and linear splines can produce a poor representation of nonuniform data. The spline under tension, developed by Schweikert (4), allows the user to make a trade-off between the very smooth cubic spline, with its wiggles, and the piecewise linear spline by adjusting a “tension”parameter. The spline under tension uses interpolatory functions based on the solution to the differential equation (D2- p 2 )D2y = 0 (1) where D = -d dx and p = tension parameter. The differential equation was derived from the physics of an elastic beam subject to lateral forces and pulled under tension. The general solution to this differential equation is y = ul + u2x + a3 sinh (px) + u4 cosh (px) (2) Thus, the spline under tension requires the four coefficients ul, u2,u3,and u4 be determined over each interval. In addition, we require that the spline under tension pass through the data points and that it have continous f i t and second derivatives. With these constraints,we need merely specify the first derivative at the end pointa to obtain all the necessary coefficients. By adjustment of the tension parameter, the spline under tension can be made to approach a cubic spline or approach a linear spline. To see this, examine the differential equation given in eq 1. In the limit as p goes to zero, the solution goes to a cubic polynomial. In the limit as p goes to infinity, the solution approaches a polynomial of degree one. Such behavior is very gratifying since it agrees with our physical intuition, i.e., as we subject the elastic beam to more and more tension (increase p ) , the beam eventually becomes straight. In order to apply splines under tension to gridding of nonuniformly spaced data, we imposed two constraints. First,to avoid extrapolation, it is required that measurements be made at the four corners of the experimental design. This constraint proved to be of little trouble in our application. The second constraint was that the nonuniorm data points had to be taken at the grid pointa of the natural grid that results from instrument accuracy and variable range (1). This constraint places only minor restrictions on the experimentation. Given nonuniformly spaced data properly constrained, the algorithm proceeds in a very straight forward way. First, all the grid lines through the two-dimensional independent variable space are considered. The number of measured data points that exist on each grid line are counted, and the grid positions of the data points are marked as known. The grid line that contains the largest number of measured data points with both its end points known is then selected for interpolation. In case of ambiguity, the first grid line encountered is selected for interpolation. A 0 1982 American Chemical Society

ANALYTICAL CHEMISTRY, VOL. 54, NO. 3, MARCH 1982

415

IO0

9.0;

0.50

Flgure 1. Experimentally measured substrate vs. pH response surface for the alkallne phosphatase catalyzed conversion of p-nitrophenyl phosphate to p-nitrophenol spline under tension is then fit to all the known points on the grid line. The values of the unknown grid points are then interpolated and, for subsequent calculations, ai-e marked as known. The selection and interpolation process is repeated until all grid points are known. Note, that it is the number of data points on a grid line that determines when the grid line will be interpolated, not the number of known points. Note also, that the spline under tension is fit to all known points, not just to the data points. This assures a continuous and smooth surface.

.

5.00

RESULTS AND DISCUSSION The spline-under-tension gridding technique was applied to the generation of enzyme response surfaces from sparse, nonuniformily spaced data (1). To test the technique, we used data obtained from complete 16 X 16 enzyme response surfaces, selecting a small subset of the 256 points. These selected points were then used to interpolate for the remaining points in the 16 X 16 grid. Figure 1shows a substrate vs. pH response surface for the alkaline phosphatase catalyzed conversion of p-nitrophenyl phosphate to p-nitrophenol. The surface contains 256 measured data points. These measured values contain measurement noise. Three surfaces obtained from the same nine data points but with varying values of the tension parameter are shown in Figure 2. The nine data points used to determine the surface are indicated by diamonds. Figure 3 shows the surface obtained from 19 data points obtained in a simulated self-adaptive experiment (1). This interpolated surface is within 3% of the data surface. The difference between the smooth interpolated surface and the noise perturbed data surface is shown in Figure 4. In Figure 5, the tension parameter has been greatly increased so that the representation (approachesthat obtained by a linear spline. Figure 6a shows the surface obtained with the four corner points, the four end line midpoints, the center point, and 10 randomly chosen points. Although this surface also has 19 points, it is markedly different from the surface obtained from the 19 points obtained by using a self-adaptive experimental algorithm (Figure 3). Only when the tension parameter is increased from 1 to 5 does the surface (Figure 6b) begin to resemble the underlying data surface (Figure 1). This instability is the principal disadvantage of this surface fitting technique. The self-adaptive experimentation algorithm (5) has been designed so that distorted representations such as the one shown in Figure 6a do not occur. Another shortcoming of the spline-under-tension gridding technique is that addition,al data points often have only very local effects on the gridded surface. The problem occurs when a data point is introduced inear grid lines that contain many

I-

9.00

0.50

Figure 2. Interpolated response surfaces from nine data points with tension parameters: (a) 1.000, (b) 5.000, and (c) 10.000. data points. Since the grid line interpolation order is based upon the number of data points on a grid line, the algorithm first interpolates along those grid lines with the most data points. This has the effect of fixing the neighboring points of the new data point to their original value prior to this last measurement. Thus, the change in the surface is only very local to the new data points, since it can only propagate its information past the fixed neighboring points via changes in the derivatives of the surface at the fixed points and not through changes in the “height” of the surface.

CONCLUSIONS The spline under tension is a large improvement in twodimensional interpolation over the cubic spline. With today’s inexpensive computers, the slight extra computation required

416

ANALYTICAL CHEMISTRY, VOL. 54, NO. 3, MARCH 1982

5.00

I

9.00

-

5.00

I -

0.50

9.00 0.50

Flgure 3. Response surface interpolated from 19 data points and 1.000 tension parameter, chosen by self-adaptive experimentation heurlstics.

5.00

r9.00 0 . 5 0 9 . 0 6 -0 .50

Flgure 4. Difference surface generated by pointwise subtraction of the measured response surface from the 19 point and 1.000 tension parameter, interpolated response surface.

Flgure 6. Interpolated response surfaces generated from the nine original points and 10 randomly chosen points with tension parameter: (a) 1,000, (b) 5,000.

gorithm presented here is often insensitive to some of the data points.

LITERATURE CITED (1) Frazer, Jack W. Am. Lab. (Fairfleld, Conn.)1981, 60-78. (2) Barnhill, R. E. I n "Mathematical Software 111"; Rlce, J. R., Ed.; Academlc Press: New York, 1977; pp 69-120. (3) McLaln, D. H. Comp. J . 1974, 17(4). 318-324. (4) deBoor, C. A. "Practlcal Guide to Spllnes"; Sprlnger-Verlag: New York, 1978; Vol. 27, pp 235-276. (5) Frazer, J. W., Balaban, D. J., Brand, H., unpubllshed work, performed at the Lawrence Livermore National Laboratory and discussed incompletely In Am. Lab. (FalrfieM Conn.)1981, 13 (4).

.00

g,oO/-o.50

Flgure 5. Response surface interpolated from 19 data points with 10.000 tension parameter.

is of little consequence. In addition, the spline under tension requires only about one-fourth the storage requirements of cubic splines. With interactive graphics, the experimenter can easily adjust the tension parameter until the spline under tension is satisfactory, an option not available with cubic splines. The use of the spline under tension for three-dimensional interpolation is quite useful but has some limitations. It is only useful when the nonuniformily spaced data can be collected on an underlying grid. In addition, the gridding al-

RECEIVED for review October 29,1981. Accepted November 30,1981. This work was performed under the auspices of the US. Department of Energy by Lawrence Livermore National Laboratory under Contract No. W-7405-ENG-48. This document was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor the University of California nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that ita use would not infringe privately owned rights. Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government thereof, and shall not be used for advertising or product endorsement purposes.