Two-Dimensional Convolute Integers for Analytical Instrumentation

As new analytical Instruments and techniques emerge with increased dimensionality, there exlsts a corresponding need for data processing logic which c...
1 downloads 5 Views 2MB Size
Anal. Chem. 1982, 5 4 , 1519-1524

1519

Two-Dimensional Convolute Integers for Analytical Instrumentation Thomas R. Edwards Space Sciences Laboratory, NASA/Marshall Space Flight Center, Alabama 358 12

As new analytical Instruments and techniques emerge with increased dimensionality, there exlsts a corresponding need for data processing logic which can appropriately address the data. Two-dlmenslonal measurements show enhanced unknown mixture anaiysls capabllity as a result of the greater spectral lnformatlon content over two one-dlmenslonal melhods taken separately. Two-dimensional convolute integers are merely an extension of the 1964 work by Savltrky and Goiay (generlcaily known as one-dlmenslonal convolute integers) which is found ubiqullously In one-dlmenslonai analytlcal techniques. Thew low-pass, high-pass, and band-pass dlgltai filters are truly twodimenslonal and can be applied in a manner identical with their onedimensionalcounterpart, Le., a weighted nearest-neighbor, movlng average with zero phase shifting, convoluted Integer (universal number) welghting coefflclents. For noise fliterlng and spectral peak location in a two-dimensional data ‘set, Le., a GWYS spectrum with lntensity as a functlon of m / e and time, approprlate two-dimensional convolute Integers applied to the data will both smooth the two-dimensional spectrum and unambiguously locate peaks as readily a8 their one-dimensional counterparts.

New analytical instrumentation and techniques with increased dimensionality are emerging. Knorr and Harris (1) report on time-resolved q~ectroscopicdata in which intensity is a function of two-dimensional (2-D) properties, number of wavelengths observed in each pulsed excitation of a mixture of fluorescence components, and the number of time intervals at which the spectra are obtained. This increased dimensionality in their data yields improved resolution over convoluted one-dimensional spectra and subsequent enhanced identification of mixture components,without prior knowledge of the individual components. Lester, Lemkin, and Lipkin (2) report on a two-dimensional gel electrophoresis technique for increased resolution of complex mixtures of proteins. Two independent molecular parameters, net intrinsic charge and molecular size, provide the basis of their two-dimensional mixture analysis. The two-dimensionalgel slab, once digitized, readily lends itself to all, the techniques available in digital optical image data procensing. Knorr, Thorshein, and Harris (3) extended their two-dimensional analysis of highly overlapped spectral mixtures to GC/MS, again with improved mixture analysis capability as a result of the increased information content in tlheir two-dimensional data. The dimensional aspects are intensity as a function of m l e values and time intervals. Two-dimensional data processing techniques (4-6),such as the one-dimensional methods developed by Savitzky and Golay (7) in 1964 and applied ubiquitously in all one-dimensional digital analytical work (8-11), need to be developed for these enhanced dimensionality instruments and techniques. Two-dimensional convolute integers (12) represent this extension of Savitzky and Golay’s work (generically described as one-dimensiomal convolute integers) and can be applied in a manner identical with their one-dimensional counterparts.

Furthermore, many two-dimensional techniques are two one-dimensional techniques applied orthogonally (13). In terms of surface fitting, this is equivalent to saying that a surface is capable of being orthogonally decoupled and the cross-product terms contain no information. This is theoretically incorrect. Two-dimensional convolute integers are truly two-dimensionaland consider all the cross-product terms appropriate to a surface. Being a straightforward extension of the one-dimensional technique, all the well-established properties are still valid. Two-dimensional convolute integers are regression-generated, convoluted, integer coefficients for nonrecursive, zero phase shift digital filtering. The low-pass, high-pass, and band-pass characteristics are obtained in the fastest type of software logic, a weighted-nearest-neighbor moving average. Furthermore, the logic is readily implemented in video rate real-time hardwire logic (14, 15). Applying the logic to a noisy charged coupled diode array detector output for spectral peak location (position and intensity) is completely equivalent to the one-dimensional case and allows use of the band-pass features inherent in a single set of multiply convoluted coefficients (16). In the case of 2-D gel techniques for protein analysis (2),high-resolution image data can be processed by an appropriate band-pass filter which is effectively noise smoothing and generating a first derivative of the image. The subsequent band-passed data can be searched for zero crossings and interpolated to an exact zero position indicating the location of a peak. Although this is a straightforward approach using a multiply convoluted set of coefficients,the technique is truly two-dimensional and can be applied with all the processing speed inherent in its onedimensional counterpart. Furthermore, if spot size area is required, the gradient surface image can be generated. The gradient operator is a new operator and is readily obtained by proper combination of first derivative filter functions. The resulting image has rather well enhanced edges and contours. Spot size delineation after gradient surface generation, gray scale stretching, and thresholding becomes a much simpler task. Only the spot contour should now be seen in the image.

THEORY The only theoretical requirement placed upon the data is equal-interval spacing. The displacements between pixel elements in the X direction must all be equal and either equal to or a multiple factor of the pixel displacements in the Y direction, i.e., AX = hAY, where h is a constant. (Polar coordinates are as equally valid.) Four equivalent concepts need to be simultaneously considered while developing these coefficients (Table I). (1) Nonrecursive, nearest-neighbor, weighted moving average is equivalent to (2) the convolution or folding together of a local region in an image with a weighting function and then moving on to the adjacent region. These two concepts are equivalent to (3) fitting a local region with a surface of some order via regression calculation and replacing the raw pixel elements with the fitted pixel elements and repeating the operation on the adjacent region (an exceedingly time-consuming technique). All these techniques are equivalent to and satisfy the

Thls article not subject to U.S. Copyrlght. Published 1982 by the American Chemical Soclety

1520

ANALYTICAL CHEMISTRY, VOL. 54, NO. 9, AUQUST 1982

Table I. Four Equivalent Concepts

Table 111. The Regression Coefficients

NONRECURSIVE, NEAREST-NEIGHBOR. WEIGHTED, MOVING SMOOTHING AVERAGE z,;

1x.yI =

?

I=-m CONVOLUTION

e

Cll zII lx.yl/Norrn

I

Ill

i=-n

- FOLDING TOGETHER A REGION

zi,c (x,y), AT ROW r, COLUMN c ON THE GENERALIZED

THE INTENSITY

SURFACE OF ORDER m+n DESCRIBED B Y THE REGRESSION COEFFICIENTS Ali

IS EXPRESSED B Y z ; , ~(x.vl =

(21

m

n

2 1-0

2 1=0

All

x:v:

LET zk REPRESENT THE ACTUAL INTENSITY A T THE kth POSITION ON A N IMAGE, AND F I T THE GENERALIZED SURFACE TO THE D A T A SET B Y THE SURFACE FITTING

5,

=

I

r0

- REGRESSION CALCULATIONS !o

All x:

(31

RESIDUAL SUM OF SQUARES,

y::

OF POINTS I N THE DATASET.

FILTERING

14al

LOW PASS FILTER

SELECT THE BEST F I T REGRESSION COEFFICIENTS BY SETTING a&aA,,

np

2

=

o

FOH ALL

U.V.

Ck'NORM=l

k = l

REPRESENT THE NORMAL EOUATIONS I N M A T R I X FORM:

FOR A L L Ck's AND NORM ASSOCIATED WITH REGRESSION COEFFICIENTS

-

X Y

T . E . A = s T

Z

A0 0 SOLVING FOR THE REGRESSION COEFFICIENTS LEADSTO A= HIGH PASS FILTER

":

!mT - n -1-l BT Z -

AVECTDR EXPRESSION

CklNORM=O

Table IV. The Nearest-Neighbor Convoluted Weighing Coefficients Are Regression Coefficients

k=l

FOR A L L Ck's AND NORM ASSOCIATED WITH THE REGRESSION COEFFICIENT A,,l's WHERE 1.1 M.

MATRIX REPRESENTATION OF THE NEAREST-NEIGHBOR CONVOLUTED WEIGHING COEFFICIENTS zrc = C Z!NORM

Table 11. The Nearest-Neighbor Weighting Coefficients Cij's Are Convolution Coefficients zr:c

ix.vi = f

n ! a. 31 z I x-a, v-01

dad P

- A SCALAR EXPRESSION

MATRIX REPRESENTATION O F THE REGRESSION COEFFICIENTS A = E T

E-' ETZ - A '

VECTOR EXPRESSION

'

NOTE THE SIMILARITY I N FORM BETWEEN THE SCALAR AND VECTOR EXPRESSIONS

A FILTER CAN BE CONSIDERED A N OPERATOR WHICH FORMSTHE FILTERED DATA z ; , ~ BY INTEGRATING THE RAW D A T A OVER A WEIGHTING FUNCTION

AT THE CENTER OF THE DATA MASK, A,llO,O LOCATION 0,O. EACH

n ia. 31

REGRESSION COEFFlClENTAII REPRESENTS A FILTER IFITTEDI INTENSITY VALUE

THIS INTEGRAL I S DEFINED AS THE CONVOLUTION OF z Ix-a. v-01 WITH !a,Pi. IN A DIGITAL FILTER, THE WEIGHTING FUNCTION I S OF THE FORM

Al,l0.0 = zr,c (0.0) M A I N THEORETICAL CONSIDERATION

A,~~O,O = C Z'NORM

WHERE 6 IS THE DIRAC D E L T A FUNCTION REPRESENTING THE DISCRETE SAMPLING OF THE D A T A OVER THE REGION i m . tn.

SUBSTITUTING THIS DISCRETE FUNCTION AND INTEGRATING OVER THE DELTA FUNCTION YIELDS A NEAREST-NEIGHBOR WEIGHTED AVERAGE.

EACH REGRESSION COEFFICIENT, EVALUATED A T THE CENTER OF THE D A T A MASK, CAN BE REPRESENTED BY A UNIOUE SET OF COEFFICIENTS, C AND NORM, INDEPENDENT OF THE D A T A SET, DEPENDENT ONLY ON THE SURFACE ORDER AND MASK SIZE INUMBER OF D A T A POINTS1 MAKING THE APPROPRIATE SUBSTITUTIONS LEADS TO

NORM CAN BE EOUATED TO THE DETERMINANT OF THE CROSS PRODUCT MATRIX NORM = DET

two basic properties of (4) digital filtering and are readily implemented in hardwire logic. In sampled data theory, convolution coefficients are equivalent to the weighting coefficients used to obtain a nearest-neighbor average (Table 11). Merely describing the convoluting function in digital sampled data form leads to the statement that the weighting coefficients in a neatest-neighbor average are convolution coefficients. That these convoluting weighting coefficients in a nearest-neighbor average are also regression coefficients may be a bit more difficult to see but, nonetheless, is a very straightforward result. Regression calculations are equally applicable to a curvilinear polynomial fitting a one-dimensional data stream or an arbitrary surface fitted to a matrix or two-dimensional set of data points (pixel elements). Performing the regression calculations for a generalized two-dimensional surface leads to a vector expression for the

!aTa1

AND Cll I S C,l

=[!DT.E1-'

.ET) Ij.NORM

regression coefficients A , (Table 111). The convolution coefficients C,can now be equated to the regression coefficients by describing the nearest-neighbor weighting Coefficients in matrix fashion (Table IV). Note the similarity in form between the scalar expression and the vector expression for the regression coefficients in Table IV. Recognize that at the center of the data musk, location O,O, the regression coefficients represent an intensity value Ao,oand all the partial derivatives Ai, (where ij # 0) of the surface. This is the main theoretical consideration and is expressed in Table IV. The newly defined weighting coefficients and normalizer are seen to be universal sets of numbers, independent of the

ANALYTICAL CHEMISTRY, VOL. 54, NO. 9, AUGUST 1982

Table VI. Equivalences

Table V. Two-DimensionalConvolute Integers 5 X 5 FILTER MASK

I

ZND Sr 3RD ORDER SURFACE A

1521

(00)

REGRESSION COEFFICIENT

SURFACE ORDER

1.1 PARTIAL DERIVATI\IE

I

SMOOTHING FILTER

-1 -2

-1

-13

2

0

1

7

2

2 -13

2

17

22

17

2

7

22

27

22

7

1

7

17

22

17

2

-13

2

7

2

-13

-15

60

85

60

-15

0

2

ROWTOTAL

I I

I

COMPLEMENTARY TRANSPOSE. T

-15

c I1 I1 = c I I l i T

60

I I I I I I

-15

I

175

85

60

--------------------------------~----COLUMN TOTAL

IDENTICAL SURFACE ORDER

Table VII. Symmetry Properties

Z CIl MASK SYMMETRY

I

i = j SYMMETRIC FILTER t

# I NONSYMMETRIC FILTER

NORMALIZER 175 SMOOTIqING FILTER PROPERTY

T

CI i /NORM

-

COEFFICIENT SYMMETRY 1 LOW-PASS FILTER

data, and dependent only on the surface order and the data mask size. These new wieighting coefficients and their associated normalizer are convolution coefficients derived from two-dimensional regression calculations and can be appropriately described as regression-generated two-dimensional convolute integers. The integer aspect of their description arises from the fact that only integer values are used in their calculation, i.e., data mask positions. Applying these coefficients to the data is equivalent to performing a two-dimensional surface fit, generating either a smoothed replacement, point or a partial derivative a t the center of the data mask, without performing the time-consuming calculations associated with setting up normal equations and taking matrix inverses. Of course, this is two-dimensional digital filtering.

IKESULTS A Typical Filter. A typical noise smoothing filter for a second- and third-order iaurface 5 X 5 data mask is shown in Table V. Note the great deal of symmetry associated with the coefficients in the mask. Only one quadrant of coefficients is needed to uniquely specify a complete set of coefficients. This lends speed to the weighted, moving, smoothing algorithm needed to address tlhe data. Data mask locations having two-dimensional convoluIt,e integers of equal value need only be added or subtracted prior to multiplication. Since addition o€ two integers is significantly faster than multiplication, considerable processing time is saved by utilizing all the symmetry properties available. Note that the center point in this fiter mask contributes a greater influence than all other points in the resulting colnvolution. Filtering. The concept of filtering expressed in Table I (eq 4a and 4b) is rather simply stated for this regressiongenerated, convoluted, weighted nearest-neighbor type averaging. A low-pass filter should pass a constant intensity value. A high-pass or band-pass filter should not pass the constant value. A low-pass filter is a noise suppressor or smoothing filter, whereas a high-pass filter is a roughing filter. By application of Cramer’s rule for the calculation of the individual regression Coefficients, in Table 111,these filtering properties are readily satisfied, as can be seen by adding all terms in Table V and dividing by norm. The smoothing or low-pass filters result from fitting the data set to an arbitrary

ROW r, COLUMN c c-r-c= c-,c=

(-1) (-1) ICrc

cr-c = (-1 I 1 c,,

REDUNDANCY

i=j

i#i

4 FOLD

4 FOLD

r,c # 0

2 FOLD

r,c = 0

2 FOLD

4 FOLD

r # c i O

8 FOLD

ZERO TERMS

c,,

=0

I

c,,

=0

I ODD

ODD

surface. The roughing or high-pass filters result from fitting the data set to a partial derivative of an arbitrary surface. Equivalences, Symmetries, Multiplications. A number of equivalences exist between sets of filter coefficients for surfaces of adjacent orders and between the complementary transpose of high-pass filters for surfaces of the same order (Table Vl). General symmetry properties exist for sets of filter coefficients associated with each regression term A , (Table VII). These two sets of properties lead to a minimal number of unique filter coefficients per surface order. As a result, the overall speed with which these filtering operations can be carried out is optimized, and the computer storage of unique filter coefficients is minimized. These two properties lead to a minimal number of multiplications per filter (Table VIII). In Tables X-XII, coefficients are expressed in a compressed fashion as a result of the equivalences and symmetries described in Tables VI and VII. Only the upper left-hand quadrant of terms expressed in Table V are necessary to uniquely specify the filter. Double Convolution-PeakDetection, Band-Pass Filter. The convolute integers generated so far have satisfied low-pass and high-pass filtering criteria. Combining these filters will result in band-pass filtering, or double ,convolution. The band-pass filter is effectively a smoothing filter convoluted with a first order or higher order partial derivative

1522

ANALYTICAL CHEMISTRY, VOL. 54, NO. 9, AUGUST 1982

Table VIII. Multiplications per Filter

Table X. Two-Dimensional Convolute Integers TWO-D I MENS IO N A L CONVOLUTE INTEGERS

D A T A MASK OF np POINTS I, 1 PARTIAL DERIVATIVE

NO. OF MULTIPLICATIONS 9 POINT FILTER 3 X 3 MASK

SYMMETRIC,

0&1 SMOOTH A 1001

SURFACE ORDER FUNCTION

i=j

RIC

+ 3118

EVEN, EVEN

lnp + 1) lnp

ODD, O D 0

Inp - 1 i (np t l ) / 8

-1 -1

1&2

PARTIAL A I011 A 1101’

1 1

-1

0 0 -1 0 0

1

NORM

9

-1 0

-1 0 6

NONSYMMETR IC. 25 POINT FILTER 5 X 5 MASK

i # j

SURFACE ORDER FUNCTION

EVEN, EVEN [(np-1) (np+1)/41 + l