The New Physician as Unwitting Quantum Mechanic: Is Adapting

Jul 3, 2007 - Finally, the paper compares identical character, distinguishability of states events or measurements, correlation, mutual information, a...
0 downloads 5 Views 204KB Size
The New Physician as Unwitting Quantum Mechanic: Is Adapting Dirac’s Inference System Best Practice for Personalized Medicine, Genomics, and Proteomics? Barry Robson* IBM Global Innovation and Information Based Medicine, Somers Route 100, New York 10589, Department of Biostatistics & Epidemiology, St. Matthews University School of Medicine, Grand Cayman, and The Dirac Foundation, BrookStreet Des Roches LLP, 1 Des Roches Square, Witney, Oxfordshire UK, OX28 4LF Received February 21, 2007

What is the Best Practice for automated inference in Medical Decision Support for personalized medicine? A known system already exists as Dirac’s inference system from quantum mechanics (QM) using bra-kets and bras where A and B are states, events, or measurements representing, say, clinical and biomedical rules. Dirac’s system should theoretically be the universal best practice for all inference, though QM is notorious as sometimes leading to bizarre conclusions that appear not to be applicable to the macroscopic world of everyday world human experience and medical practice. It is here argued that this apparent difficulty vanishes if QM is assigned one new multiplication function @, which conserves conditionality appropriately, making QM applicable to classical inference including a quantitative form of the predicate calculus. An alternative interpretation with the same consequences is if every i ) x-1 in Dirac’s QM is replaced by h, an entity distinct from 1 and i and arguably a hidden root of 1 such that h2 ) 1. With that exception, this paper is thus primarily a review of the application of Dirac’s system, by application of linear algebra in the complex domain to help manipulate information about associations and ontology in complicated data. Any combined bra-ket can be shown to be composed only of the sum of QM-like bra and ket weights c(), times an exponential function of Fano’s mutual information measure I(A; B) about the association between A and B, that is, an association rule from data mining. With the weights and Fano measure re-expressed as expectations on finite data using Riemann’s Incomplete (i.e., Generalized) Zeta Functions, actual counts of observations for real world sparse data can be readily utilized. Finally, the paper compares identical character, distinguishability of states events or measurements, correlation, mutual information, and orthogonal character, important issues in data mining and biomedical analytics, as in QM. Keywords: medical inference • probabilistic inference • expert system • decision support • predicate calculus • quantum mechanics • Dirac • bra • ket

1. Background 1.1. Diversity and Dilemma. The rise of personalized medicine increasingly embraces clinical, genomic, proteomic, and sophisticated imaging data.1-4 The abundance, variety, high dimensionality, and underlying probabilistic nature of this data5-6 create an increasing need for information technology to aid the physician and biomedical researcher. However, there is an equally overwhelming choice of classical-statistical7-15 and information-based philosophies16-29 to interpret data,8-23 draw inference, and so help make predictions and decisions.24-27 In addition, for clinical applications there remains the need to combine formally the effects of uncertainty of both existential (associations, correlations) and universal (taxonomic, metadata, * To whom correspondence should be addressed. E-mail, robsonb@ us.ibm.com.

3114

Journal of Proteome Research 2007, 6, 3114-3126

Published on Web 07/03/2007

conditional, potential causality, regulatory) aspects. All this troubles the question of what constitutes Best Practice for inference, always important in research, but especially so when used to guide physicians in personalized medicine including genomics and proteomics. The present paper discusses adapting such a system28 that already exists in physics and chemistry. It extends methods of the author’s laboratories first applied to mining protein sequence data for predictive rules from structure-sequence relationships,29-31 massive clinical,5.6 and clinical genomic,21 proteomic,22 and expression array23 data. 1.2. Quantum Mechanics Should Be Best Practice. Theoretical chemists and physicists who work as, or with, life scientists are aware that there is a system of observations, measurements, and inference that they may use in their calculations on, for example, the properties of molecules. Almost without exception, it is held in principle to be a 10.1021/pr070098h CCC: $37.00

 2007 American Chemical Society

New Physician as Unwitting Quantum Mechanic

universally applicable inference system from fundamental particles to the cosmological scale. Yet they do not use it nor recommend it in the laboratory, nor in clinical trials, epidemiology, and public health studies. This system is quantum mechanics (QM), perfected into an inference system by Paul M. Dirac.28 Their hesitancy is due to the apparent lack of transferability to the normal world. QM yields bizarre conclusions inappropriate to the macroscopic world of everyday human life. A fundamental particle may be observed to be in several states at the same time with different weights, and by QM a patient may be calculated as being alive and dead at the same time. From one perspective, there is no discrepancy. In the present, the physician makes (or hears of) an observation on the patient. Observation in everyday life creates conditions for probabilities close to 0 or 1 for each possible state. But for future time, the physician will be very interested in the less certain probabilities that the patient will be alive or dead, conditional on potential therapeutic strategies. 1.3. Quantum Mechanics Uses Complex Numbers. Arguably, the real difficulty underlying both the seeming weirdness of QM and its difficulty for non-mathematicians resides in the fact that quantum mechanics (QM) is a system of inference which is based on the complex numbers and hence the complex plane by which they can be represented graphically in two dimensions. In QM, as one includes more particles or parameters, the description embracing them requires a higher dimensional space called Hilbert space made up of many such planes; however, for present purposes we can deal with the AND case implied here in the single complex plane. The reason for complex numbers is that QM deals in probability amplitudes, complex-valued functions such as wave functions describing uncertain quantities and more closely related to the square root of probabillities. But use of amplitudes and complex numbers is not peculiar to QM: for example, electrical engineers also use them to describe electrical oscillations. Life scientists typically do not use complex numbers. So in brief: a complex number R + iβ such as 6.3 + i1.6 has a real part R (here 6.3) and an imaginary part iβ (here i × 1.6), where i ) x(-1). A graph of the imaginary part versus the real part describes the complex plane, placing the above complex value at (6.3, 1.6). For algebra and arithmetic, i can be initially considered as an algebraic variable like x. So adding (6.3 + i1.6) + (3.1 + i0.2) becomes (9.4 + i1.8). However, when i × i is produced by multiplication, it is replaced by -1. Hence (6.3 + i1.6) × (3.1 + i0.2) first becomes (6.3 × 3.1) + (6.3 × i0.2) + (i1.6 × 3.1) + (ii0.32), then (19.53) + (i1.26) + (i4.96) + (-0.32), and finally 19.41 + i6.22. More complicated functions of complex numbers are not required for comprehension of this paper. The complex conjugate is, however, required and is indicated by superscript *. It simply changes the sign of the imaginary part, so (6.3 + i1.6)* ) 6.3 - i1.6, and (6.3 - i1.6)* ) 6.3 + i1.6. 1.4. Predicate Calculus. Two-dimensionality in inference is not new: the Ancient Greeks developed the predicate calculus, a higher order logic. It combines universal statements or propositions such as “All A are B” (the modern form adds “if A exists”) and “No A are B”, and existential statements such as “Some A are B” and “Some A are not B” (the modern form adds “and A and B exist”). A and B are examples of states, events, properties, observations, samples, or measurements, such as “Diagnosis:) mucinous adenocarcinoma”. Classically, they only take true or false values. It is intuitive to take P(“All A are B”) ) 1 as implying P(B | A) ) 1, i.e., the probability of B

research articles conditional on A, P(“No A are B”) as implying P(B | A) ) 0, and P(“Some A are B”) ) 1 as implying P(A & B) ) 1, i.e., the joint probability of A and B. More deeply, logicians in their teachings would implicitly be using P(B | A) ) P[P(“All A are B”) ) 1 | H] ) 1 where the outer P is analogous to a probability density function. The vertical bar means “conditional on” something, here H, a hypothesis or a position for the sake of demonstrating logic argument, i.e., inference. By “argument” is meant demonstrating how the values of the various combinations of logical statements of this type can affect each others values and the final conclusion. For example if P(“No A are B”) ) 1 and P(“Some C are A”) )1 (the premises), then it must be that P(“Some C are not B”) ) 1 (the conclusion; this example is a syllogism). This could be expressed as P[P(“Some C are not B”) ) 1 | P(“No A are B”) ) 1 & P(“Some C are A”) ) 1] ) 1. The final and outer ) 1 really means valid, not true, since, since given the contents 1 is the answer irrespective of any hypothesis, position, or outright lie about the premises. However, the predicate calculus itself has only been developed for P being 0 or 1; at least, there is no widely accepted form that can handle uncertainty. 1.5. In Pursuit of a Quantitative Predicate Calculus (QPC). Though precise, QM does not allow absolute certainty, so that P(“All A are B”) < 1. Neither does the real world, in regard to such expressions, since typically their probabilities are based on examining finite data D. For example, P(B | A) ) E[P(“All A are B”) ) 1 | D], i.e. an expectation E conditional on data D ) [nAB, nA] where nAB is the number of times A and B are seen together and nA) ΣX nAX is the number of times that A is seen, alone or with B. Expectation implies integration over all possible values of here P[P(“All A are B”) | D] (cf. ref 30). The “frequentist” view (classical statistics) is that the appropriate measure given D is simply P(B | A) ) P(A & B)/P(A) ) nAB/nA when these numbers are indefinitely and unreachably large. For quantifying the Predicate Calculus, P(“All A are B”) and P(“Some A are B”) are so demanding and sensitive that one counterexample would set the estimate to 0 and 1, respectively. In open sampling, we can never be sure that we will not eventually encounter a case that one A is not B nor that we can avoid an undetectable error that gives that impression. Arguably, any reasonable estimate of P(“All A are B”) ) 0, or other fixed prior belief value, always applies in practice, rendering it impractical. Also, P(“Some A are B”) ) 1 would apply when just one case in which A occurs as B is seen even if trillions of other observations show that it does not, similarly rendering it impractical. To overcome this, one should include for all practical purposes (FAPP), e.g. P(“All A are B FAPP”). However, E[P(“All A are B FAPP”) | D] > E[P(“All A are B”) | D], since we hold a stronger belief in the corresponding weaker or “fuzzier” FAPP statement. A specific choice common in inference engines and particularly those using Fuzzy Logic would here correspond to xP(B | A) ) P(“All A are B FAPP”) ) xP(“All A are B”). Note that xP > P for 0 > P < 1; at 0 and 1 then xP ) P. The formal origins reside in set theory and Fuzzy Logic. Further, P(“Some A are B FAPP”) is best rendered as xP(A & B)/x[P(A) P(B)], since it is departure from expected random concurrence that matters, not a single rare or spurious “A & B” event. Thus, the elements are xP(A | B), xP(B | A), and xP(A & B)/x[P(A) P(B)]. The significance here is that probability amplitudes of QM are also closely related to xP; see Discussion. Journal of Proteome Research • Vol. 6, No. 8, 2007 3115

research articles

Robson

Figure 1. Kind of architecture of inference suggested by quantum mechanical inference. See Text.

2. Theory: A Quantitative Predicate Calculus (QPC) 2.1. Borrowing from Dirac. The approach described here uses the nature of linear algebra in the complex domain to try to capture some aspect of information, such as correlations between dimensions such as medical records columns (i.e., demographics, tests, proteomic and genomic results, diagnoses, treatments, outcomes, etc). In essence, however, the paper merely reviews Dirac’s existing notation system and its inference calculus (with one extension), with an eye on facilitating domain-dependent design of inference engines. Figure 1 shows such an example. The objects resembling one ways left and right pointing arrows correspond to bras such as of Dirac’s system. stand for “All B and A”, but the qualification “for all practical purposes” (FAPP) should be understood throughout (Section 1.5). This reflects conditionality. Depending on use case, it can also signify causality, as in “IF B then A”. Symbols A and B represent states or events, and in general are also observations or measurements such as |Diagnosis:) mucinous adenocarcinoma >, |male>, . They can also be joint observations such as |Protein 26B expression level:) 6 units & Immunological marker HLA177:)positive>. Whereas “greater than” or “less than measures” such as 200 pounds| are considered degenerate in QM, they are a valid state description in everyday life. Bras and kets actually represent rules drawn from analysis of many patients, and have an associated probability-like weight, or a rule incidence as information drawn from a specific patient record. For a rule deduced from many such, the compound rules (A & B etc.) are association rules and the weight then expresses the degree of association. To handle lack of definition as to conditionality, and still be able to make incomplete inference, the bras and kets can be considered as vectors with discrete elements or even continuous distributions of probability-like quantities (typically complex numbers). They are used by combining a bra to a ket as an input value requesting return of a specific complex value, i.e., making or for short, which is a single complex number. These forms are Dirac’s bra-kets (from brackets). Many useful operations can be performed on bras, kets, and bra-kets. Bras, kets, and bra-kets as rules can be joined together by various operators or functions, e.g., |A> + |B> or Σ(|A>, |B>, |C>,...) or @. Not shown are e.g., or with other functions, e.g., forms of negation NOT and R discussed below. R and R* extract xP(A | B) and xP(B | A) respectively from bra-kets; a useful observation bra or ket,