Application of Analytical Detection Concepts to Immunogenicity Testing

Oct 9, 2007 - The cut point and detection limit of any immunogenicity assay are two of the most important quantities that define the adequacy of an as...
0 downloads 0 Views 445KB Size
Anal. Chem. 2007, 79, 8176-8184

Application of Analytical Detection Concepts to Immunogenicity Testing Scott L. Klakamp,* Hong Lu, Mohammad Tabrizi, Cherryl Funelas, and Lorin K. Roskos*

Amgen Fremont, Inc., 6701 Kaiser Drive, Fremont, California 94555 David Coleman

4020 Benden Circle, Murrysville, Pennsylvania 15668

The cut point and detection limit of any immunogenicity assay are two of the most important quantities that define the adequacy of an assay for detecting anti-drug antibodies against therapeutic proteins. To date in the immunogenicity testing literature, only the type I (r) error (i.e., the false positive) rate of the assay has been considered for establishing cut points. The “sensitivity” of an immunogenicity assay is usually reported as the concentration of a monoclonal or polyclonal anti-drug antibody standard corresponding to the signal at the cut point. We propose that a more traditional and rigorous analytical chemistry definition of the detection capability be utilized wherein both type I and type II (β, false negative) error rates are considered. Specifically, the Hubaux-Vos technique of calculating cut points and limits of detection from predication intervals on calibration curves is recommended as a statistically rigorous approach. The utility of using receiver-operator characteristic curves for managing the type I and II error rates of an immunogenicity assay is also presented. In addition, we illustrate how a soluble receptor, sMUC18, for the therapeutic mAb ABX-MA1 can result in false positives by Biacore methodology. This result suggests that immunogenicity confirmatory experiments must be carefully designed, preferably with a smaller type I and II error rate than in the primary screening if an acceptable limit of detection can be maintained. A major concern in the development of any biological therapeutic is the induction of human anti-drug antibodies (ADA) in patients. ADA responses can adversely affect the safety and efficacy of a therapeutic product. Proper and careful design of immunogenicity assays is of the utmost importance in the development of biological therapeutics since any adverse effects from ADAs must be accurately reported. Not only is the choice of the proper analytical technique essential for accurate detection and characterization of ADAs, but application of the correct statistical procedures for data analysis can be equally important for valid conclusions to be drawn from immunogenicity assays. * To whom correspondence should be addressed. Current address: AstraZeneca Pharmaceuticals LP, 24500 Clawiter Rd., Hayward, CA 94545. E-mail: [email protected]; [email protected].

8176 Analytical Chemistry, Vol. 79, No. 21, November 1, 2007

In the immunogenicity field and in the bioanalytical and clinical chemistry disciplines in general, nonstandard definitions and concepts have been used commonly when compared with conventional analytical chemistry terminology.1-6 This nonstandard usage of analytical nomenclature has created confusion as to the definition of certain terms and their significance for characterizing assay performance. Two common terms misused in bioanalytical settings are “sensitivity” and “limit of detection” (LOD) as pointed out by Rudy7 and Anderson.8 Linnet and Kondratovich9 and Brown et al.10 have proposed rigorous statistical methods for defining the LOD for clinical chemistry methods. Recently, a document has appeared from the Clinical and Laboratory Standards Institute11 recommending protocols for determining cut points (also called limit of blank), limits of detection, and limits of quantitation (LOQ) that standardizes the meaning of many important analytical concepts in clinical and bioanalytical chemistry. Although several papers have appeared describing operational definitions of analytical parameters specifically for immunogenicity assays, certain guidelines proposed are from a nontraditional analytical chemistry perspective and may not be as rigorous as needed for immunogenicity data analysis.12 One concept that is commonly used in immunogenicity studies that may not be fully understood in terms of its weaknesses is that of ‘cut point’. The cut point, the decision point for calling samples positive or negative for ADA, is equivalent to Currie’s critical value1-4 and is not actually an LOD. However, quite often the cut point is used in (1) Currie, L. A. Anal. Chem. 1968, 40, 586-593. (2) Currie, L. A. Pure Appl. Chem. 1995, 67, 1699-1723. (3) Currie, L. A. Chemom. Intell. Lab. Syst. 1997, 37, 151-181. (4) Currie, L. A. Appl. Radiat. Isot. 2004, 61, 145-149. (5) Gibbons, R. D.; Coleman, D. E. Statistical Methods for Detection and Quantification of Environmental Contamination; John Wiley & Sons: New York, 2001. (6) Miller, J. C.; Miller, J. N. Statistics for Analytical Chemistry; Ellis Horwood Ltd.: Chichester, United Kingdom 1992. (7) Rudy, J. L. Clin. Chem. 1989, 35, 509. (8) Anderson, D. J. Clin. Chem. 1989, 35, 2152-2153. (9) Linnet, K.; Kondratovich, M. Clin. Chem. 2004, 50, 732-740. (10) Brown, E. N.; McDermott, T. J.; Bloch, K. J.; McCollom, A. D. Clin. Chem. 1996, 42, 893-903. (11) Tholen, D. W.; Linnet, K.; Kondratovich, M.; Armbruster, D. A.; Garrett, P. E.; Jones, R. L.; Kroll, M. H.; Lequin, R. M.; Pankratz, T. J.; Scassellati, G. A.; Schimmel, H.; Tsai, J. NCCLS Document EP17-A 2004, 24 (34), 1-39. (12) Mire-Sluis, A. R.; Barrett, Y. C.; Devanarayan, V.; Koren, E.; Liu, H.; Maia, M.; Parish, T.; Scott, G.; Shankar, G.; Shores, E.; Swanson, S. J.; Taniguchi, G.; Wierda, D.; Zuckerman, L. A. J. Immunol. Methods 2004, 289, 1-16. 10.1021/ac071364d CCC: $37.00

© 2007 American Chemical Society Published on Web 10/09/2007

immunogenicity studies to define how low a concentration of ADA the assay can detect. It is important to realize that at the cut point there is a 50% false negative rate.3 The cut point only takes into account type I (R) errors. The detection limit concept of Currie3 is equally or more important for defining the lowest concentration of ADA that can be detected reliably by the assay. The LOD explicitly recognizes type I (R) and II (β) error rates. The concentration of ADA at the LOD can be reliably detected at the confidence level corresponding to 1 - β. In this paper, we propose more rigorous statistical methods for analyzing immunogenicity data from ELISA and Biacore assays. We also suggest determining LOD values for these ADA assays using a technique developed by Hubaux and Vos13 that has been adapted to immunogenicity assays using monoclonal antibody (mAb) standards with defined equilibrium dissociation constants (KD). Discussion on how to evaluate non-Gaussian data sets (most commonly observed for immunogenicity data) and how to properly test for outliers is also discussed. How to choose reasonable type I and type II error rates for any given immunogenicity assay from receiver-operator curves (ROCs) is also presented. Last, we demonstrate how the soluble target of a therapeutic mAb in treatment-naive patient samples can erroneously lead to false positives unless appropriate confirmatory assays are established. MATERIALS AND METHODS ABX-IL8, ABX-MA1, and sMUC18. ABX-IL8 is a fully human mAb14 against interleukin-8 and was produced from XenoMouse mice15 similarly to methods previously published.14,16 ABX-MA1 is a fully human mAb against MUC18.17 Soluble MUC18 antigen (sMUC18) was produced at Abgenix, Inc. (Fremont, CA) by recombinant expression. Human Serum Samples. Serum samples tested in the ABXIL8 immunogenicity ELISA were collected from psoriasis patients enrolled in Abgenix clinical trial ABX-0204 and were collected prior to treatment with ABX-IL8 or placebo. Serum samples tested in the ABX-MA1 Biacore immunogenicity assay were collected from patients with malignant melanoma enrolled in Abgenix clinical trials ABX-0401 and ABX-0402 and were collected prior to treatment with ABX-MA1. Human Anti-Human Antibody (HAHA) Screening ELISA for ABX-IL8. The ABX-IL8 immunogenicity ELISA for human serum samples was conducted in a 96-well format using a flatbottom ELISA plate. ABX-IL8 was immobilized onto the bottom of the plate. The coating solution consisted of ABX-IL8 at a concentration of 100 ng/mL, and 100 µL was added to each well. Following incubation overnight at 2 to 8 °C, the plate was washed 3 times with commercially available washing buffer. A blocking solution (200 µL/well) was added and allowed to rest at room (13) Hubaux, A.; Vos, G. Anal. Chem. 1970, 42, 849-855. (14) Yang, X. D.; Corvalan, J. R.; Wang, P.; Roy, C. M.; Davis, C. G. J. Leukocyte Biol. 1999, 66, 401-410. (15) Mendez, M. J.; Green, L. L.; Corvalan, J. R.; Jia, X. C.; Maynard-Currie, C. E.; Yang, X. D.; Gallo, M. L.; Louie, D. M.; Lee, D. V.; Erickson, K. L.; Luna, J.; Roy, C. M.; Abderrahim, H.; Kirschenbaum, F.; Noguchi, M.; Smith, D. H.; Fukushima, A.; Hales, J. F.; Klapholz, S.; Finer, M. H.; Davis, C. G.; Zsebo, K. M.; Jakobovits, A. Nat. Genet. 1997, 15, 146-156. (16) Davis, C. G.; Jia, X. C.; Feng, X.; Haak-Frendscho, M. Methods Mol. Biol. 2004, 248, 191-200. (17) Mills, L.; Tellez, C.; Huang, S.; Baker, C.; McCarty, M.; Green, L.; Gudas, J. M.; Feng, X.; Bar-Eli, M. Cancer Res. 2002, 62, 5106-5114.

temperature for 1 h. Clinical study samples, diluted 10-fold with blocking buffer, along with positive control samples (1.3 ng/mL ABX-IL8 Anti-id antibodies) and negative blank samples (10% normal human serum) were added to the plate in accordance to a preset plate map. Volume added was 100 µL for each well. The plate was sealed and kept at room temperature on a plate shaker for 2 h to allow binding of the serum sample to the immobilized antigen. Following the 2-h incubation, any unbound material was washed off the plate with three rinses. Subsequently, 100 µL per well of a diluted (1:50000 dilution) biotin-conjugated ABX-IL8 solution was added, and the plate was again incubated at room temperature for 1 h and washed three times before addition of the signaling reagents. Streptavidin-conjugated horseradish peroxidase, purchased from Southern Biotechnology, was diluted 1:4000-fold, added to each well (100 µL/well), and incubated for 15 min at room temperature. The plate was washed 3 times before addition of 100 µL of TMB reagents (3,3′,5,5′-tetramethylbenzidine/hydrogen peroxide solution from Neogen) to each well. The plate was kept in the dark for 5 min at room temperature for color development. Fifty microliters of a 2 M sulfuric acid solution was added to each well for enzyme deactivation. The colorimetric absorbance was determined at 450 nm using a 96well plate reader. Surface Plasmon Resonance Measurements. All surface plasmon resonance studies were performed using a Biacore 2000 instrument. All 25% serum samples were prepared in vacuumdegassed, filtered, and concentrated HBS-P buffer containing soluble carboxymethyldextran (CM-dextran) to yield a final 1× concentration of HBS-PCM buffer consisting of 0.01 M HEPES, 0.15 M NaCl, 0.005% surfactant P-20, and 1 mg/mL CM-dextran (Biacore Inc., Uppsala, Sweden, and Fluka Chemical, Saint Louis, MO). Biacore amine-coupling reagents, 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide (EDC), N-hydroxysuccinimide (NHS), and ethanolamine were purchased from Biacore, Inc. Phosphoric acid was from Acros Organics (Geel, Belgium). Research grade CM5 biosensor chips were purchased from Biacore, Inc. Standard EDC/NHS coupling chemistry18 was used to covalently immobilize ABX-MA1 to four flow cells of two CM5 sensor chips yielding 4000-4700 RU/flow cell. All Biacore experiments were conducted at 25.5 °C. Two Biacore experiments were performed on two different CM5 sensor chips. In the first, 28 naive serum samples, were diluted to 25% serum with concentrated HBS-PCM buffer to give a final concentration of 1× HBS-PCM buffer. Another set of 28 samples was prepared identically except for being spiked with 1 mg/mL ABX-MA1. In the second experiment, the same 28 samples, unspiked and spiked with an irrelevant control mAb against KLH (keyhole limpet hemocyanin) at 1 mg/mL, were also diluted to 25% serum with concentrated HBS-PCM buffer for a final 1× HBS-PCM concentration. In both experiments, each sample was injected for 15 min over all four flow cells at a flow rate of 10 µL/min and the dissociation phase of the sensorgrams was followed for 3 min before regeneration with a 12-s pulse of 146 mM phosphoric acid, pH 1.5. It was shown (data not presented) that the ABX-MA1 surface was stable to 60-70 regeneration cycles of acid. This was proved by performing (18) BIAapplications Handbook: Biacore, Inc.: Uppsala, Sweden 1998.

Analytical Chemistry, Vol. 79, No. 21, November 1, 2007

8177

60-70 injections of MUC18 in HBS-PCM buffer with only a 5-7% decrease in the binding signal. The sensorgrams were processed in Scrubber (Version 2.0, Biologic Software, Campbell, Australia). HBS-PCM buffer was injected after every five serum injections so any systematic drift caused by the instrument on each of the four flow cells could be followed over the course of the experiment. Subsequently, the sensorgram from the closest buffer injection to each serum sample was subtracted from the serum sensorgrams for that given flow cell. A report point was then taken 3 min into the dissociation phase of the fully processed sensorgram. Report points for any individual serum sample were normalized for each flow cell to the amount of covalently immobilized ABX-MA1 by dividing the raw data report point value by the amount of covalently immobilized ABX-MA1 on that particular flow cell and multiplying by an arbitrary factor (in this case 1000) to give a value in the double-digit to low triple-digit RU range. All normalized report points for a serum sample were averaged and used in all statistical data analyses. For the confirmatory sMUC18 Biacore experiments, 25% serum samples in 1× HBS-PCM were flowed across a freshly prepared ABX-MA1 immobilized CM5 sensor surface for 15 min as described above. Immediately after the serum injection was complete, a noncompetitive mouse mAb, S-Endo-1 (Biocytex; Marseille, France), to sMUC18 at 12 µg/mL or a control antiKLH mAb at 1 mg/mL was injected for 5 min, and the dissociation phase was followed for 3 min. Soluble MUC18 ELISA Methods. Soluble MUC18 (sMUC18) concentrations were determined using the Cy-Quant Elisa sCD146 kit from Biocytex. The assay was performed as indicated in the manufacturer’s instructions. Two patient serum samples gave absorbance values corresponding to sMUC18 concentrations below the LOQ. In these two cases, the value of sMUC18 was assigned a concentration equal to the LOQ, 100 ng/mL. Statistical Methods. All statistical analyses were done in JMP (version 5.1.1, SAS Institute, Cary, NC). The normality of untransformed and log-transformed data was determined by the ShapiroWilk W test. After a significant correlation was demonstrated for duplicate optical density and quadruplicate resonance unit measurements in the ABX-IL8 ELISA and the ABX-MA1 Biacore assays, respectively, the optical densities and the Biacore response units were averaged prior to further analyses. The Rosner test5 was used as an objective method of testing for outliers in the Biacore data. The Hubaux-Vos method13 was conducted by fitting the blank and standard curve data from the ABX-IL8 immunogenicity ELISA to the following equation by weighted regression,

Yi ) Xib1 + b2

(1)

where b1 is the slope of the calibration line, b2 is the y-intercept, Xi is log(Ci + b3), Ci is the concentration of the ith calibration standard, b3 is an offset constant to allow log-transformation of blank (zero concentration) data, Yi is log(MSODi), and MSODi is the mean sample optical density associated with the ith set of calibration standard replicates. For a more detailed discussion of the Hubaux-Vos procedure, please see the Supporting Information. 8178

Analytical Chemistry, Vol. 79, No. 21, November 1, 2007

RESULTS AND DISCUSSION Immunogenicity ELISA for ABX-IL8. To demonstrate and compare methods for establishing the assay cut point and LOD, treatment-naı¨ve serum samples collected from 264 subjects with psoriasis were tested in an ELISA that was developed to assess the immunogenicity of ABX-IL8. ABX-IL8 is a human monoclonal antibody to interleukin-8 that was in clinical development for the treatment of psoriasis, rheumatoid arthritis, and chronic obstructive pulmonary disease.19 A calibration curve was generated using an anti-idiotypic mAb to ABX-IL8 (KD ) 200 pM). To establish the cut point and LOD, parametric and nonparametric analyses of the blank sample optical densities were conducted. The MSOD values for the blank samples were determined to be log-normally distributed, so log-transformed values were used to establish the cut point and LOD. The type I error rate was set to 5% (R ) 0.05) for the cut point, and the LOD was calculated for type I and type II error rates of 5% (R ) 0.05, β ) 0.05). The mean (SD) of the log-transformed MSOD values was -2.738 (0.224). The cut point of the blank samples was calculated as the mean of the blanks plus j × SD, where j ) 1.645, the z-score for the 95th quantile of the normal distribution (meaning that only 5% of the blank distribution would fall above 1.645 SD from the mean of the blanks; see Figure 1 for a practical example of this procedure). Calculation of LOD exclusively from the blank sample distribution is the simplest and easiest statistical method available; however, it requires an assumption that the standard deviation and shape of the distribution of a sample containing ADA is identical to the blank. Thus, the LOD was calculated by the blank mean plus (j - k) × SD, where, in this example, k ) -1.645, the z-score for the fifth quantile of the normal distribution (meaning that only 5% of the values for the distribution of a sample having a concentration at the LOD will fall below -1.645 SD from the mean of that distribution as illustrated in Figure 1). Under these assumptions, the fifth quantile of the log-transformed MSOD values for samples containing ADA at a concentration equal to the LOD will equal the cut point, yielding a 5% false negative rate at the LOD and reliable detection of the analyte. From the antilogs, the cut point and LOD signals correspond to OD values of 0.093 and 0.135, respectively. The frequency histogram and the frequency distribution function for the ABX-IL8 ELISA data (n ) 264) are shown in Figure 1. In panel a, the frequency distribution of the blank samples was overlaid on the frequency distribution for an ADA present at a concentration equal to the cut point (R ) 0.05). The figure illustrates that blank samples with ln(MSOD) values exceeding the 95th quantile of the blank distribution will incorrectly be called ADA-positive (5% false positive rate). Panel a demonstrates the weakness of the cut point in defining detection capability. When the ADA is present at a level equal to the cut point, the ADA is detected only 50% of the time (β ) 0.5, 50% false negative rate). The cut point will retain a false negative rate of 50% at any level of R that is selected. Therefore, the cut point should not be used to define assay “sensitivity” because the ADA will not be reliably detected at that level. In panel b, the frequency distributions are overlaid for the blank samples and an ADA present at a level equal to the LOD. At the LOD, ADA will be (19) Mahler, D. A.; Huang, S.; Tabrizi, M.; Bell, G. M. Chest 2004, 126, 926934.

Figure 1. (a) Histogram and frequency distribution function (dashed line) for ABX-IL8 blanks and theoretical frequency distribution function (solid line) for ADA at a level equal to the cut point (R ) 0.05). (b) Histogram and frequency distribution function (dashed line) for ABX-IL8 blanks and theoretical frequency distribution function (solid line) for ADA at a level equal to the LOD (R ) 0.05, β ) 0.05).

detected with confidence (1 - β); in this case with a reliable 95% confidence, as only ADA-containing samples with ln(MSOD) signals below the fifth quantile of the signal distribution will be incorrectly determined to be negative. The LOD, not the cut point, is the appropriate quantity to define the detection capability of the method. While use of the LOD to define “sensitivity” sets a higher bar for assay performance, ADA assays frequently can achieve LODs superior to the 250-500 ng/mL level that has been suggested as the minimum detection capability for immunogenicity assays.12 The cut point and LOD can also be established by a nonparametric approach in cases where the distribution is not Gaussian. For illustration, this approach was applied to the untransformed ABX-IL8 optical density data. The data were rank-ordered, and the OD values closest to the fifth and 95th quantiles were determined by inspection. The assay signal corresponding to the cut point was set equal to the OD value closest to the 95th quantile. The signal corresponding to the LOD was determined by adding to the cut point the difference between the median OD and the OD value closest to the fifth quantile. This approach yielded an OD of 0.095 for the cut point and an OD of 0.114 for the LOD signal, which were comparable to the values calculated using the parametric approach. The nonparametric approach also requires the assumption that the shape of the signal distribution for ADA present in samples at a concentration equal to the cut point and LOD is identical to the signal distribution for the blank samples. An additional limitation of this method is that if the R of the assay is intended to be very low, then a large number of samples must be analyzed to ensure that enough values fall above the quantile specified by (1 - R) to permit accurate estimation of the cut point. A combined analysis of the blank data and standard curve data generated using an anti-idiotypic mAb to ABX-IL8 was conducted using the method of Hubaux and Vos,13 and LODs were calculated for various levels of R and β from the upper and lower prediction intervals of the calibration curve. An example of the curve fit and

prediction intervals for R ) 0.01 and β ) 0.05 is shown in Figure 2. As illustrated, the standard deviations of the blanks and the calibration samples are not constant. In this case, the blanks had a higher standard deviation than the calibration samples, and the calibration samples exhibited a standard deviation that could be considered homogeneous (within expected statistical variation). An important advantage of the Hubaux-Vos approach is that heteroskedasticity can be accommodated, allowing for more accurate estimation of the LOD. In this case, the LOD (R ) 0.01, β ) 0.05) by Hubaux-Vos (199 pg/mL) is lower than estimated by blanks alone (311 pg/mL) and could be used to justify the choice of lower type I and type II error rates. The concentration corresponding to the cut point estimated by the blanks alone (141 pg/mL) was comparable to that estimated for the Hubaux-Vos cut point (137pg/mL), as expected since the cut point is dependent only on the distribution of the blank samples. An unfortunate sentiment is prevalent in the immunogenicity literature12 that the type I error rate must be fixed at 5% in order to guarantee the assay is able to detect “low positives”. The ability of the assay to avoid false negatives is specified by the type II error rate (β), where the assay can be demonstrated to detect true positive samples at the LOD concentration with a confidence of (1 - β). All assays should include positive controls at a concentration equal to the LOD and negative controls to provide confidence in assay performance. The equilibrium dissociation constant (KD) of the positive control mAb should also be stated since LOD will vary for standard mAbs with different KDs. The false positive and false negative rates can be monitored over time to ensure that the cut point and LOD have been accurately established and are maintained. From the Hubaux-Vos analyses, ROCs were generated for different concentrations of ADA (100, 200, 300, and 400 pg/mL in 10% serum) to show, for any acceptably low false negative rate, what false positive rate could be achieved while still maintaining reliable detection of the ADA. These ROC and LOD values are Analytical Chemistry, Vol. 79, No. 21, November 1, 2007

8179

Figure 2. Hubaux-Vos method for calculation of cut point and LOD for the ABX-IL8 immunogenicity ELISA standard curve (R ) 0.01, β ) 0.05). Red symbols: log-transformed optical density values of replicates. Green line: calibration curve generated by linear regression of the data. Blue line: upper 1% prediction interval. Red line: lower 5% prediction interval. B3 is a constant concentration offset required for log-log regression (see eq 1).

Figure 3. ROCs and LOD values generated from Hubaux-Vos analysis of the ABX-IL8 calibration curve in 10% serum. The ROCs illustrate corresponding false positive and false negative error rates needed to maintain LOD values equal to 100, 200, 300, or 400 pg/mL. The points represent calculations of LOD for various fixed values of R and β.

shown in Figure 3. For example, if a 5% false negative rate is desired for detection of 200 pg/mL ADA, then a 1% false positive rate could be allowed. However, if 100 pg/mL must be detected with 95% confidence, then a 20% false positive rate would have to 8180

Analytical Chemistry, Vol. 79, No. 21, November 1, 2007

be accepted, which illustrates the inefficiency of decreasing LOD by increasing false positive errors. As the ROC amply demonstrates, this assay has many possible LODs that are all superior, by ∼2 orders of magnitude, to the 250-500 ng/mL

Figure 4. Format of confirmatory Biacore assays used to distinguish ABX-MA1 binding to ADA from binding to sMUC18. (a) Spiking serum samples with ABX-MA1 prior to testing, (b) Spiking samples with a monoclonal antibody to an epitope of sMUC18 that does not compete with ABX-MA1.

serum minimum detection capability in serum that has been recommended.12 As illustrated by Figure 3 and Table S-1 (Supporting Information), false positive and false negative error rates of 1 in 10 000 could be selected while still maintaining a 3.97 ng/ mL LOD in 100% serum. From inspection of Figure 3, it can be seen easily that increasing the type I and II error rates to 5% improves the detection capability by only 2.45 ng/mL in 100% serum. In this case, forcing 5% false positive errors in screening or confirmatory assays would be hard to rationalize from the perspectives of accuracy and cost-effectiveness. Biacore-Based Immunogenicity Assays for ABX-MA1. In some cases, the soluble target of a drug in serum (e.g., shed extracellular domain of a receptor) can generate false positive signals in screening assays. In double-antigen immunoassay formats, the target would have to be multimeric or aggregated to bridge. In Biacore-based assays, monomeric target could also generate false positives. Most confirmatory assays are conducted using the same assay used in screening, with drug (and a negative control) spiked into serum to confirm specificity by inhibition of signal. However, such an approach would not distinguish target from ADA. An additional confirmatory step is necessary to rule

out target as the source of the signal. For double-antigen methods, the receptor/ligand partner of the target or an antibody to the target that competes with binding of drug to the target could be spiked into serum to inhibit signal if target is the cause. In Biacore assays, the same confirmatory approach could be used, but a spiked antibody to the target with a noncompeting epitope also could be used to increase signal if the target is responsible, as illustrated below for ABX-MA1 (Figure 4). To investigate whether soluble antigen present in patient serum samples could give false positives for HAHAs, we turned to a Biacore assay. In our experimental design, ABX-MA1 was immobilized to all 4 flow cells of a CM5 sensor surface and 28 naı¨ve patient serum samples were flowed across the surface. A report point was taken at 3 min into the dissociation phase of each of the four sensorgrams resulting from the four flow cells to give a normalized average resonance unit report point (NARU) for any particular serum sample. To determine a cut point, we needed confidence that the 28 NARU values were normally distributed. The raw NARU values were not normally distributed, but were normally distributed when transformed by the natural logarithm function (Shapiro-Wilk W test, p ) 0.35). It was also necessary Analytical Chemistry, Vol. 79, No. 21, November 1, 2007

8181

Figure 5. (a) Subtracted NARU values resulting from the difference of 25% serum samples unspiked and spiked with 1 mg/mL mAb ABXMA1 plotted against sMUC18 concentration. The colored circles highlight the three 25% serum samples that were further tested as described in the text and in Figure 6. (b) Subtracted NARU values resulting from the difference of 25% serum samples unspiked or spiked with 1 mg/mL control anti-KLH mAb plotted against sMUC18 concentration.

to do outlier statistical testing using the Rosner test5 to reject 1 sample out of the 28 tested. The sample in question consistently gave extremely low NARU values relative to the other 27 measurements, and the Rosner test indicated this data point could be excluded at the 5% critical value. We suggest the Rosner test as a rigorous statistical outlier method for immunogenicity testing as long as there are at least 25 replicates of each sample for analysis. In practice with immunogenicity testing, the Rosner test would usually only be applied to the blank serum samples used to establish the cut point since this is most commonly the only data set with a sufficient number of replicates for outlier testing. As in all strict analytical data analyses, any decision to exclude data from a small data set (