A Practical Guide to Analytical Method Validation - American Chemical

The ability to provide timely, accu- rate, and reliable data is central to the role of analytical chemists and is especially true in the discovery, de...
5 downloads 11 Views 17MB Size
Report

A Practical Guide to Analytical Method Validation Doing a thorough method validation can be tedious, but the consequences of not doing it right are wasted time, money, and resources

T

he ability to provide timely, accurate, and reliable data is central to the role of analytical chemists and is especially true in the discovery, development, and manufacture of pharmaceuticals. Analytical data are used to screen potential drug candidates, aid in the development of drug syntheses, support formulation studies, monitor the stability of bulk pharmaceuticals and formulated products and test final products for release The quality of analytical data is a key factor in the of a drug development program The of method development and validation has a direct impact on the quality of these data

Although a thorough validation cannot rule out all potential problems, the process of method development and validation should address the most common ones. Examples of typical problems that can be minimized or avoided are synthesis impurities that coelute with the analyte peak in an HPLC assay; a particular type

J. Mark Green DuPont Merck Pharmaceutical Co. 0003-2700/96/0368-305A/$12.00/0 © 1996 American Chemical Society

of column that no longer produces the separation needed because the supplier of the column has changed the manufacturing process; an assay method that is transferred to a second laboratory where they are unable to achieve the same detection limit; and a quality assurance audit of a validation report that finds no documentation on how the method was performed during the validation. Problems increase as additional people, laboratories, and equipment are used to perform the method. When the method is used in the developer's laboratory, a small adjustment can usually be made to make the method work, but the flexibility to change it is lost once the method is transferred to other laboratories or used for official product testing. This is especially true in the pharmaceutical indus-

try, where methods are submitted to regulatory agencies and changes may require formal approval before they can be implemented for official testing. The best way to minimize method problems is to perform adequate validation experiments during development. What is method validation?

Method validation is the process of proving that an analytical method is acceptable for its intended purpose. For pharmaceutical methods, guidelines from the United States Pharmacopeia (USP) (1), International Conference on Harmonisation (ICH) (2), and the Food and Drug Administration (FDA) (3, 4) provide a framework for performing such validations. In general, methods for regulatory submission must include studies on specificity, linearity, accuracy,

Analytical Chemistry News & Features, May 1, 1996 305 A

Report

precision, range, detection limit, quantitation limit, and robustness. Although there is general agreement about what type of studies should be done, there is great diversity in how they are performed (5). The literature contains diverse approaches to performing validations (as in References 6-10). This Report presents an approach to performing validation studies that encompasses much of the current literature and provides practical guidance. This approach should be viewed with the understanding that validation requirements are continually changing and vary widely depending on the type of drug being tested the stage of drug development and the regulatory group that will review the drug aDDlication For our purposes we will discuss validation studies as they apply to chromatncrraphic methridQ althmitrh the

same Drincioles aDDly to other analytical techninues In the early stages of drug development, it is usually not necessary to perform all of the various validation studies. Many researchers focus on specificity, linearity, accuracy, and precision studies for drugs in the preclinical through Phase II (preliminary efficacy) stages. The remaining studies are performed when the drug reaches the Phase III (efficacy) stage of development and has a higher probability of becoming a marketed product. The process of validating a method cannot be separated from the actual development of the method conditions, because the developer will not know whether the method conditions are acceptable until validation studies are performed. The development and validation of a new analytical method may therefore be an iterative process. Results of validation studies may indicate that a change in the procedure is necessary, which may then require revalidation. During each validation study key method parameters are determined and then used for all subsequent validation steps To minimize repetitious studies and ensure that the validation data are generated under conditions ertuivalent to t h e final p r o c e d u r e WP

recommend the following sequence of chirlipc

Establish minimum criteria

The first step in the method development and validation cycle should be to set

minimum requirements, which are essentially acceptance specifications for the method. A complete list of criteria should be agreed on by the developer and the end users before the method is developed so that expectations are clear. For example, is it critical that method precision (RSD) be < 2%? Doee she meehod need to be accurate to within 2% of the target concentration? Is it acceptable to have only one supplier of the HPLC column used in the analysis? During the actual studies and in thefinalvalidation report, these criteria will allow clear judgment about the acceptability of the analytical method. Examples of minimum criteria are provided throughout this article that indicate practical ways to evaluate the acceptability of data from each validation study. The statistics generated for making compari-

The development and validation of a new analytical method may be an iterative process. sons are similar to what analysts will generate later in the routine use of the method and therefore can serve as a tool for evaluating later questionable data. More rigorous statistical evaluation techniques are available and should be used in some instances, but these may not allow as direct a comparison for method troubleshooting during routine use. Demonstrate specificity

For chromatographic methods, developing a separation involves demonstrating specificity, which is the ability of the method to accurately measure the analyte response in the presence of all potential sample components. The response of the analyte in test mixtures containing the analyte and all potential sample components (placebo formulation, synthesis intermediates, excipients, degradation products, process impurities, etc.) is compared with the response of a solution containing

only the analyte. Other potential sample components are generated by exposing the analyte to stress conditions sufficient to degrade it to 80-90% purity. For bulk pharmaceuticals, stress conditions such as heat (50 *C), light (600 FC)) acid (0.1 N HC1), base (0.1 N NaOH)) and oxidant (3% H202) are typical. For formulated products, heat, light, and humidity (85%) are often used. The resulting mixtures are then analyzed, and the analyte peak is evaluated for peak purity and resolution from the nearest eluting peak. If an alternate chromatographic column is to be allowed in the final method procedure, it should be identified during these studies. Once acceptable resolution is obtained for the analyte and potential sample components, the chromatographic parameters, such as column type, mobile-phase composition, flow rate, and detection mode are considered set. An example of specificity criteria for an assay method is that the analyte peak will have baseline chromatographic resolution of at least 1.5 from all other sample components. If this cannot be achieved, the unresolved components at their maximum expected levels will not affect the final assay result by more than 0.5%. An example of specificity criteria for an impurity method is that all impurity peaks that are > 0.1% bb area will have baseline chromatographic resolution from the main component peak(s) and, where practical, will have resolution from all other impurities. Demonstrate linearity

A linearity study verifies that the sample solutions are in a concentration range where analyte response is linearly proportional to concentration. For assay methods, this study is generally performed by preparing standard solutions atfiveconcentration levels, from 50 to 150% of the target analyte concentration. Five levels are required to allow detection of curvature in the plotted data. The standards are evaluated using the chromatographic conditions determined during the specificity studies Standards should be prepared and analyzed a minimum of three times. The 50 to 150% range for this study is wider than what is required by the FDA guidelines. In the final method procedure, a tighter range of three standards is generally used,

such as 80,100, and 120% of target; and in some instances, a single standard concentration is used. Validating over a wider range provides confidence that the routine standard levels are well removed from nonlinear response concentrations, that the method covers a wide enough range to incorporate the limits of content uniformity testing, and that it allows quantitation of crude samples in support of process development. For impurity methods, linearity is determined by preparing standard solutions atfiveconcentration levels over a range such as 0.05-2.5 wt%. Acceptability of linearity data is often judged by examining the correlation coefficient and ^intercept of the linear regression line for the response versus concentration plot. A correlation coefficient of > 0.999 is generally considered as evidence of acceptable fit of the data to the regression line. The ^intercept should be less than a few percent of the response obtained for the analyte at the target level. Althoughtfieseare very practical ways of evaluating linearity data, they are not true measures of linearity (11,12). These parameters, by themselves, can be misleading and should not be used without a visual examination of the response versus concentration plot. An example of how the use of correlation coefficients can be misleading can be seen in data from an HPLC method for quantitation of mannitol. This method uses an internal standard, so the data are recorded as peak area ratios (mannitol area/internal standard area). Figure 1 is a plot of mannitol peak area ratio versus mannitol concentration for standards analyzed by the method. Although the correlation coefficient of the linear regression is > 0.999 (top), the plot indicates small deviations from linearity at low and high concentrations. An alternate way of evaluating the data is to plot response factor [ (peak area ratio - y intercept/concentration)] versus concentration (also shown in Figure 1). If an equivalent response was obtained at each concentration, the data points would form a straight line with a zero slope. The response factors plotted in Figure 1 (top) vary greatly over the range and fall only within 15% of the target concentration. A second set of mannitol data, over a narrower range of concentrations, is

shown in Figure 1 (bottom). The response factors for all concentrations in this range are within 1.5% of the target concentration response. The near-zero slope of the response factor plot indicates that a linear response is obtained over this concentration range. At the completion of linearity studies, the appropriate concentration range for the standards and the injection volume should be set for all subsequent studies. An example of a linearity criteria for an assay method is that the correlation coefficient for each of three curves (five concentration levels each) will be > 0.99 for the range 80-120% of the target concentration. The ^-intercept will be < 2% of the target concentration response. An alternate criteria is that a plot of response factor versus concentration will show all values within 2.5% of the target-level response factor for concentrations between 80 and 120% of the target concentration. For an impurity method, the correlation coefficient for each of three curves (five concentration levels each) will be > 0.98 for the range 0.1- 2.5% of the main component concentration. The ^intercept will be < 10% of the response produced for a 2.5 wt%

impurity. An alternate criteria is that a plot of response factor versus concentration will show all values within 5% oo the mean response factor for concentrations > 0.5 wt% and within 10% of the mean response factor for concentrations < 0.5 wt%. Demonstrate accuracy

The accuracy of a method is the closeness of the measured value to the true value for the sample. Accuracy is usually determined in one of four ways. First, accuracy can be assessed by analyzing a sample of known concentration and comparing the measured value to the true value. National Institute of Standards and Technology (NIST) reference standards are often used; however, such a well-characterized sample is usually not available for new drug-related analytes. The second approach is to compare test results from the method with results from an existing alternate method that is known to be accurate Again for pharmaceutical studies such an alternate not The third and fourth approaches are based on the recovery of known amounts of analyte spiked into sample matrix. The third approach, which is the most widely used recovery study, is performed by spiking analyte in blank matrices. For assay methods, spiked samples are prepared in triplicate at three levels over a range of 50-150% of the target concentration. If potential impurities have been isolated they should be added to the matrix to mimic impure samples For impurity methods spiked samples are prepared in triplicate at three levels over a range that covers the expected impurity content of the sample such as 0 1-2 5 wt% The analyte levels in the spiked samples determined using the same quantitation p r o c e d u r e as will be u s e d in t h e final method procedure (i p same number and i

t 4. J

I

'

J

u

e

levels of standards, same number of sam1

J

4

J

J

• •

f

4-

\

TT.

pie and standard injections, etc.). The per,

Figure 1 . Peak area ratio (circles) and response factor (squares) versus concentration for mannitol. (Top) Concentration range is 5-80 mg/mL. For peak area ratio line, y = 0.09775 + 0.080569* and correlation coefficient = 0.99952. (Bottom) Concentration range is 12-28 mg/mL. For peak area ratio line, y = 0.027 + 0.08625x and correlation coefficient = 0.99965.

, , , ,

,

1

1 4

J

cent recovery should then be calculated. The fourth approach is the technique of standard additions, which can also be used to determine recovery of spiked analyte. This approach is used if it is not possible to prepare a blank sample matrix without the presence of the analyte. This can occur, for example, with lyophilized material, in which the speciation in the

Analytical Chemistry News & Features, May 1, 1996 3 0 7 A

Report

lyophilized material is significantly different when the analyte is absent. An example of an accuracy criteria for an assay method is that the mean recovery will be 100 ± 2% at each honcentration over the range of 80-120% of the target concentration. For an impurity method, the mean recovery will be within 0.1% absolute of the theoretical concentration or 10% relative, whichever is greater, for impurities in the range of 0.1-2.5 wt%.

criteria, which are required prior to routine use of the method to ensure that it is performing appropriately. Typically, the process involves makingfiveinjections of a standard solution and evaluating several chromatographic parameters (1) such as resolution, area % reproducibility, number of theoretical plates, and tailing factor. Establish the detection limit

The detection limit of a method is the lowest analyte concentration that produces a response detectable above the noise level Determine the range of the system, typically, three times the The range of an analytical method is the Thefirsttype of precision study is innoise level. The detection limit needs to be concentration interval over which acceptable accuracy, linearity, and precision are strument precision or injection repeatabil- determined only for impurity methods in ity (3). A minimum of 10 injections of which chromatographic peaks near the deobtained. In practice, the range is detertection limit will be observed. The detecmined using data from the linearity and ac- one sample solution is made to test the performance of the chromatographic intion limit should be estimated early in the curacy studies. Assuming that acceptable linearity and accuracy (recovery) results strument. The second type is repeatability method development-validation process or intra-assay precision (2). Intra-assay and should be repeated using the specific were obtained as described earlier, the precision data are obtained by repeatedly wording of the final procedure if any only remaining factor to be evaluated is analyzing, in one laboratory on one day, al- changes have been made. It is important to precision. This precision data should be iquots of a homogeneous sample, each of test the method detection limit on differavailable from the triplicate analyses of which has been independently prepared ent instruments, such as those used in the spiked samples in the accuracy study. different laboratories to which the method Figure 2 illustrates how precision may according to the method procedure. From these precision studies, the sample will be transferred. An example of a change as a function of analyte level. The %RSD values for ethanol quantitation by preparation procedure, the number of rep- detection limit criteria is that, at the 0.05% licate samples to be prepared, and the level, an impurity will have S/N > 3. GC increased significantly as the concennumber of injections required for each tration decreased from 1000 ppm to sample in the final method procedure will 10 ppm. Higher variability is expected as Establish the quantitation limit be set. Two additional types of precision the analyte levels approach the detection The quantitation limit is the lowest level of studies are described later in Round 2. limit for the method. The developer must analyte that can be accurately and prejudge at what concentration the impreciAn example of precision criteria for cisely measured. This limit is required sion becomes too great for the intended an assay method is that the instrument only for impurity methods and is deteruse of the method. precision (RSD) will be < 1% and the intra- mined by reducing the analyte concentraassay precision will be < 2%. For an impu- tion until a level is reached where the An example of range criteria for an assay method is that the acceptable range will rity method, at the limit of quantitation, the precision of the method is unacceptable. If instrument precision will be < 5% and the not determined experimentally, the quanbe defined as the concentration interval intra-assay precision will be < 10%. titation limit is often calculated as the anaover which linearity and accuracy are oblyte concentration that gives S/N = 10. An tained per previously discussed criteria and example of quantitation limit criteria is that yields a precision of < 3% RSD. For an Widen the scope that the limit will be defined as the lowest impurity method, the acceptable range will Once these validation studies are complete, the method developers should be concentration level for which an RSD be defined as the concentration interval < 20% is obtained when an intra-assay preover which linearity and accuracy are ob- confident in the ability of the method to provide good quantitation in their own lab- cision study is performed. tained per the above criteria, and that, in oratories. This result may be sufficient addition, yields a precision of < 10% RSD. for many methods, especially in the early Establish stability Determine precision, Round 1 phases of drug development. The remain- During the earlier validation studies, the method developer gained some informaThe precision of an analytical method is ing studies should provide greater assurthe amount of scatter in the results obance that the method will work well in tion on the stability of reagents, mobile phases, standards, and sample solutions. tained from multiple analyses of a homoge- other laboratories, where different operaneous sample. To be meaningful, the pretors, instruments, and reagents are in- For routine testing in which many samples are prepared and analyzed each day, cision study must be performed using the volved and where it will be used over it is often essential that solutions be stable exact sample and standard preparation much longer periods of time. procedures that will be used in the final This is a good time to begin accumulat- enough to allow for delays such as instrumethod. ing data for two or more system suitability ment breakdowns or overnight analyses 308 A

Figure 2. %RSD versus concentration for a GC headspace analysis of ethanol.

Analytical Chemistry News & Features, May 1, 1996

using autosamplers. At this point, the limits of stability should be tested. Samples and standards should be tested over at least a 48-h period, and quantitation of components should be determined by comparison to freshly prepared standards. If the solutions are not stable over 48 h, storage conditions or additives should be identified that can improve stability. An example of stability criteria for assay methods is that sample and standard solutions and the mobile phase will be stable for 48 h under defined storage conditions. Acceptable stability is < 2% change in standard or sample response, relative to freshly prepared standards. The mobile phase is considered to have acceptable stability if aged mobile phase produces equivalent chromatography (capacity factors, resolution, or tailing factor) and assay results are within 2% of the value obtained with fresh mobile phase. For impurity methods, the sample and standard solutions and mobile phase will be stable for 48 h under defined storage conditions. Acceptable stability is < 20% change in standard or sample response at the limit of quantitation, relative to freshly prepared standards. The mobile phase is considered to have acceptable stability if aged mobile phase produces equivalent chromatography and if impurity results at the limit of quantitation are within 20% of the values obtained with fresh mobile phase. Establish precision, Round 2 The remaining precision studies comprise much of what historically has been called ruggedness. Intermediate precision (2) is the precision obtained when the assay is performed by multiple analysts, using multiple instruments, on multiple days, in one laboratory. Different sources of reagents and multiple lots of columns should also be included in this study. Intermediate precision results are used to identify which of the above factors contribute significant variability to the final result. The last type of precision study is reproducibility (2), which is determined by testing homogeneous samples in multiple laboratories, often as part of interlaboratory crossover studies. The evaluation of reproducibility results often focuses more on measuring bias in results than on determining differences in precision alone.

Statistical equivalence is often used as a measure of acceptable interlaboratory results. An alternative, more practical approach is the use of "analytical equivalence" in which a range of acceptable results is chosen prior to the study and used to judge the acceptability of the results obtained from the different laboratories. An example of reproducibility criteria for an assay method could be that the assay results obtained in multiple laboratories will be statistically equivalent or the mean results will be within 2% of the value obtained by the primary testing lab. For an impurity method, results obtained in multiple laboratories will be statistically equivalent or the mean results will be within 10% (relative) of the value obtained by the primary testing lab for impurities > lwt%, within 25% for impurities from 0.1-1.0 wt%, and within 50% for impurities