Numeric results from quality attributes testing of drug product and drug substance lots can be used for different statistical analyses. One study is the calculation of statistical tolerance intervals from lot-release data to assist in the determination of specification acceptance criteria (1). Data from manufactured batches placed on stability at the recommended storage condition (RSC) also can provide useful information to estimate long-term variation. Below, I address potential concerns associated with pooling disparate data sources and illustrate a technique to perform appropriate calculations using statistical software.
Modeling Data from Two Sources
Often, only limited data are available for a biopharmaceutical product before large-scale manufacturing. My analysis considered the pooling of reportable values from two distinct data sources to estimate long-term manufacturing variability: lot release and lots that have been placed on a stability study and are stored at the RSC.
Typically, one reportable result for each quality attribute is obtained for a released lot. That result is compared with the release specification acceptance criteria. The sample standard deviation is calculated from n released lots and estimated as (ÏƒL2 + ÏƒE2)1/2, in which ÏƒL2 represents lot-to-lot variance, and ÏƒE2 represents method variance. Another data set that might be available comes from tracking analytical method performance. Those data typically are used in individuals (X) control charts, and the standard deviation used to calculate the limits of those charts provides an alternative estimate of ÏƒE.
To estimate long-term variation, it might be tempting to add the sample variance from method-monitoring data to the sample variance calculated from lot-release data. However, that approach is inappropriate because the calculation estimates are ÏƒE2 + (ÏƒL2 + ÏƒE2). Thus, method variation is double-counted. An alternative approach is to combine the lot-release and stability data, then fit an appropriate mathematical model to estimate ÏƒL2 and ÏƒE2. The estimate of ÏƒE derived for method X-chart limits can be used as an orthogonal confirmation of the estimate of ÏƒE from the mathematical model.
Estimating Method and Lot-to-Lot Variability
When compiling the combined stability and lot-release data set, the t = 0 month result could be included twice in error for released lots that were subsequently placed on stability. Thus, the data set should be checked before statistical calculations to remove duplicates at t = 0 month. The recommended mathematical model follows the overarching strategy in ICH Q1E (2). The stability shelf-life working group of the Product Quality Research Institute (3) noted random variation among commercial stability lots regarding initial levels and trends (e.g., intercepts and slopes for a linear model). Thus, it is assumed that the number of batches placed on stability and used in the statistical analysis permits the use of batch-related terms to be considered as random (instead of fixed) effects. A model that meets those requirements by allowing individual lots to have different intercepts and slopes is called a random coefficients model. It is defined as Model 1: Yij = (Î¼ + loti) + (Î² + Bi) Ã— timeij + Eij. In that equation, i = 1, . . . n; j = 1, . . . Ti; Yij is the response for lot i at time point j; Î¼ + loti is the y-intercept of the ith lot, assumed to be ~N(0, ÏƒL2); Î² + Bi is the slope of the ith lot, assumed to be ~N(0, ÏƒB2); timeij is the jth time point for the ith lot; Eij is a normal random error term created by model misspecification and measurement error; Eij is assumed to be ~N(0, ÏƒE2); n is the number of lots; and T is the number of responses obtained for lot i.
After fitting Model 1 to the combined stability and lot release data, the residuals are examined to ensure that standard model assumptions of independence, normality, linearity, and homogeneity of variance are met. The estimates for ÏƒL, ÏƒB, and ÏƒE can be calculated using statistical software as described by Burdick et al. (4). Using data from stability lots manufactured using well-controlled and understood manufacturing processes, the estimate of ÏƒB frequently is zero or close to zero, implying the average slope is common across stability lots at the RSC. For that reason, a random coefficients model Ââ€”Â which is identical to Model 1 but assumes ÏƒB to be zero â€” has been assumed aÂ priori in other studies (5, 6).
Application for Reliable Estimations
The technique described herein is one way to calculate an estimate of long-term variation from biopharmaceutical manufacturing data. Once reliable estimates of ÏƒL and ÏƒE (denoted as SL and SE, respectively) are available, an estimate of long-term variability can be calculated using (SL2 + SE2)1/2. That estimate can be used for multiple purposes, including for determining release and stability acceptance criteria (5).
1 Dong X, Tsong Y, and Shen M. Statistical Considerations in Setting Product Specifications. J. Biopharm. Stat. 25(2) 2015: 280â€“294.
2 Q1E: Evaluation for Stability Data. International Council on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use: Geneva, Switzerland, 2003.
3 Capen R, et al. On the Shelf Life of Pharmaceutical Products. AAPS PharmSciTech. 13(3) 2012: 911â€“918.
4 Burdick RK, et al. Statistical Applications for Chemistry, Manufacturing and Controls (CMC) in the Pharmaceutical Industry. Springer, Cham, Switzerland, 2017: 291â€“292.
5 Montes RO, Burdick RK, LeBlond DJ. Simple Approach to Calculate Random Effects Model Tolerance Intervals to Set Release and Shelf-Life Specification Limits of Pharmaceutical Products. J PDA J. Pharm. Sci. Technol. 73(1) 2018: 39â€“59.
6 Schmelzer B, Mischo A, Innerbichler K. Statistically Significant Versus Practically Relevant Trend in Stability Data. PDA J. Pharm. Sci. Technol. 75(4) 2021: 341â€“356.