Full text

Turn on search term navigation

OVERVIEW

Data from clinical trials are frequently incomplete, particularly data sets collected during large, late‐phase trials or during routine clinical patient care or follow‐up visits. Portions of data may be missing or inaccurate as a result of factors such as study site noncompliance, patient noncompliance, inappropriate sample handling, data entry errors, and analytical problems. How “problematic” data are handled can impact its interpretation, especially when data used for population pharmacokinetic (PPK) modeling contains missing or erroneous data. Before beginning an analysis, pharmacometricians often spend a large portion of time dealing with problematic data.

During data cleaning (data quality assurance [QA]), the first step is to identify missing or problematic data. Concentration‐time data and dosing records are often the primary concern, but other issues, such as missing or questionable covariate data, must also be considered. Once issues/discrepancies are identified, the next challenge is to evaluate frequency of occurrence of each type of problem and the associated reason to establish appropriate methods for handling these erroneous data. Prior studies have addressed handling of specific types of problematic data, although no set of broad recommendations spanning the various types of problematic data have been previously presented.

Through reviews of published methods, simulations of data sets with known errors, and evaluations using different methods for handling these errors, this tutorial aims to provide guidance for dealing with problematic clinical (and some nonclinical) concentration versus time, dosing, and covariate data.

This tutorial is intended to be used by scientists analyzing pharmacokinetic (PK) data with either missing data or where apparently questionable or erroneous data are present. Although QA and data quality control (QC) are essential to successful modeling, this tutorial assumes the data set has already undergone appropriate QC or was assembled from locked, clean data. Basic assessments include exploratory data analysis by plotting and summarizing existing data (e.g., summary statistics) to ensure all data are within expected ranges relative to the subject population. Communication with the clinical research staff, analytical laboratory scientists, and/or sponsor is important to attempt to explain any missing or apparently problematic data. For additional information on pharmacology data QA and QC, we refer the reader to previously published reviews.^1,2

METHODS

Simulated data sets

In addition to reviewing published studies evaluating methods for handling missing or erroneous data, simulation and estimation was used to fill in gaps or to expand on published information. A single 400‐subject data set was simulated using SimCyp³ to ensure appropriate ranges of weight, age, creatinine clearance (CrCL) and sex. This covariate data set was randomly sampled without replacement to generate 100 data sets of 200, 100, 50, or 25 unique subjects with the widest possible range of covariate values to evaluate the impact of missing or erroneous concentration data, data that are below limit of quantification (BLQ), and the impact of missing or incorrect covariate data on the ability to accurately and precisely estimate PK parameter values. A total of 100 replicates of concentration‐time data were simulated with each of the data sets using multiple PK models, including intravenous (i.v.) bolus and oral dosing, one‐compartment and two‐compartment structural models, various scenarios of clearance (CL) and volume of distribution (V), and various covariate effect levels (covariate factor [FAC]) for continuous (weight [WT]) or categorical (sex [SEX]) covariate effects on CL. For each scenario, three sizes of the covariate effect (weak, 0.3 and 0.8; moderate, 0.5 and 0.7; or strong influence, 0.75 and 0.5; for continuous and categorical covariates, respectively) on CL were evaluated. The effect of WT on CL was modeled with a power function as in Equation (1):[Image Omitted. See PDF]where θ_CL represents the typical value of CL for a given value of WT normalized to 70 (kg). The effect of SEX on CL was modeled according to Equation (2):[Image Omitted. See PDF]where SEX was coded as 0 for females and 1 for males. Various percentages of data were removed or adjusted to simulate missing or erroneous data, respectively, for each evaluation. Table 1 provides a summary of the study subject characteristics. Additional details on the simulated data sets are in the Supplementary Material.

1 TABLESummary of 400‐patient data set

Covariate	Median (minimum, maximum)
Females, N = 200	Males, N = 200
WT	67 (40, 117)	79 (53, 120)
AGE	41 (21, 60)	43 (21, 65)
SECR	0.6 (0.4, 1.3)	0.8 (0.5, 1.4)
CrCL	123 (41, 315)	127 (74, 234)

Abbreviations: AGE, age (years); WT, baseline body weight (kg); CrCL, creatinine clearance (ml/min); SECR, serum creatinine (mg/dl).

General model‐fitting approaches

Each simulated data set with either BLQ or missing or erroneous covariate data was evaluated using NONMEM (version 7.3; Icon PLC, Dublin, Ireland) with the first‐order conditional estimation method, implementing the M3 method, as appropriate (see the Concentration data BLQ section), to estimate model parameters. Model output was automatically tabulated and summarized graphically using an R script. Additional details on the data simulation and parameter estimation processes, including NONMEM code, are provided in the Supplementary Material. Overall flow diagrams of the simulation and estimation approaches are shown in Figure S1.

Analysis of simulated missing data scenarios and data‐handling methods

To evaluate the impact of various factors on the accuracy and precision of estimated PK parameters, multiple output variables between the experimental (i.e., 5%–50% BLQ or missing/erroneous data) and control (i.e., 0% BLQ or 0% missing/erroneous data) conditions were compared. Output parameters can be found in the Supplementary Material Output Tables sections. Model performance in all scenarios was evaluated by comparing parameter estimates, associated relative standard errors (RSEs), proportion of successful covariance steps, and measures of bias (relative mean error [RME]) and precision (root mean square error [RMSE]). Details on these metrics for evaluating the handling methods applied to the various missing or erroneous data scenarios are provided in the Supplementary Material.

CONCENTRATION VERSUS TIME DATA

PPK models should be developed from data where all dosing times, concentration measurements, and sampling times are known. Unfortunately, that is rarely the case. Two commonly encountered types of problematic PK data are errors in concentration measurements and errors in recorded PK sampling times. Causes for erroneous concentrations may include problems with the bioanalytical assay (e.g., assay variability, presence of interfering endogenous compounds), differences between assay laboratories and/or changes to the analytical technique over time, sampling anomalies (e.g., collecting a PK sample from an i.v. line used to deliver the drug dose, poor or variable sample stability, mislabeling samples), or errors in recorded data. Cause for errors in sampling times may include failure to capture actual sample time or recording an incorrect sample time.

Analytical assay accuracy and precision

The analytical method used to quantify a given compound and/or metabolites is an essential element of PK studies. All analytical measurements have associated error, and Graves et al.⁴ investigated the impact of this error on PK parameters estimated from several methods for three different ratios of absorption rate constant to elimination rate constant (K_a/K_e). Although assay error is often a minor component of residual unexplained variability (RUV), the authors found that random assay error may alter PK curve shape leading to false inclusion of additional compartments. For the lowest K_a/K_e (K_a/K_e = 2), all parameter estimation methods except the nonlinear mixed‐effects approach underestimated K_e and overestimated K_a and V. These biases increased when assay error increased up to 30%. Even at the allowable ±20% at the lower limit of quantification (LLOQ) level according to bioanalytical guidelines,^5,6 Graves and colleagues found that at 20% “assay error,” PK parameters are already overestimated. A review of the bioanalytical validation report before starting data analysis will signal the potential of obtaining biased PK estimates because of the accuracy and precision of the analytical assay used (e.g., overestimated PK parameters may be obtained if the bioanalytical method used has acceptance criteria close to 20% at the LLOQ level).

Assay selectivity–endogenous interference

When concentration data are confounded by interfering endogenous compounds (including the same analyte in cases where the drug itself is an endogenous compound, such as a glucocorticoid or growth factors), baseline correction can be applied.⁷ However, this methodology ignores the analytical assay variability, and postdose differences do not consider measurement errors associated with endogenous baseline. Other methods have been proposed to model endogenous compound concentrations.^8,9 With linear kinetics, when the rate of endogenous drug production is constant, without feedback control of production (endogenous drug is at steady state), the effect of the intrinsic drug represents a constant increment to exogenous drug in the sampled compartment (Supplementary Code 1).

In some cases, however, there may be fluctuations of the endogenous values over time attributed to various feedback or control systems. Circadian rhythms are evident in many physiological and endocrine systems, leading to concentrations that fluctuate over time. Circadian rhythms can be represented using trigonometric functions that oscillate over 24 h (Equation 3),[Image Omitted. See PDF]

where amplitude is the maximum extent of the oscillation from the point of equilibrium, cos is the cosine function, T is time, and phase is the amount of time between successive oscillations. Circadian patterns may be evident, particularly if placebo data are collected over a period of time. However, if placebo data are collected at limited times (e.g., daily, monthly), and sampling only occurs at times that reflect the same part of the curve, a potential circadian rhythm may be hidden. If unrecognized, circadian patterns can obscure significant concentration‐effect relationships. Rose and colleagues¹⁰ aimed to characterize the PPK of a recombinant follicle‐stimulating hormone (FSH) taking into account endogenous FSH levels. In this case, the change in endogenous FSH amount over time in the central compartment was described as a turnover model with a zero‐order production rate (Equation 4):[Image Omitted. See PDF]

where FSH_en is the endogenous FSH level, K_in is a zero‐order production rate, and K is the elimination rate constant. The authors demonstrated that total FSH concentration is affected, and endogenous FSH levels change over time as a consequence of feedback from ovarian hormones in subjects with endogenous FSH levels. If not accounted for, the resulting PK parameter estimates are biased, leading to erroneous characterization of drug disposition. Therefore, whenever initial conditions of the system are nonzero, regardless of the amount of the endogenous compound in circulation, the initial amount or concentration must be accounted for by the model so that its circulating concentration is considered during PK analysis. However, adjustments for endogenous substance and any time dependent variations that it exhibits would depend on the overall ability to measure the endogenous substance and the impact on exogenous substance administered.

Interassay bias

Ideally, concentration data used for PK analysis should be obtained from the same laboratory using the same validated analytical assay, or when multiple analytical labs are used, those assays should be cross‐validated. However, when data originates from multiple assays, the interchangeability of results from those different laboratories or assays should be evaluated. A graphical visualization of dose‐normalized concentration (for drugs with linear PK) versus time data may reveal analytical discrepancies. Figure 1a shows an example where results from assays 1 and 2 are not in agreement. Notably, although we can clearly see the bias in the concentration‐time plots in this example, there is no obvious indication of bias in the residual or diagnostic plots for this analysis (Figure 1b–d). Furthermore, assay bias is more often subtle and less apparent. Assessing “assay” as a covariate (e.g., on CL) is therefore recommended and would also be a strategy for estimating the degree of difference or bias between the assays. If plots and/or covariate analysis suggest disagreement between assays, a scaling factor that eliminates method differences can be introduced (Supplementary Code 2). However, this scenario is undesirable since the pharmacometrician may not know which assay should be adjusted. In addition, even if the assay results are interchangeable, assay precisions may be different. This potential variability between assays should be taken into account in the analysis, and separate residual errors should be used by introducing an independent indicator variable (NONMEM Guide 2011¹¹). This variable (ASAY), which indicates the assay methodology for each concentration, is defined with a value of 0 for assay 1 and 1 for assay 2. Estimating different residual variabilities associated with each assay is demonstrated in Supplementary Code 3. The equation resolves to Equation (5):[Image Omitted. See PDF]when ASAY = 0, and Equation (6):[Image Omitted. See PDF]when ASAY = 1, where F is the prediction value that is evaluated by the ADVAN model and Y is the user‐defined value relating the differences between F and the observed data. The estimate of EPS(1) reflects RUV associated with assay 1 and EPS(2) reflects RUV associated with assay 2. This strategy may improve model predictions by reducing bias in parameter estimates.¹²

View Image - 1 FIGURE. Example quality control plots of a data with different types of errors. Assay bias: (a) concentration versus time scatterplot (blue, assay 1; black, assay 2), (b) individual predicted versus observed, (c) population predicted versus observed, and (d) conditional weighted residuals (CWRES) versus time for a merged set of concentration data obtained from two different assays. Sampling time error: (e) concentration versus time scatterplot and (g) CWRES versus time for a data set where one sample time (2.1 h) was incorrectly labeled (6.7 h). Concentration error: (f) concentration versus time scatterplot and (h) CWRES versus time for a data set where one concentration (5.3 µg/ml) was incorrectly recorded (0.53 µg/ml)

1 FIGURE. Example quality control plots of a data with different types of errors. Assay bias: (a) concentration versus time scatterplot (blue, assay 1; black, assay 2), (b) individual predicted versus observed, (c) population predicted versus observed, and (d) conditional weighted residuals (CWRES) versus time for a merged set of concentration data obtained from two different assays. Sampling time error: (e) concentration versus time scatterplot and (g) CWRES versus time for a data set where one sample time (2.1 h) was incorrectly labeled (6.7 h). Concentration error: (f) concentration versus time scatterplot and (h) CWRES versus time for a data set where one concentration (5.3 µg/ml) was incorrectly recorded (0.53 µg/ml)

Sample contamination and interfering components

Aside from analytical methodology, errors occurring before and after samples are processed and analyzed sometimes account for significant error in concentration measurements.¹³ Sample integrity may be compromised in several ways including sample “contamination” with something that could interfere with drug measurement, sample storage or transportation under incorrect conditions, or mislabeling. For example, blood samples are often collected from a venous catheter where contamination with i.v. fluid may occur if sampling procedures are not followed correctly, which causes sample dilution and lowering of actual drug concentrations. Similarly, when a blood sample is collected from the i.v. line used to deliver a drug, if the line is not flushed properly, the sample is contaminated with the drug in the i.v. line, and measured drugs concentrations will be erroneously high.

Drug stability

Drug stability during sample processing and storage is an important consideration. Proper handling of samples after collection, including processing and storage within a specific time period and transportation at appropriate conditions (e.g., temperature, light) will minimize erroneous results, especially when dealing with chemically or metabolically unstable drugs. This is particularly important in clinical centers where samples are collected then sent to centralized laboratories for analysis.

Sampling time errors

One of the most common causes of postanalytical error is inaccurate data entry. PK data available for analysis may be susceptible to measurement error (ME) in sampling times. Specifically, the ME associated with the sampling time variable in PK modeling belongs to the Berkson error type (i.e., the variability in the true sampling time is larger than that of the scheduled time).¹⁴ This type of error is more commonly present in data obtained during routine clinical practice and less so in PK data collected during a well‐controlled clinical trial.

Actual sampling times may be missing or recorded incorrectly, and nominal protocol times are used instead. Studies using trough concentrations of drugs with a short half‐life for PK modeling provide a good example of blood drawn at times different than scheduled times. Santalo et al.¹⁵ reported vancomycin trough concentrations are often measured early, producing results that do not reflect actual troughs and may subsequently lead to inappropriate dosing. Similarly, as discussed in the next paragraph, dose times that are incorrectly recorded result in an overall shift of the PK curve.

Sampling times are assumed to be correct for most PK clinical pharmacology analyses. However, this may be a flawed assumption. The impact of recording errors for sampling times has been explored and reported previously.^16–18 Based on a sensitivity analysis, Karlsson et al.¹⁷ demonstrated that random shifts of sampling times do not cause significant changes in parameter estimates aside from increasing residual error. These results are in accordance with Wang and Davidian.¹⁸ Recently, Choi et al.¹⁶ conducted simulation studies to evaluate the effects of using nominal time instead of actual sampling time on model parameters. The authors found the most important determinant of bias attributed to Berkson error in time is the degree of the curvature (nonlinearity) of the concentration‐time profile. Although the magnitude of ME is another important factor, its effect may not become notable unless the drug exhibits nonlinear PK. Thus, the data should be reviewed first to evaluate the impact of sample time errors. The smaller the curvature around sampling times, regardless of the magnitude of ME, the lower the expected bias in PK parameter estimates. Conversely, when substantial nonlinearity is expected, use of actual sampling times becomes more important. The authors also found that bias decreases systematically as sample number within a dose interval increases (more dense sampling design).^17,18 In this case, it may be assumed that the number of sample time errors falls related to the number of correct sample times with dense sampling.

Errors occurring before or after the assay run are often identified during standard QC of a data set when looking at individual or overall concentration‐time plots. They are also sometimes identified when reviewing summary statistics of concentration‐time data. Errors in concentration measurements can sometimes be difficult to distinguish from errors in sample time, and in some cases, it may not be clear that erroneous data are present. For example, observing high concentrations at a given timepoint in one patient relative to others could be the result of incorrectly recorded sampling times (e.g., 6.7 h instead of 2.1 h; Figure 1e) or may be explained by incorrect recording of a concentration (e.g., 0.53 µg/ml instead of 5.3 µg/ml; Figure 1f). These types of errors typically show up as outlier data points and may be identified in initial modeling through diagnostic plots, such as predicted versus observed concentrations or conditional weighted residual values (CWRES; Figure 1g,h), although sometimes errors are more subtle. In fact, sampling time errors may not be possible to identify without direct knowledge that a sampling time error occurred (e.g., a note made by the clinical site indicating the recorded time was an approximation of the actual sample time). Even in this case, all erroneous concentration or time data identified or suspected, should be removed from the analysis. If time needs to be scaled (e.g., samples from a particular center appear questionable), the variable TSCALE or XSCALE may be introduced in the $PK block¹⁹ (Supplementary Code 4).

In summary, both measured drug concentrations and recorded sampling times are key components of PK studies. Any interference or deviation from the true condition should be taken into account in the analysis and adjusted, if possible.

Summary of recommendations related to errors in concentration versus time data

‐Before initiating data analysis, reviewing the bioanalytical report will signal the potential of obtaining biased PK estimates attributed to the accuracy and precision of the analytical assay used.
‐If concentration data comes from different sites and/or assays, verify the PK is not different between sites or assays.
‐When a cross‐validation is unavailable, graphical visualization of PK data (e.g., dose‐normalized concentrations versus time) will help to identify agreement between results.
‐If there is poor agreement between assays, a scaling factor that adjusts for assay differences should be introduced.
‐If the drug to be modeled is also an endogenous compound, baseline endogenous concentrations should be accounted for. If the endogenous compound is known or believed to be time invariant, endogenous concentrations are modeled as a constant increment to exogenous drug. If it is known to vary over time, appropriate functions can be introduced to the model.
‐If erroneous concentration or time data are identified or suspected, those data are replaced with correct data, if available, or removed from the analysis.
‐When nominal times are recorded instead of actual times, the modeler should check preclinical or other clinical data to understand curvature in the PK profile, which determines the potential for bias in parameter estimates.

Concentration data BLQ

BLQ data are commonly encountered during PK data analysis. In 2001, Beal described seven methods for handling BLQ observations (labeled M1–M7), several of which have since been widely adopted as standards of practice in pharmacometric modeling.²⁰ Despite widespread acceptance of the M3 method as generating the least biased results, it increases the time a model takes to converge and often inflates residual error unless BLQ observations are excluded from the computation of residual and weighted residuals (e.g., setting MDVRES = 1 [missing dependent variable for residual]). Furthermore, less information is available about when it is critical to use M3 (i.e., what percentage of BLQ observations requires this approach to avoid biased model parameters), the impact of both numbers of samples per subject or overall number of subjects in the data set, the impact of BLQ observations on parameter estimation for compounds that exhibit nonlinear PK, and the impact of BLQ observations on structural model selection (e.g., two‐compartment structural model versus one‐compartment structural model). In addition, the influence of BLQ data on structural model features, such as the relative magnitudes of CL and volume, has not been fully explored.

Several publications have examined various approaches to handling BLQ values, including deleting all BLQ observations (M1), the 'YLO' approach (M2), setting all BLQ observations to 0 (M7), setting all BLQ observations to half the LLOQ (LLOQ/2) (M5), setting the first BLQ observation in a subject to LLOQ/2 and removing the rest (M6), and likelihood‐based use of M3 or M4 methods; these published results are summarized in Table 2.^21–26 Differences are evident between these published studies regarding the point at which ignoring/deleting BLQ observations was found to cause bias and impact parameter precision, which can be partly attributed to methodological differences, such as differing percentages of BLQ tested, use of the M5 versus M6 method for incorporating LLOQ/2, and approaches to simulating data sets and statistical methods used to evaluate the impact of the various methods employed for handling BLQ observations. These dissimilarities can prevent generalizable interpretability. We conducted a comprehensive, consistent analysis of the main BLQ data‐handling methods (i.e., M1, M3, M5, and M6) to summarize current knowledge and address apparent discrepancies between previously published studies.

2 TABLESummary of published literature evaluating methods for handling concentration data BLQ

Title	Methods	Conclusions
Hing et al.²¹ NONMEM V	One‐compartment model Rat study with one sample per animal BLQ: 10%–50% Methods tested: M1 and M7 and four substitution methods	M1 and M7 led to biased CL and IIV estimates Loss of precision for all methods occurred at BLQ >25% No impact of the number of animals used at each sampling point
Duval et al.²⁶ NONMEM VI	Two‐compartment model Data sets based on (a) the ratio of the AUC of the distribution phase to the total AUC and (b) the ratio of the half‐life of the distribution phase to the half‐life of the elimination phase BLQ: 5%–50% Methods tested: M1 and M6	A bias on CL of >20% was observed with M1 at BLQ ≥20% No major trends were observed for V_c and Q between M1 and M6 substitution IIV on CL is improved with M6, whereas the loss of information on IIV was observed for all other parameters, regardless of method
Keizer et al.²² NONMEM VI	i.v. one‐compartment model i.v. two‐compartment model Oral one‐compartment model BLQ: 10%, 20%, 40% Methods tested: M1, M6, and M3	IV one‐compartment model: no bias was observed for any of the methods when the BLQ was <20%; considerable bias occurs with M1 and M6 at 40%; IIV was comparable at 10% and 20%, except for M3, which showed higher variation IV two‐compartment model: bias in fixed parameters was observed after M1 and M6 at 10% BLQ, except for the estimation of V and IIV in V One‐compartment oral model: at <20% BLQ, all methods except M1 provided reasonable and broadly similar performance for both fixed effects and IIV
Xu et al.²³ NONMEM VI	One‐compartment model Two‐compartment model BLQ: 1%, 2.5%, 7%, 10% Methods tested: M1 and M3	One‐compartment model: the impact of ignoring BLQ <10% was minimal Two‐compartment model: when the BLQ was <5%, M1 did not create bias in the fixed‐effect parameters, whereas more pronounced bias in the estimates IIV was observed. The greatest impact was on V_p
Ahn et al.²⁵ NONMEM VI	Two‐compartment model with first‐order absorption BLQ: 10%–40% Methods tested: M1, M2, M3, and M4	M3 and M4 produced similar results without log transformation Parameter estimates were biased with M1, especially when the BLQ was 40% Clearance was more negatively biased as %BLQ increased. V_p and Q were more positively biased The most accurate and precise estimates were obtained with M3
Bergstrand et al.²⁴ NONMEM VI	Model A: one‐compartment model, transit compartments, BLQ in absorption phase Model B: two‐compartment model BLQ: 10%–30% Methods tested: M1, M2, and M3	Model A: CL, V_c, and IIV on CL and V_c were not biased by presence of BLQ samples and similar for each method. K_a, mean transit time and number of transit compartments were biased with M1 M3 generated the best performance Model B: M1 led to substantial bias in CL, Q, and V_p. The M3 method was the least biased

Title

Methods

Conclusions

Hing et al.²¹

NONMEM V

One‐compartment model

Rat study with one sample per animal

BLQ: 10%–50%

Methods tested: M1 and M7 and four substitution methods

M1 and M7 led to biased CL and IIV estimates

Loss of precision for all methods occurred at BLQ >25%

No impact of the number of animals used at each sampling point

Duval et al.²⁶

NONMEM VI

Two‐compartment model

Data sets based on (a) the ratio of the AUC of the distribution phase to the total AUC and (b) the ratio of the half‐life of the distribution phase to the half‐life of the elimination phase

BLQ: 5%–50%

Methods tested: M1 and M6

A bias on CL of >20% was observed with M1 at BLQ ≥20%

No major trends were observed for V_c and Q between M1 and M6 substitution

IIV on CL is improved with M6, whereas the loss of information on IIV was observed for all other parameters, regardless of method

Keizer et al.²²

NONMEM VI

i.v. one‐compartment model

i.v. two‐compartment model

Oral one‐compartment model

BLQ: 10%, 20%, 40%

Methods tested: M1, M6, and M3

IV one‐compartment model: no bias was observed for any of the methods when the BLQ was <20%; considerable bias occurs with M1 and M6 at 40%; IIV was comparable at 10% and 20%, except for M3, which showed higher variation

IV two‐compartment model: bias in fixed parameters was observed after M1 and M6 at 10% BLQ, except for the estimation of V and IIV in V

One‐compartment oral model: at <20% BLQ, all methods except M1 provided reasonable and broadly similar performance for both fixed effects and IIV

Xu et al.²³

NONMEM VI

One‐compartment model

Two‐compartment model

BLQ: 1%, 2.5%, 7%, 10%

Methods tested: M1 and M3

One‐compartment model: the impact of ignoring BLQ <10% was minimal

Two‐compartment model: when the BLQ was <5%, M1 did not create bias in the fixed‐effect parameters, whereas more pronounced bias in the estimates IIV was observed. The greatest impact was on V_p

Ahn et al.²⁵

NONMEM VI

Two‐compartment model with first‐order absorption

BLQ: 10%–40%

Methods tested: M1, M2, M3, and M4

M3 and M4 produced similar results without log transformation

Parameter estimates were biased with M1, especially when the BLQ was 40%

Clearance was more negatively biased as %BLQ increased. V_p and Q were more positively biased

The most accurate and precise estimates were obtained with M3

Bergstrand et al.²⁴

NONMEM VI

Model A: one‐compartment model, transit compartments, BLQ in absorption phase

Model B: two‐compartment model

BLQ: 10%–30%

Methods tested: M1, M2, and M3

Model A: CL, V_c, and IIV on CL and V_c were not biased by presence of BLQ samples and similar for each method. K_a, mean transit time and number of transit compartments were biased with M1

M3 generated the best performance

Model B: M1 led to substantial bias in CL, Q, and V_p. The M3 method was the least biased

Abbreviations: AUC, area under the curve; BLQ, below the limit of quantification; CL, clearance; IIV, interindividual variability; i.v., intravenous; Q, intercompartmental clearance; V, volume of distribution; V_c, volume of central compartment; V_p, volume of peripheral compartment.

Results of BLQ simulation and parameter estimation: M3 versus other handling methods

Across multiple scenarios (Supplementary Material: Handling of BLQ Data section), common BLQ data‐handling methods were compared. In the simplest case, a densely sampled (five samples/subject), i.v. bolus one‐compartment model with CL = 20 L/h and V = 70 L, the M3 method did not provide any added benefit for accurately estimating CL and its associated between‐subject variability (BSV) for CL (BSVCL) compared with M1 across all data sets (5% ‐ 50% BLQ). However, between 30% and 50% BLQ observations, M3 achieved less‐biased estimates of V and BSV for V (BSVV). When V remained unchanged (V = 70 L) and CL was reduced (decreased from CL = 20 to CL = 10 L/h), the M3 method resulted in less‐biased BSVV and BSVCL estimates at 30% or more BLQ. No differences were detected for estimates of CL and V between the M3 and M1 methods between 5% and up to 30% BLQ. At more than 30% BLQ, the M3 produced significantly less‐biased results compared with M1.

The M5 method produced biased results at scenarios 10%–50% BLQ. With M6, bias on CL and BSVCL was similar to M1 and M3 up to 20% BLQ, after which significant bias was observed. Similar trends were observed for V. A previous evaluation of the M6 method for a one‐compartment model showed considerable bias and imprecision on CL at 40% BLQ, whereas no systematic bias was observed at 10% and 20%.²² Of note, 30% BLQ was not assessed in this study. An additional study reported unpredictable results when using M6 for BLQ observations in the elimination phase and M5 for BLQ observations in the absorption phase, sometimes inflating or sometimes reducing the bias.²⁴ Nonetheless, the M3 method was determined to be the best overall method. The size of the data set did not appear to influence the results, as similar trends were observed for all data set sizes of 25 to 400 subjects.

After simulations of an i.v. bolus one‐compartment model with sparse sampling (two samples/subject), bias was observed for V between 20%–50% BLQ for M1, whereas bias on CL was similar between M1 and M3 for all BLQ percentages. However, in smaller data sets (i.e., n = 25 subjects), bias on CL was apparent between 20%–50% BLQ with M1. These results are in agreement with previous assessments of one‐compartment models.^21–23 Collectively, these data indicate for a simple i.v., one‐compartment model, either the M1 or the M3 methods are approximately equivalent with moderate or large data sets and when up to 30% of the observations are BLQ. However, the M3 method is superior when the portion of BLQ observations is >30% or with small data sets. Furthermore, use of the M6 method generated biased parameter estimates of CL or BSVCL at BLQ percentages as low as 20% and as low as 10% for M5, suggesting these data‐handling methods should not be used for data sets with moderate to large percentages of BLQ.

The one‐compartment extravascular dosing scenario was similar to the i.v. dosing scenario. Accuracy of CL estimates was similar between the M3 and M1 up to 30% BLQ. At 50% BLQ, however, M3 produced less‐biased CL estimates. Not surprisingly, the time at which the BLQ observations occurred was important. When the percent of BLQ in the absorption phase was 5%, M1 generated similar results to M3. When the percentage of BLQ in the absorption phase was ≥10%, the use of M3 was superior to M1. Using RMSE as the primary criteria for comparison, M3 is recommended for less‐biased estimates of V, BSVV, and BSVCL, regardless of overall percentage of BLQ. Using normalized RMSE (nRMSE) criteria (RMSE values normalized to 0% missing; see the Supplementary Material: Assessing Performance of Missing Data Handling Methods section), when the percent of BLQ was 5% or greater in the absorption phase and the overall BLQ percentage was ≥20%, M3 generated the least‐biased results of the other evaluated methods. Previous oral one‐compartment studies suggest that excluding observations produces bias on absorption parameters (e.g., mean transit time, number of transit compartments) at <30% BLQ.²⁴ Similar to our findings, Bergstrand and colleagues also observed accuracy and precision of CL estimates were comparable between M3 and M1, up to 30%.²⁴

In an i.v. bolus two‐compartment model with CL = 70 L/h, bias in CL and V were similar between M1 and M3 at BLQ percentages up to 30%. However, estimates for peripheral compartments (volume [V2] and intercompartmental clearance [Q]) were more sensitive to missing data. Using RMSE, M3 yielded less bias in these parameters at ≥5% BLQ. Using nRMSE, M3 was superior for V2 and Q when the BLQ percentage was 30% or more. BSVCL was less biased when using M3 when the BLQ was 20% or more. When simulating slower CL values (CL = 20 L/h), the results were similar except for 50% BLQ, in which the M3 method generated less‐biased estimates for CL than the other methods evaluated. The size of the data set had no influence on these results. Previous evaluations of two‐compartment models have observed bias with M1 and M6 with BLQ percentages as low as 5% or 10%.^22–26 This is likely attributed to the loss of information on peripheral compartments resulting from omitting concentrations in the terminal phase, particularly in individuals with small peripheral volumes who would have declining concentrations below the limit of detection earlier than others.

Finally, nonlinear PK models were evaluated. M3 was superior at reducing bias on the maximum rate of metabolism (V_max) and the Michaelis‐Menten rate constant (K_m) at all BLQ percentages. Volume estimates were stable throughout all percentages between M1 and M3, whereas M5 and M6 resulted in biased parameters at ≥5%. These results suggest that M3 should be used for achieving less‐biased estimates of V_max, K_m, and BSV for V_max and BSV for V at BLQ 5% or more.

In summary, the selection of an appropriate method to handle BLQ data is an important consideration in PK model development. Based on previous studies and our current analysis, when developing a simple i.v. bolus one‐compartment model with rich sampling, the M3 method should be used in cases with approximately ≥30% BLQ observations. Otherwise, excluding observations when the BLQ percentage is <30% may not excessively bias CL and V estimates, although the M3 method should still be tested. The M3 method should be implemented with smaller data sets (50 subjects or fewer) with sparse sampling at approximately ≥20% BLQ observations. In an oral one‐compartment model, the percentage of BLQ in the absorption phase is a determinant of using M3 versus excluding observations. If the percentage of BLQ in the absorption phase is >5%, M3 should be implemented for less‐biased absorption parameters. Nonetheless, bias on CL is stable through 30% BLQ. When using a two‐compartment model, CL and V tend to have similar bias up to 30% BLQ, regardless of the method used. However, ignoring all BLQ observations creates bias in V2 and Q parameters, and the M3 method would therefore be warranted for BLQ percentages greater than 5%. Finally, the M3 method is necessary for less‐biased estimates of V_max and K_m in nonlinear models when BLQ values are present. Most studies report bias when replacing all BLQ observations with LLOQ/2. If this method is implemented, replacing the first observation per subject with LLOQ/2 and discarding the remaining observations generates less bias than replacing all remaining BLQ observations with LLOQ/2. Finally, alternative methods for handling BLQ data can also be evaluated. For example, fixing the additive component to the variance of a uniform distribution of [0, LLOQ] in an additive‐plus‐proportional error model can potentially shorten run times and stabilize parameter estimation.²⁷

Model misspecification is an additional concern with BLQ observations. A previous investigation of one‐compartment i.v. model misspecification reported steadily increasing type I error rates (lower objective function value [OFV] for two‐compartment models compared with one‐compartment models at the α 0.05 and 0.01 levels) using M1 as the BLQ percentage increased. For example, when approximately 50% of the data was BLQ, 96% of 500 simulations had lower OFV for two‐compartment models. When the M2 method was implemented, this percentage significantly decreased to nominal levels, leading to recommendations of M2 implementation when the BLQ >10%.²⁸ For nonlinear models, 25% of the simulations had lower OFV when modeled as linear elimination at 10% BLQ. At 30% BLQ, approximately 50% of the simulations would be incorrectly classified as linear elimination. Similar to one‐compartment versus two‐compartment models, these percentages decreased to nominal levels after M3 implementation. Taken together, model misspecification and parameter estimate bias warrant the use of likelihood‐based approaches to handling BLQ in most modeling scenarios.

Summary of recommendations for handling BLQ data

‐When the percentage of BLQ data is low (5% or less), all methods (M1, M3, M5, M6) should perform similarly, regardless of the model used and regardless of sampling density.
‐If the percentage of BLQ data exceeds 5%, M5 should not be used.
‐M6 generates biased CL, V, BSVCL, and BSVV at BLQ percentages greater than 20% for one‐compartment models with rich sampling.
‐In i.v. bolus one‐compartment models, with 30% or less BLQ observations, the M3 method does not appear to provide a major advantage over M1 when rich sampling data are available.
‐For i.v. bolus two‐compartment models with rich sampling, the M3 method should be implemented at 5% or more BLQ for less biased estimates of V2 and Q. CL and V tend to have similar bias up to 30% BLQ between M3 and M1. M6 generates biased results.
‐For drugs that exhibit nonlinear PK, the M3 method should be implemented at 5% or more BLQ.
‐A decision tree for choosing the best BLQ handling method is shown in Figure 2.

View Image - 2 FIGURE. Decision tree for handling of concentration data below the limit of quantification (BLQ). CL, clearance; Q, intercompartmental clearance; V, volume of distribution; Vc, volume of central compartment; V2, peripheral volume

2 FIGURE. Decision tree for handling of concentration data below the limit of quantification (BLQ). CL, clearance; Q, intercompartmental clearance; V, volume of distribution; Vc, volume of central compartment; V2, peripheral volume

DOSE DATA

Missing dose amounts or dose times

Missing or erroneous dose information is commonly encountered during PK data analysis. During a clinical PK study, some dosing records, including administered dose, dose time, or duration, may be missing or recorded inaccurately. Often, dose and dose time are collected only at infrequent clinical visits. Common reasons for erroneous dosing information include incomplete or erroneous data capture, data loss during transition from the clinic into medical records or case report form, and so on. These errors are generally minimized during a study, although some study designs are inherently more prone to these types of problems. For example, in trials where study subjects self‐report the times and/or doses administered, the rate of missing or erroneous data is relatively high.²⁹ Similarly, studies conducted within a general clinical setting, as opposed to a specialized research unit, tend to have higher rates of missed or misreported dosing data. In other cases, subjects may have started on study drug before enrollment in a particular trial.

Furthermore, identification of missing or erroneous dose records may be challenging and will depend on the data structure, amount of missing or erroneous data, PK properties of the drug, and so on. When possible, merging smaller studies with rich sampling into larger studies with sparse sampling can help to identify missing or erroneous data (concentration vs. time and dosing data) in the larger study. Comparing model results from densely sampled studies with parameters estimated from suspect data can also be insightful.

The choice of method for handling missing dose information can influence the quality of the PK data analysis. Ignoring missing records is never acceptable because observed drug concentrations are a direct result of dose administration (whether or not that dose is recorded) combined with drug distribution and elimination. If missing dose information is ignored, the analysis will inappropriately associate observed concentrations to distribution and/or clearance processes. Nedelman previously demonstrated the utility and flexibility of NONMEM in assessing the reliability of dose data, providing an example of how one might assess accuracy of observed or assumed dose times, the influence these assumptions may have on parameter estimates, and the stability of the model.³⁰ The next several paragraphs highlight approaches, some simple and others more complex, that can be applied to address missing or unreliable dose data.

One method for treating missing dosing records, sometimes referred as the omitting method (OM), is to exclude the “influenced” PK observations from analysis. This method tries to mask the problem by only analyzing the intact subset of data. Although not a preferred approach, this method may still be feasible if the remaining data carry sufficient information about the PK behavior under study. Empirical examples include when only a few dosing records are missing while most records are intact within a given study subject or when a dosing record is missing during later stages of a multidose study after steady‐state concentrations are achieved, provided the duration of missing dose information is sufficiently short to not impact steady‐state concentrations. Studies of self‐administered monoclonal antibodies often fall into this group, as patients administer most doses, but may miss a few doses. Under these conditions, it may be acceptable to exclude the “influenced” observations from analysis. However, it is sometimes difficult to precisely determine which portions of the data were “influenced” and the extent of influence. There are many situations when this method might be used with caution. For example, when PK data are only available after a single dosing event or when only sparse PK data are available and prior to when steady‐state PK is achieved.

Several other methods for handling missing or erroneous dosing information have been reported.^31–37 The prescribed dose method (PDM) assumes all prescribed doses were administered per prescription/study design.³¹ With this assumption, one simply fills in the missing or erroneous records and proceeds with analysis as if dosing data were intact. This is the approach often taken in larger, late‐phase studies, where dose level and time are intentionally not captured, and consequently they are assumed (e.g., using ADDL in NONMEM). Notably, in these cases the data are not considered missing since they were intentionally not captured. While straightforward, the PDM can be inaccurate, which inherently introduces error into the analysis. Capturing the dose information prior to a PK sample collection can prevent inaccuracies. In some large clinical trials, patient compliance data are collected (pill counts), which can be used to adjust the administered amount over intervals of time (a comprehensive summary of compliance [or adherence], its measurement, modeling adherence, and the impact of improper assumptions about adherence in the modeling and simulation processes was previously published by Kenna et al.³⁸). Several handling methods exist where the shortcomings of OM and PDM can be minimized by estimating the missing or erroneous dosing information based on observed residual drug amounts. These methods are discussed in the subsequent paragraphs.

The concentration‐time method (CTM), which is similar to PDM, also assumes per protocol dose amount and intervals.³² However, CTM specifically addresses the issue with incorrect shifting of dosing time on individual levels. The individual time shift is parameterized and used to calculate the residual drug amount. For example, for a drug with repetitive oral administrations and empirical one‐compartmenal PK model, the residual amount CT is expressed as Equation (7):[Image Omitted. See PDF]where Dose is the dose amount, V is the volume of distribution, ka is the absorption rate constant, k is the elimination rate, ND is the number of doses, tau is the dosing interval, TP is the time passed dose, and DT is incorrect dosing time shift. DT is parameterized the same way as other PK parameters in the form of ${DT}_{pop} \times e^{η_{DT}}$ . The residual amount is then accounted when fitting current PK observations by incorporating it to the observed concentrations (Equation 8):[Image Omitted. See PDF]where Y is the observed concentration, F is the predicted concentration, CT is the residual amount correction from missing dosing history, $ε_{prop}$ is the proportional error, and $ε_{add}$ is the additive error.

The extrapolation‐subtraction method (ESM) uses a “subtraction curve” estimated from the terminal‐elimination phase of current data.³¹ The method adjusts the influenced data by subtracting the residual amount over time.

The missing dose method (MDM) does not amend the data set³¹ though, it parameterizes the residual drug amount as an individual PK parameter (A0; Equation 9):[Image Omitted. See PDF]where A0 is the individual residual drug amount, $A 0_{pop}$ is the population residual amount, and $η_{A 0}$ is the individual variability. All drug concentrations after the missing records are treated as results from both A0 and available dosing events. A modification of MDM, the missing dose mixture method (MDMM), combines MDM and PDM via a mixture model approach. In the original article by Soy et al.³¹, the authors presented the MDM and MDMM methods and compared their performance with PDM and ESM in handling missing dosing histories. The MDM method was concluded to be superior overall in the demonstrated examples. The method was later provided with an implementation by Gupta et al.³³ (Equation 10):[Image Omitted. See PDF]where C is the observed concentration, SDF denotes the concentration from last (correct) dose (i.e., concentration without correction, similar to F from the CTM method), C0 is the residual A0 in the form of concentration, k_e is the elimination rate, t is time after dose, and ε is residual error. The MDM method was reportedly applied in practical analysis by Friberg et al.³⁴

CM methods³⁷ (CM1 and CM2) provide two different angles to attack the issues from erroneous dosing information (classified as “noncompliance history”). CM1 treats the noncompliance dosing history as an additional source of RUV and offers two submethods to label records that may carry it. One submethod (M3) put an interindividual variability (IIV) term η_σ (η_σ) on RUV and estimates the individual η_i,_σ. Please note that the Gibiansky, et al.,³⁷ publication uses M1 through M6 as labels for 6 different submethods for handling erroneous dose data. Here we retain the same M1 – M6 labeling as presented by Gibiansky, et al., though we caution the reader to not confuse these submethods for handling erroneous dose data with the M1 – M7 methods for handling BLQ data, as described in the Concentration data BLQ section of this tutorial. The model with proportional RUV is introduced as follows (Equation 11):[Image Omitted. See PDF]where C_ij is the jth observed concentration in individual i; C_ij is the jth model predicted value in individual i; σ_ij is the residual random variable; and η_i,_σ is the individual value of random effect on RUV, which is assumed to be normally distributed with mean zero and standard deviation ω. Records with post hoc estimated >0 are labeled as potential noncompliant records and can be excluded from analysis to reduce estimation bias (M5).

The other submethod of CM1 (M4) models the RUV with a mixture model structure, assuming noncompliant dosing history leads to concentration records with RUV following a different distribution than do normal records. The model with proportional RUV is introduced as follows (Equation 12):[Image Omitted. See PDF]where C_ij is the jth observed concentration in individual i, C_ij is the jth model predicted value in individual i, and σ_ij is the residual random variable. σ_ij follows a normal distribution with mean zero, but a variance to be estimated individually for each group of patients (i.e., compliance vs. noncompliance). The mixture model structure is needed to implement this method. Records associated with the more variable RUVs are labeled as potential noncompliant records and can be excluded from analysis to reduce estimation bias (M6).

Both combinations of M3/M5 and M4/M6 were reported to reduce the estimation bias from noncompliant dosing history, with similar performance.

The CM2 method works on the dosing records instead of concentration records. First, one has to know which dosing records can be attributed as noncompliant. This can come from knowledge of the data condition or be deduced by applying either M3 or M4 from CM1. Once this is done, the noncompliant dosing records can be included in analysis, but with a flag to label them in the data set during model run. The labeled dosing records are implemented with a relative bioavailability term to distinguish them from compliant dosing. The concentration at time t is then formulated as Equation (13):[Image Omitted. See PDF]where C(F_NC) is the concentration from noncompliant dosing, F_NC is the relative bioavailability to be estimated, and C is the concentration with normal dosing records. It is worth noting that to use the CM2 method, concentration records associated with noncompliant dosing need to be excluded from the analysis.

Recently, Wang et al. published another implementation of the MDM method and expanded the methodology (MDM2) to handle different scenarios of missing dosing information.³⁵ First, A0 was parameterized as in MDM to represent the residual drug amount from missing dosing histories. Then, a mixture model structure is applied to put patients into different groups so that each group of patients will be identified with their specific A0_pop values and η_A0 distributions. This approach addresses potential issues when the distribution of A0 cannot be described with a single log‐normal distribution. The authors also published a new method, called the compartment initialization method (CIM), which addressed the residual amount using a different approach. Rather than estimating the individual residual amount, A0, the CIM method “back calculates” A0 using the most recent concentration record and current individual estimate of volume V. Each individual A0 is then used to initialize the corresponding compartment if a missing dosing record precedes current observations. The CIM method demonstrated equal performance to MDM for handling missing dosing histories, although CIM outperformed all other tested methods in other missing dosing record scenarios.

Besides all aforementioned methods, there was another publication that looked into the uncertain dosing information from a Bayesian perspective.³⁶ Mu et al. examined the potential benefits of adapting a hierarchical Bayesian approach when part of the dosing history is uncertain. The study adopted a simplistic but representative scenario: multiple doses were given at known and equal intervals, with two doses (the first and the middle doses in the more realistic setting) at the known level, but all others at uncertain levels. For each of the uncertain doses, independently, they have certain probabilities of being totally missed, taken as the correct dose, or taken as twice the correct dose. With a simple one‐compartment PK model with bolus dosing, the model predicted concentration is expressed in a recursive form as the following (Equation 14):[Image Omitted. See PDF]where f_ijk is the observed concentration for ith individual at jth occasion and kth timepoint; D is the dose given at certain dosing records (value 0 otherwise); CL_ij, V_ij are the clearance and volume of distribution of ith individual at jth occasion, respectively; t_k is the time after dose at the kth timepoint; and r_ik is the compliance factor, which serves as a switch to specify the amount of drug for an uncertain dose, 0, 1, or 2 times the nominal dose.

Incorporation of the compliance factor r_ik was shown to improve estimation bias on some parameters, whereas others were shown to be insensitive to this technique. It was speculated that this may be attributed to the specific sampling design that was used. It is worth noting that to find the appropriate model for r_ik to approximate distribution of the simulated uncertain dose levels, for example, 0:1:2 with 60%:35%:5%, the authors had to go through a process of trial and error. Eventually a function of two binomial distributions was found to provide reasonable estimates of compliance factors. However, because of the relative simplicity of tested scenarios, speculated sampling dependent results, and the use of trial and error, it is difficult to predict the generalizability of this method in other scenarios.

In summary, when dosing records are missing, all methods to replace them will inevitably adopt one or more assumptions. The PDM method and parts of the CTM method (regarding the prescribed dose amount and interval) apply direct and consistent assumptions based on the clinical protocol. It is important to recognize that those assumptions may not be accurate for all instances of the missing data. Alternative methods, including ESM, MDM, MDMM, MDM2, CM1/CM2, and CIM, also allow for estimation and adjustment of the missing dosing information based on the observed PK data. Each method has its strengths and weaknesses, depending on the specific data situation. Next we summarize our recommendations for implementing these methods based on the specific missing data scenarios or assumptions.

Summary of recommendations for handling missing or erroneous dose data

‐PDM assumes dose times and/or dose amounts do not deviate from those specified in the study and/or that deviations in times and/or amounts are insignificant. PDM only works well when these assumptions are generally robust.
‐CTM applies when the dosing amount is known and only dosing time is missing.
‐All other methods for handling missing dose information assume the missing data deviate from the study design and that these deviations may significantly influence the PK analysis.
‐When PK observations influenced by missing dose data constitute only a small portion of all available PK data for an individual, OM is likely to yield acceptable PK parameter estimates.
‐When some of the dose level information is missing within a full dosing history, use of either the MDM or CIM method will provide the best opportunity for yielding accurate PK parameter estimates. When dosing records are missing more randomly throughout the data, CIM and the expanded MDM (MDM2) methods are more appropriate.

COVARIATE DATA

As with other data types, missing, erroneous, or outlier covariate data can negatively impact PK data analysis. Data QA strategies and exploratory data analysis used for concentration versus time data, such as descriptive statistics and generation/review of various plots (scatterplots of covariates vs. subject or time, histograms of covariates, etc.) can be useful for covariate data as well,¹ and these should be employed prior to modeling and simulation efforts. In this section, published strategies for addressing covariate data that have been confirmed to be missing or are questionable are reviewed. The impact of missing or erroneous covariate data is also evaluated using simulation and estimation to systematically compare the three most common strategies for handling missing covariate data: the complete case (CC; elimination of the entire patient from the data set), imputing to a reference value (IRV), and joint (JM; continuous) or mixture modeling (MM; categorical) to estimate the covariate value. The results of these analyses are discussed next.

Missing covariates

Missing covariate data are usually obvious, although they may sometimes be difficult to identify with zero (0) values that may represent true zeros versus missing covariate data, such as can occur with numerical assignments for patient sex or other categorical covariates, making assignment of a coded value for a missing covariate important. After covariate data have been confirmed to be missing and strategies to locate the missing data have been exhausted, appropriate methods for addressing the missing covariate data should be employed.

Several methods for handling missing covariate data in PPK analysis have been proposed in the literature. A population model of topotecan PK used the JM method to impute missing values for body weight (~20% missing) using observed values for body surface area, sex, and CrCL.³⁹ This approach was successful at producing unbiased estimates of weight for utilization in the covariate model, as demonstrated by a linear regression of the observed versus estimated weights from the joint model (the data were uniformly distributed about the line of unity, and the regression coefficient was 0.88). Utilization of the JM method in this study allowed for an unbiased assessment of the effect of body weight on topotecan PK in the published data set, which led to a 5% reduction in interpatient variability on CL. One of the most comprehensive studies published recently evaluated six methods for handling missing data for the categorical covariate sex, with up to 50% missing data. These methods included CC, single imputation to the mode value (SImode), single imputation using weight to predict the missing value (SIWT), multiple imputation using both weight and estimated steady‐state concentration (MI), prediction of the missing covariate using a maximum likelihood mixture model developed from the individuals without missing covariate data (MOD), and a version of the MOD mixture model where the proportion of missing males and females was estimated as a fixed effect parameter in the model (EST).⁴⁰ This study also evaluated these missing data handling methods in the contexts of data missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). The MI, MOD, and EST methods produced unbiased and precise estimates under the MCAR and MAR scenarios, although only the EST method performed well with the MNAR scenario. Although the SImode and SIWT methods were already known to produce more bias than the other methods evaluated, they along with the CC method remain among the most common methods employed in pharmacometric analysis. A subsequent study by the same group also evaluated the impact of η shrinkage on bias and precision for the MI method. This study found that when shrinkage is high (~40%), the estimates remain unbiased although they become less precise.⁴¹ About the same time, a separate group evaluated the CC method, estimation of a separate parameter for the effect of the missing covariate (EXTRA), and three mixture models fixing the likelihood of missing binary categorical data (sex) to the observed binary proportion, estimating the proportion using only PK data, and estimating the proportion using a mixture of observed proportion and PK data.⁴² These authors concluded that all methods performed similarly compared with each other with both low and high levels of missingness (10% or 50%) and with data having balanced and unbalanced levels of the binary categories (1:1 and 3:1). The CC and EXTRA methods were actually less biased than the mixture methods for the MNAR data.

When individual patient data are partially missing, the last observation carried forward (LOCF) or last observation carried backward (LOCB) approaches are commonly used. The default behavior in NONMEM is LOCB for time‐varying covariates. Although this method performs reasonably well, in general it is somewhat less intuitive than LOCF, and it can be problematic in patient populations where the covariate values are changing rapidly (e.g., body weight in young children or infants). The LOCB approach in NONMEM can also result in different simulation profiles as compared with the LOCF method, as demonstrated by Upton and Mould.⁴³ This study showed that simulations from NONMEM were dependent on the value of TIME in the row immediately prior to the record where the covariate value changed. The authors showed that a “dummy” row inserted in the data set with the same TIME as the record where the covariate value changes is a useful strategy to implement LOCF behavior in NONMEM.

Simulations to address gaps in the literature

Detailed methods for the covariate simulations along with summaries of the results can be found in the Supplementary Material. Here we describe the key findings of the simulations when applying CC, IRV, and JM methods for WT on CL and CC and the MM method for SEX on CL across four scenarios for CL and V within data sets of various sizes, various strengths of covariate effect, and various levels of missing data.

Many of the trends observed during simulation were as expected. In general, the population parameters for CL and V were always estimated well, irrespective of the amount of missing covariate data. Overall, the smaller the data set, and the smaller the effect of the covariate, the higher the RSEs, RMEs, and RMSEs of the covariate effect parameter, FAC, and the lower the likelihood that the covariate effect would be found to be significant. Weak trends were also observed with other estimated parameters being less accurate or less precise with smaller data sets and with greater amounts of missing covariate data, as expected, although these effects were very small compared with those affecting the FAC parameter. Accordingly, the FAC estimate (including its related RSEs, RMEs, and RMSEs) was the model parameter most affected by missing data. In some scenarios different handling methods performed equally, whereas in other scenarios there was an advantage with one or two of the three methods. The key factors influencing the accuracy and precision of the estimated parameters, the ability of the model to detect the covariate effect (i.e., significance of the likelihood ratio test with p value of 0.05 based on change in OFV), and the ability of the model to converge with a successful covariance step across the various scenarios are discussed in the following paragraphs. A summary of recommendations for the preferred methods of handling missing covariate data under various circumstances based on the literature review and the results of the simulation exercises are also provided.

In most scenarios, the JM and MM methods slightly outperformed the IRV or CC methods in correctly identifying a significant FAC effect (Figures 3 and 4, Tables S1–S6, and Figures S2–S5). In this regard, the relative impact of these preferred handling methods was more pronounced in the smaller data sets (200 subjects or less), with increasing amounts of missing data (20%–50%) and for the weaker covariate effects (power of 0.5 or less for WT, difference in population value of 30% or less for SEX). The JM, IRV, and MM methods also slightly outperformed the CC method in terms of the RMSE of BSVCL, BSVV, and BSVKA with increasing amounts of missing covariate data (Tables S1–S4 and Figures S6–S9). However, there was not a clear preference in handling method with respect to any of the other estimated model parameters.

View Image - 3 FIGURE. Comparing the number of runs that failed to identify the covariate effect between handling methods. These data are summarized from all simulations involving a single intravascular or extravascular dose with WT as the covariate on CL. CC, complete case; CL, clearance; FAC, covariate effect size; JM, joint modeling; IRV, imputation to a reference value; WT, weight

3 FIGURE. Comparing the number of runs that failed to identify the covariate effect between handling methods. These data are summarized from all simulations involving a single intravascular or extravascular dose with WT as the covariate on CL. CC, complete case; CL, clearance; FAC, covariate effect size; JM, joint modeling; IRV, imputation to a reference value; WT, weight

View Image - 4 FIGURE. Comparing the number of runs that failed to identify the covariate effect between handling methods. These data are summarized from all simulations involving a single intravascular or extravascular dose with SEX as the covariate on CL. CC, complete case; CL, clearance; FAC, covariate effect size; MM, mixture modeling; WT, weight

4 FIGURE. Comparing the number of runs that failed to identify the covariate effect between handling methods. These data are summarized from all simulations involving a single intravascular or extravascular dose with SEX as the covariate on CL. CC, complete case; CL, clearance; FAC, covariate effect size; MM, mixture modeling; WT, weight

In some instances, the IRV and/or CC methods were preferable to the JM or MM methods. For example, the JM and MM methods failed to complete the covariance step more frequently than the other methods, especially in the scenarios involving fewer patients (200 subjects or fewer), sparser sampling (4 vs. 7 timepoints), weaker covariate effects, and with additional parameter estimates (full OMEGA block, etc.; Tables S1–S4). Similarly, the use of the JM or MM methods were often associated with longer run times, although the increases in run times were modest for all of our tested scenarios. In cases where the data are of higher quality (larger data sets, 20% or less missing data) the CC or IRV methods would be preferable to either the JM or MM methods as there is minimal impact to correctly identifying a significant covariate effect.

Erroneous covariate data–the units switch

Erroneous covariate data may also be encountered within a data set, and it may not be completely obvious that the erroneous data are present. An example of erroneous data that are encountered periodically is the mixing of units within a single covariate, such as weight (kg vs. lb). Because the ranges for weight in kg and weight in lb can overlap within a typical data set, it may not be obvious that some data are reported in lb, for example, when the expected unit is kg (Figure 5).

View Image - 5 FIGURE. Diagnostic density plots demonstrating the manifestation of erroneous covariate data (units switched from kg to lb) in the distribution of WT. The results are stratified by the percentage of erroneous covariate data. WT, weight

5 FIGURE. Diagnostic density plots demonstrating the manifestation of erroneous covariate data (units switched from kg to lb) in the distribution of WT. The results are stratified by the percentage of erroneous covariate data. WT, weight

To evaluate the impact of unit switching on covariate analysis, we explored the impact of switching kg units to lb in 5%, 10%, 20%, and 50% of the population across the five different data set sizes of 400, 200, 100, 50, and 25 patients. On average, even a 5% switch within the 400‐patient data sets increased the absolute value of the RME FAC to roughly 40% (Table S5 and S6 and Figures S10 and S11). As expected, increasing amounts of unit switching (10%, 20%, 50%) caused accuracy to further decline (Tables S5 and S6 and Figures S10 and S11). The difference between the estimated and true FAC level was approximately the same between the high and low FAC levels (i.e., FAC = 0.75 and FAC = 0.3, respectively) when the units were switched (Tables S5 and S6 and Figures S10 and S11). A unit switch in weight from kg to lb significantly decreased the ability to identify WT as a covariate for both the low and high FAC levels (FAC = 0.3 and FAC = 0.75).

Summary of recommendations for handling missing/erroneous covariate data

‐The JM and MM methods typically performed equally well or better than the CC or IRV methods for identifying the covariate effect for WT and SEX, respectively. Therefore, JM or MM methods would generally be a safe and conservative approach in all scenarios for identifying even weak continuous or categorical covariate effects, respectively.
‐For large data sets (approximately 200 or more subjects), and with relatively minor amounts of missing covariate data (up to approximately 20% of the subjects have a missing covariate), all of the handling methods tested worked approximately equally in terms of identifying the covariate effect and estimating model parameter values accurately and precisely. Therefore, since the CC and IRV methods are generally easier to apply, these might be preferred over the JM or MM approaches in these scenarios.
‐Although the JM and MM approaches still outperformed the other methods for detecting the covariate effect, a higher failure rate for the covariance step was observed with these methods compared with the CC and IRV methods.
‐Caution should be exercised when there are apparently erroneous covariate data present within a data set, and the use of diagnostic plots such as histograms and/or density plots may be helpful to detect a units switch for continuous covariates. A decision tree for selecting the best handling method for missing or erroneous covariate data is displayed in Figure 6.

View Image - 6 FIGURE. Decision tree for handling missing covariate data. CC, complete case; FAC, covariate effect size; IRV, imputing to a reference value; JM, joint modeling MM, mixture modeling

6 FIGURE. Decision tree for handling missing covariate data. CC, complete case; FAC, covariate effect size; IRV, imputing to a reference value; JM, joint modeling MM, mixture modeling

ACKNOWLEDGMENTS

The authors thank the expert reviewers along with all of the authors of the previous work that we referenced. These contributions helped to refine this tutorial.

CONFLICT OF INTEREST

The authors declared no competing interests for this work.

Word count: 10659

Show less

© 2021. This work is published under http://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Missing or erroneous information is a common problem in the analysis of pharmacokinetic (PK) data. This may present as missing or inaccurate dose level or dose time, drug concentrations below the analytical limit of quantification, missing sample times, or missing or incorrect covariate information. Several methods to handle problematic data have been evaluated, although no single, broad set of recommendations for commonly occurring errors has been published. In this tutorial, we review the existing literature and present the results of our simulation studies that evaluated common methods to handle known data errors to bridge the remaining gaps and expand on the existing knowledge. This tutorial is intended for any scientist analyzing a PK data set with missing or apparently erroneous data. The approaches described herein may also be useful for the analysis of nonclinical PK data.

Details

Title

Approaches to handling missing or “problematic” pharmacology data: Pharmacokinetics

Author

Irby, Donald J¹; Ibrahim, Mustafa E²; Dauki, Anees M¹; Badawi, Mohamed A¹; Illamola, Sílvia M³; Chen, Mingqing¹; Wang, Yuhuan⁴; Liu, Xiaoxi⁴; Phelps, Mitch A¹; Mould, Diane R⁵

¹ Division of Pharmaceutics and Pharmacology, College of Pharmacy, The Ohio State University, Columbus, OH, USA
² Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
³ Department of Experimental and Clinical Pharmacology, College of Pharmacy, University of Minnesota, Minneapolis, MN, USA
⁴ Division of Clinical Pharmacology, Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, UT, USA
⁵ Projections Research Inc, Phoenixville, PA, USA; Division of Pharmaceutics and Pharmacology, College of Pharmacy, The Ohio State University, Columbus, OH, USA

Pages

291-308

Section

TUTORIALS

Publication year

2021

Publication date

Apr 2021

Publisher

John Wiley & Sons, Inc.

e-ISSN

21638306

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1002/psp4.12611

ProQuest document ID

2520736849

Approaches to handling missing or “problematic” pharmacology data: Pharmacokinetics

Jump to:

Full text

Abstract

Details

Suggested sources