Content area
Full text
Received Mar 26, 2018; Accepted May 5, 2018
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
Mixed effects models are widely used in the analysis of clustered data, especially analysis of longitudinal data or survival data. In a longitudinal study, some variables are measured repeatedly over time, and these variables may be used either as responses or covariates, depending on study objectives. A common problem is that data on some of these variables may be left or right censored due to detection limits, may be missing at times of interest, or may be measured with errors. For example, in HIV/AIDS studies, viral load values may be left censored due to lower detection limits and may be missing or measured with substantial errors. In statistical analysis, these “incomplete data" issues must be addressed for correct statistical inference. In this article, we consider the case when these incompletely observed and time-varying variables are used as important covariates in mixed effects models for longitudinal response data or for time-to-event response data. To simplify the discussion, we focus on time-dependent covariates with left censoring, since similar methods/models may be used for right censoring or missing data or measurement errors in the covariates.
Longitudinal data with left censoring have received increasing attention in the literature in recent years (e.g., [1–8]). A common approach is to assume an empirical model for the covariate of interest based on the observed data, such as a linear mixed effects model. Then, the empirical model is used to “predict" the true covariate values when these values are censored, assuming the fitted model continues to hold for the unobserved censored values. A potential problem with this approach is that the assumed empirical covariate model based on the observed data may not hold for the censored covariate values, due to possibly different data-generation mechanisms for these “too small to observe" values. For example, in AIDS studies, censored viral loads below the detection limit may behave very differently from those above detection limit (observed values), due to a possibly different disease status for suppressed viral loads [6]. Moreover, the assumed model and distribution for...