Content area
Full Text
SUMMARY. We develop a method for constructing adaptive regression spline models for the exploration of survival data. The method combines Cox's (1972, Journal of the Royal Statistical Society, Series B 34, 187-200) regression model with a weighted least-squares version of the multivariate adaptive regression spline (MARS) technique of Friedman (1991, Annals of Statistics 19, 1 141) to adaptively select the knots and covariates. The new technique can automatically fit models with terms that represent nonlinear effects and interactions among covariates. Applications based on simulated data and data from a clinical trial for myeloma are presented. Results from the myeloma application identified several important prognostic variables, including a possible nonmonotone relationship with survival in one laboratory variable. Results are compared to those from the adaptive hazard regression (HARE) method of Kooperberg, Stone, and Truong (1995, Journal of the American Statistical Association 90, 78-94).
KEY WORDS: Censored survival data; HARE; MARS; Proportional hazards.
1. Introduction
The linear proportional hazards model of Cox (1972) is a popular regression tool for the analysis of censored survival data. The model allows a parametric specification of the relative risk regression function of the predictor variables while making minimal assumptions about the baseline hazard function. The proportional hazards model has been extended to smooth additive functions of the relative risk by, e.g., Hastie and Tibshirani (1990), Gentleman and Crowley (1991), O'Sullivan (1988), Sasieni (1989), and Sleeper and Harrington (1990). In the context of regression trees and other piecewise constant regression functions, there has been work by Davis and Anderson (1989) and LeBlanc and Crowley (1992, 1995). Gray (1992) and Hastie and Tibshirani (1993) extend the Cox model to model smooth time-varying variable effects.
Recently, Kooperberg, Stone, and Truong (1995) developed the powerful hazard regression (HARE) method, which uses piecewise linear regression splines where the knots and variables are adaptively selected on the basis of the outcome variable to model the hazard function. The resulting model has similarities with the multivariate adaptive regression spline (MARS) model of Friedman (1991). However, because the HARE method focuses on the estimation of the entire hazard function, instead of just the relative risk regression function, the unconditional log hazard function model must be also modeled parametrically as part of the procedure.
We consider models that...