Content area

Abstract

Nowadays researchers can collect and access data that have large numbers of variables. Data sets that have a large number of features and relatively few observations are referred to as high dimensional data. Building statistical models and making statistical inference from high dimensional data is out of the scope of well-developed classical statistical models such as the ordinary least squares. Penalized regression models have been one of the most popular methods in this field. This thesis aims at proposing novel approaches that are able to improve the predictive performance and inference of penalized models.

The first paper is devoted to developing a novel testing procedure, Projection Inference for Penalized Regression Estimator (PIPE). Based on model estimates from an initial penalized linear regression model, PIPE provides a computationally-efficient way to compute test statistics that can be used for false discovery rate control. In the second paper, I extend the PIPE procedure to accommodate binary outcomes with penalized logistic regression. For both linear and binary case, the validity of the proposed PIPE procedure is studied carefully through its theoretical properties and empirical performance.

In the third paper, two novel cross-validation approaches, cross-validated linear predictor and cross-validated deviance residuals are developed for Cox regression, where there is an inherent challenge to conduct cross-validation for the models built upon partial likelihood. Both approaches can be used to conduct model selection for penalized Cox Regression model. I assess those methods and compare them with two existing approaches in a comprehensive set of simulations. The cross-validated linear predictor approach has the best overall performance.

For all methods that are developed in this thesis, I illustrate their usage with real data sets that are considered as high dimensional data.

Details

Title
Projection-Based Inference and Model Selection for Penalized Regression
Author
Dai, Biyue
Publication year
2019
Publisher
ProQuest Dissertations & Theses
ISBN
9781392422434
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2378121178
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.