Content area
In this thesis, we develop computationally efficient algorithms with statistical guaranteesfor problems of decision-making under uncertainty, particularly in the presence of largescale, noisy, and high-dimensional data. In Chapter 2, we propose a kernelized projectedWasserstein distance for high-dimensional hypothesis testing, which finds the nonlinearmapping that maximizes the discrepancy between projected distributions. In Chapter 3, weprovide an in-depth analysis of the computational and statistical guarantees of the kernelizedprojected Wasserstein distance. In Chapter 4, we study the variable selection problem intwo-sample testing, aiming to select the most informative variables to determine whethertwo datasets follow the same distribution. In Chapter 5, we present a novel frameworkfor distributionally robust stochastic optimization (DRO), which seeks an optimal decisionthat minimizes expected loss under the worst-case distribution within a specified set. Thisworst-case distribution is modeled using a variant of the Wasserstein distance based onentropic regularization. In Chapter 6, we incorporate Phi-divergence regularization into theinfinity-type Wasserstein DRO, which is a formulation particularly useful for adversarialmachine learning tasks. Chapter 7 concludes with an overview of promising future researchdirections.