Content area
This dissertation focuses on the development of sampling policies and schedules that maximize the per-sample information utility for a specified objective. We’re motivated by applications in science and engineering where data may result from expensive, time-consuming experiments or diagnostics, but careful, strategic design of the experiments or system in question can yield more informative results. In particular, we explore applications that include measurement design for quantum state discrimination and anomaly detection for Markovian systems.
We first frame this problem as an active hypothesis test and develop the PostAC framework, which tracks a Bayesian posterior over the hypotheses and uses an actor-critic algorithm to learn a deep neural network testing policy as a function of this posterior. This is the first general-purpose framework that can readily be applied to sequential or fixed-length tests, discrete or continuous action spaces, and i.i.d. or dynamic environments.
In the next part of this work, we explore applications in quantum state discrimination, first applying traditional methods from active hypothesis testing to develop a heuristic testing strategy based on the extrinsic Jensen-Shannon information divergence. We then apply the PostAC framework to learn a physically realizable testing policy in the problem of discriminating coherent states–states of the electromagnetic field commonly used in optical communication–and show state-of-the-art performance in the low photon regime.
We then consider a sequence of problems in Markov hypothesis testing, first developing an asymptotic analysis of the problem and an information divergence-based sampling policy that allows the tester to distinguish Markov chains with fewer samples. We again apply PostAC to this setting and show that the resulting policy provides an order of magnitude improvement in sample efficiency over passive sampling.
These results focus on either minimizing probability of error or minimizing the samples required to achieve a target probability of error. In the final part, we instead consider a fairness objective: we design a sampling policy such that a classifier trained on the resulting dataset achieves minimax optimal performance across different sub-populations of the data distribution.