Content area
High-dimensional problems are increasingly prevalent across a wide range of scientific and engineering domains, including machine learning, computational biology, and data-driven decision systems. Solving these problems efficiently is of critical importance due to the overwhelmingly large volume of data that modern systems must process. In many practical scenarios, it is not necessary to compute exact solutions; instead, proximity solutions are sufficiently close to the optimal one, which is acceptable and desirable. This relaxation opens the door to significant computational speed-ups. In this thesis, we demonstrate four important high-dimensional proximity problems for which substantial acceleration can be achieved using advanced algorithmic techniques.
First, using the algebraic method, we design a faster average-case algorithm for the orthogonal vector problem and the closest pair problem. Both problems are high-dimensional problems, and they serve as cornerstones of fine-grained complexity, as one can refute SETH (strong exponential time hypothesis) by solving any one of them in subquadratic time.
Second, we develop a new method for approximate matrix multiplication, in which we can utilize any incomplete matrix multiplication tensors. We practice this new method and gain speed-ups on the correlation detection problem (i.e., the light bulb problem), a fundamental problem in learning theory.
Third, using geometric tools, we design the first high-dimensional Euclidean spanner with $(1+\eps)$ approximation and subquadratic size. Such a tool can embed high-dimensional vectors into a graph, enabling us to access all graph-based algorithms. We speed up the calculation of Earth-Mover distance via this tool.
Lastly, we design the world's fastest linear program solver. The key technical challenge is a dynamic matrix-vector multiplication maintenance problem. In our new algorithm, we combine many advanced techniques, such as robust central path, fast rectangular matrix multiplication, sketching algorithms, and multi-layer lazy updates on a dynamic data structure.