Content area
This paper proposes a decorrelation scheme based on product quantization, termed Reference-Vector Removed Product Quantization (RvRPQ), for approximate nearest neighbor (ANN) search. The core idea is to capture the redundancy among database vectors by representing them with compactly encoded reference-vectors, which are then subtracted from the original vectors to yield residual vectors. We provide a theoretical derivation for obtaining the optimal reference-vectors. This preprocessing step significantly improves the quantization accuracy of the subsequent product quantization applied to the residuals. To maintain low online computational complexity and control memory overhead, we apply vector quantization to the reference-vectors and allocate only a small number of additional bits to store their indices. Experimental results show that RvRPQ substantially outperforms state-of-the-art ANN methods in terms of retrieval accuracy, while preserving high search efficiency.
