Aggretriever: A Simple Approach to Aggregate

Abstract

Pre-trained language models have been successful in many knowledge-intensive NLP tasks. However, recent work has shown that models such as BERT are not “structurally ready” to aggregate textual information into a [CLS] vector for dense passage retrieval (DPR). This “lack of readiness” results from the gap between language model pre-training and DPR fine-tuning. Previous solutions call for computationally expensive techniques such as hard negative mining, cross-encoder distillation, and further pre-training to learn a robust DPR model. In this work, we instead propose to fully exploit knowledge in a pre-trained language model for DPR by aggregating the contextualized token embeddings into a dense vector, which we call agg^★. By concatenating vectors from the [CLS] token and agg★, our Aggretriever model substantially improves the effectiveness of dense retrieval models on both in-domain and zero-shot evaluations without introducing substantial training overhead. Code is available at https://github.com/castorini/dhr.

Details

Title

Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval

Author

Lin, Sheng-Chieh; Li, Minghan; Lin, Jimmy

Pages

436-452

Publication year

2023

Publication date

2023

Publisher

MIT Press Journals, The

ISSN

2307387X

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1162/tacl_a_00556

ProQuest document ID

2893946867

© 2023. This work is published under https://creativecommons.org/licenses/by/4.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Aggretriever: A Simple Approach to Aggregate Textual Representations for Robust Dense Passage Retrieval

Jump to:

Abstract

Details

Suggested sources