Abstract

Efficiently searching and reusing code from expansive codebases is pivotal for enhancing developers’ productivity. In recent times, the emergence of deep learning-driven neural ranking models, characterized by their vast dimensions and intricate interaction mechanisms, has been noteworthy. Yet, these models, in real-world scenarios, pose computational challenges due to their high dimensionality. Moreover, models rooted in interaction necessitate querying every piece of code within a voluminous corpus. While these methodologies offer superior accuracy, their online retrieval process is considerably more time-consuming compared to traditional Information Retrieval (IR) techniques. Addressing this, we introduce “ExCS”, an innovative code search tool designed to expedite the code search process without compromising on accuracy. ExCS innovatively employs code expansion in its offline phase, leveraging predictions on potential queries for specific codes, thereby enriching the code’s semantic depth. During online retrieval, ExCS prioritizes IR-based methods to pinpoint a concise set of persuasive candidates. Our evaluations, conducted on the Java dataset from CodeSearchNet, reveal that ExCS achieves a remarkable 90% reduction in retrieval duration while maintaining an impressive 99% retrieval accuracy.

Details

Title
ExCS: accelerating code search with code expansion
Author
Huang, Siwei 1 ; Cai, Bo 1 ; Yu, Yaoxiang 1 ; Luo, Jian 1 

 Wuhan University, School of Cyber Science and Engineering, Wuhan, China (GRID:grid.49470.3e) (ISNI:0000 0001 2331 6153); Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, Wuhan, China (GRID:grid.419897.a) (ISNI:0000 0004 0369 313X) 
Pages
29166
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3132709166
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.