It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Recent approaches to binary–source code matching often operate at the intermediate representation (IR) level, with some applying the matching process at the binary level by compiling the source code to binary and then matching it directly with the binary code. Others, though less common, perform matching at the decompiler-generated pseudo-code level by first decompiling the binary code into pseudo-code and then comparing it with the source code. However, all these approaches are limited by the loss of semantic information in the original source code and the introduction of noise during compilation and decompilation, making accurate matching challenging and often requiring specialized expertise. To address these limitations, this study introduces a system for binary–source code matching based on decompilation techniques and Graph analysis (BSMDG) that matches binary code with source code at the source code level. Our method utilizes the Ghidra decompiler in conjunction with a custom-built transpiler to reconstruct high-level C++ source code from binary executables. Subsequently, call graphs (CGs) and control flow graphs (CFGs) are generated for both the original and translated code to evaluate their structural and semantic similarities. To evaluate our system, we used a curated dataset of C++ source code and corresponding binary files collected from the AtCoder website for training and testing. Additionally, a case study was conducted using the widely recognized POJ-104 benchmark dataset to assess the system's generalizability. The results demonstrate the effectiveness of combining decompilation with graph-based analysis, with our system achieving 90% accuracy on POJ-104, highlighting its potential in code clone detection, vulnerability identification, and reverse engineering tasks.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer





