Neural Modeling of Reasoning About Program Behaviors

Abstract

Programming languages, much like natural languages, exhibit a high degree of repetitiveness and regularity, often referred to as the naturalness of software. This characteristic, combined with the improved capabilities of neural language models (NLMs) to statistically learn from such patterns, has led to their widespread adoption in software engineering (SE) tasks ranging from code generation to automated bug detection and program repair. While these applications of automated software engineering offer a useful proxy for assessing the downstream performance of NLMs, their ability to reason about intrinsic program properties, such as structure, semantics, and execution behaviors, remains underexplored.

This dissertation addresses this gap through the lens of program analysis, using the latter’s formalisms to probe the reasoning capabilities of NLMs over intrinsic program behaviors. In general, analyzing programs entails either examining all possible behaviors based on program semantics (i.e., static) or establishing precise execution behaviors by running the entire test suite (i.e., dynamic), each with trade-offs in generalizability and scalability. As an alternative, we introduce a new paradigm of predictive program analysis, which aims to learn to analyze program behaviors from similar analyses of open-source software repositories. This approximation helps extend such analyses to partial programs, enables a static estimation of runtime behaviors, and facilitates multilingual program analysis, all at scale. Using dependence analysis as a representative setting, this dissertation investigates how NLMs can model program structure, semantics, and execution behaviors across three key dimensions: (i) the granularity of dependencies, ranging from inter-statement and variable-statement to inter-constraint dependencies; (ii) nature of reasoning, spanning both static and dynamic program behaviors; and (iii) reasoning modality, which involves reasoning in the latent space or through verbalized natural language explanations. Overall, these contributions show that predictive analysis can generalize, bridging the gap between static and dynamic analysis, while offering insights into how language models internalize reasoning about program behaviors.

Details

Subject

Computer engineering;
Computer science;
Information science

Classification

0984: Computer science
0464: Computer Engineering
0723: Information science

Identifier / keyword

Neural language models; Programming languages; Software engineering; Program semantics

Title

Neural Modeling of Reasoning About Program Behaviors

Author

Yadavally, Aashish

Number of pages

185

Publication year

2025

Degree date

2025

School code

0382

Source

DAI-A 87/6(E), Dissertation Abstracts International

ISBN

9798265455963

Advisor

Nguyen, Tien N.; Overzet, Lawrence

Committee member

Ray, Baishakhi; Yang, Wei; Wei, Shiyi

University/institution

The University of Texas at Dallas

Department

Computer Science

University location

United States -- Texas

Degree

Ph.D.

Source type

Dissertation or Thesis

Language

English

Document type

Dissertation/Thesis

Dissertation/thesis number

32435089

ProQuest document ID

3279300678

Document URL

https://www.proquest.com/dissertations-theses/neural-modeling-reasoning-about-program-behaviors/docview/3279300678/se-2?accountid=208611

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Database

ProQuest One Academic

Neural Modeling of Reasoning About Program Behaviors

Content area

Abstract

Details