Content area

Abstract

Neuro-symbolic learning aims to combine neural networks and symbolic reasoning for a hybrid AI. It can offer many desiderata of human-like intelligence, including explainability, efficiency, compositionality, and robustness that are sorely lacking in the monolithic deep neural networks today. A well-designed interface between deep learning and symbolic reasoning provides a structural learning prior that can lead to performance improvements and state-of-the-art results experimentally. In the bigger picture, the integration of deep learning and symbolic reasoning constitutes an algorithm that unifies empiricism and rationalism, two branches of epistemology in philosophy that explain the human acquisition of knowledge, which makes neuro-symbolic reasoning a fundamentally important problem that may one day lead to AGI. This dissertation explores various problems and methods of neuro-symbolic learning with computer programs as the symbolic form, targeting program generation and execution as the two main topics.

First, Neuron Dependency Graphs (NDGs) discover symbolic rules that exist commonly in trained neural networks and represent them as directed graphs, where each node corresponds to the boolean activation value of a neuron, and each edge models an approximate logical implication from one node to another. In addition to providing symbolic explanations of the neural network’s internal structure, an NDG can represent a Structural Causal Model (SCM) that is a causal abstraction of the corresponding neural network that “unfolds” the same way under interventions.

Then, NSEdit designs a domain-specific language (DSL) as the interface for Transformers to edit code. The DSL interface allows localization, insertion, and deletion, and a neuro-symbolic bi-modal decoder learns to perform bug localization and repair jointly, predicting mixed data types, including editing actions, locations, and words. When published, NSEdit achieved the state-of-the-art program repair performance.

Next, Neural Interpretation (NI) presents a neural model for procedural code execution, where each function is represented by a neural network, and every variable is represented by a vector. NI resembles how humans abstractly understand how computers would execute programs from top to bottom, without knowing how the program will actually run step-by-step. In experiments, we show that the neuro-symbolic interpreter can be trained end-to-end with gradient descent. The method can be trained to “execute” library functions without test inputs, because the variables are represented as vectors and do not require the actual values or entry points.

Following Neural Interpretation, a Neuro-symbolic Interpreter for Arithmetic Composition (NIAC) demonstrates the compositional generalization ability of NI when performing arithmetic calculation. NIAC learns a structure-preserving mapping between neural execution and arithmetic calculation. Unlike LLMs that lack compositional generalization with respect to productivity (length) and systematicity (format), NIAC guarantees perfect compositional generalization and uses constant memory for potentially infinite input length during inference.

Additionally, TableLabeler uses LLMs for automated tabular dataset construction with large language models with quality validation on the LLM-annotated labels. The efficiency of the LLM annotation process enables TableLabeler to introduce the largest executable SQL dataset in the literature. The dataset construction method can also synthesize programs for new database languages with very little training data beforehand.

Finally, QualityFlow proposes an agentic workflow method for program synthesis, consisting of software engineering roles including program generator, test designer, and self-debugger, all of which are controlled by a centralized quality checker. QualityFlow achieved the state-of-the-art performance on various program synthesis benchmarks.

Details

1010268
Business indexing term
Title
Neuro-Symbolic Program Generation and Execution for Hybrid Reasoning
Author
Number of pages
261
Publication year
2025
Degree date
2025
School code
0097
Source
DAI-B 86/12(E), Dissertation Abstracts International
ISBN
9798286430420
Committee member
Jannesari, Ali; Gao, Hongyang; Martin, Ryan
University/institution
Iowa State University
Department
Computer Science
University location
United States -- Iowa
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
31995574
ProQuest document ID
3224180549
Document URL
https://www.proquest.com/dissertations-theses/neuro-symbolic-program-generation-execution/docview/3224180549/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic