Content area

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across various domains, including code generation. However, complex inductive reasoning, deriving general rules from limited observations, remains a significant challenge. Programming-by-Examples (PBE) aims to synthesize programs from input-output examples, representing an important inductive reasoning task in programming languages with practical applications. We propose an approach to enhance LLMs on PBE using code-grounded synthetic data generation to provide high-quality training data for finetuning LLMs and address the scarcity of domain-specific data. Furthermore, we demonstrate how scaling test-time computation significantly improves inference results in this PBE setting. Our approach achieves state-of-the-art results on common PBE benchmarks including string, number sequence, and logo graphics domains. We further extend our methods to ARC-AGI, a very challenging benchmark requiring visual inductive reasoning from a few examples involving concepts such as physics, objects and symmetry. By applying our synthetic data and test-time scaling method, and then combining with transduction, we can approach human-level performance on ARC-AGI, demonstrating the framework's effectiveness even in highly challenging, visually-grounded domains.

Unlike PBE and ARC-AGI tasks where examples enable direct validation, real-world code generation often begins with ambiguous natural language specifications. This inherent ambiguity creates uncertainty about code correctness. We develop an approach that samples both code and tests from LLMs and uses execution results to build a classifier that estimates correctness probabilities. The method produces human-interpretable predicates explaining code behavior, a feature that users preferred in the user study, and helps create more trustworthy program synthesis while maintaining state-of-the-art accuracy.

Details

1010268
Business indexing term
Title
Code Generation With Large Language Models: Inductive Reasoning and Calibration
Author
Number of pages
256
Publication year
2025
Degree date
2025
School code
0058
Source
DAI-B 87/3(E), Dissertation Abstracts International
ISBN
9798293822805
Advisor
Committee member
Legunsen, Owolabi; Sampson, Adrian
University/institution
Cornell University
Department
Computer Science
University location
United States -- New York
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32040139
ProQuest document ID
3248434448
Document URL
https://www.proquest.com/dissertations-theses/code-generation-with-large-language-models/docview/3248434448/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic