It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
In recent years, rapid advancements in large language models (LLMs) have steadily shifted their applications from simple chatbots to increasingly complex, autonomous agents. Agentic applications require LLMs to interact with a broad range of external information sources, tools, and environments to solve intricate tasks with minimal human oversight—posing significant challenges to their reliability. This dissertation presents a series of contributions toward (more) reliable agentic LLMs.
Firstly, we explore how LLMs can be made more robust when incorporating external references—an essential capability for many agentic applications. We introduce chain-of-defensive-thought, a simple yet effective technique that instructs LLMs to generate a chain of thought mimicking a structured reasoning process of cross-checking. This highly accessible approach significantly improves the robustness of a wide range of LLMs against reference corruption. Importantly, it highlights a promising direction: exploiting the reasoning abilities of LLMs for robustness on tasks that are not necessarily reasoning-centric, which is a timely insight given the growing interest in LLM reasoning and the increasing reliability demands of agentic applications.
Secondly, we examine the reliability of tool use in agentic LLMs. While external tools can dramatically extend the capabilities of LLMs, the current paradigm—where models choose tools based solely on text descriptions—proves fragile. We demonstrate how strategic edits to tool descriptions can substantially bias tool usage, revealing a vulnerability in standard tool/function-calling protocols. These findings underscore the need for a grounded mechanism for agentic LLMs to select and utilize tools and resources.
Finally, we address the reliability of LLM evaluations, particularly in the presence of test set contamination, where models may (knowingly or not) train on test data prior to evaluation. We propose DyePack, a novel framework that repurposes backdoor techniques into a principled mechanism for identifying such contamination. DyePack operates without requiring access to model internals and supports both multiple-choice and open-ended tasks. More importantly, it provides provable guarantees by enabling exact false positive rate (FPR) computation before flagging any model as contaminated—effectively preventing false accusations while offering strong evidence for every case detected. This positions DyePack as a powerful tool for maintaining the integrity of open benchmarks and safeguarding our pathway toward reliable agentic LLMs.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer