Content area
We are in a period of rapidly increasing machine intelligence. When reviewing progress, we conclude that there is a clear possibility of machines performing a dominant fraction of economic activity within the next few decades (reaching “high level machine intelligence (HLMI)” or “artificial general intelligence (AGI)”), which requires a focus on safety.
We contribute to this effort in two parts. In the first part, we focus on “Normative NLP”, or designing Natural Language Processing systems that follow certain norms. First, we recognize that with increasing capabilities, machines are at risk of deceiving humans by pretending to be human (“deceptive anthropomorphism”). We present work on this topic published at prominent NLP venues. In the R-U-A-Robot dataset (2021), we collected over 2,500 phrasings related to the intent of “Are you a robot?”. Even when explicitly asked, we show popular systems of the time often failed to confirm their non-human identity, and we design machine learning classifiers as well as a user study to improve on this. In addition, we contribute the Robots-Dont-Cry dataset (2022), which studies implicit deceptive anthropomorphism. We collect over 900 dialogue turns in popular datasets of the time, showing many are not viewed as possible for a machine. This work has been used by other scientists to study anthropomorphism and robust NLP classifiers.
In the second part, we focus on connecting Software Engineering research to AGI safety. We discuss traditional Software Engineering research problems and their connection to AGI safety (2023). We then focus on two problems. First, we contribute techniques for estimating confidence of correctness in code-generating models (2025). This work aims to aid our ability to know when to audit machine outputs. In other work (2020), we study the problem of code summarization, where we characterize datasets of the time and help to improve the rigor of metrics. Faithful and high-quality summaries of complex machine output might also help manage a world where machines are producing vast amounts of complex output.
These contributions are presented toward the UC Davis PhD requirements, and add to knowledge for building a better future within AI, NLP, and Software Engineering.