Document Preview Unavailable

Discovering Language Model Behaviors with Model-Written Evaluations

Perez, Ethan; Ringer, Sam; Lukošiūtė, Kamilė; Nguyen, Karina; Chen, Edwin; et al.  arXiv.org, Dec 19, 2022.

You might have access to this document