Abstract

Statistical learning algorithms provide a generally-applicable framework to sidestep time-consuming experiments, or accurate physics-based modeling, but they introduce a further source of error on top of the intrinsic limitations of the experimental or theoretical setup. Uncertainty estimation is essential to quantify this error, and to make application of data-centric approaches more trustworthy. To ensure that uncertainty quantification is used widely, one should aim for algorithms that are accurate, but also easy to implement and apply. In particular, including uncertainty quantification on top of an existing architecture should be straightforward, and add minimal computational overhead. Furthermore, it should be easy to manipulate or combine multiple machine-learning predictions, propagating uncertainty over further modeling steps. We compare several well-established uncertainty quantification frameworks against these requirements, and propose a practical approach, which we dub direct propagation of shallow ensembles, that provides a good compromise between ease of use and accuracy. We present benchmarks for generic datasets, and an in-depth study of applications to the field of atomistic machine learning for chemistry and materials. These examples underscore the importance of using a formulation that allows propagating errors without making strong assumptions on the correlations between different predictions of the model.

Details

Title
Uncertainty quantification by direct propagation of shallow ensembles
Author
Kellner, Matthias 1   VIAFID ORCID Logo  ; Ceriotti, Michele 1   VIAFID ORCID Logo 

 Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne , 1015 Lausanne, Switzerland 
First page
035006
Publication year
2024
Publication date
Sep 2024
Publisher
IOP Publishing
e-ISSN
26322153
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3076284744
Copyright
© 2024 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.