Full text

Turn on search term navigation

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the “one-versus-all” approach (top-label calibration) must be used rather than the “calibrate-the-full-response-matrix” approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation using only the test set and the final model. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, may be performed using only the test dataset, and should be sanity-checked visually.

Details

Title
Classification Confidence in Exploratory Learning: A User’s Guide
Author
Salamon, Peter 1   VIAFID ORCID Logo  ; Salamon, David 1 ; Cantu, V Adrian 2 ; An, Michelle 3 ; Perry, Tyler 2 ; Edwards, Robert A 4   VIAFID ORCID Logo  ; Segall, Anca M 5   VIAFID ORCID Logo 

 Department of Mathematics, San Diego State University, San Diego, CA 92182, USA; [email protected] 
 Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA; [email protected] (V.A.C.); [email protected] (T.P.) 
 Bioinformatics and Medical Informatics Program, San Diego State University, San Diego, CA 92182, USA; [email protected] 
 Flinders Accelerator for Microbiome Exploration, Flinders University, Flinders, Adelaide, SA 5001, Australia; [email protected] 
 Department of Biology, San Diego State University, San Diego, CA 92182, USA; [email protected] 
First page
803
Publication year
2023
Publication date
2023
Publisher
MDPI AG
e-ISSN
25044990
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2869416132
Copyright
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.