Abstract

Machine learning promises to revolutionize clinical decision making and diagnosis. In medical diagnosis a doctor aims to explain a patient’s symptoms by determining the diseases causing them. However, existing machine learning approaches to diagnosis are purely associative, identifying diseases that are strongly correlated with a patients symptoms. We show that this inability to disentangle correlation from causation can result in sub-optimal or dangerous diagnoses. To overcome this, we reformulate diagnosis as a counterfactual inference task and derive counterfactual diagnostic algorithms. We compare our counterfactual algorithms to the standard associative algorithm and 44 doctors using a test set of clinical vignettes. While the associative algorithm achieves an accuracy placing in the top 48% of doctors in our cohort, our counterfactual algorithm places in the top 25% of doctors, achieving expert clinical accuracy. Our results show that causal reasoning is a vital missing ingredient for applying machine learning to medical diagnosis.

In medical diagnosis a doctor aims to explain a patient’s symptoms by determining the diseases causing them, while existing diagnostic algorithms are purely associative. Here, the authors reformulate diagnosis as a counterfactual inference task and derive new counterfactual diagnostic algorithms.

Details

Title
Improving the accuracy of medical diagnosis with causal machine learning
Author
Richens, Jonathan G 1   VIAFID ORCID Logo  ; Lee, Ciarán M 2 ; Johri Saurabh 3   VIAFID ORCID Logo 

 Babylon Health, 60 Sloane Ave, Chelsea, London, UK 
 Babylon Health, 60 Sloane Ave, Chelsea, London, UK; University College London, Gower St, Bloomsbury, London, UK (GRID:grid.83440.3b) (ISNI:0000000121901201) 
 Babylon Health, 60 Sloane Ave, Chelsea, London, UK (GRID:grid.83440.3b) 
Publication year
2020
Publication date
2020
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2432684307
Copyright
© The Author(s) 2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.