| Home > In process > Comparison of Post-hoc Calibration Methods for Neural Network Likelihood Scores |
| Contribution to a conference proceedings/Contribution to a book | DZNE-2026-00661 |
; ; ; ; ; ; ; ; ; ;
2026
Springer Fachmedien Wiesbaden
Wiesbaden
ISBN: 978-3-658-51099-2 (print), 978-3-658-51100-5 (electronic)
This record in other databases:
Please use a persistent id in citations: doi:10.1007/978-3-658-51100-5_87
Abstract: This study investigates how to improve the reliability of probability estimates produced by deep learning models for the detection of Alzheimer’s disease using MRI data. Although convolutional neural networks (CNNs) can accurately classify neurodegenerative diseases, their softmax outputs often misrepresent true classification probabilities. We evaluated four calibration methods, two parametric (logistic and probit regression) and two nonparametric (isotonic regression and Bayesian binning into quantiles), on data from 474 participants. All models improved the CNN’s calibration noticeably without reducing accuracy. Non-parametric methods achieved the best calibration results (expected calibration error ≈ 0.014 and maximum calibration error ≈ 0.025). These findings suggest that non-parametric calibration provides more reliable and clinically useful probability estimates.
|
The record appears in these collections: |