Aleatoric and Epistemic Uncertainty in Statistics and Machine Learning and its use towards Explainale AI
The tutorial has a separate website, please check it here
Without any doubt, the notion of uncertainty is of major importance in machine learning and constitutes a key element of modern machine learning methodology. In recent years, it has gained in importance due to the increasing relevance of machine learning for practical applications, many of which are coming with safety requirements. In this regard, new problems and challenges have been identified by machine learning scholars, many of which call for novel methodological developments. Indeed, while uncertainty has a long tradition in statistics, and many useful concepts for representing and quantifying uncertainty have been developed on the basis of probability theory, recent research has gone beyond traditional approaches and also leverages more general formalism and uncertainty calculi.
In the statistics literature, two inherently different sources of uncertainty are commonly distinguished, referred to as aleatoric and epistemic
Obviously, the learner does not have full knowledge of \(p(\cdot \mid x)\). Instead, it produces a “guess” \(p(\cdot \mid x)\) on the basis of the sample data provided for training. Broadly speaking, epistemic uncertainty is uncertainty about the true probability and hence the discrepancy between \(p\) and \(\hat{p}\). This (second-order) uncertainty can be captured and represented in different ways. One approach is to train a probabilistic predictor in a more or less standard way, and to quantify the uncertainty of that predictor in a kind of post hoc manner. An example is the estimation of epistemic uncertainty in terms of the mutual information between the target variable and the model parameters
Another idea is to estimate uncertainty in a more direct way, and to let the learner itself predict, not only the target variable, but also its own uncertainty about the prediction