In-Depth Overview of UNIQUE
#
UNIQUE
implements various input types, UQ methods, and evaluation metrics, and allows for an end-to-end uncertainty quantification benchmarking.
Each input type object is associated with a certain real-world input from the user’s data; each UQ method directly consists of or can be derived from the input type values; each UQ method is associated with one or multiple evaluation benchmarks and corresponding metrics.
Input Types#
The above schema shows a detailed map of UNIQUE
’s workflow and objects. From a user’s input dataset, UNIQUE
abstracts two different input type objects: data- (or features-) and model-based input types, that represent the input values necessary to estimate and quantify the uncertainty in model’s predictions.
See also
Check out Input Type Specification for more details about data- vs model-based input types.
UQ Methods#
Each input type object can be used either directly as a representation of model’s uncertainty or to compute a UQ proxy using associated UQ methods.
These methods can either directly derive the UQ estimates from the input data (base UQ methods), or combine several base UQ methods to generate more complex and holistic measures of uncertainty (transformed UQ methods).
UNIQUE
distinguishes between “base” UQ methods and “transformed” UQ methods:
Base UQ Methods |
Transformed UQ Methods |
---|---|
UQ methods directly computable from the input data and/or original model (e.g., |
Combinations of base UQ methods (e.g., |
See also
Check out Available Inputs & UQ Methods for more details about available UQ methods.
Error Models#
Error models are a novel way to measure uncertainty, and are an example of transformed UQ method, as they combine several input features and base UQ methods to try predicting the error of the model’s predictions, as a UQ proxy itself.
See also
Check out Error Models and Available Inputs & UQ Methods for more details on error models.
Evaluation Benchmarks#
Lastly, each UQ method can be evaluated by three different evaluation benchmarks: ranking-based, calibration-based, and proper scoring rules evaluations.
Each of these encompasses multiple evaluation metrics, which are established scores, concepts, and functions that are tasked with assessing the quality of the UQ methods with respect to the original data and model.
UNIQUE
then generates a comprehensive report of all the UQ methods (base and transformed) across the different evaluation benchmarks, highlighting the best-performing UQ method for each one of them (according to a selected scoring function).
See also
Check out Evaluation Benchmarks for more details about evaluation benchmarks and corresponding metrics.