Test UNIQUE with a Random Forest regressor on the California Housing dataset

Test UNIQUE with a Random Forest regressor on the California Housing dataset#

In this Notebook, we showcase how to use UNIQUE to assess the uncertainty quantification methods for a random forest (RF) regressor and a multilayer perceptron (MLP), both trained on the California Housing dataset provided by scikit-learn package.

As uncertainty quantification methods based on models, we derive the variance of the predictions from 8 trees for the RF and the Monte Carlo dropout for the MLP.

The UNIQUE pipeline has been set with the following parameters:

Regression Task
UQ metrics:
- Ensemble Variance (from the property model output)
- ManhattanDistance
- EuclideanDistance
Error model:
- UniqueRandomForestRegressor
- UniqueLASSO

UNIQUE Input data generation#

import json
import os
import yaml
from pathlib import Path

import numpy as np
import pandas as pd

# Install torch to prepare the California Housing data
try:
    import torch
except ImportError:
    # Replace with conda if mamba not available
    # %mamba install pytorch::pytorch -y
    %pip install torch --index-url https://download.pytorch.org/whl/cpu

from unique import Pipeline

# Set the project's directory
PROJECT_PATH = os.environ.get("PROJECT_PATH", os.path.abspath("")) # ALTERNATIVELY, REPLACE `os.path.abspath("")` WITH YOUR PATH TO THE SYNTHETIC EXAMPLE FOLDER
%cd $PROJECT_PATH

from preparation import SyntheticDataExamplePreparation

Looking in indexes: https://download.pytorch.org/whl/cpu
Collecting torch
Downloading https://download.pytorch.org/whl/cpu/torch-2.4.1%2Bcpu-cp312-cp312-linux_x86_64.whl (194.8 MB)
?25l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/194.8 MB ? eta -:--:--
     ━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.9/194.8 MB 300.4 MB/s eta 0:00:01
     ━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━ 128.7/194.8 MB 326.2 MB/s eta 0:00:01
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 194.8/194.8 MB 336.0 MB/s eta 0:00:01
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 194.8/194.8 MB 336.0 MB/s eta 0:00:01
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 194.8/194.8 MB 205.5 MB/s eta 0:00:00
?25h
Requirement already satisfied: filelock in /home/runner/work/UNIQUE/UNIQUE/.conda/unique/lib/python3.12/site-packages (from torch) (3.16.1)
Requirement already satisfied: typing-extensions>=4.8.0 in /home/runner/work/UNIQUE/UNIQUE/.conda/unique/lib/python3.12/site-packages (from torch) (4.12.2)
Collecting sympy (from torch)
  Downloading https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 182.0 MB/s eta 0:00:00
?25h
Collecting networkx (from torch)
Downloading https://download.pytorch.org/whl/networkx-3.2.1-py3-none-any.whl (1.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 157.6 MB/s eta 0:00:00
?25hRequirement already satisfied: jinja2 in /home/runner/work/UNIQUE/UNIQUE/.conda/unique/lib/python3.12/site-packages (from torch) (3.1.3)
Collecting fsspec (from torch)
  Downloading https://download.pytorch.org/whl/fsspec-2024.2.0-py3-none-any.whl (170 kB)
Requirement already satisfied: setuptools in /home/runner/work/UNIQUE/UNIQUE/.conda/unique/lib/python3.12/site-packages (from torch) (75.1.0)
Requirement already satisfied: MarkupSafe>=2.0 in /home/runner/work/UNIQUE/UNIQUE/.conda/unique/lib/python3.12/site-packages (from jinja2->torch) (2.1.5)
Collecting mpmath>=0.19 (from sympy->torch)
  Downloading https://download.pytorch.org/whl/mpmath-1.3.0-py3-none-any.whl (536 kB)
?25l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/536.2 kB ? eta -:--:--
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 kB 71.8 MB/s eta 0:00:00
?25h
Installing collected packages: mpmath, sympy, networkx, fsspec, torch
Successfully installed fsspec-2024.2.0 mpmath-1.3.0 networkx-3.2.1 sympy-1.12 torch-2.4.1+cpu
Note: you may need to restart the kernel to use updated packages.
/home/runner/work/UNIQUE/UNIQUE/notebooks/california_housing

unique_input_data_path = f'{PROJECT_PATH}/unique_input_data.json'

s = SyntheticDataExamplePreparation()

if Path(unique_input_data_path).is_file():
    print('UNIQUE input data already generated.')
else:
    train_indices, val_indices, test_indices, \
    X_train, X_val, X_test, y_train, y_val, y_test, \
    rf_all_predictions, rf_variances, \
    mlp_all_predictions, mlp_variances = s.run()
    
    unique_dict = {
        'ID': np.concatenate([train_indices, val_indices, test_indices]).tolist(),
        'feature': [X_train[i, :].tolist() for i in range(X_train.shape[0])] +
                    [X_val[i, :].tolist() for i in range(X_val.shape[0])] +
                    [X_test[i, :].tolist() for i in range(X_test.shape[0])],
        'label': np.concatenate([y_train, y_val, y_test]).tolist(),
        'which_set': ['TRAIN'] * len(train_indices) +
                    ['CALIBRATION'] * len(val_indices) +
                    ['TEST'] * len(test_indices),
        'rf_predictions': rf_all_predictions.tolist(),
        'rf_variances': rf_variances,
        'mlp_predictions': mlp_all_predictions.detach().squeeze().numpy().tolist(),
        'mlp_variances': mlp_variances.tolist()
    }


    with open(unique_input_data_path, 'w') as f:
        json.dump(unique_dict, f)

RF training...
Random Forest Test performance:
MAE: 0.3653
RMSE: 0.5515
R2: 0.7729

MLP training...
Epoch: 0 | train_loss: 1.219 | val_loss: 0.450
Epoch: 1 | train_loss: 0.674 | val_loss: 0.413
Epoch: 2 | train_loss: 0.611 | val_loss: 0.378
Epoch: 3 | train_loss: 0.582 | val_loss: 0.391
Epoch: 4 | train_loss: 0.539 | val_loss: 0.357
Epoch: 5 | train_loss: 0.518 | val_loss: 0.367
Epoch: 6 | train_loss: 0.494 | val_loss: 0.364
Epoch: 7 | train_loss: 0.477 | val_loss: 0.338
Epoch: 8 | train_loss: 0.456 | val_loss: 0.330
Epoch: 9 | train_loss: 0.449 | val_loss: 0.325
MLP Test performance:
MAE: 0.3879
RMSE: 0.5503
R2: 0.7739

Collecting RF variance...

Dropout Monte Carlo...:   0%|          | 0/100 [00:00<?, ?it/s]
Dropout Monte Carlo...:   9%|▉         | 9/100 [00:00<00:01, 86.37it/s]
Dropout Monte Carlo...:  18%|█▊        | 18/100 [00:00<00:00, 86.32it/s]
Dropout Monte Carlo...:  27%|██▋       | 27/100 [00:00<00:00, 85.91it/s]
Dropout Monte Carlo...:  36%|███▌      | 36/100 [00:00<00:00, 86.41it/s]
Dropout Monte Carlo...:  45%|████▌     | 45/100 [00:00<00:00, 86.76it/s]
Dropout Monte Carlo...:  54%|█████▍    | 54/100 [00:00<00:00, 86.99it/s]
Dropout Monte Carlo...:  63%|██████▎   | 63/100 [00:00<00:00, 86.69it/s]
Dropout Monte Carlo...:  72%|███████▏  | 72/100 [00:00<00:00, 86.79it/s]
Dropout Monte Carlo...:  81%|████████  | 81/100 [00:00<00:00, 86.81it/s]
Dropout Monte Carlo...:  90%|█████████ | 90/100 [00:01<00:00, 86.76it/s]
Dropout Monte Carlo...:  99%|█████████▉| 99/100 [00:01<00:00, 84.73it/s]
Dropout Monte Carlo...: 100%|██████████| 100/100 [00:01<00:00, 85.99it/s]

UNIQUE Pipeline#

To evaluate the UQ methods of interest, including an additional set of UQ methods generated by the UNIQUE pipeline, you can run the fit() method of the pipeline. This will allow you to assess their performance using three main UQ evaluation types: Ranking, Proper scoring rules, and Calibration curves.

The summary tables provide scores for each UQ method based on a set of UQ evaluation metrics that are indicative of each evaluation type. The UQ method with the highest score is highlighted in green, indicating it as the best performing method.

Following the summary tables, you will find individual plots showcasing the performance of the best UQ methods.

Additionally, you can explore the summary plots generated for all the evaluated UQ methods, providing a comprehensive overview of their performance.

def overwrite_paths(yaml_file: str, project_path: str, input_data_file: str = "unique_input_data.json"):
	"""Given a yaml UNIQUE config file, overwrite the `data_path` and `output_path` fields."""
	# Use ruamel.yaml to preserve comments
	from ruamel.yaml import YAML
	yaml = YAML()

	# Read
	with open(yaml_file, "r") as f:
		# If you want the equivalent of yaml.safe_load use `typ="safe"`
		config = yaml.load(f) # defaults to `typ="rt"` (round-trip) argument. 

	# Overwrite
	config["data_path"] = os.path.join(project_path, input_data_file)
	config["output_path"] = os.path.join(project_path, "output")

	# Save
	with open(yaml_file, "w") as f:
		yaml.dump(config, f)

config_file = f'{PROJECT_PATH}/config_mlp.yaml'

# Replace `data_path` and `output_path` to be able run the notebook automatically
overwrite_paths(config_file, PROJECT_PATH) # COMMENT TO DISABLE OVERWRITING

pipeline = Pipeline.from_config(config_file)

# Compute UQ metrics, train error models (if any), evaluate UQ metrics
output, eval_results = pipeline.fit()

[2024-10-04 18:15:00] | [UNIQUE - INFO]: ************************ UNIQUE - INITIALIZING PIPELINE ************************
[2024-10-04 18:15:00] | [UNIQUE - INFO]: Loaded Pipeline configuration from: 'config_mlp.yaml'
[2024-10-04 18:15:00] | [UNIQUE - INFO]: Loading data from 'unique_input_data.json'...
[2024-10-04 18:15:00] | [UNIQUE - INFO]: Dataset with 20640 entries correctly loaded.
[2024-10-04 18:15:00] | [UNIQUE - INFO]: UQ inputs initialization...
[2024-10-04 18:15:00] | [UNIQUE - INFO]: UQ inputs summary: 
                                             1. [Data-Based Feature] Column: 'feature' | UQ methods to compute: Manhattan Distance, Euclidean Distance
                                             2. [Model-Based Feature] Column: 'mlp_variances' | UQ methods to compute: Ensemble Variance
[2024-10-04 18:15:00] | [UNIQUE - INFO]: Selected error model(s): 
                                             1. UniqueRandomForestRegressor
                                             2. UniqueLASSO
[2024-10-04 18:15:00] | [UNIQUE - INFO]: ************************ UNIQUE - COMPUTING UQ METHODS *************************
[2024-10-04 18:15:00] | [UNIQUE - INFO]: Computing UQ methods for provided inputs...
[2024-10-04 18:15:00] | [UNIQUE - INFO]: Combining 'base' UQ methods and computing 'transformed' UQ methods...
[2024-10-04 18:15:01] | [UNIQUE - INFO]: Initializing error models...
[2024-10-04 18:15:01] | [UNIQUE - INFO]: Preparing error models inputs...
[2024-10-04 18:15:01] | [UNIQUE - INFO]: Training error models...
[2024-10-04 18:15:06] | [UNIQUE - INFO]: Collected and computed 16 UQ methods.
[2024-10-04 18:15:06] | [UNIQUE - INFO]: Note: UQ method 'SumOfVariancesAndDistances' summed the input variance(s) and the following distances (converted to variances):
                                              1. Dist2Var[EuclideanDistance[feature]]
[2024-10-04 18:15:06] | [UNIQUE - INFO]: ************************ UNIQUE - EVALUATING UQ METHODS ************************
[2024-10-04 18:15:06] | [UNIQUE - INFO]: Evaluating and benchmarking 16 UQ methods by bootstrapping (n=500) on the test set...
[2024-10-04 18:18:52] | [UNIQUE - INFO]: Evaluated 5 UQ methods out of 16...
[2024-10-04 18:23:35] | [UNIQUE - INFO]: Evaluated 10 UQ methods out of 16...
[2024-10-04 18:29:21] | [UNIQUE - INFO]: Evaluated 15 UQ methods out of 16...
[2024-10-04 18:30:30] | [UNIQUE - INFO]: Evaluated 16 UQ methods out of 16.
[2024-10-04 18:30:31] | [UNIQUE - INFO]: Generating summary tables...
[2024-10-04 18:30:31] | [UNIQUE - INFO]: Summary evaluation tables saved to: .../california_housing/output/summary.
[2024-10-04 18:30:42] | [UNIQUE - INFO]: Generating summary plots...
[2024-10-04 18:30:56] | [UNIQUE - INFO]: Summary plots saved to: .../california_housing/output/summary.
[2024-10-04 18:30:56] | [UNIQUE - INFO]: Summary of best UQ method for each UQ evaluation type:
                                             1. [TEST] RankingBasedEvaluation: UniqueRandomForestRegressor[feature+UQmetrics+predictions](l1)
                                             2. [TEST] CalibrationBasedEvaluation: Dist2Var[EuclideanDistance[feature]]
                                             3. [TEST] ProperScoringRulesEvaluation: Dist2Var[ManhattanDistance[feature]]
[2024-10-04 18:30:56] | [UNIQUE - INFO]: ********************************* UNIQUE - END *********************************
[2024-10-04 18:30:56] | [UNIQUE - INFO]: Time elapsed: 00h:15m:56s

	UQ Method	Subset	AUC Difference: UQ vs. True Error	Spearman Correlation	Decreasing Coefficient	Performance Drop: High UQ vs. Low UQ (3-Bins)	Increasing Coefficient	Performance Drop: All vs. Low UQ (10-Bins)	Performance Drop: High UQ vs. Low UQ (10-Bins)	Performance Drop: All vs. Low UQ (3-Bins)
0	ManhattanDistance[feature]	TEST	4.342	-0.147	2.616	1.011	2.706	0.986	1.044	0.997
1	EuclideanDistance[feature]	TEST	4.221	-0.122	3.304	1.017	2.273	0.991	1.029	1.002
2	EnsembleVariance[mlp_variances]	TEST	7.595	-0.593	0.272	0.859	4.468	0.832	0.809	0.925
3	Diff5NN[ManhattanDistance[feature], EnsembleVariance[mlp_variances]]	TEST	6.194	-0.375	0.110	0.902	4.455	0.936	0.898	0.952
4	Diff5NN[ManhattanDistance[feature], predictions]	TEST	3.778	0.075	4.432	1.034	0.825	1.015	1.092	1.012
5	Diff5NN[EuclideanDistance[feature], EnsembleVariance[mlp_variances]]	TEST	6.149	-0.369	0.102	0.904	4.432	0.940	0.899	0.953
6	Diff5NN[EuclideanDistance[feature], predictions]	TEST	3.776	0.080	4.460	1.038	0.674	1.014	1.092	1.015
7	Dist2Var[ManhattanDistance[feature]]	TEST	4.342	-0.147	2.616	1.011	2.706	0.986	1.044	0.997
8	Dist2Var[EuclideanDistance[feature]]	TEST	4.221	-0.122	3.304	1.017	2.273	0.991	1.029	1.002
9	SumOfVariances[Dist2Var[EuclideanDistance[feature]]]	TEST	7.557	-0.597	0.272	0.857	4.468	0.851	0.828	0.924
10	UniqueRandomForestRegressor[feature+UQmetrics+predictions](l1)	TEST	0.023	0.992	4.468	1.268	0.000	1.195	1.539	1.126
11	UniqueRandomForestRegressor[UQmetrics+predictions](l1)	TEST	0.052	0.984	4.468	1.266	0.000	1.194	1.536	1.125
12	UniqueRandomForestRegressor[transformedUQmetrics+predictions](l1)	TEST	0.051	0.984	4.468	1.266	0.000	1.194	1.536	1.125
13	UniqueLASSO[feature+UQmetrics+predictions](l1)	TEST	0.031	0.990	4.468	1.268	0.000	1.195	1.538	1.126
14	UniqueLASSO[UQmetrics+predictions](l1)	TEST	0.061	0.981	4.468	1.266	0.000	1.193	1.534	1.125
15	UniqueLASSO[transformedUQmetrics+predictions](l1)	TEST	0.081	0.976	4.468	1.264	0.000	1.192	1.531	1.124

	UQ Method	Subset	MACE	RMSCE
0	EnsembleVariance[mlp_variances]	TEST	0.381	0.476
1	Dist2Var[ManhattanDistance[feature]]	TEST	0.314	0.388
2	Dist2Var[EuclideanDistance[feature]]	TEST	0.310	0.383
3	SumOfVariances[Dist2Var[EuclideanDistance[feature]]]	TEST	0.379	0.474
4	UniqueRandomForestRegressor[feature+UQmetrics+predictions](l1)	TEST	0.400	0.502
5	UniqueRandomForestRegressor[UQmetrics+predictions](l1)	TEST	0.400	0.502
6	UniqueRandomForestRegressor[transformedUQmetrics+predictions](l1)	TEST	0.400	0.502
7	UniqueLASSO[feature+UQmetrics+predictions](l1)	TEST	0.400	0.502
8	UniqueLASSO[UQmetrics+predictions](l1)	TEST	0.400	0.502
9	UniqueLASSO[transformedUQmetrics+predictions](l1)	TEST	0.400	0.502

	UQ Method	Subset	NLL	CheckScore	CRPS	IntervalScore
0	EnsembleVariance[mlp_variances]	TEST	5.965	20.093	39.794	187.592
1	Dist2Var[ManhattanDistance[feature]]	TEST	5.517	18.278	36.196	157.577
2	Dist2Var[EuclideanDistance[feature]]	TEST	5.517	18.278	36.197	157.589
3	SumOfVariances[Dist2Var[EuclideanDistance[feature]]]	TEST	5.604	18.497	36.633	173.948
4	UniqueRandomForestRegressor[feature+UQmetrics+predictions](l1)	TEST	32.847	27.800	55.416	503.113
5	UniqueRandomForestRegressor[UQmetrics+predictions](l1)	TEST	32.887	27.802	55.419	503.188
6	UniqueRandomForestRegressor[transformedUQmetrics+predictions](l1)	TEST	32.886	27.802	55.419	503.187
7	UniqueLASSO[feature+UQmetrics+predictions](l1)	TEST	32.846	27.800	55.416	503.101
8	UniqueLASSO[UQmetrics+predictions](l1)	TEST	32.844	27.800	55.415	503.067
9	UniqueLASSO[transformedUQmetrics+predictions](l1)	TEST	32.804	27.798	55.411	502.971

../../../_images/73f63ef218eb2da5087af179c2c5de39cae56a8bbd0458c391fd774d8d724cb0.png

../../../_images/f9569325988d4c5aa76e7e19968db1e18b24412e3ce901abb9df4d14ff246d56.png

../../../_images/ccfe2d34b96632a7ff81bfc558f297f0b11cf6acffe072a0a92ab683c9aa936e.png

../../../_images/c021ec1bd8a08772a7501453a18b6e7287ce825fcdb7ddfa41e4fbbf5e5ef568.png

../../../_images/f599d5bc7d758a86a31e5b01ad81948806fdc4df7ec9ceb59960590a3e6f7a47.png

../../../_images/24ff111667ed22e8701a5e09ce71ae9c33edb5ac8578285fba414127b34b0fec.png

../../../_images/ff4c4cbe8ad2aad0bd63a4cfb502d89868d3a2f1189eaa8b08c1cafd4e052c9d.png

../../../_images/4e2b899e95fac5934390a0843ba24b1fde76fab0bef39719d66af12dd507ebd9.png

../../../_images/162b18bfdf03c25321a4b047fdb8ba4246907c60223cfefa5a8002385704acce.png

../../../_images/1dd6039dba84d714ae793ff71e96f59bfed86011768685246273ab901f63ee00.png

../../../_images/a6cb3243d2271ffaa7302a462c65032a582c5db015616a26cb00ae403205ed6e.png

../../../_images/b2810aec228a5c09ee7c9288ebb10065faeb452968b32c12a8ba3494248aa142.png

../../../_images/a465ffbb90aa911f24fd7583fe38d0384688afab42ced061d86ce9b346a59c80.png

../../../_images/1f74842347f3c1e82ed312947650ad0a06bb825a3f384bdf2d9cd85ca47097dd.png

../../../_images/fada5fb8bb35145d8b08ab6631666382107ac9b143184156c22b886af6cb625f.png

../../../_images/4148150084f2befd3de3eb3aba22f767585f6640e338f77cc9fd140cec704153.png

../../../_images/afbdb08bf9f9ea4f1fb7aba67757a03ef5ff6df3af89c1badd6da39ce3c40da8.png

../../../_images/d634c3ca585d53e22227b3737e822159ef36aa9114713622c00e0a1d62236709.png

../../../_images/6354df160f06cbaed512d9992791e27ae2c3c01165045b46c891d98111061690.png

../../../_images/97758dc9531b60ba1d5b654b92affa941adeab1b8ea715825172d17a579be68f.png

../../../_images/5e892a9a062a4110c1acfe0829560678f1c21840469d77c6f8461b9b52bf06c8.png

../../../_images/e21b51d7f5aaad8a2c5305769bba480a096c68907e2c0231b3cb9da88ba2225a.png

../../../_images/4a8acef8043f7339564b7698c1ab099a286fd969ff3694c29b7fa9b4e9ccf32e.png

../../../_images/80ab7fdee183ada281a4bdb2db9195e69849601229d7d2f0c99b6195e87143eb.png

../../../_images/f060c335cd249197ec45a20875164f22c37d60593473701f30c5336ab009743c.png

# Optionally save the computed UQ metrics
pd.DataFrame.from_dict(output).to_csv(pipeline.output_path / "uq_metrics_values.csv", index=False)

# `eval_results` is a dict containing the evaluation data used to generate the plots

Test UNIQUE with a Random Forest regressor on the California Housing dataset

On this page

Test UNIQUE with a Random Forest regressor on the California Housing dataset#

UNIQUE Input data generation#

UNIQUE Pipeline#