Evaluation of the rejection processes

The following functions output all the necessary datapoints to plot accuracy-rejection curves for evaluation of (partial) rejection. For rejection with hierarchical classification, a parallelised function is available, as prediction over a lot of thresholds can sometimes take quite a lot of time.

Evaluation.Functions_Accuracy_Reject.Evaluate_AR_Flat(clf_list, Xtests, ytests, predictions, probabilities, b, scores)
Function to generate datapoints for accuracy-rejection curves with flat classification

The rejection threshold is varied with a stepsize of 0.01.

Parameters:
  • clf_list (list of scikit-learn classifiers) – Contains the trained classifiers on the test sets over the K folds.

  • Xtests (list of matrices) – Contains the test data per fold

  • ytests (list of lists) – Contains the actual labels of the test data per fold

  • predictions (list of lists) – Contains the predictions per fold

  • probabilities (list of matrices) – Contains the probabilities of the predictions per fold over all the classes

  • b (boolean) – Is the hierarchy balanced?

  • scores (boolean) – Does the trained scikit-learn classifier output scores or probabilities (with predict_proba)?

Returns:

results – The following metrics are saved in a dictionary with key ‘Try fold_number’ for every fold: the accuracies (acc) and rejection percentage (perc:) for every rejection threshold, the rejection thresholds themselves (steps), the actual values (ytest), the predictions (preds), the probabilities (probs) and the lengths of the predictions (lp) and actual values (lt) per rejection threshold

Return type:

nested dictionary

Evaluation.Functions_Accuracy_Reject.Evaluate_AR(clf_list, Xtests, ytests, predictions, greedy=True)
Function to generate datapoints for accuracy-rejection curves with hierarchical classification

The rejection threshold is varied with a stepsize of 0.01.

Parameters:
  • clf_list (list of scikit-learn classifiers) – Contains the trained classifiers on the test sets over the K folds.

  • Xtests (list of matrices) – Contains the test data per fold

  • ytests (list of lists) – Contains the actual labels of the test data per fold

  • predictions (list of lists) – Contains the predictions per fold

  • greedy (boolean, optional) – Perform greedy (True) or non-greedy (False) hierarchical classification, by default True

Returns:

results – The following metrics are saved in a dictionary with key ‘Try fold_number’ for every fold: the accuracies (acc) and rejection percentage (perc:) for every rejection threshold, the rejection thresholds themselves (steps), the actual values (ytest), the predictions (preds), the probabilities (probs) and the lengths of the predictions (lp) and actual values (lt) per rejection threshold

Return type:

nested dictionary

Evaluation.Functions_Accuracy_Reject.Evaluate_AR_parallel(clf_list, Xtests, ytests, predictions, all_jobs, greedy)
Function to generate datapoints for accuracy-rejection curves with hierarchical classification in a parallel manner

The rejection threshold is varied with a stepsize of 0.01.

Parameters:
  • clf_list (list of scikit-learn classifiers) – Contains the trained classifiers on the test sets over the K folds.

  • Xtests (list of matrices) – Contains the test data per fold

  • ytests (list of lists) – Contains the actual labels of the test data per fold

  • predictions (list of lists) – Contains the predictions per fold

  • all_jobs (int) – number of CPU cores to parallelize the calculations over

  • greedy (boolean, optional) – Perform greedy (True) or non-greedy (False) hierarchical classification, by default True

Returns:

results – The following metrics are saved in a dictionary with key ‘Try fold_number’ for every fold: the accuracies (acc) and rejection percentage (perc:) for every rejection threshold, the rejection thresholds themselves (steps), the actual values (ytest), the predictions (preds), the probabilities (probs) and the lengths of the predictions (lp) and actual values (lt) per rejection threshold

Return type:

nested dictionary