galaxy.metrics
Script for model performance evaluation and metrics visualization.
Functions
|
Plots histogram of prediction probabilities. |
|
Plots the ROC curve. |
|
Plots the precision-recall curve. |
|
Plots confusion matrices. |
|
|
|
Plots loss curves for training and validation. |
|
Plots accuracy curves for training and validation. |
|
Plots distributions of probabilities of classes, ROC and Precision-Recall curves, change of loss and accuracy throughout training, |
|
Combines metrics for all selected models into a single CSV file. |
Module Contents
- galaxy.metrics.probabilities_hist(predictions: pandas.DataFrame, pdf: matplotlib.backends.backend_pdf.PdfPages) None
Plots histogram of prediction probabilities.
- Args:
predictions_clusters (np.ndarray): Probabilities for the cluster class. predictions_non_clusters (np.ndarray): Probabilities for the non-cluster class. pdf (PdfPages): PDF file to save the plots.
- galaxy.metrics.plot_roc_curve(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame) None
Plots the ROC curve.
- Args:
pdf (PdfPages): PDF file to save the plots. predictions (pd.DataFrame): DataFrame containing true labels and predicted probabilities.
- galaxy.metrics.plot_pr_curve(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame) float
Plots the precision-recall curve.
- Args:
pdf (PdfPages): PDF file to save the plots. predictions (pd.DataFrame): DataFrame containing true labels and predicted probabilities.
- Returns:
float: Area under the precision-recall curve (PR AUC).
- galaxy.metrics.plot_confusion_matrices(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame, classes: List[str]) Tuple[int, int, int, int]
Plots confusion matrices.
- Args:
pdf (PdfPages): PDF file to save the plots predictions (pd.DataFrame): DataFrame containing true and predicted labels classes (List[str]): Class labels
- Returns:
Tuple[int, int, int, int]: TN, FP, FN, TP counts
- galaxy.metrics.plot_red_shift(pdf, predictions: pandas.DataFrame)
- galaxy.metrics.plot_loss_by_model(train_loss_data: List[Tuple[int, float]], val_loss_data: List[Tuple[int, float]], pdf: matplotlib.backends.backend_pdf.PdfPages) None
Plots loss curves for training and validation.
- Args:
train_table_data (List[Tuple[int, float]]): training loss data by epochs val_table_data (List[Tuple[int, float]]): validation loss data by epochs pdf (PdfPages): pdf file to save the plots
- galaxy.metrics.plot_accuracies_by_model(train_acc_data: List[Tuple[int, float]], val_acc_data: List[Tuple[int, float]], pdf: matplotlib.backends.backend_pdf.PdfPages)
Plots accuracy curves for training and validation.
- Args:
train_table_data (List[Tuple[int, float]]): training accuracy data by epochs val_table_data (List[Tuple[int, float]]): validation accuracy data by epochs pdf (PdfPages): pdf file to save the plots
- galaxy.metrics.modelPerformance(model_name: str, optimizer_name: str, predictions: pandas.DataFrame, train_table_data: List[Tuple[int, float, float]] | None = None, val_table_data: List[Tuple[int, float, float]] | None = None, f_beta: float = 2.0) None
Plots distributions of probabilities of classes, ROC and Precision-Recall curves, change of loss and accuracy throughout training, confusion matrix and its weighted version and saves them in .png files, counts accuracy, precision, recall, false positive rate and f1-score and saves them in .txt file
- Args:
model_name (str): name of the model optimizer_name (str): name of the optimizer predictions (pd.DataFrame): dataframe with true labels, predicted labels, and probabilities classes (List[str]): class labels train_table_data (Optional[List[Tuple[int, float, float]]], optional): training data for plotting
element 0 in tuple - # of the epoch, 1 - loss, 2 - accuracy
- val_table_data (Optional[List[Tuple[int, float, float]]], optional): validation data for plotting
element 0 in tuple - # of the epoch, 1 - loss, 2 - accuracy
f_beta (float, optional): beta value for F-beta score calculation
- galaxy.metrics.combine_metrics(selected_models: List[Tuple[str, Any]], optimizer_name: str) pandas.DataFrame
Combines metrics for all selected models into a single CSV file.
- Args:
selected_models (List[Tuple[str, Any]]): List of selected models. optimizer_name (str): Name of the optimizer.
- Returns:
pd.DataFrame: Combined metrics DataFrame.