galaxy.metrics

Script for model performance evaluation and metrics visualization.

Functions

probabilities_hist(→ None)

Plots histogram of prediction probabilities.

plot_roc_curve(→ None)

Plots the ROC curve.

plot_pr_curve(→ float)

Plots the precision-recall curve.

plot_confusion_matrices(→ Tuple[int, int, int, int])

Plots confusion matrices.

plot_red_shift(pdf, predictions)

plot_loss_by_model(→ None)

Plots loss curves for training and validation.

plot_accuracies_by_model(train_acc_data, val_acc_data, pdf)

Plots accuracy curves for training and validation.

modelPerformance(→ None)

Plots distributions of probabilities of classes, ROC and Precision-Recall curves, change of loss and accuracy throughout training,

combine_metrics(→ pandas.DataFrame)

Combines metrics for all selected models into a single CSV file.

Module Contents

galaxy.metrics.probabilities_hist(predictions: pandas.DataFrame, pdf: matplotlib.backends.backend_pdf.PdfPages) None

Plots histogram of prediction probabilities.

Args:

predictions_clusters (np.ndarray): Probabilities for the cluster class. predictions_non_clusters (np.ndarray): Probabilities for the non-cluster class. pdf (PdfPages): PDF file to save the plots.

galaxy.metrics.plot_roc_curve(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame) None

Plots the ROC curve.

Args:

pdf (PdfPages): PDF file to save the plots. predictions (pd.DataFrame): DataFrame containing true labels and predicted probabilities.

galaxy.metrics.plot_pr_curve(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame) float

Plots the precision-recall curve.

Args:

pdf (PdfPages): PDF file to save the plots. predictions (pd.DataFrame): DataFrame containing true labels and predicted probabilities.

Returns:

float: Area under the precision-recall curve (PR AUC).

galaxy.metrics.plot_confusion_matrices(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame, classes: List[str]) Tuple[int, int, int, int]

Plots confusion matrices.

Args:

pdf (PdfPages): PDF file to save the plots predictions (pd.DataFrame): DataFrame containing true and predicted labels classes (List[str]): Class labels

Returns:

Tuple[int, int, int, int]: TN, FP, FN, TP counts

galaxy.metrics.plot_red_shift(pdf, predictions: pandas.DataFrame)
galaxy.metrics.plot_loss_by_model(train_loss_data: List[Tuple[int, float]], val_loss_data: List[Tuple[int, float]], pdf: matplotlib.backends.backend_pdf.PdfPages) None

Plots loss curves for training and validation.

Args:

train_table_data (List[Tuple[int, float]]): training loss data by epochs val_table_data (List[Tuple[int, float]]): validation loss data by epochs pdf (PdfPages): pdf file to save the plots

galaxy.metrics.plot_accuracies_by_model(train_acc_data: List[Tuple[int, float]], val_acc_data: List[Tuple[int, float]], pdf: matplotlib.backends.backend_pdf.PdfPages)

Plots accuracy curves for training and validation.

Args:

train_table_data (List[Tuple[int, float]]): training accuracy data by epochs val_table_data (List[Tuple[int, float]]): validation accuracy data by epochs pdf (PdfPages): pdf file to save the plots

galaxy.metrics.modelPerformance(model_name: str, optimizer_name: str, predictions: pandas.DataFrame, train_table_data: List[Tuple[int, float, float]] | None = None, val_table_data: List[Tuple[int, float, float]] | None = None, f_beta: float = 2.0) None

Plots distributions of probabilities of classes, ROC and Precision-Recall curves, change of loss and accuracy throughout training, confusion matrix and its weighted version and saves them in .png files, counts accuracy, precision, recall, false positive rate and f1-score and saves them in .txt file

Args:

model_name (str): name of the model optimizer_name (str): name of the optimizer predictions (pd.DataFrame): dataframe with true labels, predicted labels, and probabilities classes (List[str]): class labels train_table_data (Optional[List[Tuple[int, float, float]]], optional): training data for plotting

element 0 in tuple - # of the epoch, 1 - loss, 2 - accuracy

val_table_data (Optional[List[Tuple[int, float, float]]], optional): validation data for plotting

element 0 in tuple - # of the epoch, 1 - loss, 2 - accuracy

f_beta (float, optional): beta value for F-beta score calculation

galaxy.metrics.combine_metrics(selected_models: List[Tuple[str, Any]], optimizer_name: str) pandas.DataFrame

Combines metrics for all selected models into a single CSV file.

Args:

selected_models (List[Tuple[str, Any]]): List of selected models. optimizer_name (str): Name of the optimizer.

Returns:

pd.DataFrame: Combined metrics DataFrame.