galaxy.metrics

Script for model performance evaluation and metrics visualization.

Functions

`probabilities_hist`(→ None)	Plots histogram of prediction probabilities.
`plot_roc_curve`(→ None)	Plots the ROC curve.
`plot_pr_curve`(→ float)	Plots the precision-recall curve.
`plot_confusion_matrices`(→ Tuple[int, int, int, int])	Plots confusion matrices.
`plot_red_shift`(pdf, predictions)
`plot_loss_by_model`(→ None)	Plots loss curves for training and validation.
`plot_accuracies_by_model`(train_acc_data, val_acc_data, pdf)	Plots accuracy curves for training and validation.
`modelPerformance`(→ None)	Plots distributions of probabilities of classes, ROC and Precision-Recall curves, change of loss and accuracy throughout training,
`combine_metrics`(→ pandas.DataFrame)	Combines metrics for all selected models into a single CSV file.

Module Contents

galaxy.metrics.probabilities_hist(predictions: pandas.DataFrame, pdf: matplotlib.backends.backend_pdf.PdfPages) → None

Plots histogram of prediction probabilities.

Args:: predictions_clusters (np.ndarray): Probabilities for the cluster class. predictions_non_clusters (np.ndarray): Probabilities for the non-cluster class. pdf (PdfPages): PDF file to save the plots.

galaxy.metrics.plot_roc_curve(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame) → None

Plots the ROC curve.

Args:: pdf (PdfPages): PDF file to save the plots. predictions (pd.DataFrame): DataFrame containing true labels and predicted probabilities.

galaxy.metrics.plot_pr_curve(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame) → float

Plots the precision-recall curve.

Args:: pdf (PdfPages): PDF file to save the plots. predictions (pd.DataFrame): DataFrame containing true labels and predicted probabilities.
Returns:: float: Area under the precision-recall curve (PR AUC).

galaxy.metrics.plot_confusion_matrices(pdf: matplotlib.backends.backend_pdf.PdfPages, predictions: pandas.DataFrame, classes: List[str]) → Tuple[int, int, int, int]

Plots confusion matrices.

Args:: pdf (PdfPages): PDF file to save the plots predictions (pd.DataFrame): DataFrame containing true and predicted labels classes (List[str]): Class labels
Returns:: Tuple[int, int, int, int]: TN, FP, FN, TP counts

galaxy.metrics.plot_red_shift(pdf, predictions: pandas.DataFrame)

galaxy.metrics.plot_loss_by_model(train_loss_data: List[Tuple[int, float]], val_loss_data: List[Tuple[int, float]], pdf: matplotlib.backends.backend_pdf.PdfPages) → None

Plots loss curves for training and validation.

Args:: train_table_data (List[Tuple[int, float]]): training loss data by epochs val_table_data (List[Tuple[int, float]]): validation loss data by epochs pdf (PdfPages): pdf file to save the plots

galaxy.metrics.plot_accuracies_by_model(train_acc_data: List[Tuple[int, float]], val_acc_data: List[Tuple[int, float]], pdf: matplotlib.backends.backend_pdf.PdfPages)

Plots accuracy curves for training and validation.

Args:: train_table_data (List[Tuple[int, float]]): training accuracy data by epochs val_table_data (List[Tuple[int, float]]): validation accuracy data by epochs pdf (PdfPages): pdf file to save the plots

galaxy.metrics.modelPerformance(model_name: str, optimizer_name: str, predictions: pandas.DataFrame, train_table_data: List[Tuple[int, float, float]] | None = None, val_table_data: List[Tuple[int, float, float]] | None = None, f_beta: float = 2.0) → None

Plots distributions of probabilities of classes, ROC and Precision-Recall curves, change of loss and accuracy throughout training, confusion matrix and its weighted version and saves them in .png files, counts accuracy, precision, recall, false positive rate and f1-score and saves them in .txt file

Args:

model_name (str): name of the model optimizer_name (str): name of the optimizer predictions (pd.DataFrame): dataframe with true labels, predicted labels, and probabilities classes (List[str]): class labels train_table_data (Optional[List[Tuple[int, float, float]]], optional): training data for plotting

element 0 in tuple - # of the epoch, 1 - loss, 2 - accuracy

val_table_data (Optional[List[Tuple[int, float, float]]], optional): validation data for plotting: element 0 in tuple - # of the epoch, 1 - loss, 2 - accuracy

f_beta (float, optional): beta value for F-beta score calculation

galaxy.metrics.combine_metrics(selected_models: List[Tuple[str, Any]], optimizer_name: str) → pandas.DataFrame

Combines metrics for all selected models into a single CSV file.

Args:: selected_models (List[Tuple[str, Any]]): List of selected models. optimizer_name (str): Name of the optimizer.
Returns:: pd.DataFrame: Combined metrics DataFrame.