Changelog
=========

v0.1 (May 7, 2024)
------------------

Implemented Features
********************

**main.py**

- **Dataloader Creation**: Utilizes ``data.py`` to create a dataloader and splits it into train and validation loaders.
- **Model Selection**: Offers a list of easily modifiable models for training.
- **Optimizer Selection**: Provides a choice of optimizers from a predefined list.
- **Training Script**: Executes training with pre-selected models, number of epochs, optimizer, and its settings such as learning rate (``lr``) and momentum (``mm``).
- **Training Progress**: Displays training progress and statistics during execution.
- **Results Saving**: Saves the results of training for further analysis.

**train.py**

- **train()**: Displays progress and statistics during training. Saves best model weights, checkpoints, and results.
- **validate()**: Validates the model and provides statistics.
- **continue_training()**: Allows resuming training from checkpoints.

**data.py**

- Queries GAIA Star Catalog asynchronously (``read_gaia``).
- Reads data from **ACT_DR5** and **MaDCoWS** catalogs, pre-downloading if necessary.
- Creates positive and negative samples for training:
  - ``createNegativeClassDR5``, ``create_data_dr5``.
- Includes data transformations: resizing, rotation, reflection, and normalization.

**segmentation.py**

- Implements image dataset class and segmentation map creation:
  - ``create_samples``, ``formSegmentationMaps``, ``printSegMaps``, ``printBigSegMap``.
- Predicts probabilities for images using ``predict_folder`` and ``predict_tests``.

**legacy_for_img.py**

- Downloads image cutouts using multithreading (``grab_cutouts``).
- Supports **VLASS** and **unWISE** image downloads.

**metrics.py**

- Includes plotting functions:
  - ROC Curve (``plot_roc_curve``), Precision-Recall Curve (``plot_precision_recall``).
- ``modelPerformance``: Calculates and displays metrics such as accuracy, precision, recall, and F1-score.

Known Issues
************

- Not all optimizers are functional.
- Incorrect loss and accuracy calculations with Adam optimizer.
- **segmentation.py**: Paths for loaded weights need fixing.
- **data.py**: Ensure data folder existence.
- **main.py**:
  - Models are loaded simultaneously; optimize memory usage.
  - Some models need fixes (e.g., ``ViTL16``).

Directory Structure
*******************

.. code-block:: yaml

    .
    ├── data
    │   ├── DATA
    │   │   ├── test_dr5
    │   │   ├── test_madcows
    │   │   ├── train
    │   │   └── val
    ├── models
    ├── notebooks
    ├── results
    ├── screenshots
    ├── state_dict
    └── trained_models

Performance
***********

- **Initial startup**: ~1h 9m 43s (Internet speed-dependent).
- **Subsequent runs**: ~6m 55s.

Usage
*****

Run the script with the following command:

.. code-block:: bash

    python3 main.py --model MODEL_NAME --epoch NUM_EPOCH --optimizer OPTIMIZER --lr LR --mm MOMENTUM

Supported models:

- ``ResNet18``, ``AlexNet_VGG``, ``SpinalNet_VGG``, ``SpinalNet_ResNet``.

Supported optimizers:

- ``SGD``, ``Adam``, ``RMSprop``, ``AdamW``, ``Adadelta``, ``DiffGrad``.

Example output for AlexNet training with SGD optimizer:

.. code-block:: text

    Epoch 1/10. Training AlexNet with SGD optimizer: acc=0.683, loss=0.598
    Validation Loss: 0.0075, Validation Accuracy: 0.7926
    ...
    Epoch 10/10. Training AlexNet with SGD optimizer: acc=0.884, loss=0.286
    Validation Loss: 0.0057, Validation Accuracy: 0.8445

Bugs
****

- Not all optimizers work.
- Loss and accuracy calculations incorrect with Adam optimizer.
- Additional feedback from users is welcome.