galaxy.normalize

Classes

DataChannel

str(object='') -> str

Functions

`get_channel_file`(file_path, output_folder, ...)
`separate_channels`(dataset)	Separates data by frequencies for later calculation of mean and std for normalization of each channel of frequency
`normalize_asym`(, 0.95), n_bins, outlier_thr)	Normalize data with asymmetrical distribution.
`load_existing_normalization_values`()
`store_normalization_values`(existing_data)
`normalize`(dataset)	TODO: look at data.py

Module Contents

class galaxy.normalize.DataChannel

Bases: str, enum.Enum

str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to ‘utf-8’. errors defaults to ‘strict’.

WISE_W1 = 'W1'

WISE_W2 = 'W2'

galaxy.normalize.get_channel_file(file_path, output_folder, channel_idx, fits_idx)

galaxy.normalize.separate_channels(dataset): Separates data by frequencies for later calculation of mean and std for normalization of each channel of frequency

galaxy.normalize.normalize_asym(i_data: numpy.ndarray, p: Tuple[float] = (10**-3, 0.95), n_bins: int = 500, outlier_thr: float = 10**4) → numpy.ndarray

Normalize data with asymmetrical distribution.

(By fitting Gauss curve to left wing of the distribution).

Parameters:

i_data (np.ndarray) – Data with asymmetrical distribution.
p (Tuple[float]) – Probability range for quantile.
n_bins (int) – Number of bins for histogram.
outlier_thr (float) – Threshold for finding the outliers.

Return type:

np.ndarray

galaxy.normalize.load_existing_normalization_values()

galaxy.normalize.store_normalization_values(existing_data)

galaxy.normalize.normalize(dataset)

TODO: look at data.py def ddos() grabs cutouts in the following way: i-th element of table description (which is either table of train, val or test elements)

is put into folder like train as train/i.fits

IDEA: create RANDOM_BASED = “rand_based” in DataSource, all random based elems belong to this enum (consider def generate_random_sample) after function ddos everything is ready in create_dataloaders

go to descriptions/part folder, search elements with source random_based and send these pictures to normalization calculate mean, std and return to create_dataloaders