wass2s submodules

wass2s.was_download module

wass2s.was_transformdata module

wass2s.was_compute_predictand module

class wass2s.was_compute_predictand.WAS_compute_HWSDI

Bases: object

A class to compute the Heat Wave Severity Duration Index (HWSDI), including calculating TXin90 (90th percentile of daily max temperature) and annual counts of heatwave days with at least 6 consecutive hot days.

static calculate_TXin90(temperature_data, base_period_start='1961', base_period_end='1990')

Calculate the daily 90th percentile temperature (TXin90) centered on a 5-day window for each calendar day based on the base period.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature with time dimension.
base_period_start (str, optional) – Start year of the base period (default is ‘1961’).
base_period_end (str, optional) – End year of the base period (default is ‘1990’).

Returns:

TXin90 for each day of the year.

Return type:

xarray.DataArray

compute(temperature_data, base_period_start='1961', base_period_end='1990', nb_cores=4)

Compute the Heat Wave Severity Duration Index (HWSDI) for each pixel in a given daily temperature DataArray.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature data, coords = (T, Y, X).
base_period_start (str, optional) – Start year of the base period for TXin90 calculation (default is ‘1961’).
base_period_end (str, optional) – End year of the base period for TXin90 calculation (default is ‘1990’).
nb_cores (int, optional) – Number of parallel processes to use (default is 4).

Returns:

HWSDI computed for each pixel.

Return type:

xarray.DataArray

count_hot_days(temperature_data, TXin90)

Count the number of days per year with at least 6 consecutive days where daily maximum temperature is above the 90th percentile.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature with time dimension.
TXin90 (xarray.DataArray) – 90th percentile temperature for each day of the year.

Returns:

Annual count of hot days.

Return type:

xarray.DataArray

class wass2s.was_compute_predictand.WAS_compute_HWSDI_Seasonal

Bases: object

A class to compute the Heat Wave Severity Duration Index (HWSDI) for a given season.

static calculate_TXin90(temperature_data, base_period_start='1961', base_period_end='1990', season=[6, 7, 8])

Calculate the daily 90th percentile temperature (TXin90) for each calendar day based on the base period, but only considering the specified season.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature with time dimension.
base_period_start (str, optional) – Start year of the base period (default is ‘1961’).
base_period_end (str, optional) – End year of the base period (default is ‘1990’).
season (list, optional) – List of months to include in the calculation (default is [6, 7, 8] for JJA).

Returns:

TXin90 for each day of the selected season.

Return type:

xarray.DataArray

compute(temperature_data, base_period_start='1961', base_period_end='1990', nb_cores=4, season=[6, 7, 8])

Compute the HWSDI for each pixel in a given daily temperature DataArray for a specific season.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature data, coords = (T, Y, X).
base_period_start (str, optional) – Start year of the base period for TXin90 calculation (default is ‘1961’).
base_period_end (str, optional) – End year of the base period for TXin90 calculation (default is ‘1990’).
nb_cores (int, optional) – Number of parallel processes to use (default is 4).
season (list, optional) – List of months to include in the calculation (default is [6, 7, 8] for JJA).

Returns:

HWSDI computed for each pixel for the given season.

Return type:

xarray.DataArray

count_hot_days(temperature_data, TXin90)

Count the number of days per season with at least 6 consecutive days where daily maximum temperature is above the 90th percentile.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature with time dimension.
TXin90 (xarray.DataArray) – 90th percentile temperature for each day of the year.

Returns:

Seasonal count of hot days.

Return type:

xarray.DataArray

class wass2s.was_compute_predictand.WAS_compute_HWSDI_monthly

Bases: object

A class to compute the Heat Wave Severity Duration Index (HWSDI) monthly, calculating TXin90 (90th percentile of daily max temperature) and counting heatwave days for each month with at least 6 consecutive hot days.

static calculate_TXin90(temperature_data, base_period_start='1961', base_period_end='1990')

Calculate the monthly 90th percentile temperature (TXin90) centered on a 5-day window for each calendar day based on the base period.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature with time dimension.
base_period_start (str, optional) – Start year of the base period (default is ‘1961’).
base_period_end (str, optional) – End year of the base period (default is ‘1990’).

Returns:

TXin90 for each month of the year.

Return type:

xarray.DataArray

compute(temperature_data, base_period_start='1961', base_period_end='1990', nb_cores=4)

Compute the Monthly Heat Wave Severity Duration Index (HWSDI) for each pixel in a given daily temperature DataArray.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature data, coords = (T, Y, X).
base_period_start (str, optional) – Start year of the base period for TXin90 calculation (default is ‘1961’).
base_period_end (str, optional) – End year of the base period for TXin90 calculation (default is ‘1990’).
nb_cores (int, optional) – Number of parallel processes to use (default is 4).

Returns:

HWSDI computed for each pixel per month.

Return type:

xarray.DataArray

count_hot_days(temperature_data, TXin90)

Count the number of days per month with at least 6 consecutive days where daily maximum temperature is above the 90th percentile.

Parameters:

temperature_data (xarray.DataArray) – Daily maximum temperature with time dimension.
TXin90 (xarray.DataArray) – 90th percentile temperature for each month.

Returns:

Monthly count of hot days.

Return type:

xarray.DataArray

class wass2s.was_compute_predictand.WAS_compute_cessation(user_criteria=None)

Bases: object

A class to compute cessation dates based on soil moisture balance for different regions and criteria, leveraging parallel computation for efficiency.

static adjust_duplicates(series, increment=1e-05): If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

cessation_function(x, ijour_dem_cal, idebut, ETP, Cap_ret_maxi, irch_fin): Compute cessation date using soil moisture balance criteria.

compute(daily_data, nb_cores): Compute cessation dates for each pixel using criteria based on regions.

compute_insitu(daily_df)

static day_of_year(i, dem_rech1): Given a year ‘i’ and a month-day string ‘dem_rech1’ (e.g., ‘07-23’), return the 1-based day of the year.

default_criteria = {0: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '09-30', 'start_search': '09-01', 'zone_name': 'Sahel100_0mm'}, 1: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '10-05', 'start_search': '09-01', 'zone_name': 'Sahel200_100mm'}, 2: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '11-10', 'start_search': '09-01', 'zone_name': 'Sahel400_200mm'}, 3: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '11-15', 'start_search': '09-15', 'zone_name': 'Sahel600_400mm'}, 4: {'Cap_ret_maxi': 70, 'ETP': 4.5, 'date_dry_soil': '01-01', 'end_search': '11-30', 'start_search': '10-01', 'zone_name': 'Soudan'}, 5: {'Cap_ret_maxi': 70, 'ETP': 4.0, 'date_dry_soil': '01-01', 'end_search': '12-01', 'start_search': '10-15', 'zone_name': 'Golfe_Of_Guinea'}}

rainf_zone(daily_data)

static transform_cdt(df)

Transform a DataFrame with:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_compute_cessation_dry_spell(user_criteria=None)

Bases: object

A class for computing the longest dry spell length after the onset of a rainy season, based on user-defined criteria.

static adjust_duplicates(series, increment=1e-05): If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, nb_cores)

Compute the longest dry spell length after the rainy season onset for each pixel in the given daily rainfall DataArray, using different criteria (both for onset and cessation) based on isohyet zones.

Parameters:

daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).
nb_cores (int) – Number of parallel processes (workers) to use.

Returns:

Array with the longest dry spell length per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df)

static day_of_year(i, dem_rech1): Convert year i and MM-DD string dem_rech1 (e.g., ‘07-23’) into a 1-based day of the year.

default_criteria = {0: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 10, 'date_dry_soil': '01-01', 'end_search1': '08-15', 'end_search2': '09-30', 'nbjour': 40, 'number_dry_days': 25, 'start_search1': '05-01', 'start_search2': '09-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel100_0mm'}, 1: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 15, 'date_dry_soil': '01-01', 'end_search1': '08-15', 'end_search2': '10-05', 'nbjour': 40, 'number_dry_days': 25, 'start_search1': '05-15', 'start_search2': '09-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel200_100mm'}, 2: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 15, 'date_dry_soil': '01-01', 'end_search1': '07-31', 'end_search2': '11-10', 'nbjour': 40, 'number_dry_days': 20, 'start_search1': '05-01', 'start_search2': '09-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel400_200mm'}, 3: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 20, 'date_dry_soil': '01-01', 'end_search1': '07-31', 'end_search2': '11-15', 'nbjour': 45, 'number_dry_days': 20, 'start_search1': '03-15', 'start_search2': '09-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel600_400mm'}, 4: {'Cap_ret_maxi': 70, 'ETP': 4.5, 'cumulative': 20, 'date_dry_soil': '01-01', 'end_search1': '07-31', 'end_search2': '11-30', 'nbjour': 50, 'number_dry_days': 10, 'start_search1': '03-15', 'start_search2': '10-01', 'thrd_rain_day': 0.85, 'zone_name': 'Soudan'}, 5: {'Cap_ret_maxi': 70, 'ETP': 4.0, 'cumulative': 20, 'date_dry_soil': '01-01', 'end_search1': '06-15', 'end_search2': '12-01', 'nbjour': 50, 'number_dry_days': 10, 'start_search1': '02-01', 'start_search2': '10-15', 'thrd_rain_day': 0.85, 'zone_name': 'Golfe_Of_Guinea'}}

dry_spell_cessation_function(x, idebut1, cumul, nbsec, jour_pluvieux, irch_fin1, idebut2, ijour_dem_cal, ETP, Cap_ret_maxi, irch_fin2, nbjour)

Computes the longest dry spell length after the onset and determines the cessation date (when soil water returns to 0) based on water balance, then checks for a dry spell.

Parameters:

x (array-like) – Daily rainfall or similar values.
idebut1 (int) – Start index to begin searching for the onset.
cumul (float) – Cumulative rainfall threshold to trigger onset.
nbsec (int) – Maximum number of dry days allowed in the sequence.
jour_pluvieux (float) – Minimum rainfall to consider a day as rainy.
irch_fin1 (int) – Maximum index limit for the onset search.
idebut2 (int) – Start index for the cessation search.
ijour_dem_cal (int) – Start index from which the water balance is calculated.
ETP (float) – Daily evapotranspiration (mm).
Cap_ret_maxi (float) – Maximum soil water retention capacity (mm).
irch_fin2 (int) – Maximum index limit for the cessation search.
nbjour (int) – Number of days after onset to check for the dry spell.

Returns:

Length of the longest dry spell sequence after onset and before soil water returns to zero, or NaN if not found.

Return type:

float

rainf_zone(daily_data)

static transform_cdt(df)

Transform a DataFrame with:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_compute_onset(user_criteria=None)

Bases: object

A class that encapsulates methods for transforming precipitation data from different formats (CPT, CDT) and computing onset dates based on rainfall criteria.

static adjust_duplicates(series, increment=1e-05): If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, nb_cores)

Compute onset dates for each pixel in a given daily rainfall DataArray using different criteria based on isohyet zones.

Parameters:

daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).
nb_cores (int) – Number of parallel processes to use.

Returns:

Array with onset dates computed per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df)

static day_of_year(i, dem_rech1): Given a year ‘i’ and a month-day string ‘dem_rech1’ (e.g., ‘07-23’), return the day of the year (1-based).

default_criteria = {0: {'cumulative': 10, 'end_search': '08-30', 'number_dry_days': 25, 'start_search': '06-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel100_0mm'}, 1: {'cumulative': 15, 'end_search': '08-15', 'number_dry_days': 25, 'start_search': '05-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel200_100mm'}, 2: {'cumulative': 15, 'end_search': '07-31', 'number_dry_days': 20, 'start_search': '05-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel400_200mm'}, 3: {'cumulative': 20, 'end_search': '07-31', 'number_dry_days': 20, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel600_400mm'}, 4: {'cumulative': 20, 'end_search': '07-31', 'number_dry_days': 10, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Soudan'}, 5: {'cumulative': 20, 'end_search': '06-15', 'number_dry_days': 10, 'start_search': '02-01', 'thrd_rain_day': 0.85, 'zone_name': 'Golfe_Of_Guinea'}}

onset_function(x, idebut, cumul, nbsec, jour_pluvieux, irch_fin)

Calculate the onset date of a season based on cumulative rainfall criteria.

Parameters:

x (array-like) – Daily rainfall or similar values.
idebut (int) – Start index to begin searching for the onset.
cumul (float) – Cumulative rainfall threshold to trigger onset.
nbsec (int) – Maximum number of dry days allowed in the sequence.
jour_pluvieux (float) – Minimum rainfall to consider a day as rainy.
irch_fin (int) – Maximum index limit for the onset.

Returns:

Index of the onset date or NaN if onset not found.

Return type:

int or float

rainf_zone(daily_data)

static transform_cdt(df)

Transform a DataFrame with:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_compute_onset_dry_spell(user_criteria=None)

Bases: object

A class for computing the longest dry spell length after the onset of a rainy season, based on user-defined criteria.

static adjust_duplicates(series, increment=1e-05): If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, nb_cores)

Compute the longest dry spell length after the onset for each pixel in a given daily rainfall DataArray, using different criteria based on isohyet zones.

Parameters:

daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).
nb_cores (int) – Number of parallel processes to use.

Returns:

Array with the longest dry spell length per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df)

static day_of_year(i, dem_rech1): Given a year ‘i’ and a month-day string ‘dem_rech1’ (e.g., ‘07-23’), return the 1-based day of the year.

default_criteria = {0: {'cumulative': 10, 'end_search': '08-30', 'nbjour': 40, 'number_dry_days': 25, 'start_search': '06-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel100_0mm'}, 1: {'cumulative': 15, 'end_search': '08-15', 'nbjour': 40, 'number_dry_days': 25, 'start_search': '05-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel200_100mm'}, 2: {'cumulative': 15, 'end_search': '07-31', 'nbjour': 40, 'number_dry_days': 20, 'start_search': '05-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel400_200mm'}, 3: {'cumulative': 20, 'end_search': '07-31', 'nbjour': 45, 'number_dry_days': 20, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel600_400mm'}, 4: {'cumulative': 20, 'end_search': '07-31', 'nbjour': 50, 'number_dry_days': 10, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Soudan'}, 5: {'cumulative': 20, 'end_search': '06-15', 'nbjour': 50, 'number_dry_days': 10, 'start_search': '02-01', 'thrd_rain_day': 0.85, 'zone_name': 'Golfe_Of_Guinea'}}

dry_spell_onset_function(x, idebut, cumul, nbsec, jour_pluvieux, irch_fin, nbjour): Calculate the onset date of a season based on cumulative rainfall criteria, and determine the longest dry spell sequence within a specified period after the onset.

dry_spell_onset_function_(x, idebut, cumul, nbsec, jour_pluvieux, irch_fin, nbjour)

Calculate the onset date of a season based on cumulative rainfall criteria, and determine the longest dry spell sequence within a specified period after the onset.

Parameters:

x (array-like) – Daily rainfall or similar values.
idebut (int) – Start index to begin searching for the onset.
cumul (float) – Cumulative rainfall threshold to trigger onset.
nbsec (int) – Maximum number of dry days allowed in the sequence.
jour_pluvieux (float) – Minimum rainfall to consider a day as rainy.
irch_fin (int) – Maximum index limit for the onset.
nbjour (int) – Number of days to check for the longest dry spell after onset.

Returns:

Length of the longest dry spell sequence after onset or NaN if onset not found.

Return type:

float

rainf_zone(daily_data)

static transform_cdt(df)

Transform a DataFrame with:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_count_dry_spells

Bases: object

A class to compute the number of dry spells within a specified period (onset to cessation) for each pixel or station in a daily rainfall dataset.

static adjust_duplicates(series, increment=1e-05): If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, onset_date, cessation_date, dry_spell_length, dry_threshold, nb_cores)

Compute the number of dry spells for each pixel within the onset and cessation period in a daily xarray DataArray.

Parameters:

daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).
onset_date (xarray.DataArray) – DataArray containing onset dates for each pixel.
cessation_date (xarray.DataArray) – DataArray containing cessation dates for each pixel.
dry_spell_length (int) – The length of a dry spell to count.
dry_threshold (float) – Rainfall threshold to classify a day as “dry.”
nb_cores (int) – Number of parallel processes to use.

Returns:

An array with the count of dry spells per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df, onset_df_cpt, cessation_df_cpt, dry_spell_length, dry_threshold=1.0)

Compute the number of dry spells (of length = dry_spell_length) between the onset and cessation dates for in-situ stations (CDT format).

Returns a DataFrame in CPT format:

Row 0: [“LAT”, lat_stn1, lat_stn2, …]
Row 1: [“LON”, lon_stn1, lon_stn2, …]
Subsequent rows: [year, station1_value, station2_value, …]

Parameters:

daily_df (pd.DataFrame) – CDT rainfall data (ID column = date, station columns).
onset_df_cpt (pd.DataFrame) – CPT-format DataFrame containing onset dates (as returned by some method).
cessation_df_cpt (pd.DataFrame) – CPT-format DataFrame containing cessation dates.
dry_spell_length (int) – The length of the dry spell to look for.
dry_threshold (float, optional) – Rainfall threshold below which a day is considered “dry.” Defaults to 1.0 mm.

Returns:

Final dry-spell counts in CPT pivot format.

Return type:

pd.DataFrame

static count_dry_spells(x, onset, cessation, dry_spell_length, dry_threshold)

Count the number of dry spells of a specific length between onset and cessation dates.

Parameters:

x (array-like) – Daily rainfall values.
onset (int) – Start index for the calculation (onset date).
cessation (int) – End index for the calculation (cessation date).
dry_spell_length (int) – The length of a dry spell to count.
dry_threshold (float) – Rainfall threshold to classify a day as “dry.”

Returns:

The number of dry spells of the specified length (NaN if invalid).

Return type:

int or float

static transform_cdt(df)

Transform a DataFrame with:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data with ‘ID’ column containing dates.

Returns a DataFrame with columns like:

static transform_cpt(df, missing_value=None)

Transform a DataFrame in CPT format with:

Row 0 = LAT
Row 1 = LON
Rows 2+ = numeric year data in wide format (stations in columns).

Returns a DataFrame with columns like:

YEAR | STATION | VALUE | LAT | LON

class wass2s.was_compute_predictand.WAS_count_rainy_days

Bases: object

A class to compute the number of rainy days between onset and cessation dates for each pixel or station in a daily rainfall dataset.

compute(daily_data, onset_date, cessation_date, rain_threshold, nb_cores)

Compute the number of rainy days for each pixel between onset and cessation dates.

Parameters:

daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).
onset_date (xarray.DataArray) – DataArray containing onset dates for each pixel.
cessation_date (xarray.DataArray) – DataArray containing cessation dates for each pixel.
rain_threshold (float) – Rainfall threshold to classify a day as “rainy.”
nb_cores (int) – Number of parallel processes to use.

Returns:

Array with the count of rainy days per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df, onset_df_cpt, cessation_df_cpt, rain_threshold=0.85)

Compute, for in-situ stations (CDT data), the number of rainy days between onset and cessation, for each station and year.

Parameters:

daily_df (pd.DataFrame) – CDT precipitation data (ID column = date; columns = stations). Follows the standard CDT format.
onset_df_cpt (pd.DataFrame) – Result of WAS_compute_onset.compute_insitu(…) for onset (CPT format).
cessation_df_cpt (pd.DataFrame) – Same format for cessation (CPT format).
rain_threshold (float, optional) – Precipitation threshold for counting a day as “rainy,” by default 0.85 mm.

Returns:

df_final – The count of rainy days in CPT pivot format.

Return type:

pd.DataFrame

static count_rainy_days(x, onset_date, cessation_date, rain_threshold)

Count the number of rainy days between onset and cessation dates.

Parameters:

x (array-like) – Daily rainfall values.
onset_date (int) – Start index for the calculation (onset date).
cessation_date (int) – End index for the calculation (cessation date).
rain_threshold (float) – Rainfall threshold to classify a day as “rainy.”

Returns:

Number of rainy days (returns NaN if data is invalid).

Return type:

int or float

static transform_cdt(df)

Transform a DataFrame in CDT format into a standardized long DataFrame.

CDT format assumptions:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data with ‘ID’ column holding dates in YYYYMMDD format.

This method returns a DataFrame with columns:

DATE, STATION, VALUE, LON, LAT, ELEV, (optional) MEAN_ANNUAL_RAINFALL, zonename

class wass2s.was_compute_predictand.WAS_count_wet_spells

Bases: object

A class to compute the number of wet spells within a specified period (onset to cessation) for each pixel or station in a daily rainfall dataset.

compute(daily_data, onset_date, cessation_date, wet_spell_length, wet_threshold, nb_cores)

Compute the number of wet spells for each pixel within the onset and cessation period in a daily xarray DataArray.

Parameters:

daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).
onset_date (xarray.DataArray) – DataArray containing onset dates for each pixel.
cessation_date (xarray.DataArray) – DataArray containing cessation dates for each pixel.
wet_spell_length (int) – The length of a wet spell to count.
wet_threshold (float) – Rainfall threshold to classify a day as “wet.”
nb_cores (int) – Number of parallel processes to use.

Returns:

Array with the count of wet spells per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df, onset_df_cpt, cessation_df_cpt, wet_spell_length, wet_threshold=1.0)

Compute the number of wet spells (of length = wet_spell_length) between onset and cessation for in-situ stations (CDT data).

Returns a DataFrame in CPT format:

Row 0: [“LAT”, lat_station1, lat_station2, …]
Row 1: [“LON”, lon_station1, lon_station2, …]
Then one row per year: [year, station1_value, station2_value, …]

Parameters:

daily_df (pd.DataFrame) – CDT rainfall data (ID column = date, station columns).
onset_df_cpt (pd.DataFrame) – CPT-format DataFrame with onset dates (same station columns).
cessation_df_cpt (pd.DataFrame) – CPT-format DataFrame with cessation dates (same station columns).
wet_spell_length (int) – The length of a wet spell to count.
wet_threshold (float, optional) – Rainfall threshold classifying a day as “wet.” Defaults to 1.0 mm.

Returns:

Final wet-spell counts in CPT pivot format.

Return type:

pd.DataFrame

static count_wet_spells(x, onset_date, cessation_date, wet_spell_length, wet_threshold)

Count the number of wet spells of a specific length between onset and cessation dates.

Parameters:

x (array-like) – Daily rainfall values.
onset_date (int) – Start index for the calculation (onset date).
cessation_date (int) – End index for the calculation (cessation date).
wet_spell_length (int) – The length of a wet spell to count.
wet_threshold (float) – Rainfall threshold to classify a day as “wet.”

Returns:

The number of wet spells of the specified length (NaN if data is invalid).

Return type:

int or float

static transform_cdt(df)

Transform a CDT-format DataFrame into a standard table.

CDT format assumptions:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data, ‘ID’ column has dates in YYYYMMDD.

Returns a DataFrame with columns:

[DATE, STATION, VALUE, LON, LAT, ELEV, MEAN_ANNUAL_RAINFALL, zonename]

class wass2s.was_compute_predictand.WAS_r95_99p(base_period: slice, season: list = None)

Bases: object

A class to compute the R95p and R99p climate indices using either: - Dask-enabled xarray for large raster/time-series - An “insitu” method for station-based (CDT) data.

compute_insitu_r95p(df_cdt: DataFrame) → DataFrame

Compute R95p index (total precipitation on days above the daily 95th percentile) for station-based data in CDT format.

Parameters:: df_cdt (pd.DataFrame) – CDT-format DataFrame (rows 0..2 = LON/LAT/ELEV, row 3+ = daily data).
Returns:: df_final – A DataFrame in CPT format with the R95p values pivoted by station vs. year.
Return type:: pd.DataFrame

compute_insitu_r99p(df_cdt: DataFrame) → DataFrame

Compute R99p index (total precipitation on days above the daily 99th percentile) for station-based data in CDT format.

Parameters:: df_cdt (pd.DataFrame) – CDT-format DataFrame.
Returns:: df_final
Return type:: pd.DataFrame (CPT format)

compute_r95p(pr: DataArray) → DataArray: Existing method for xarray-based data (unchanged).

compute_r99p(pr: DataArray) → DataArray: Existing method for xarray-based data (unchanged).

static transform_cdt(df)

Transform a DataFrame in CDT format into a standardized long DataFrame.

CDT format assumptions:

Row 0 = LON
Row 1 = LAT
Row 2 = ELEV
Rows 3+ = daily data with ‘ID’ column holding dates in YYYYMMDD format.

Returns a DataFrame with columns:

[DATE, STATION, VALUE, LON, LAT, ELEV]

wass2s.was_bias_correction module

class wass2s.was_bias_correction.WAS_Qmap

Bases: object

Bias correction methods using quantile mapping techniques, adapted from qmap R package.

This class provides static methods for fitting and applying various bias correction techniques, including empirical quantile mapping (QUANT), robust quantile mapping (RQUANT), smoothing splines (SSPLIN), parametric transformations (PTF), and distribution-based methods (DIST). The methods support both NumPy arrays (1D, 2D, or 3D) and xarray DataArrays (3D: T, Y, X or similar).

All methods handle wet/dry day corrections optionally and are designed for precipitation or similar non-negative variables.

Notes

Inputs are expected to be non-negative.
For gridded data, computations are performed column-wise (per grid cell).
xarray support preserves coordinates and attributes.

static doQmap(x, fobj, **kwargs)

Apply the fitted bias correction to new data.

Parameters:

x (array_like or xarray.DataArray) – New modeled data to correct, same format and shape structure as fitting data.
fobj (dict) – Fitted object from fitQmap.
**kwargs – Additional keyword arguments passed to the specific application method.

Returns:

Bias-corrected data, same type and shape as x.

Return type:

array_like or xarray.DataArray

Raises:

ValueError – If input types or dimensions mismatch the fitted object.

See also

fitQmap: Fit the bias correction model.

static doQmapDIST(x, fobj)

Apply distribution-based (DIST) correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitQmapDIST.

Returns:

Corrected data.

Return type:

ndarray

static doQmapPTF(x, fobj)

Apply parametric transformation (PTF) correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitQmapPTF.

Returns:

Corrected data.

Return type:

ndarray

Raises:

ValueError – If unknown transfun in fobj.

static doQmapQUANT(x, fobj, type='linear')

Apply empirical quantile mapping (QUANT) correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitQmapQUANT.
type (str, optional) – Interpolation kind for interp1d (default ‘linear’).

Returns:

Corrected data.

Return type:

ndarray

static doQmapRQUANT(x, fobj, type='linear', slope_bound=[0, inf])

Apply robust quantile mapping (RQUANT) correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitQmapRQUANT.
type (str, optional) – Interpolation kind for interp1d (default ‘linear’).
slope_bound (list of float, optional) – Bounds for extrapolation slopes [min, max] (default [0, inf]).

Returns:

Corrected data.

Return type:

ndarray

static doQmapSSPLIN(x, fobj)

Apply quantile mapping correction using PCHIP interpolator.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitQmapSSPLIN.

Returns:

Corrected data.

Return type:

ndarray

static evaluate_bias_correction(obs, mod, corrected, wet_threshold=0.1, extreme_quantiles=[0.95, 0.99])

Evaluate bias correction performance with metrics for dry/wet days and extremes.

Parameters:

obs (array_like or xarray.DataArray) – Observed data (numpy array or xarray.DataArray, shape (T, Y, X) or (T,)).
mod (array_like or xarray.DataArray) – Modeled (uncorrected) data (same shape as obs).
corrected (array_like or xarray.DataArray) – Bias-corrected data (same shape as obs).
wet_threshold (float, optional) – Threshold for wet days (default 0.1 mm).
extreme_quantiles (list of float, optional) – List of quantiles for extremes (default [0.95, 0.99]).

Returns:

A dictionary with evaluation metrics, or xarray.Dataset if input is DataArray.

Return type:

dict or xarray.Dataset

static fitQmap(obs, mod, method, **kwargs)

Fit a bias correction model using the specified quantile mapping method.

Parameters:

obs (array_like or xarray.DataArray) – Observed data. If array_like, can be 1D (time), 2D (time, grid), or 3D (T, Y, X). If xarray.DataArray, must be 3D with dimensions (T, Y, X).
mod (array_like or xarray.DataArray) – Modeled data to fit against, same shape as obs.
method (str) – Bias correction method. Options: ‘QUANT’, ‘RQUANT’, ‘SSPLIN’, ‘PTF’, ‘DIST’ (case-insensitive).
**kwargs – Additional keyword arguments passed to the specific fitting method.

Returns:

Fitted object containing parameters, class identifier, and metadata for applying correction.

Return type:

dict

Raises:

ValueError – If shapes mismatch, invalid dimensions, or unknown method.

See also

doQmap: Apply the fitted bias correction to new data.

static fitQmapDIST(obs, mod, distr='berngamma', qstep=None, **kwargs)

Fit distribution-based quantile mapping (DIST).

Uses Bernoulli for wet/dry and a continuous distribution for wet values.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
distr (str, optional) – Wet distribution: ‘berngamma’, ‘bernexp’, ‘bernlnorm’, ‘bernweibull’ (default ‘berngamma’).
qstep (float, optional) – If set, fit on quantiles; else on full data.
**kwargs – Additional arguments (unused).

Returns:

Fitted parameters including distributions and transfer functions.

Return type:

dict

Raises:

ValueError – If unknown distr.

static fitQmapPTF(obs, mod, transfun='power', parini=None, cost='RSS', wet_day=False, qstep=None)

Fit parametric transformation function (PTF) for bias correction.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
transfun (str or callable, optional) – Transformation function: ‘power’, ‘power.x0’, ‘expasympt’, ‘expasympt.x0’, ‘scale’, ‘linear’, or a custom callable (default ‘power’).
parini (list, optional) – Initial parameter guesses for optimization.
cost (str, optional) – Cost function for optimization: ‘RSS’ or ‘MAE’ (default ‘RSS’).
wet_day (bool or float, optional) – Wet day handling (default False).
qstep (float, optional) – If set, fit on quantiles with this step; else fit on sorted data.

Returns:

Fitted parameters including transformation and thresholds.

Return type:

dict

Raises:

ValueError – If unknown transfun or cost.

static fitQmapQUANT(obs, mod, wet_day=False, qstep=0.01, nboot=1)

Fit empirical quantile mapping (QUANT) with optional bootstrapping.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
wet_day (bool or float, optional) – Wet day handling (default False).
qstep (float, optional) – Quantile step size (default 0.01).
nboot (int, optional) – Number of bootstrap samples for observed quantiles (default 1, no bootstrap).

Returns:

Fitted parameters including quantiles and thresholds.

Return type:

dict

static fitQmapRQUANT(obs, mod, wet_day=True, qstep=0.01, nlls=10, nboot=10)

Fit robust quantile mapping (RQUANT) with local linear fitting.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
wet_day (bool or float, optional) – Wet day handling (default True).
qstep (float, optional) – Quantile step size (default 0.01).
nlls (int, optional) – Number of local quantiles for linear fit (default 10).
nboot (int, optional) – Number of bootstrap samples (default 10).

Returns:

Fitted parameters including quantiles, slopes, and thresholds.

Return type:

dict

static fitQmapSSPLIN(obs, mod, wet_day=False, qstep=0.01)

Fit quantile mapping using PCHIP smoothing splines (SSPLIN).

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
wet_day (bool or float, optional) – Wet day handling (default False).
qstep (float, optional) – Quantile step size (default 0.01).

Returns:

Fitted parameters including quantiles and thresholds.

Return type:

dict

static plot_extreme_quantiles_group(ds, extent=None, robust=True): Plots extreme_quantiles_* variables, faceting by variable (columns) and quantile (rows). Assumes dims (‘quantile’, ‘Y’, ‘X’).

static plot_fraction_group(ds, group_prefix, extent=None, robust=True): Plots dry/wet fraction groups (e.g., ‘dry_fraction_’ or ‘wet_fraction_’).

static plot_mean_wet_group(ds, extent=None, robust=True): Plots mean_wet_* variables.

class wass2s.was_bias_correction.WAS_bias_correction

Bases: object

Bias correction methods for climate variables such as temperature or wind speed.

This class provides static methods for fitting and applying bias correction techniques suitable for continuous variables that may include negative values or skewed positive values, such as mean adjustment, variance scaling, empirical quantile mapping (non-parametric), and parametric mapping assuming various distributions (normal, lognormal, gamma, weibull). The methods support both NumPy arrays (1D, 2D, or 3D) and xarray DataArrays (3D: time, lat, lon or similar).

Notes

Inputs can be negative and are treated as continuous, but for positive skewed data like wind speed, use distributions like ‘lognormal’, ‘gamma’, or ‘weibull’.
No handling for wet/dry days.
For gridded data, computations are performed column-wise (per grid cell).
xarray support preserves coordinates and attributes.
Non-parametric method: ‘QUANT’ (empirical quantile mapping).
Parametric methods: ‘NORM’ (normal), or ‘DIST’ with specified distribution.
Handles NaNs: Ignores NaNs in fitting by filtering them out; if fewer than 2 valid points per grid cell in obs or mod, flags as all_nan and outputs NaNs for that grid in application. NaNs in input data during application are propagated as NaNs in output.

static doBC(x, fobj, **kwargs)

Apply the fitted bias correction to new data.

Parameters:

x (array_like or xarray.DataArray) – New modeled data to correct, same format and shape structure as fitting data.
fobj (dict) – Fitted object from fitBC.
**kwargs – Additional keyword arguments passed to the specific application method.

Returns:

Bias-corrected data, same type and shape as x.

Return type:

array_like or xarray.DataArray

Raises:

ValueError – If input types or dimensions mismatch the fitted object.

See also

fitBC: Fit the bias correction model.

static doDist(x, fobj)

Apply parametric distribution bias correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitDist.

Returns:

Corrected data.

Return type:

ndarray

static doMean(x, fobj)

Apply mean additive bias correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitMean.

Returns:

Corrected data.

Return type:

ndarray

static doQuant(x, fobj, type='linear')

Apply empirical quantile mapping correction (non-parametric).

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitQuant.
type (str, optional) – Interpolation kind for interp1d (default ‘linear’).

Returns:

Corrected data.

Return type:

ndarray

static doVarscale(x, fobj)

Apply variance scaling bias correction.

Parameters:

x (ndarray) – 2D data to correct (time, grid).
fobj (dict) – Fitted object from fitVarscale.

Returns:

Corrected data.

Return type:

ndarray

static fitBC(obs, mod, method, **kwargs)

Fit a bias correction model using the specified method.

Parameters:

obs (array_like or xarray.DataArray) – Observed data. If array_like, can be 1D (time), 2D (time, grid), or 3D (time, y, x). If xarray.DataArray, must be 3D with dimensions (time, y, x).
mod (array_like or xarray.DataArray) – Modeled data to fit against, same shape as obs.
method (str) – Bias correction method. Options: ‘MEAN’, ‘VARSCALE’, ‘QUANT’, ‘NORM’, ‘DIST’ (case-insensitive).
**kwargs – Additional keyword arguments passed to the specific fitting method. For ‘DIST’, include ‘distr’ (e.g., ‘lognormal’, ‘gamma’, ‘weibull’, default ‘normal’).

Returns:

Fitted object containing parameters, class identifier, and metadata for applying correction.

Return type:

dict

Raises:

ValueError – If shapes mismatch, invalid dimensions, or unknown method.

See also

doBC: Apply the fitted bias correction to new data.

static fitDist(obs, mod, distr='normal')

Fit parametric bias correction assuming a specified distribution.

Suitable for skewed data like wind speed with ‘lognormal’, ‘gamma’, or ‘weibull’.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
distr (str, optional) – Distribution: ‘normal’, ‘lognormal’, ‘gamma’, ‘weibull’ (default ‘normal’).

Returns:

Fitted parameters including distribution parameters.

Return type:

dict

Raises:

ValueError – If unknown distribution.

static fitMean(obs, mod)

Fit mean additive bias correction.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).

Returns:

Fitted parameters including delta (mean difference).

Return type:

dict

static fitQuant(obs, mod, qstep=0.01, nboot=1)

Fit empirical quantile mapping (non-parametric).

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).
qstep (float, optional) – Quantile step size (default 0.01).
nboot (int, optional) – Number of bootstrap samples for observed quantiles (default 1, no bootstrap).

Returns:

Fitted parameters including quantiles.

Return type:

dict

static fitVarscale(obs, mod)

Fit variance scaling bias correction.

Parameters:

obs (ndarray) – 2D observed data (time, grid).
mod (ndarray) – 2D modeled data (time, grid).

Returns:

Fitted parameters including means and standard deviations.

Return type:

dict

wass2s submodules

wass2s.was_download module

wass2s.was_transformdata module

wass2s.was_compute_predictand module

wass2s.was_bias_correction module

wass2s.was_merge_predictand module

wass2s.was_cross_validate module

wass2s.was_linear_models module

wass2s.was_eof module

wass2s.was_pcr module

wass2s.was_cca module

wass2s.was_machine_learning module

wass2s.was_analog module

wass2s.was_verification module

wass2s.was_mme module

wass2s.utils module