API Reference

This page contains the API reference for the wass2s package.

Module Documentation

wass2s.was_download module

wass2s.was_transformdata module

wass2s.was_compute_predictand module

class wass2s.was_compute_predictand.ETCCDIHeatWaveIndices[source]

Bases: object

Factory for creating standard ETCCDI heat wave indices.

static compound_heat_wave(base_period: slice, tx_percentile: float = 90, tn_percentile: float = 90, min_consecutive_days: int = 3, season: List[int] | None = None) WAS_HeatWaveIndices[source]

Create calculator for compound heat waves (both TX and TN).

static heat_wave_frequency(base_period: slice, tx_percentile: float = 90, min_consecutive_days: int = 3, season: List[int] | None = None) WAS_HeatWaveIndices[source]

Create calculator for Heat Wave Frequency.

static wsdi(base_period: slice, tx_percentile: float = 90, min_consecutive_days: int = 6, season: List[int] | None = None) WAS_HeatWaveIndices[source]

Create calculator for WSDI (Warm Spell Duration Index).

class wass2s.was_compute_predictand.ETCCDITempIndices[source]

Bases: object

Factory for creating standard ETCCDI temperature indices.

static cold_days(base_period: slice, season: List[int] | None = None, percentile: float = 10) WAS_TempPercentileIndices[source]

Create calculator for cold days (TX10p).

static cold_nights(base_period: slice, season: List[int] | None = None, percentile: float = 10) WAS_TempPercentileIndices[source]

Create calculator for cold nights (TN10p).

static hot_days(base_period: slice, season: List[int] | None = None, percentile: float = 90) WAS_TempPercentileIndices[source]

Create calculator for hot days (TX90p).

static hot_nights(base_period: slice, season: List[int] | None = None, percentile: float = 90) WAS_TempPercentileIndices[source]

Create calculator for hot nights (TN90p).

class wass2s.was_compute_predictand.ExtremeType(*values)[source]

Bases: Enum

Type of temperature extreme.

COLD = 'cold'
HOT = 'hot'
classmethod __contains__(value)

Return True if value is in cls.

value is in cls if: 1) value is a member of cls, or 2) value is the value of one of the cls’s members.

classmethod __getitem__(name)

Return the member matching name.

classmethod __iter__()

Return members in definition order.

classmethod __len__()

Return the number of members (no aliases)

class wass2s.was_compute_predictand.HeatWaveDefinition(start_date: Timestamp, end_date: Timestamp, duration: int, max_temp: float, mean_temp: float)[source]

Bases: object

Definition of a heat wave event.

__init__(start_date: Timestamp, end_date: Timestamp, duration: int, max_temp: float, mean_temp: float) None
duration: int
end_date: Timestamp
max_temp: float
mean_temp: float
start_date: Timestamp
class wass2s.was_compute_predictand.HeatWaveMetric(*values)[source]

Bases: Enum

ETCCDI Heat Wave Indices.

HWDI = 'HWDI'
HWF = 'HWF'
HWN = 'HWN'
WSDI = 'WSDI'
classmethod __contains__(value)

Return True if value is in cls.

value is in cls if: 1) value is a member of cls, or 2) value is the value of one of the cls’s members.

classmethod __getitem__(name)

Return the member matching name.

classmethod __iter__()

Return members in definition order.

classmethod __len__()

Return the number of members (no aliases)

class wass2s.was_compute_predictand.WAS_HeatWaveIndices(base_period: slice, tx_percentile: float = 90, tn_percentile: float | None = None, min_consecutive_days: int = 3, max_break_days: int = 1, season: List[int] | None = None, require_both_tx_tn: bool = False, min_intensity: float | None = None)[source]

Bases: object

Correct implementation of ETCCDI heat wave indices.

Standard ETCCDI Indices: 1. WSDI (Warm Spell Duration Index): Annual count of days with at least

6 consecutive days when TX > 90th percentile

  1. HWF (Heat Wave Frequency): Annual count of heat wave events

  2. HWDI (Heat Wave Duration Index): Annual maximum length of heat waves

Reference: - ETCCDI Climate Change Indices (2009) - Perkins & Alexander (2013): On the measurement of heat waves

__init__(base_period: slice, tx_percentile: float = 90, tn_percentile: float | None = None, min_consecutive_days: int = 3, max_break_days: int = 1, season: List[int] | None = None, require_both_tx_tn: bool = False, min_intensity: float | None = None)[source]
Parameters:
  • base_period (slice) – Base period for percentile calculation, e.g., slice(“1961”, “1990”)

  • tx_percentile (float) – Percentile for daily maximum temperature (TX)

  • tn_percentile (float, optional) – Percentile for daily minimum temperature (TN) for compound heat waves

  • min_consecutive_days (int) – Minimum consecutive days for a heat wave (ETCCDI WSDI uses 6)

  • max_break_days (int) – Maximum number of break days allowed within a heat wave

  • season (list, optional) – Months to consider for heat wave analysis

  • require_both_tx_tn (bool) – If True, requires both TX and TN to exceed percentiles (compound heat wave)

  • min_intensity (float, optional) – Minimum intensity (e.g., temperature anomaly) for a heat wave

_calculate_temperature_thresholds(df_temp: DataFrame, percentile: float, var_name: str = 'TX') DataFrame[source]

Calculate temperature thresholds using 5-day centered window.

Parameters:
  • df_temp (pd.DataFrame) – Temperature data with columns: DATE, STATION, VALUE

  • percentile (float) – Percentile to calculate (e.g., 90 for 90th percentile)

  • var_name (str) – Variable name for metadata

Returns:

Thresholds for each day of year and station

Return type:

pd.DataFrame

_detect_heat_waves(df_hot_days: DataFrame, intensity_col: str | None = None) DataFrame[source]

Detect heat wave events from sequence of hot days.

Parameters:
  • df_hot_days (pd.DataFrame) – DataFrame with IS_HOT column (0/1 for non-hot/hot days)

  • intensity_col (str, optional) – Column with intensity values for filtering

Returns:

DataFrame with heat wave events detected

Return type:

pd.DataFrame

_format_to_cdt(df: DataFrame, metric: str) DataFrame[source]

Convert to CDT format.

_get_index_definition(metric: str) str[source]

Get definition string for the index.

_get_metadata(metric: str) Dict[source]

Get metadata dictionary for the index.

_identify_hot_days(df_temp: DataFrame, thresholds: DataFrame, var_name: str = 'TX') DataFrame[source]

Identify days when temperature exceeds threshold.

_standardize_dims(da: DataArray) DataArray[source]

Standardize dimension names.

_validate_inputs()[source]

Validate all input parameters.

compute_insitu(df_cdt_tx: DataFrame, df_cdt_tn: DataFrame | None = None, metric: str = 'WSDI') DataFrame[source]

Compute heat wave indices for in-situ data.

Parameters:
  • df_cdt_tx (pd.DataFrame) – Daily maximum temperature in CDT format

  • df_cdt_tn (pd.DataFrame, optional) – Daily minimum temperature in CDT format (for compound heat waves)

  • metric (str) – Heat wave metric to compute: “WSDI”, “HWF”, or “HWDI”

Returns:

Results in CDT format

Return type:

pd.DataFrame

compute_xarray(ds_tx: Dataset | DataArray, ds_tn: Dataset | DataArray | None = None, metric: str = 'WSDI', parallel: bool = True, nb_cores: int = 4) DataArray[source]

Compute heat wave indices for xarray data.

Parameters:
  • ds_tx (xr.Dataset or xr.DataArray) – Daily maximum temperature

  • ds_tn (xr.Dataset or xr.DataArray, optional) – Daily minimum temperature (for compound heat waves)

  • metric (str) – Heat wave metric to compute

  • chunk_size (dict, optional) – Chunk sizes for parallel processing

  • parallel (bool) – Whether to use Dask for parallel processing

Returns:

Heat wave index values

Return type:

xr.DataArray

static transform_cdt(df: DataFrame) DataFrame[source]

Transform CDT format to long format DataFrame.

class wass2s.was_compute_predictand.WAS_PrecipIndices(base_period: slice, percentile: float = 95, season: List[int] | None = None, wet_day_threshold: float = 1.0, min_base_years: int = 15)[source]

Bases: object

Correct implementation of ETCCDI precipitation indices (R95p, R99p, etc.)

Parameters:
  • base_period (slice) – Slice for base period years, e.g., slice(“1991”, “2020”)

  • percentile (float) – Percentile value (95 for R95p, 99 for R99p)

  • season (list, optional) – Months to consider (e.g., [6, 7, 8, 9] for JJAS)

  • wet_day_threshold (float) – Minimum precipitation for a wet day (default 1.0 mm)

  • min_base_years (int) – Minimum years required in base period (default 15)

__init__(base_period: slice, percentile: float = 95, season: List[int] | None = None, wet_day_threshold: float = 1.0, min_base_years: int = 15)[source]
_compute_percentile_threshold(data: DataFrame) DataFrame[source]

Compute percentile threshold from base period wet days.

Returns:

DataFrame with threshold per station

Return type:

pd.DataFrame

_format_to_cdt(df: DataFrame) DataFrame[source]

Convert long format DataFrame to CDT format.

Parameters:

df (pd.DataFrame) – Long format DataFrame with columns: STATION, YEAR, index_name, LAT, LON

Returns:

DataFrame in CDT format

Return type:

pd.DataFrame

_standardize_dims(da: DataArray) DataArray[source]

Standardize dimension names.

_validate_base_period(years: ndarray) None[source]

Validate that base period has sufficient data.

compute_insitu(df_cdt: DataFrame) DataFrame[source]

Compute index for in-situ (station) data in CDT format.

Parameters:

df_cdt (pd.DataFrame) – Input data in CDT format

Returns:

Result in CDT format

Return type:

pd.DataFrame

compute_xarray(da: DataArray, parallel: bool = True, nb_cores: int = 4) DataArray[source]

Compute index for xarray DataArray (gridded data).

Parameters:
  • da (xr.DataArray) – Precipitation DataArray with dimensions (time, y, x) or (time, lat, lon)

  • chunk_size (dict, optional) – Chunk sizes for parallel processing, e.g., {‘y’: 100, ‘x’: 100}

  • parallel (bool) – Whether to use Dask for parallel processing

Returns:

Annual index values

Return type:

xr.DataArray

get_index_definition() Dict[source]

Return index definition metadata.

static transform_cdt(df: DataFrame) DataFrame[source]

Transform CDT format to long format DataFrame.

Parameters:

df (pd.DataFrame) – Input DataFrame in CDT format

Returns:

Long format DataFrame with columns: DATE, STATION, VALUE, LAT, LON

Return type:

pd.DataFrame

class wass2s.was_compute_predictand.WAS_TempPercentileIndices(base_period: slice, percentile: float = 90, season: List[int] | None = None, var_type: str = 'TMAX', extreme_type: str = 'hot', bootstrap_samples: int = 10, min_base_years: int = 15)[source]

Bases: object

Correct implementation of ETCCDI temperature percentile indices.

Standard ETCCDI Indices: - Hot Days: TX90p (daily max temperature > 90th percentile) - Hot Nights: TN90p (daily min temperature > 90th percentile) - Cold Days: TX10p (daily max temperature < 10th percentile) - Cold Nights: TN10p (daily min temperature < 10th percentile)

Reference: ETCCDI Climate Change Indices (2009)

__init__(base_period: slice, percentile: float = 90, season: List[int] | None = None, var_type: str = 'TMAX', extreme_type: str = 'hot', bootstrap_samples: int = 10, min_base_years: int = 15)[source]
Parameters:
  • base_period (slice) – Slice for base period years, e.g., slice(“1961”, “1990”)

  • percentile (float) – Percentile value: - For hot extremes: 90, 95, 99 (days above percentile) - For cold extremes: 10, 5, 1 (days below percentile)

  • season (list, optional) – Months to consider (e.g., [6, 7, 8] for JJA)

  • var_type (str) – Temperature variable type: ‘TMAX’ (TX) or ‘TMIN’ (TN)

  • extreme_type (str) – Type of extreme: ‘hot’ or ‘cold’

  • bootstrap_samples (int) – Number of bootstrap samples for confidence intervals

  • min_base_years (int) – Minimum years required in base period

_calculate_extreme_days(temp_data: DataFrame, thresholds: DataFrame) DataFrame[source]

Calculate number of extreme temperature days.

For hot extremes: days when temperature > percentile (e.g., TX90p, TN90p) For cold extremes: days when temperature < percentile (e.g., TX10p, TN10p)

_calculate_percentile_thresholds(temp_data: DataFrame, confidence: bool = False) DataFrame | Tuple[DataFrame, DataFrame, DataFrame][source]

Calculate percentile thresholds using 5-day centered window.

For hot extremes: Calculate upper percentile (e.g., 90th) For cold extremes: Calculate lower percentile (e.g., 10th)

_format_to_cdt(df: DataFrame) DataFrame[source]

Convert to CDT format.

_generate_index_name() str[source]

Generate the proper ETCCDI index name.

_get_etccdi_id() str[source]

Get ETCCDI official ID for the index.

_get_metadata() Dict[source]

Get metadata dictionary for the index.

_standardize_dims(da: DataArray) DataArray[source]

Standardize dimension names.

_validate_base_period(years: ndarray) None[source]

Validate that base period has sufficient data.

_validate_inputs()[source]

Validate all input parameters.

compute_insitu(df_cdt: DataFrame, return_confidence: bool = False) DataFrame | Tuple[DataFrame, DataFrame, DataFrame][source]

Compute index for in-situ (station) data.

compute_xarray(ds: Dataset | DataArray, var_name: str | None = None, parallel: bool = True, nb_cores: int = 4) DataArray[source]

Compute index for xarray data (gridded).

get_index_definition() Dict[source]

Return index definition metadata.

static transform_cdt(df: DataFrame) DataFrame[source]

Transform CDT format to long format DataFrame.

class wass2s.was_compute_predictand.WAS_compute_cessation(user_criteria=None)[source]

Bases: object

A class to compute cessation dates based on soil moisture balance for different regions and criteria, leveraging parallel computation for efficiency.

__init__(user_criteria=None)[source]

Initialize the WAS_compute_cessation class with user-defined or default criteria.

Parameters:

user_criteria (dict, optional) – A dictionary containing zone-specific criteria. If not provided, the class will use the default criteria.

static adjust_duplicates(series, increment=1e-05)[source]

If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

cessation_function(x, ijour_dem_cal, idebut, ETP, Cap_ret_maxi, irch_fin)[source]

Compute cessation date using soil moisture balance criteria.

compute(daily_data, nb_cores)[source]

Compute cessation dates for each pixel using criteria based on regions.

compute_insitu(daily_df)[source]
static day_of_year(i, dem_rech1)[source]

Given a year ‘i’ and a month-day string ‘dem_rech1’ (e.g., ‘07-23’), return the 1-based day of the year.

default_criteria = {0: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '09-30', 'start_search': '09-01', 'zone_name': 'Sahel100_0mm'}, 1: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '10-05', 'start_search': '09-01', 'zone_name': 'Sahel200_100mm'}, 2: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '11-10', 'start_search': '09-01', 'zone_name': 'Sahel400_200mm'}, 3: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'date_dry_soil': '01-01', 'end_search': '11-15', 'start_search': '09-15', 'zone_name': 'Sahel600_400mm'}, 4: {'Cap_ret_maxi': 70, 'ETP': 4.5, 'date_dry_soil': '01-01', 'end_search': '11-30', 'start_search': '10-01', 'zone_name': 'Soudan'}, 5: {'Cap_ret_maxi': 70, 'ETP': 4.0, 'date_dry_soil': '01-01', 'end_search': '12-01', 'start_search': '10-15', 'zone_name': 'Golfe_Of_Guinea'}}
rainf_zone(daily_data)[source]
static transform_cdt(df)[source]
Transform a DataFrame with:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_compute_cessation_dry_spell(user_criteria=None)[source]

Bases: object

A class for computing the longest dry spell length after the onset of a rainy season, based on user-defined criteria.

__init__(user_criteria=None)[source]

Initialize the WAS_compute_cessation_dry_spell class with user-defined or default criteria.

Parameters:

user_criteria (dict, optional) – A dictionary containing zone-specific criteria. If not provided, the class will use the default criteria.

static adjust_duplicates(series, increment=1e-05)[source]

If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, nb_cores)[source]

Compute the longest dry spell length after the rainy season onset for each pixel in the given daily rainfall DataArray, using different criteria (both for onset and cessation) based on isohyet zones.

Parameters:
  • daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).

  • nb_cores (int) – Number of parallel processes (workers) to use.

Returns:

Array with the longest dry spell length per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df)[source]
static day_of_year(i, dem_rech1)[source]

Convert year i and MM-DD string dem_rech1 (e.g., ‘07-23’) into a 1-based day of the year.

default_criteria = {0: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 10, 'date_dry_soil': '01-01', 'end_search1': '08-15', 'end_search2': '09-30', 'nbjour': 40, 'number_dry_days': 25, 'start_search1': '05-01', 'start_search2': '09-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel100_0mm'}, 1: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 15, 'date_dry_soil': '01-01', 'end_search1': '08-15', 'end_search2': '10-05', 'nbjour': 40, 'number_dry_days': 25, 'start_search1': '05-15', 'start_search2': '09-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel200_100mm'}, 2: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 15, 'date_dry_soil': '01-01', 'end_search1': '07-31', 'end_search2': '11-10', 'nbjour': 40, 'number_dry_days': 20, 'start_search1': '05-01', 'start_search2': '09-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel400_200mm'}, 3: {'Cap_ret_maxi': 70, 'ETP': 5.0, 'cumulative': 20, 'date_dry_soil': '01-01', 'end_search1': '07-31', 'end_search2': '11-15', 'nbjour': 45, 'number_dry_days': 20, 'start_search1': '03-15', 'start_search2': '09-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel600_400mm'}, 4: {'Cap_ret_maxi': 70, 'ETP': 4.5, 'cumulative': 20, 'date_dry_soil': '01-01', 'end_search1': '07-31', 'end_search2': '11-30', 'nbjour': 50, 'number_dry_days': 10, 'start_search1': '03-15', 'start_search2': '10-01', 'thrd_rain_day': 0.85, 'zone_name': 'Soudan'}, 5: {'Cap_ret_maxi': 70, 'ETP': 4.0, 'cumulative': 20, 'date_dry_soil': '01-01', 'end_search1': '06-15', 'end_search2': '12-01', 'nbjour': 50, 'number_dry_days': 10, 'start_search1': '02-01', 'start_search2': '10-15', 'thrd_rain_day': 0.85, 'zone_name': 'Golfe_Of_Guinea'}}
dry_spell_cessation_function(x, idebut1, cumul, nbsec, jour_pluvieux, irch_fin1, idebut2, ijour_dem_cal, ETP, Cap_ret_maxi, irch_fin2, nbjour)[source]

Computes the longest dry spell length after the onset and determines the cessation date (when soil water returns to 0) based on water balance, then checks for a dry spell.

Parameters:
  • x (array-like) – Daily rainfall or similar values.

  • idebut1 (int) – Start index to begin searching for the onset.

  • cumul (float) – Cumulative rainfall threshold to trigger onset.

  • nbsec (int) – Maximum number of dry days allowed in the sequence.

  • jour_pluvieux (float) – Minimum rainfall to consider a day as rainy.

  • irch_fin1 (int) – Maximum index limit for the onset search.

  • idebut2 (int) – Start index for the cessation search.

  • ijour_dem_cal (int) – Start index from which the water balance is calculated.

  • ETP (float) – Daily evapotranspiration (mm).

  • Cap_ret_maxi (float) – Maximum soil water retention capacity (mm).

  • irch_fin2 (int) – Maximum index limit for the cessation search.

  • nbjour (int) – Number of days after onset to check for the dry spell.

Returns:

Length of the longest dry spell sequence after onset and before soil water returns to zero, or NaN if not found.

Return type:

float

rainf_zone(daily_data)[source]
static transform_cdt(df)[source]
Transform a DataFrame with:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_compute_onset(user_criteria=None)[source]

Bases: object

A class that encapsulates methods for transforming precipitation data from different formats (CPT, CDT) and computing onset dates based on rainfall criteria.

__init__(user_criteria=None)[source]

Initialize the WAS_compute_onset class with user-defined or default criteria.

Parameters:

user_criteria (dict, optional) – A dictionary containing zone-specific criteria. If not provided, the class will use the default criteria.

static adjust_duplicates(series, increment=1e-05)[source]

If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, nb_cores)[source]

Compute onset dates for each pixel in a given daily rainfall DataArray using different criteria based on isohyet zones.

Parameters:
  • daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).

  • nb_cores (int) – Number of parallel processes to use.

Returns:

Array with onset dates computed per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df)[source]
static day_of_year(i, dem_rech1)[source]

Given a year ‘i’ and a month-day string ‘dem_rech1’ (e.g., ‘07-23’), return the day of the year (1-based).

default_criteria = {0: {'cumulative': 10, 'end_search': '08-30', 'number_dry_days': 25, 'start_search': '06-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel100_0mm'}, 1: {'cumulative': 15, 'end_search': '08-15', 'number_dry_days': 25, 'start_search': '05-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel200_100mm'}, 2: {'cumulative': 15, 'end_search': '07-31', 'number_dry_days': 20, 'start_search': '05-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel400_200mm'}, 3: {'cumulative': 20, 'end_search': '07-31', 'number_dry_days': 20, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel600_400mm'}, 4: {'cumulative': 20, 'end_search': '07-31', 'number_dry_days': 10, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Soudan'}, 5: {'cumulative': 20, 'end_search': '06-15', 'number_dry_days': 10, 'start_search': '02-01', 'thrd_rain_day': 0.85, 'zone_name': 'Golfe_Of_Guinea'}}
onset_function(x, idebut, cumul, nbsec, jour_pluvieux, irch_fin)[source]

Calculate the onset date of a season based on cumulative rainfall criteria.

Parameters:
  • x (array-like) – Daily rainfall or similar values.

  • idebut (int) – Start index to begin searching for the onset.

  • cumul (float) – Cumulative rainfall threshold to trigger onset.

  • nbsec (int) – Maximum number of dry days allowed in the sequence.

  • jour_pluvieux (float) – Minimum rainfall to consider a day as rainy.

  • irch_fin (int) – Maximum index limit for the onset.

Returns:

Index of the onset date or NaN if onset not found.

Return type:

int or float

rainf_zone(daily_data)[source]
static transform_cdt(df)[source]
Transform a DataFrame with:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_compute_onset_dry_spell(user_criteria=None)[source]

Bases: object

A class for computing the longest dry spell length after the onset of a rainy season, based on user-defined criteria.

__init__(user_criteria=None)[source]

Initialize the WAS_compute_dry_spell class with user-defined or default criteria.

Parameters:

user_criteria (dict, optional) – A dictionary containing zone-specific criteria. If not provided, the class will use the default criteria.

static adjust_duplicates(series, increment=1e-05)[source]

If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, nb_cores)[source]

Compute the longest dry spell length after the onset for each pixel in a given daily rainfall DataArray, using different criteria based on isohyet zones.

Parameters:
  • daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).

  • nb_cores (int) – Number of parallel processes to use.

Returns:

Array with the longest dry spell length per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df)[source]
static day_of_year(i, dem_rech1)[source]

Given a year ‘i’ and a month-day string ‘dem_rech1’ (e.g., ‘07-23’), return the 1-based day of the year.

default_criteria = {0: {'cumulative': 10, 'end_search': '08-30', 'nbjour': 40, 'number_dry_days': 25, 'start_search': '06-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel100_0mm'}, 1: {'cumulative': 15, 'end_search': '08-15', 'nbjour': 40, 'number_dry_days': 25, 'start_search': '05-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel200_100mm'}, 2: {'cumulative': 15, 'end_search': '07-31', 'nbjour': 40, 'number_dry_days': 20, 'start_search': '05-01', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel400_200mm'}, 3: {'cumulative': 20, 'end_search': '07-31', 'nbjour': 45, 'number_dry_days': 20, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Sahel600_400mm'}, 4: {'cumulative': 20, 'end_search': '07-31', 'nbjour': 50, 'number_dry_days': 10, 'start_search': '03-15', 'thrd_rain_day': 0.85, 'zone_name': 'Soudan'}, 5: {'cumulative': 20, 'end_search': '06-15', 'nbjour': 50, 'number_dry_days': 10, 'start_search': '02-01', 'thrd_rain_day': 0.85, 'zone_name': 'Golfe_Of_Guinea'}}
dry_spell_onset_function(x, idebut, cumul, nbsec, jour_pluvieux, irch_fin, nbjour)[source]

Calculate the onset date of a season based on cumulative rainfall criteria, and determine the longest dry spell sequence within a specified period after the onset.

dry_spell_onset_function_(x, idebut, cumul, nbsec, jour_pluvieux, irch_fin, nbjour)[source]

Calculate the onset date of a season based on cumulative rainfall criteria, and determine the longest dry spell sequence within a specified period after the onset.

Parameters:
  • x (array-like) – Daily rainfall or similar values.

  • idebut (int) – Start index to begin searching for the onset.

  • cumul (float) – Cumulative rainfall threshold to trigger onset.

  • nbsec (int) – Maximum number of dry days allowed in the sequence.

  • jour_pluvieux (float) – Minimum rainfall to consider a day as rainy.

  • irch_fin (int) – Maximum index limit for the onset.

  • nbjour (int) – Number of days to check for the longest dry spell after onset.

Returns:

Length of the longest dry spell sequence after onset or NaN if onset not found.

Return type:

float

rainf_zone(daily_data)[source]
static transform_cdt(df)[source]
Transform a DataFrame with:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data (or any date) with ‘ID’ column containing dates.

Returns an xarray DataArray with coords = (T, Y, X), variable = ‘Observation’.

class wass2s.was_compute_predictand.WAS_count_dry_spells[source]

Bases: object

A class to compute the number of dry spells within a specified period (onset to cessation) for each pixel or station in a daily rainfall dataset.

static _parse_cpt_to_long(df_cpt, value_name='onset_or_cessation')[source]
Convert a DataFrame in CPT-like format to a long DataFrame with columns:

[year, station, value_name, lat, lon]

Assumes:
  • Row 0 = [“LAT”, lat_stn1, lat_stn2, …]

  • Row 1 = [“LON”, lon_stn1, lon_stn2, …]

  • Rows 2+ = [year, station1_val, station2_val, …]

Parameters:
  • df_cpt (pd.DataFrame) – CPT-like DataFrame (as returned by, e.g., compute_insitu).

  • value_name (str) – Name to give to the column containing the value (e.g. “onset”, “cessation”).

Returns:

Columns: [station, year, <value_name>, lat, lon]

Return type:

pd.DataFrame

static adjust_duplicates(series, increment=1e-05)[source]

If any values in the Series repeat, nudge them by a tiny increment so that all are unique (to avoid indexing collisions).

compute(daily_data, onset_date, cessation_date, dry_spell_length, dry_threshold, nb_cores)[source]

Compute the number of dry spells for each pixel within the onset and cessation period in a daily xarray DataArray.

Parameters:
  • daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).

  • onset_date (xarray.DataArray) – DataArray containing onset dates for each pixel.

  • cessation_date (xarray.DataArray) – DataArray containing cessation dates for each pixel.

  • dry_spell_length (int) – The length of a dry spell to count.

  • dry_threshold (float) – Rainfall threshold to classify a day as “dry.”

  • nb_cores (int) – Number of parallel processes to use.

Returns:

An array with the count of dry spells per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df, onset_df_cpt, cessation_df_cpt, dry_spell_length, dry_threshold=1.0)[source]

Compute the number of dry spells (of length = dry_spell_length) between the onset and cessation dates for in-situ stations (CDT format).

Returns a DataFrame in CPT format:
  • Row 0: [“LAT”, lat_stn1, lat_stn2, …]

  • Row 1: [“LON”, lon_stn1, lon_stn2, …]

  • Subsequent rows: [year, station1_value, station2_value, …]

Parameters:
  • daily_df (pd.DataFrame) – CDT rainfall data (ID column = date, station columns).

  • onset_df_cpt (pd.DataFrame) – CPT-format DataFrame containing onset dates (as returned by some method).

  • cessation_df_cpt (pd.DataFrame) – CPT-format DataFrame containing cessation dates.

  • dry_spell_length (int) – The length of the dry spell to look for.

  • dry_threshold (float, optional) – Rainfall threshold below which a day is considered “dry.” Defaults to 1.0 mm.

Returns:

Final dry-spell counts in CPT pivot format.

Return type:

pd.DataFrame

static count_dry_spells(x, onset, cessation, dry_spell_length, dry_threshold)[source]

Count the number of dry spells of a specific length between onset and cessation dates.

Parameters:
  • x (array-like) – Daily rainfall values.

  • onset (int) – Start index for the calculation (onset date).

  • cessation (int) – End index for the calculation (cessation date).

  • dry_spell_length (int) – The length of a dry spell to count.

  • dry_threshold (float) – Rainfall threshold to classify a day as “dry.”

Returns:

The number of dry spells of the specified length (NaN if invalid).

Return type:

int or float

static transform_cdt(df)[source]
Transform a DataFrame with:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data with ‘ID’ column containing dates.

Returns a DataFrame with columns like:

DATE | STATION | VALUE | LON | LAT | ELEV | MEAN_ANNUAL_RAINFALL | zonename

static transform_cpt(df, missing_value=None)[source]
Transform a DataFrame in CPT format with:
  • Row 0 = LAT

  • Row 1 = LON

  • Rows 2+ = numeric year data in wide format (stations in columns).

Returns a DataFrame with columns like:

YEAR | STATION | VALUE | LAT | LON

class wass2s.was_compute_predictand.WAS_count_rainy_days[source]

Bases: object

A class to compute the number of rainy days between onset and cessation dates for each pixel or station in a daily rainfall dataset.

static _parse_cpt_to_long(df_cpt, value_name='onset_or_cessation')[source]

Convert a DataFrame in CPT format (like the one returned by ‘compute_insitu’) into a long format DataFrame: columns = [year, station, value_name, lat, lon].

Parameters:
  • df_cpt (pd.DataFrame) –

    • Row 0: [“LAT”, lat_stn1, lat_stn2, …]

    • Row 1: [“LON”, lon_stn1, lon_stn2, …]

    • Rows 2+: [year, station1_value, station2_value, …]

  • value_name (str) – Name for the column containing the values (e.g., “onset”, “cessation”).

Returns:

df_long – Columns = [station, year, value_name, lat, lon]

Return type:

pd.DataFrame

compute(daily_data, onset_date, cessation_date, rain_threshold, nb_cores)[source]

Compute the number of rainy days for each pixel between onset and cessation dates.

Parameters:
  • daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).

  • onset_date (xarray.DataArray) – DataArray containing onset dates for each pixel.

  • cessation_date (xarray.DataArray) – DataArray containing cessation dates for each pixel.

  • rain_threshold (float) – Rainfall threshold to classify a day as “rainy.”

  • nb_cores (int) – Number of parallel processes to use.

Returns:

Array with the count of rainy days per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df, onset_df_cpt, cessation_df_cpt, rain_threshold=0.85)[source]

Compute, for in-situ stations (CDT data), the number of rainy days between onset and cessation, for each station and year.

Parameters:
  • daily_df (pd.DataFrame) – CDT precipitation data (ID column = date; columns = stations). Follows the standard CDT format.

  • onset_df_cpt (pd.DataFrame) – Result of WAS_compute_onset.compute_insitu(…) for onset (CPT format).

  • cessation_df_cpt (pd.DataFrame) – Same format for cessation (CPT format).

  • rain_threshold (float, optional) – Precipitation threshold for counting a day as “rainy,” by default 0.85 mm.

Returns:

df_final – The count of rainy days in CPT pivot format.

Return type:

pd.DataFrame

static count_rainy_days(x, onset_date, cessation_date, rain_threshold)[source]

Count the number of rainy days between onset and cessation dates.

Parameters:
  • x (array-like) – Daily rainfall values.

  • onset_date (int) – Start index for the calculation (onset date).

  • cessation_date (int) – End index for the calculation (cessation date).

  • rain_threshold (float) – Rainfall threshold to classify a day as “rainy.”

Returns:

Number of rainy days (returns NaN if data is invalid).

Return type:

int or float

static transform_cdt(df)[source]

Transform a DataFrame in CDT format into a standardized long DataFrame.

CDT format assumptions:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data with ‘ID’ column holding dates in YYYYMMDD format.

This method returns a DataFrame with columns:

DATE, STATION, VALUE, LON, LAT, ELEV, (optional) MEAN_ANNUAL_RAINFALL, zonename

class wass2s.was_compute_predictand.WAS_count_wet_spells[source]

Bases: object

A class to compute the number of wet spells within a specified period (onset to cessation) for each pixel or station in a daily rainfall dataset.

static _parse_cpt_to_long(df_cpt, value_name='onset_or_cessation')[source]
Convert a CPT-format DataFrame into a long DataFrame with columns:

[year, station, value_name, lat, lon]

Assumes:
  • Row 0: [“LAT”, lat_stn1, lat_stn2, …]

  • Row 1: [“LON”, lon_stn1, lon_stn2, …]

  • Rows 2+: [year, station1_val, station2_val, …]

Parameters:
  • df_cpt (pd.DataFrame) – DataFrame in CPT-wide format (as returned by certain compute_insitu methods).

  • value_name (str) – Name for the output column containing the values (e.g. “onset”, “cessation”).

Returns:

Columns: [station, year, <value_name>, lat, lon]

Return type:

pd.DataFrame

compute(daily_data, onset_date, cessation_date, wet_spell_length, wet_threshold, nb_cores)[source]

Compute the number of wet spells for each pixel within the onset and cessation period in a daily xarray DataArray.

Parameters:
  • daily_data (xarray.DataArray) – Daily rainfall data, coords = (T, Y, X).

  • onset_date (xarray.DataArray) – DataArray containing onset dates for each pixel.

  • cessation_date (xarray.DataArray) – DataArray containing cessation dates for each pixel.

  • wet_spell_length (int) – The length of a wet spell to count.

  • wet_threshold (float) – Rainfall threshold to classify a day as “wet.”

  • nb_cores (int) – Number of parallel processes to use.

Returns:

Array with the count of wet spells per pixel.

Return type:

xarray.DataArray

compute_insitu(daily_df, onset_df_cpt, cessation_df_cpt, wet_spell_length, wet_threshold=1.0)[source]

Compute the number of wet spells (of length = wet_spell_length) between onset and cessation for in-situ stations (CDT data).

Returns a DataFrame in CPT format:
  • Row 0: [“LAT”, lat_station1, lat_station2, …]

  • Row 1: [“LON”, lon_station1, lon_station2, …]

  • Then one row per year: [year, station1_value, station2_value, …]

Parameters:
  • daily_df (pd.DataFrame) – CDT rainfall data (ID column = date, station columns).

  • onset_df_cpt (pd.DataFrame) – CPT-format DataFrame with onset dates (same station columns).

  • cessation_df_cpt (pd.DataFrame) – CPT-format DataFrame with cessation dates (same station columns).

  • wet_spell_length (int) – The length of a wet spell to count.

  • wet_threshold (float, optional) – Rainfall threshold classifying a day as “wet.” Defaults to 1.0 mm.

Returns:

Final wet-spell counts in CPT pivot format.

Return type:

pd.DataFrame

static count_wet_spells(x, onset_date, cessation_date, wet_spell_length, wet_threshold)[source]

Count the number of wet spells of a specific length between onset and cessation dates.

Parameters:
  • x (array-like) – Daily rainfall values.

  • onset_date (int) – Start index for the calculation (onset date).

  • cessation_date (int) – End index for the calculation (cessation date).

  • wet_spell_length (int) – The length of a wet spell to count.

  • wet_threshold (float) – Rainfall threshold to classify a day as “wet.”

Returns:

The number of wet spells of the specified length (NaN if data is invalid).

Return type:

int or float

static transform_cdt(df)[source]

Transform a CDT-format DataFrame into a standard table.

CDT format assumptions:
  • Row 0 = LON

  • Row 1 = LAT

  • Row 2 = ELEV

  • Rows 3+ = daily data, ‘ID’ column has dates in YYYYMMDD.

Returns a DataFrame with columns:

[DATE, STATION, VALUE, LON, LAT, ELEV, MEAN_ANNUAL_RAINFALL, zonename]

wass2s.was_bias_correction module

class wass2s.was_bias_correction.WAS_Qmap[source]

Bases: object

Bias correction methods using quantile mapping techniques, adapted from qmap R package.

This class provides static methods for fitting and applying various bias correction techniques, including empirical quantile mapping (QUANT), robust quantile mapping (RQUANT), smoothing splines (SSPLIN), parametric transformations (PTF), and distribution-based methods (DIST). The methods support both NumPy arrays (1D, 2D, or 3D) and xarray DataArrays (3D: T, Y, X or similar).

All methods handle wet/dry day corrections optionally and are designed for precipitation or similar non-negative variables.

Notes

  • Inputs are expected to be non-negative.

  • For gridded data, computations are performed column-wise (per grid cell).

  • xarray support preserves coordinates and attributes.

static _add_basemap(ax, extent=None)[source]
static _collect_vars(ds, prefix)[source]
static _doQmap_internal(x, fobj, **kwargs)[source]

Internal helper to apply bias correction based on fitted class.

Parameters:
  • x (ndarray) – 2D array of data to correct (time, grid).

  • fobj (dict) – Fitted object.

  • **kwargs – Additional arguments for specific methods.

Returns:

Corrected 2D array.

Return type:

ndarray

Raises:

ValueError – If unknown fitted class.

static _to_2d(arr)[source]

Convert input array to 2D (time, grid) format.

Parameters:

arr (array_like) – Input array (0D to 3D).

Returns:

2D array.

Return type:

ndarray

Raises:

ValueError – If more than 3 dimensions.

static _wet_day_threshold(obs, mod, wet_day)[source]

Compute wet day thresholds for observations and model.

Parameters:
  • obs (ndarray) – Observed data column.

  • mod (ndarray) – Modeled data column.

  • wet_day (bool or float) – If False, no threshold (0). If True, compute based on wet fraction. If float, use as observation threshold and compute model accordingly.

Returns:

(model_threshold, obs_threshold)

Return type:

tuple

static doQmap(x, fobj, **kwargs)[source]

Apply the fitted bias correction to new data.

Parameters:
  • x (array_like or xarray.DataArray) – New modeled data to correct, same format and shape structure as fitting data.

  • fobj (dict) – Fitted object from fitQmap.

  • **kwargs – Additional keyword arguments passed to the specific application method.

Returns:

Bias-corrected data, same type and shape as x.

Return type:

array_like or xarray.DataArray

Raises:

ValueError – If input types or dimensions mismatch the fitted object.

See also

fitQmap

Fit the bias correction model.

static doQmapDIST(x, fobj)[source]

Apply distribution-based (DIST) correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitQmapDIST.

Returns:

Corrected data.

Return type:

ndarray

static doQmapPTF(x, fobj)[source]

Apply parametric transformation (PTF) correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitQmapPTF.

Returns:

Corrected data.

Return type:

ndarray

Raises:

ValueError – If unknown transfun in fobj.

static doQmapQUANT(x, fobj, type='linear')[source]

Apply empirical quantile mapping (QUANT) correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitQmapQUANT.

  • type (str, optional) – Interpolation kind for interp1d (default ‘linear’).

Returns:

Corrected data.

Return type:

ndarray

static doQmapRQUANT(x, fobj, type='linear', slope_bound=[0, inf])[source]

Apply robust quantile mapping (RQUANT) correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitQmapRQUANT.

  • type (str, optional) – Interpolation kind for interp1d (default ‘linear’).

  • slope_bound (list of float, optional) – Bounds for extrapolation slopes [min, max] (default [0, inf]).

Returns:

Corrected data.

Return type:

ndarray

static doQmapSSPLIN(x, fobj)[source]

Apply quantile mapping correction using PCHIP interpolator.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitQmapSSPLIN.

Returns:

Corrected data.

Return type:

ndarray

static evaluate_bias_correction(obs, mod, corrected, wet_threshold=0.1, extreme_quantiles=[0.95, 0.99])[source]

Evaluate bias correction performance with metrics for dry/wet days and extremes.

Parameters:
  • obs (array_like or xarray.DataArray) – Observed data (numpy array or xarray.DataArray, shape (T, Y, X) or (T,)).

  • mod (array_like or xarray.DataArray) – Modeled (uncorrected) data (same shape as obs).

  • corrected (array_like or xarray.DataArray) – Bias-corrected data (same shape as obs).

  • wet_threshold (float, optional) – Threshold for wet days (default 0.1 mm).

  • extreme_quantiles (list of float, optional) – List of quantiles for extremes (default [0.95, 0.99]).

Returns:

A dictionary with evaluation metrics, or xarray.Dataset if input is DataArray.

Return type:

dict or xarray.Dataset

static fitQmap(obs, mod, method, **kwargs)[source]

Fit a bias correction model using the specified quantile mapping method.

Parameters:
  • obs (array_like or xarray.DataArray) – Observed data. If array_like, can be 1D (time), 2D (time, grid), or 3D (T, Y, X). If xarray.DataArray, must be 3D with dimensions (T, Y, X).

  • mod (array_like or xarray.DataArray) – Modeled data to fit against, same shape as obs.

  • method (str) – Bias correction method. Options: ‘QUANT’, ‘RQUANT’, ‘SSPLIN’, ‘PTF’, ‘DIST’ (case-insensitive).

  • **kwargs – Additional keyword arguments passed to the specific fitting method.

Returns:

Fitted object containing parameters, class identifier, and metadata for applying correction.

Return type:

dict

Raises:

ValueError – If shapes mismatch, invalid dimensions, or unknown method.

See also

doQmap

Apply the fitted bias correction to new data.

static fitQmapDIST(obs, mod, distr='berngamma', qstep=None, **kwargs)[source]

Fit distribution-based quantile mapping (DIST).

Uses Bernoulli for wet/dry and a continuous distribution for wet values.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • distr (str, optional) – Wet distribution: ‘berngamma’, ‘bernexp’, ‘bernlnorm’, ‘bernweibull’ (default ‘berngamma’).

  • qstep (float, optional) – If set, fit on quantiles; else on full data.

  • **kwargs – Additional arguments (unused).

Returns:

Fitted parameters including distributions and transfer functions.

Return type:

dict

Raises:

ValueError – If unknown distr.

static fitQmapPTF(obs, mod, transfun='power', parini=None, cost='RSS', wet_day=False, qstep=None)[source]

Fit parametric transformation function (PTF) for bias correction.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • transfun (str or callable, optional) – Transformation function: ‘power’, ‘power.x0’, ‘expasympt’, ‘expasympt.x0’, ‘scale’, ‘linear’, or a custom callable (default ‘power’).

  • parini (list, optional) – Initial parameter guesses for optimization.

  • cost (str, optional) – Cost function for optimization: ‘RSS’ or ‘MAE’ (default ‘RSS’).

  • wet_day (bool or float, optional) – Wet day handling (default False).

  • qstep (float, optional) – If set, fit on quantiles with this step; else fit on sorted data.

Returns:

Fitted parameters including transformation and thresholds.

Return type:

dict

Raises:

ValueError – If unknown transfun or cost.

static fitQmapQUANT(obs, mod, wet_day=False, qstep=0.01, nboot=1)[source]

Fit empirical quantile mapping (QUANT) with optional bootstrapping.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • wet_day (bool or float, optional) – Wet day handling (default False).

  • qstep (float, optional) – Quantile step size (default 0.01).

  • nboot (int, optional) – Number of bootstrap samples for observed quantiles (default 1, no bootstrap).

Returns:

Fitted parameters including quantiles and thresholds.

Return type:

dict

static fitQmapRQUANT(obs, mod, wet_day=True, qstep=0.01, nlls=10, nboot=10)[source]

Fit robust quantile mapping (RQUANT) with local linear fitting.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • wet_day (bool or float, optional) – Wet day handling (default True).

  • qstep (float, optional) – Quantile step size (default 0.01).

  • nlls (int, optional) – Number of local quantiles for linear fit (default 10).

  • nboot (int, optional) – Number of bootstrap samples (default 10).

Returns:

Fitted parameters including quantiles, slopes, and thresholds.

Return type:

dict

static fitQmapSSPLIN(obs, mod, wet_day=False, qstep=0.01)[source]

Fit quantile mapping using PCHIP smoothing splines (SSPLIN).

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • wet_day (bool or float, optional) – Wet day handling (default False).

  • qstep (float, optional) – Quantile step size (default 0.01).

Returns:

Fitted parameters including quantiles and thresholds.

Return type:

dict

static plot_extreme_quantiles_group(ds, extent=None, robust=True)[source]

Plots extreme_quantiles_* variables, faceting by variable (columns) and quantile (rows). Assumes dims (‘quantile’, ‘Y’, ‘X’).

static plot_fraction_group(ds, group_prefix, extent=None, robust=True)[source]

Plots dry/wet fraction groups (e.g., ‘dry_fraction_’ or ‘wet_fraction_’).

static plot_mean_wet_group(ds, extent=None, robust=True)[source]

Plots mean_wet_* variables.

class wass2s.was_bias_correction.WAS_bias_correction[source]

Bases: object

Bias correction methods for climate variables such as temperature or wind speed.

This class provides static methods for fitting and applying bias correction techniques suitable for continuous variables that may include negative values or skewed positive values, such as mean adjustment, variance scaling, empirical quantile mapping (non-parametric), and parametric mapping assuming various distributions (normal, lognormal, gamma, weibull). The methods support both NumPy arrays (1D, 2D, or 3D) and xarray DataArrays (3D: time, lat, lon or similar).

Notes

  • Inputs can be negative and are treated as continuous, but for positive skewed data like wind speed, use distributions like ‘lognormal’, ‘gamma’, or ‘weibull’.

  • No handling for wet/dry days.

  • For gridded data, computations are performed column-wise (per grid cell).

  • xarray support preserves coordinates and attributes.

  • Non-parametric method: ‘QUANT’ (empirical quantile mapping).

  • Parametric methods: ‘NORM’ (normal), or ‘DIST’ with specified distribution.

  • Handles NaNs: Ignores NaNs in fitting by filtering them out; if fewer than 2 valid points per grid cell in obs or mod, flags as all_nan and outputs NaNs for that grid in application. NaNs in input data during application are propagated as NaNs in output.

static _doBC_internal(x, fobj, **kwargs)[source]

Internal helper to apply bias correction based on fitted class.

Parameters:
  • x (ndarray) – 2D array of data to correct (time, grid).

  • fobj (dict) – Fitted object.

  • **kwargs – Additional arguments for specific methods.

Returns:

Corrected 2D array.

Return type:

ndarray

Raises:

ValueError – If unknown fitted class.

static _to_2d(arr)[source]

Convert input array to 2D (time, grid) format.

Parameters:

arr (array_like) – Input array (0D to 3D).

Returns:

2D array.

Return type:

ndarray

Raises:

ValueError – If more than 3 dimensions.

static doBC(x, fobj, **kwargs)[source]

Apply the fitted bias correction to new data.

Parameters:
  • x (array_like or xarray.DataArray) – New modeled data to correct, same format and shape structure as fitting data.

  • fobj (dict) – Fitted object from fitBC.

  • **kwargs – Additional keyword arguments passed to the specific application method.

Returns:

Bias-corrected data, same type and shape as x.

Return type:

array_like or xarray.DataArray

Raises:

ValueError – If input types or dimensions mismatch the fitted object.

See also

fitBC

Fit the bias correction model.

static doDist(x, fobj)[source]

Apply parametric distribution bias correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitDist.

Returns:

Corrected data.

Return type:

ndarray

static doMean(x, fobj)[source]

Apply mean additive bias correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitMean.

Returns:

Corrected data.

Return type:

ndarray

static doQuant(x, fobj, type='linear')[source]

Apply empirical quantile mapping correction (non-parametric).

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitQuant.

  • type (str, optional) – Interpolation kind for interp1d (default ‘linear’).

Returns:

Corrected data.

Return type:

ndarray

static doVarscale(x, fobj)[source]

Apply variance scaling bias correction.

Parameters:
  • x (ndarray) – 2D data to correct (time, grid).

  • fobj (dict) – Fitted object from fitVarscale.

Returns:

Corrected data.

Return type:

ndarray

static fitBC(obs, mod, method, **kwargs)[source]

Fit a bias correction model using the specified method.

Parameters:
  • obs (array_like or xarray.DataArray) – Observed data. If array_like, can be 1D (time), 2D (time, grid), or 3D (time, y, x). If xarray.DataArray, must be 3D with dimensions (time, y, x).

  • mod (array_like or xarray.DataArray) – Modeled data to fit against, same shape as obs.

  • method (str) – Bias correction method. Options: ‘MEAN’, ‘VARSCALE’, ‘QUANT’, ‘NORM’, ‘DIST’ (case-insensitive).

  • **kwargs – Additional keyword arguments passed to the specific fitting method. For ‘DIST’, include ‘distr’ (e.g., ‘lognormal’, ‘gamma’, ‘weibull’, default ‘normal’).

Returns:

Fitted object containing parameters, class identifier, and metadata for applying correction.

Return type:

dict

Raises:

ValueError – If shapes mismatch, invalid dimensions, or unknown method.

See also

doBC

Apply the fitted bias correction to new data.

static fitDist(obs, mod, distr='normal')[source]

Fit parametric bias correction assuming a specified distribution.

Suitable for skewed data like wind speed with ‘lognormal’, ‘gamma’, or ‘weibull’.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • distr (str, optional) – Distribution: ‘normal’, ‘lognormal’, ‘gamma’, ‘weibull’ (default ‘normal’).

Returns:

Fitted parameters including distribution parameters.

Return type:

dict

Raises:

ValueError – If unknown distribution.

static fitMean(obs, mod)[source]

Fit mean additive bias correction.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

Returns:

Fitted parameters including delta (mean difference).

Return type:

dict

static fitQuant(obs, mod, qstep=0.01, nboot=1)[source]

Fit empirical quantile mapping (non-parametric).

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

  • qstep (float, optional) – Quantile step size (default 0.01).

  • nboot (int, optional) – Number of bootstrap samples for observed quantiles (default 1, no bootstrap).

Returns:

Fitted parameters including quantiles.

Return type:

dict

static fitVarscale(obs, mod)[source]

Fit variance scaling bias correction.

Parameters:
  • obs (ndarray) – 2D observed data (time, grid).

  • mod (ndarray) – 2D modeled data (time, grid).

Returns:

Fitted parameters including means and standard deviations.

Return type:

dict

wass2s.was_merge_predictand module

wass2s.was_cross_validate module

wass2s.was_linear_models module

wass2s.was_eof module

wass2s.was_pcr module

wass2s.was_cca module

wass2s.was_machine_learning module

wass2s.was_analog module

wass2s.was_verification module

wass2s.was_mme module

wass2s.utils module