ens_crps

hydrostats.ens_metrics.ens_crps(obs, fcst_ens, adj=nan, remove_neg=False, remove_zero=False, llvm=True)

Calculate the ensemble-adjusted Continuous Ranked Probability Score (CRPS)

Parameters:
  • obs (1D ndarray) – array of observations for each start date.
  • fcst_ens (2D ndarray) – Array of ensemble forecast of dimension n x M, where n = number of start dates and M = number of ensemble members.
  • adj (float or int) – A positive number representing ensemble size for which the scores should be adjusted. If np.nan (default) scores will not be adjusted. This value can be ‘np.inf‘, in which case the adjusted (or fair) crps values will be calculated as per equation 6 in Leutbecher et al. (2018).
  • remove_neg (bool) – If True, when a negative value is found at the i-th position in the observed OR ensemble array, the i-th value of the observed AND ensemble array are removed before the computation.
  • remove_zero (bool) – If true, when a zero value is found at the i-th position in the observed OR ensemble array, the i-th value of the observed AND ensemble array are removed before the computation.
  • llvm (bool) – If true (default) the crps will be calculated using the numba module, which uses the LLVM compiler infrastructure for enhanced performance. If this is not wanted, then set it to false for a pure python implementation.
Returns:

Dictionary contains two keys, crps and crpsMean. The value of crps is a list of the crps values. Note that if the ensemble forecast or the observed values contained NaN or inf values, or negative or zero values if specified, these start dates will not show up in the crps values. The crpsMean value is the arithmatic mean of the crps values.

Return type:

dict

Examples

>>> import numpy as np
>>> import hydrostats.ens_metrics as em
>>> np.random.seed(3849590438)

Creating an observed 1D array and an ensemble 2D array with all random numbers

>>> ens_array_random = (np.random.rand(15, 52) + 1) * 100  # 52 Ensembles
>>> obs_array_random = (np.random.rand(15) + 1) * 100

Creating an observed 1D array and an ensemble 2D array with noise.

>>> noise = np.random.normal(scale=1, size=(15, 52))
>>> x = np.linspace(1, 10, 15)
>>> observed_array = np.sin(x) + 10
>>> ensemble_array_noise = (np.ones((15, 52)).T * observed_array).T + noise  # 52 Ensembles

Computing the crps values between the ensemble mean and the observed data with the random data. Note that the crps is relatively high because it is random.

>>> crps_dictionary_rand = em.ens_crps(obs_array_random, ens_array_random)
>>> print(crps_dictionary_rand['crps'])
[ 7.73360237  9.59248626 34.46719655 30.10271075  7.451665   16.07882352
 14.59543529  8.55181637 15.4833089   8.32422363 16.55108154 19.20821296
  8.39452279 12.59949378 27.82543302]
>>> crps_dictionary_rand['crpsMean']
15.797334183277709

Computing the crps values between the ensemble mean and the observed data with noise in the ensemble data. Note that the crps values are better because the forecast is closer to observed values.

>>> crps_dictionary_noise = em.ens_crps(obs=observed_array, fcst_ens=ensemble_array_noise)
>>> print(crps_dictionary_noise['crps'])
[0.26921152 0.21388687 0.24927151 0.26047667 0.30234843 0.1996493
 0.2779844  0.29478927 0.275383   0.25682693 0.21485236 0.22824711
 0.2813889  0.21264652 0.18141063]
>>> crps_dictionary_noise['crpsMean']
0.24789156041214638

References

  • Gneiting, T. and Raftery, A. E. (2007) Strictly proper scoring rules, prediction and estimation, J. Amer. Stat. Asoc., 102, 359-378.
  • Leutbecher, M. (2018) Ensemble size: How suboptimal is less than infinity?, Q. J. R. Meteorol., Accepted.
  • Ferro CAT, Richardson SR, Weigel AP (2008) On the effect of ensemble size on the discrete and continuous ranked probability scores. Meteorological Applications. doi: 10.1002/met.45
  • Stefan Siegert (2017). SpecsVerification: Forecast Verification Routines for Ensemble Forecasts of Weather and Climate. R package version 0.5-2. https://CRAN.R-project.org/package=SpecsVerification