qqplot

hydrostats.visual.qqplot(merged_data_df: DataFrame | None = None, sim_array: ndarray[tuple[Any, ...], dtype[floating | integer]] | Sequence[int | float] | None = None, obs_array: ndarray[tuple[Any, ...], dtype[floating | integer]] | Sequence[int | float] | None = None, interpolate: Literal['inverted_cdf', 'averaged_inverted_cdf', 'closest_observation', 'interpolated_inverted_cdf', 'hazen', 'weibull', 'linear', 'median_unbiased', 'normal_unbiased', 'lower', 'higher', 'midpoint', 'nearest'] = 'linear', title: str | None = None, xlabel: str = 'Simulated Data Quantiles', ylabel: str = 'Observed Data Quantiles', legend: bool = False, replace_nan: float | None = None, replace_inf: float | None = None, remove_neg: bool = False, remove_zero: bool = False, figsize: tuple[float, float] = (12, 8)) Figure

Plot a Quantile-Quantile plot of the simulated and observed data.

Useful for comparing to see whether the two datasets come from the same distribution.

Parameters:
  • merged_data_df (DataFrame) – Dataframe must contain a datetime type index and floating point type numbers in two columns. The left column must be simulated data and the right column must be observed data. If given, sim_array and obs_array must be None.

  • sim_array (1D ndarray) – Array of simulated data. If given, merged_data_df parameter must be None and obs_array must be given.

  • obs_array (1D ndarray) – Array of observed data. If given, merged_data_df parameter must be None and sim_array must be given.

  • interpolate (str) – Specifies the interpolation type when computing quantiles. Available options can be found at https://numpy.org/doc/stable/reference/generated/numpy.percentile.html#numpy-percentile. See the “method” argument.

  • title (str) – If given, sets the title of the plot.

  • xlabel (str) – The label for the x axis that holds the simulated data quantiles.

  • ylabel (str) – The label for the y axis that holds the observed data quantiles.

  • legend (bool) – If True, a legend to explain the elements on the plot will be added.

  • replace_nan (float, optional) – If given, indicates which value to replace NaN values with in the two arrays. If None, when a NaN value is found at the i-th position in the observed OR simulated array, the i-th value of the observed and simulated array are removed before the computation.

  • replace_inf (float, optional) – If given, indicates which value to replace Inf values with in the two arrays. If None, when an inf value is found at the i-th position in the observed OR simulated array, the i-th value of the observed and simulated array are removed before the computation.

  • remove_neg (boolean, optional) – If True, when a negative value is found at the i-th position in the observed OR simulated array, the i-th value of the observed AND simulated array are removed before the computation.

  • remove_zero (boolean, optional) – If true, when a zero value is found at the i-th position in the observed OR simulated array, the i-th value of the observed AND simulated array are removed before the computation.

  • figsize (tuple of float) – Tuple of length two that specifies the horizontal and vertical lengths of the plot in inches, respectively.

Returns:

  • A matplotlib figure handle is returned, which can be viewed with the matplotlib.pyplot.show()

  • command.

Examples

>>> import hydrostats.data as hd
>>> import hydrostats.visual as hv
>>> import matplotlib.pyplot as plt
>>> sfpt_url = r"https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/sfpt_data/magdalena-calamar_interim_data.csv"
>>> glofas_url = r"https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/GLOFAS_Data/magdalena-calamar_ECMWF_data.csv"
>>> merged_df = hd.merge_data(sfpt_url, glofas_url, column_names=("SFPT", "GLOFAS"))
>>> qqplot(
...     merged_data_df=merged_df,
...     title="Quantile-Quantile Plot of Data",
...     xlabel="SFPT Data Quantiles",
...     ylabel="GLOFAS Data Quantiles",
...     legend=True,
...     figsize=(8, 6),
... )
>>> plt.show()
../_images/qqplot.png