qqplot¶
- hydrostats.visual.qqplot(merged_data_df: DataFrame | None = None, sim_array: ndarray[tuple[Any, ...], dtype[floating | integer]] | Sequence[int | float] | None = None, obs_array: ndarray[tuple[Any, ...], dtype[floating | integer]] | Sequence[int | float] | None = None, interpolate: Literal['inverted_cdf', 'averaged_inverted_cdf', 'closest_observation', 'interpolated_inverted_cdf', 'hazen', 'weibull', 'linear', 'median_unbiased', 'normal_unbiased', 'lower', 'higher', 'midpoint', 'nearest'] = 'linear', title: str | None = None, xlabel: str = 'Simulated Data Quantiles', ylabel: str = 'Observed Data Quantiles', legend: bool = False, replace_nan: float | None = None, replace_inf: float | None = None, remove_neg: bool = False, remove_zero: bool = False, figsize: tuple[float, float] = (12, 8)) Figure¶
Plot a Quantile-Quantile plot of the simulated and observed data.
Useful for comparing to see whether the two datasets come from the same distribution.
- Parameters:
merged_data_df (DataFrame) – Dataframe must contain a datetime type index and floating point type numbers in two columns. The left column must be simulated data and the right column must be observed data. If given, sim_array and obs_array must be None.
sim_array (1D ndarray) – Array of simulated data. If given, merged_data_df parameter must be None and obs_array must be given.
obs_array (1D ndarray) – Array of observed data. If given, merged_data_df parameter must be None and sim_array must be given.
interpolate (str) – Specifies the interpolation type when computing quantiles. Available options can be found at https://numpy.org/doc/stable/reference/generated/numpy.percentile.html#numpy-percentile. See the “method” argument.
title (str) – If given, sets the title of the plot.
xlabel (str) – The label for the x axis that holds the simulated data quantiles.
ylabel (str) – The label for the y axis that holds the observed data quantiles.
legend (bool) – If True, a legend to explain the elements on the plot will be added.
replace_nan (float, optional) – If given, indicates which value to replace NaN values with in the two arrays. If None, when a NaN value is found at the i-th position in the observed OR simulated array, the i-th value of the observed and simulated array are removed before the computation.
replace_inf (float, optional) – If given, indicates which value to replace Inf values with in the two arrays. If None, when an inf value is found at the i-th position in the observed OR simulated array, the i-th value of the observed and simulated array are removed before the computation.
remove_neg (boolean, optional) – If True, when a negative value is found at the i-th position in the observed OR simulated array, the i-th value of the observed AND simulated array are removed before the computation.
remove_zero (boolean, optional) – If true, when a zero value is found at the i-th position in the observed OR simulated array, the i-th value of the observed AND simulated array are removed before the computation.
figsize (tuple of float) – Tuple of length two that specifies the horizontal and vertical lengths of the plot in inches, respectively.
- Returns:
A matplotlib figure handle is returned, which can be viewed with the matplotlib.pyplot.show()
command.
Examples
>>> import hydrostats.data as hd >>> import hydrostats.visual as hv >>> import matplotlib.pyplot as plt
>>> sfpt_url = r"https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/sfpt_data/magdalena-calamar_interim_data.csv" >>> glofas_url = r"https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/GLOFAS_Data/magdalena-calamar_ECMWF_data.csv" >>> merged_df = hd.merge_data(sfpt_url, glofas_url, column_names=("SFPT", "GLOFAS"))
>>> qqplot( ... merged_data_df=merged_df, ... title="Quantile-Quantile Plot of Data", ... xlabel="SFPT Data Quantiles", ... ylabel="GLOFAS Data Quantiles", ... legend=True, ... figsize=(8, 6), ... ) >>> plt.show()