hist¶
-
hydrostats.visual.
hist
(merged_data_df=None, sim_array=None, obs_array=None, num_bins=100, z_norm=False, legend=('Simulated', 'Observed'), grid=False, title=None, labels=None, prob_dens=False, figsize=(12, 6))¶ Plots a histogram comparing simulated and observed data.
The histogram plot is a function that is available for comparing the histograms of two time series. Data can be Z-score normalized as well as fit in a probability density function.
Parameters: - merged_data_df (DataFrame) – Dataframe must contain a datetime type index and floating point type numbers in two columns. The left column must be simulated data and the right column must be observed data. If given, sim_array and obs_array must be None.
- sim_array (1D ndarray) – Array of simulated data. If given, merged_data_df parameter must be None and obs_array must be given.
- obs_array (1D ndarray) – Array of observed data. If given, merged_data_df parameter must be None and sim_array must be given.
- num_bins (int) – Specifies the number of bins in the histogram.
- z_norm (bool) – If True, the data will be Z-score normalized.
- legend (tuple of str) – Tuple of length two with str inputs. Adds a Legend in the ‘best’ location determined by matplotlib. The entries in the tuple label the simulated and observed data (e.g. [‘Simulated Data’, ‘Predicted Data’]).
- grid (bool) – If True, adds a grid to the plot.
- title (str) – If given, sets the title of the plot.
- labels (tuple of str) – Tuple of two string type objects to set the x-axis labels and y-axis labels, respectively.
- prob_dens (bool) – If True, normalizes both histograms to form a probability density, i.e., the area (or integral) under each histogram will sum to 1.
- figsize (tuple of float) – Tuple of length two that specifies the horizontal and vertical lengths of the plot in inches, respectively.
Returns: fig – A matplotlib figure handle is returned, which can be viewed with the matplotlib.pyplot.show() command.
Return type: Matplotlib figure instance
Examples
In this example the histograms of two models are compared to check their distributions
>>> import hydrostats.data as hd >>> import hydrostats.visual as hv >>> import matplotlib.pyplot as plt
>>> sfpt_url = r'https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/sfpt_data/magdalena-calamar_interim_data.csv' >>> glofas_url = r'https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/GLOFAS_Data/magdalena-calamar_ECMWF_data.csv' >>> merged_df = hd.merge_data(sfpt_url, glofas_url, column_names=('SFPT', 'GLOFAS'))
The histogram with 100 bins is plotted below
>>> hist(merged_data_df=merged_df, >>> num_bins=100, >>> title='Histogram of Streamflows', >>> legend=('SFPT', 'GLOFAS'), >>> labels=('Bins', 'Frequency'), >>> grid=True) >>> plt.show()