hist¶

hydrostats.visual.hist(merged_data_df=None, sim_array=None, obs_array=None, num_bins=100, z_norm=False, legend=('Simulated', 'Observed'), grid=False, title=None, labels=None, prob_dens=False, figsize=(12, 6))¶

Plots a histogram comparing simulated and observed data.

The histogram plot is a function that is available for comparing the histograms of two time series. Data can be Z-score normalized as well as fit in a probability density function.

Parameters:	merged_data_df (DataFrame) – Dataframe must contain a datetime type index and floating point type numbers in two columns. The left column must be simulated data and the right column must be observed data. If given, sim_array and obs_array must be None. sim_array (1D ndarray) – Array of simulated data. If given, merged_data_df parameter must be None and obs_array must be given. obs_array (1D ndarray) – Array of observed data. If given, merged_data_df parameter must be None and sim_array must be given. num_bins (int) – Specifies the number of bins in the histogram. z_norm (bool) – If True, the data will be Z-score normalized. legend (tuple of str) – Tuple of length two with str inputs. Adds a Legend in the ‘best’ location determined by matplotlib. The entries in the tuple label the simulated and observed data (e.g. [‘Simulated Data’, ‘Predicted Data’]). grid (bool) – If True, adds a grid to the plot. title (str) – If given, sets the title of the plot. labels (tuple of str) – Tuple of two string type objects to set the x-axis labels and y-axis labels, respectively. prob_dens (bool) – If True, normalizes both histograms to form a probability density, i.e., the area (or integral) under each histogram will sum to 1. figsize (tuple of float) – Tuple of length two that specifies the horizontal and vertical lengths of the plot in inches, respectively.
Returns:	fig – A matplotlib figure handle is returned, which can be viewed with the matplotlib.pyplot.show() command.
Return type:	Matplotlib figure instance

Examples

In this example the histograms of two models are compared to check their distributions

>>> import hydrostats.data as hd
>>> import hydrostats.visual as hv
>>> import matplotlib.pyplot as plt

>>> sfpt_url = r'https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/sfpt_data/magdalena-calamar_interim_data.csv'
>>> glofas_url = r'https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/GLOFAS_Data/magdalena-calamar_ECMWF_data.csv'
>>> merged_df = hd.merge_data(sfpt_url, glofas_url, column_names=('SFPT', 'GLOFAS'))

The histogram with 100 bins is plotted below

>>> hist(merged_data_df=merged_df,
>>>      num_bins=100,
>>>      title='Histogram of Streamflows',
>>>      legend=('SFPT', 'GLOFAS'),
>>>      labels=('Bins', 'Frequency'),
>>>      grid=True)
>>> plt.show()