hist

hydrostats.visual.hist(merged_data_df=None, sim_array=None, obs_array=None, num_bins=100, z_norm=False, legend=('Simulated', 'Observed'), grid=False, title=None, labels=None, prob_dens=False, figsize=(12, 6))

Plots a histogram comparing simulated and observed data.

The histogram plot is a function that is available for comparing the histograms of two time series. Data can be Z-score normalized as well as fit in a probability density function.

Parameters:
  • merged_data_df (DataFrame) – Dataframe must contain a datetime type index and floating point type numbers in two columns. The left column must be simulated data and the right column must be observed data. If given, sim_array and obs_array must be None.
  • sim_array (1D ndarray) – Array of simulated data. If given, merged_data_df parameter must be None and obs_array must be given.
  • obs_array (1D ndarray) – Array of observed data. If given, merged_data_df parameter must be None and sim_array must be given.
  • num_bins (int) – Specifies the number of bins in the histogram.
  • z_norm (bool) – If True, the data will be Z-score normalized.
  • legend (tuple of str) – Tuple of length two with str inputs. Adds a Legend in the ‘best’ location determined by matplotlib. The entries in the tuple label the simulated and observed data (e.g. [‘Simulated Data’, ‘Predicted Data’]).
  • grid (bool) – If True, adds a grid to the plot.
  • title (str) – If given, sets the title of the plot.
  • labels (tuple of str) – Tuple of two string type objects to set the x-axis labels and y-axis labels, respectively.
  • prob_dens (bool) – If True, normalizes both histograms to form a probability density, i.e., the area (or integral) under each histogram will sum to 1.
  • figsize (tuple of float) – Tuple of length two that specifies the horizontal and vertical lengths of the plot in inches, respectively.
Returns:

fig – A matplotlib figure handle is returned, which can be viewed with the matplotlib.pyplot.show() command.

Return type:

Matplotlib figure instance

Examples

In this example the histograms of two models are compared to check their distributions

>>> import hydrostats.data as hd
>>> import hydrostats.visual as hv
>>> import matplotlib.pyplot as plt
>>> sfpt_url = r'https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/sfpt_data/magdalena-calamar_interim_data.csv'
>>> glofas_url = r'https://github.com/waderoberts123/Hydrostats/raw/master/Sample_data/GLOFAS_Data/magdalena-calamar_ECMWF_data.csv'
>>> merged_df = hd.merge_data(sfpt_url, glofas_url, column_names=('SFPT', 'GLOFAS'))

The histogram with 100 bins is plotted below

>>> hist(merged_data_df=merged_df,
>>>      num_bins=100,
>>>      title='Histogram of Streamflows',
>>>      legend=('SFPT', 'GLOFAS'),
>>>      labels=('Bins', 'Frequency'),
>>>      grid=True)
>>> plt.show()
../_images/hist1.png