seasonal_period¶

hydrostats.data.seasonal_period(merged_dataframe, daily_period, time_range=None, numpy=False)¶

Creates a dataframe with a specified seasonal period

Parameters:	merged_dataframe (DataFrame) – A pandas DataFrame with a datetime index and columns containing float type values. daily_period (tuple of str) – A list of length two with strings representing the start and end dates of the seasonal period (e.g. (01-01, 01-31) for Jan 1 to Jan 31. time_range (tuple of str) – A tuple of string values representing the start and end dates of the time range. Format is YYYY-MM-DD. numpy (bool) – If True, two numpy arrays will be returned instead of a pandas dataframe
Returns:	Pandas dataframe that has been truncated to fit the parameters specified for the seasonal period.
Return type:	DataFrame

Examples

>>> import pandas
>>> pd.options.display.max_rows = 15
>>> import numpy as np
>>> import hydrostats.data as hd

Here an example DataFrame is made with appx three years of data.

>>> example_df = pd.DataFrame(data=np.random.rand(1000, 2), index=pd.date_range('2000-01-01', periods=1000), columns=['Simulated', 'Observed'])
            Simulated  Observed
2000-01-01   0.862726  0.056597
2000-01-02   0.979643  0.915072
2000-01-03   0.857667  0.965057
2000-01-04   0.011238  0.033678
2000-01-05   0.011390  0.401728
2000-01-06   0.056505  0.047417
2000-01-07   0.615151  0.134103
               ...       ...
2002-09-20   0.883156  0.272355
2002-09-21   0.595319  0.406609
2002-09-22   0.415106  0.826873
2002-09-23   0.399449  0.656040
2002-09-24   0.243404  0.561899
2002-09-25   0.879932  0.551347
2002-09-26   0.787526  0.887288
[1000 rows x 2 columns]

Using this function, a new dataframe containing only the data values in january is returned.

>>> seasonal_df_jan = hd.seasonal_period(example_df, ('01-01', '01-31'))
            Simulated  Observed
2000-01-01   0.862726  0.056597
2000-01-02   0.979643  0.915072
2000-01-03   0.857667  0.965057
2000-01-04   0.011238  0.033678
2000-01-05   0.011390  0.401728
2000-01-06   0.056505  0.047417
2000-01-07   0.615151  0.134103
               ...       ...
2002-01-25   0.230580  0.363213
2002-01-26   0.579899  0.370847
2002-01-27   0.317925  0.120410
2002-01-28   0.196034  0.035715
2002-01-29   0.245429  0.974162
2002-01-30   0.156166  0.544797
2002-01-31   0.158595  0.311630
[93 rows x 2 columns]

We can also specify a time range if we only want the months of January in the year 2000 and 2001

>>> seasonal_df_jan = hd.seasonal_period(example_df, ('01-01', '01-31'), time_range=('2000-01-01', '2001-12-31'))
            Simulated  Observed
2000-01-01   0.862726  0.056597
2000-01-02   0.979643  0.915072
2000-01-03   0.857667  0.965057
2000-01-04   0.011238  0.033678
2000-01-05   0.011390  0.401728
2000-01-06   0.056505  0.047417
2000-01-07   0.615151  0.134103
               ...       ...
2001-01-25   0.119188  0.043076
2001-01-26   0.896280  0.282883
2001-01-27   0.659078  0.230265
2001-01-28   0.667826  0.383687
2001-01-29   0.298459  0.738100
2001-01-30   0.336499  0.189036
2001-01-31   0.571562  0.783718
[62 rows x 2 columns]