seasonal_period¶
- hydrostats.data.seasonal_period(merged_dataframe: DataFrame, daily_period: tuple[str, str], time_range: tuple[str, str] | None = None) DataFrame¶
- hydrostats.data.seasonal_period(merged_dataframe: DataFrame, daily_period: tuple[str, str], time_range: tuple[str, str] | None = None, numpy: Literal[True] = False) tuple[ndarray[tuple[Any, ...], dtype[floating]], ndarray[tuple[Any, ...], dtype[floating]]]
- hydrostats.data.seasonal_period(merged_dataframe: DataFrame, daily_period: tuple[str, str], time_range: tuple[str, str] | None = None, numpy: Literal[False] = False) DataFrame
Create a dataframe with a specified seasonal period.
- Parameters:
merged_dataframe (DataFrame) – A pandas DataFrame with a datetime index and columns containing float type values.
daily_period (tuple of str) – A list of length two with strings representing the start and end dates of the seasonal period (e.g. (01-01, 01-31) for Jan 1 to Jan 31).
time_range (tuple of str) – A tuple of string values representing the start and end dates of the time range. Format is YYYY-MM-DD.
numpy (bool) – If True, two numpy arrays will be returned instead of a pandas dataframe
- Returns:
Pandas dataframe that has been truncated to fit the parameters specified for the seasonal period.
- Return type:
DataFrame
Examples
>>> import pandas >>> pd.options.display.max_rows = 15 >>> import numpy as np >>> import hydrostats.data as hd
Here an example DataFrame is made with appx three years of data.
>>> example_df = pd.DataFrame( ... data=np.random.rand(1000, 2), ... index=pd.date_range("2000-01-01", periods=1000), ... columns=["Simulated", "Observed"], ... ) Simulated Observed 2000-01-01 0.862726 0.056597 2000-01-02 0.979643 0.915072 2000-01-03 0.857667 0.965057 2000-01-04 0.011238 0.033678 2000-01-05 0.011390 0.401728 2000-01-06 0.056505 0.047417 2000-01-07 0.615151 0.134103 ... ... 2002-09-20 0.883156 0.272355 2002-09-21 0.595319 0.406609 2002-09-22 0.415106 0.826873 2002-09-23 0.399449 0.656040 2002-09-24 0.243404 0.561899 2002-09-25 0.879932 0.551347 2002-09-26 0.787526 0.887288 [1000 rows x 2 columns]
Using this function, a new dataframe containing only the data values in January is returned.
>>> seasonal_df_jan = hd.seasonal_period(example_df, ("01-01", "01-31")) Simulated Observed 2000-01-01 0.862726 0.056597 2000-01-02 0.979643 0.915072 2000-01-03 0.857667 0.965057 2000-01-04 0.011238 0.033678 2000-01-05 0.011390 0.401728 2000-01-06 0.056505 0.047417 2000-01-07 0.615151 0.134103 ... ... 2002-01-25 0.230580 0.363213 2002-01-26 0.579899 0.370847 2002-01-27 0.317925 0.120410 2002-01-28 0.196034 0.035715 2002-01-29 0.245429 0.974162 2002-01-30 0.156166 0.544797 2002-01-31 0.158595 0.311630 [93 rows x 2 columns]
It is also possible to specify a time range to extract only the months of January from the years 2000 and 2001.
>>> seasonal_df_jan = hd.seasonal_period( ... example_df, ("01-01", "01-31"), time_range=("2000-01-01", "2001-12-31") ... ) Simulated Observed 2000-01-01 0.862726 0.056597 2000-01-02 0.979643 0.915072 2000-01-03 0.857667 0.965057 2000-01-04 0.011238 0.033678 2000-01-05 0.011390 0.401728 2000-01-06 0.056505 0.047417 2000-01-07 0.615151 0.134103 ... ... 2001-01-25 0.119188 0.043076 2001-01-26 0.896280 0.282883 2001-01-27 0.659078 0.230265 2001-01-28 0.667826 0.383687 2001-01-29 0.298459 0.738100 2001-01-30 0.336499 0.189036 2001-01-31 0.571562 0.783718 [62 rows x 2 columns]