Pandas Frequencies

How this guide benefits you

This guide covers how to use the pandas frequency strings within pytimetk. Once you understand key frequencies, you can apply them to manipulate time series data like a pro.

1 Pandas Frequencies

Pandas offers a variety of frequency strings, also known as offset aliases, to define the frequency of a time series. Here are some common frequency strings used in pandas:

  1. β€˜B’: Business Day
  2. β€˜D’: Calendar day
  3. β€˜W’: Weekly
  4. β€˜M’: Month end
  5. β€˜BM’: Business month end
  6. β€˜MS’: Month start
  7. β€˜BMS’: Business month start
  8. β€˜Q’: Quarter end
  9. β€˜BQ’: Business quarter end
  10. β€˜QS’: Quarter start
  11. β€˜BQS’: Business quarter start
  12. β€˜A’ or β€˜Y’: Year end
  13. β€˜BA’ or β€˜BY’: Business year end
  14. β€˜AS’ or β€˜YS’: Year start
  15. β€˜BAS’ or β€˜BYS’: Business year start
  16. β€˜H’: Hourly
  17. β€˜T’ or β€˜min’: Minutely
  18. β€˜S’: Secondly
  19. β€˜L’ or β€˜ms’: Milliseconds
  20. β€˜U’: Microseconds
  21. β€˜N’: Nanoseconds

Custom Frequencies:

  • You can also create custom frequencies by combining base frequencies, like:
    • β€˜2D’: Every 2 days
    • β€˜3W’: Every 3 weeks
    • β€˜4H’: Every 4 hours
    • β€˜1H30T’: Every 1 hour and 30 minutes

Compound Frequencies:

  • You can combine multiple frequencies by adding them together.
    • β€˜1D1H’: 1 day and 1 hour
    • β€˜1H30T’: 1 hour and 30 minutes

Example:

Code
import pandas as pd

# Creating a date range with daily frequency
date_range_daily = pd.date_range(start='2023-01-01', end='2023-01-10', freq='D')

date_range_daily
DatetimeIndex(['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04',
               '2023-01-05', '2023-01-06', '2023-01-07', '2023-01-08',
               '2023-01-09', '2023-01-10'],
              dtype='datetime64[ns]', freq='D')
Code
# Creating a date range with 2 days frequency
date_range_two_days = pd.date_range(start='2023-01-01', end='2023-01-10', freq='2D')

date_range_two_days
DatetimeIndex(['2023-01-01', '2023-01-03', '2023-01-05', '2023-01-07',
               '2023-01-09'],
              dtype='datetime64[ns]', freq='2D')

These frequency strings help in resampling, creating date ranges, and handling time-series data efficiently in pandas.

2 Timetk Incorporates Pandas Frequencies

Now that you’ve seen pandas frequencies, you’ll see them pop up in many of the pytimetk functions.

Example: Padding Dates

This example shows how to use Pandas frequencies inside of pytimetk functions.

We’ll use pad_by_time to show how to use freq to fill in missing dates.

Code
# DataFrame with missing dates
import pandas as pd

data = {
    # '2023-09-05' is missing
    'datetime': ['2023-09-01', '2023-09-02', '2023-09-03', '2023-09-04', '2023-09-06'],  
    'value': [10, 30, 40, 50, 60]
}

df = pd.DataFrame(data)
df['datetime'] = pd.to_datetime(df['datetime'])
df
datetime value
0 2023-09-01 10
1 2023-09-02 30
2 2023-09-03 40
3 2023-09-04 50
4 2023-09-06 60

We can resample to fill in the missing day using pad_by_time with freq = 'D'.

Code
import pytimetk as tk

df.pad_by_time('datetime', freq = 'D')
datetime value
0 2023-09-01 10.0
1 2023-09-02 30.0
2 2023-09-03 40.0
3 2023-09-04 50.0
4 2023-09-05 NaN
5 2023-09-06 60.0

What about resampling every 12 hours? Just set `freq = β€˜12H’.

Code
import pytimetk as tk

df.pad_by_time('datetime', freq = '12H')
datetime value
0 2023-09-01 00:00:00 10.0
1 2023-09-01 12:00:00 NaN
2 2023-09-02 00:00:00 30.0
3 2023-09-02 12:00:00 NaN
4 2023-09-03 00:00:00 40.0
5 2023-09-03 12:00:00 NaN
6 2023-09-04 00:00:00 50.0
7 2023-09-04 12:00:00 NaN
8 2023-09-05 00:00:00 NaN
9 2023-09-05 12:00:00 NaN
10 2023-09-06 00:00:00 60.0

You’ll see these pandas frequencies come up as the parameter freq in many pytimetk functions.

3 Next Steps

Check out the Data Wrangling Guide next.

4 More Coming Soon…

We are in the early stages of development. But it’s obvious the potential for pytimetk now in Python. 🐍