Apply filtering expressions inside periods (windows)
Source:R/dplyr-filter_period.R
filter_period.Rd
Applies a dplyr filtering expression inside a time-based period (window).
See filter_by_time()
for filtering continuous ranges defined by start/end dates.
filter_period()
enables filtering expressions like:
Filtering to the maximum value each month.
Filtering the first date each month.
Filtering all rows with value greater than a monthly average
Arguments
- .data
A
tbl
object ordata.frame
- ...
Filtering expression. Expressions that return a logical value, and are defined in terms of the variables in .data. If multiple expressions are included, they are combined with the & operator. Only rows for which all conditions evaluate to TRUE are kept.
- .date_var
A column containing date or date-time values. If missing, attempts to auto-detect date column.
- .period
A period to filter within. Time units are grouped using
lubridate::floor_date()
orlubridate::ceiling_date()
.The value can be:
second
minute
hour
day
week
month
bimonth
quarter
season
halfyear
year
Arbitrary unique English abbreviations as in the
lubridate::period()
constructor are allowed:"1 year"
"2 months"
"30 seconds"
See also
Time-Based dplyr functions:
summarise_by_time()
- Easily summarise using a date column.mutate_by_time()
- Simplifies applying mutations by time windows.pad_by_time()
- Insert time series rows with regularly spaced timestampsfilter_by_time()
- Quickly filter using date ranges.filter_period()
- Apply filtering expressions inside periods (windows)slice_period()
- Apply slice inside periods (windows)condense_period()
- Convert to a different periodicitybetween_time()
- Range detection for date or date-time sequences.slidify()
- Turn any function into a sliding (rolling) function
Examples
# Libraries
library(dplyr)
# Max value in each month
m4_daily %>%
group_by(id) %>%
filter_period(.period = "1 month", value == max(value))
#> .date_var is missing. Using: date
#> # A tibble: 350 × 3
#> # Groups: id [4]
#> id date value
#> <fct> <date> <dbl>
#> 1 D10 2014-07-03 2076.
#> 2 D10 2014-08-08 2028.
#> 3 D10 2014-09-30 2024.
#> 4 D10 2014-10-12 2155.
#> 5 D10 2014-11-13 2245.
#> 6 D10 2014-12-30 2345.
#> 7 D10 2015-01-09 2369.
#> 8 D10 2015-02-09 2341.
#> 9 D10 2015-03-31 2392.
#> 10 D10 2015-04-13 2500.
#> # ℹ 340 more rows
# First date each month
m4_daily %>%
group_by(id) %>%
filter_period(.period = "1 month", date == first(date))
#> .date_var is missing. Using: date
#> # A tibble: 323 × 3
#> # Groups: id [4]
#> id date value
#> <fct> <date> <dbl>
#> 1 D10 2014-07-03 2076.
#> 2 D10 2014-08-01 1923.
#> 3 D10 2014-09-01 1908.
#> 4 D10 2014-10-01 2049.
#> 5 D10 2014-11-01 2133.
#> 6 D10 2014-12-01 2244.
#> 7 D10 2015-01-01 2351
#> 8 D10 2015-02-01 2286.
#> 9 D10 2015-03-01 2291.
#> 10 D10 2015-04-01 2396.
#> # ℹ 313 more rows
# All observations that are greater than a monthly average
m4_daily %>%
group_by(id) %>%
filter_period(.period = "1 month", value > mean(value))
#> .date_var is missing. Using: date
#> # A tibble: 4,880 × 3
#> # Groups: id [4]
#> id date value
#> <fct> <date> <dbl>
#> 1 D10 2014-07-03 2076.
#> 2 D10 2014-07-04 2073.
#> 3 D10 2014-07-05 2049.
#> 4 D10 2014-07-06 2049.
#> 5 D10 2014-07-07 2006.
#> 6 D10 2014-07-08 2018.
#> 7 D10 2014-07-09 2019.
#> 8 D10 2014-07-10 2007.
#> 9 D10 2014-07-11 2010
#> 10 D10 2014-07-12 2002.
#> # ℹ 4,870 more rows