The TimeSeriesCVSplitter is a scikit-learn compatible cross-validator using TimeSeriesCV.
This cross-validator generates splits based on time values, making it suitable for time series data.
Parameters:
frequency: str The frequency of the time series (e.g., “days”, “hours”). train_size: int Minimum number of time units in the training set. forecast_horizon: int Number of time units to forecast in each split. time_series: pd.Series A pandas Series or Index representing the time values. gap: int Number of time units to skip between training and testing sets. stride: int Number of time units to move forward after each split. window: str Type of window, either “rolling” or “expanding”. mode: str Order of split generation, “forward” or “backward”. start_dt: pd.Timestamp Start date for the time period. end_dt: pd.Timestamp End date for the time period. split_limit: int Maximum number of splits to generate. If None, all possible splits will be generated.
Raises:
ValueError: If the input arrays are incompatible in length with the time series.
Returns:
A generator of tuples of arrays containing the training and forecast data.
# Insepct the cross-validation splitscv.splitter.plot(y, time_series = time_series)
# Using the TimeSeriesCVSplitter in a scikit-learn CV modelfrom sklearn.linear_model import Ridgefrom sklearn.model_selection import RandomizedSearchCV# Fit and get best estimatorparam_grid = {"alpha": np.linspace(0.1, 2, 10),"fit_intercept": [True, False],"positive": [True, False],}random_search_cv = RandomizedSearchCV( estimator=Ridge(), param_distributions=param_grid, cv=cv, n_jobs=-1,).fit(X, y)random_search_cv.best_estimator_
Ridge(alpha=np.float64(0.1), fit_intercept=False)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Generates train and test indices for cross-validation.
Parameters:
X: Optional input features (ignored, for compatibility with scikit-learn). y: Optional target variable (ignored, for compatibility with scikit-learn). groups: Optional group labels (ignored, for compatibility with scikit-learn).
Yields:
Tuple[np.ndarray, np.ndarray]: Tuples of train and test indices.