Sample Time Series Retail Data from the Walmart Recruiting Store Sales Forecasting Competition
Source:R/data-walmart_sales_weekly.R
walmart_sales_weekly.Rd
The Kaggle "Walmart Recruiting - Store Sales Forecasting" Competition used retail data for combinations of stores and departments within each store. The competition began February 20th, 2014 and ended May 5th, 2014. The competition included data from 45 retail stores located in different regions. The dataset included various external features including Holiday information, Temperature, Fuel Price, and Markdown. This dataset includes a Sample of 7 departments from the Store ID 1 (7 total time series).
Format
A tibble: 9,743 x 3
id
Factor. Unique series identifier (4 total)Store
Numeric. Store ID.Dept
Numeric. Department ID.Date
Date. Weekly timestamp.Weekly_Sales
Numeric. Sales for the given department in the given store.IsHoliday
Logical. Whether the week is a "special" holiday for the store.Type
Character. Type identifier of the store.Size
Numeric. Store square-footageTemperature
Numeric. Average temperature in the region.Fuel_Price
Numeric. Cost of fuel in the region.MarkDown1
,MarkDown2
,MarkDown3
,MarkDown4
,MarkDown5
Numeric. Anonymized data related to promotional markdowns that Walmart is running. MarkDown data is only available after Nov 2011, and is not available for all stores all the time. Any missing value is marked with an NA.CPI
Numeric. The consumer price index.Unemployment
Numeric. The unemployment rate in the region.
Details
This is a sample of 7 Weekly data sets from the Kaggle Walmart Recruiting Store Sales Forecasting competition.
Holiday Features
The four holidays fall within the following weeks in the dataset (not all holidays are in the data):
Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13
Labor Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13
Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13
Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13
Examples
walmart_sales_weekly
#> # A tibble: 1,001 × 17
#> id Store Dept Date Weekly_Sales IsHoliday Type Size Temperature
#> <fct> <dbl> <dbl> <date> <dbl> <lgl> <chr> <dbl> <dbl>
#> 1 1_1 1 1 2010-02-05 24924. FALSE A 151315 42.3
#> 2 1_1 1 1 2010-02-12 46039. TRUE A 151315 38.5
#> 3 1_1 1 1 2010-02-19 41596. FALSE A 151315 39.9
#> 4 1_1 1 1 2010-02-26 19404. FALSE A 151315 46.6
#> 5 1_1 1 1 2010-03-05 21828. FALSE A 151315 46.5
#> 6 1_1 1 1 2010-03-12 21043. FALSE A 151315 57.8
#> 7 1_1 1 1 2010-03-19 22137. FALSE A 151315 54.6
#> 8 1_1 1 1 2010-03-26 26229. FALSE A 151315 51.4
#> 9 1_1 1 1 2010-04-02 57258. FALSE A 151315 62.3
#> 10 1_1 1 1 2010-04-09 42961. FALSE A 151315 65.9
#> # ℹ 991 more rows
#> # ℹ 8 more variables: Fuel_Price <dbl>, MarkDown1 <dbl>, MarkDown2 <dbl>,
#> # MarkDown3 <dbl>, MarkDown4 <dbl>, MarkDown5 <dbl>, CPI <dbl>,
#> # Unemployment <dbl>