
Sample Time Series Retail Data from the Walmart Recruiting Store Sales Forecasting Competition
Source:R/data-walmart_sales_weekly.R
walmart_sales_weekly.RdThe Kaggle "Walmart Recruiting - Store Sales Forecasting" Competition used retail data for combinations of stores and departments within each store. The competition began February 20th, 2014 and ended May 5th, 2014. The competition included data from 45 retail stores located in different regions. The dataset included various external features including Holiday information, Temperature, Fuel Price, and Markdown. This dataset includes a Sample of 7 departments from the Store ID 1 (7 total time series).
Format
A tibble: 9,743 x 3
idFactor. Unique series identifier (4 total)StoreNumeric. Store ID.DeptNumeric. Department ID.DateDate. Weekly timestamp.Weekly_SalesNumeric. Sales for the given department in the given store.IsHolidayLogical. Whether the week is a "special" holiday for the store.TypeCharacter. Type identifier of the store.SizeNumeric. Store square-footageTemperatureNumeric. Average temperature in the region.Fuel_PriceNumeric. Cost of fuel in the region.MarkDown1,MarkDown2,MarkDown3,MarkDown4,MarkDown5Numeric. Anonymized data related to promotional markdowns that Walmart is running. MarkDown data is only available after Nov 2011, and is not available for all stores all the time. Any missing value is marked with an NA.CPINumeric. The consumer price index.UnemploymentNumeric. The unemployment rate in the region.
Details
This is a sample of 7 Weekly data sets from the Kaggle Walmart Recruiting Store Sales Forecasting competition.
Holiday Features
The four holidays fall within the following weeks in the dataset (not all holidays are in the data):
Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13
Labor Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13
Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13
Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13
Examples
walmart_sales_weekly
#> # A tibble: 1,001 × 17
#> id Store Dept Date Weekly_Sales IsHoliday Type Size Temperature
#> <fct> <dbl> <dbl> <date> <dbl> <lgl> <chr> <dbl> <dbl>
#> 1 1_1 1 1 2010-02-05 24924. FALSE A 151315 42.3
#> 2 1_1 1 1 2010-02-12 46039. TRUE A 151315 38.5
#> 3 1_1 1 1 2010-02-19 41596. FALSE A 151315 39.9
#> 4 1_1 1 1 2010-02-26 19404. FALSE A 151315 46.6
#> 5 1_1 1 1 2010-03-05 21828. FALSE A 151315 46.5
#> 6 1_1 1 1 2010-03-12 21043. FALSE A 151315 57.8
#> 7 1_1 1 1 2010-03-19 22137. FALSE A 151315 54.6
#> 8 1_1 1 1 2010-03-26 26229. FALSE A 151315 51.4
#> 9 1_1 1 1 2010-04-02 57258. FALSE A 151315 62.3
#> 10 1_1 1 1 2010-04-09 42961. FALSE A 151315 65.9
#> # ℹ 991 more rows
#> # ℹ 8 more variables: Fuel_Price <dbl>, MarkDown1 <dbl>, MarkDown2 <dbl>,
#> # MarkDown3 <dbl>, MarkDown4 <dbl>, MarkDown5 <dbl>, CPI <dbl>,
#> # Unemployment <dbl>