Skip to contents

The Kaggle "Walmart Recruiting - Store Sales Forecasting" Competition used retail data for combinations of stores and departments within each store. The competition began February 20th, 2014 and ended May 5th, 2014. The competition included data from 45 retail stores located in different regions. The dataset included various external features including Holiday information, Temperature, Fuel Price, and Markdown. This dataset includes a Sample of 7 departments from the Store ID 1 (7 total time series).

Usage

walmart_sales_weekly

Format

A tibble: 9,743 x 3

  • id Factor. Unique series identifier (4 total)

  • Store Numeric. Store ID.

  • Dept Numeric. Department ID.

  • Date Date. Weekly timestamp.

  • Weekly_Sales Numeric. Sales for the given department in the given store.

  • IsHoliday Logical. Whether the week is a "special" holiday for the store.

  • Type Character. Type identifier of the store.

  • Size Numeric. Store square-footage

  • Temperature Numeric. Average temperature in the region.

  • Fuel_Price Numeric. Cost of fuel in the region.

  • MarkDown1, MarkDown2, MarkDown3, MarkDown4, MarkDown5 Numeric. Anonymized data related to promotional markdowns that Walmart is running. MarkDown data is only available after Nov 2011, and is not available for all stores all the time. Any missing value is marked with an NA.

  • CPI Numeric. The consumer price index.

  • Unemployment Numeric. The unemployment rate in the region.

Details

This is a sample of 7 Weekly data sets from the Kaggle Walmart Recruiting Store Sales Forecasting competition.

Holiday Features

The four holidays fall within the following weeks in the dataset (not all holidays are in the data):

  • Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13

  • Labor Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13

  • Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13

  • Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13

Examples

walmart_sales_weekly
#> # A tibble: 1,001 × 17
#>    id    Store  Dept Date       Weekly_Sales IsHoliday Type    Size Temperature
#>    <fct> <dbl> <dbl> <date>            <dbl> <lgl>     <chr>  <dbl>       <dbl>
#>  1 1_1       1     1 2010-02-05       24924. FALSE     A     151315        42.3
#>  2 1_1       1     1 2010-02-12       46039. TRUE      A     151315        38.5
#>  3 1_1       1     1 2010-02-19       41596. FALSE     A     151315        39.9
#>  4 1_1       1     1 2010-02-26       19404. FALSE     A     151315        46.6
#>  5 1_1       1     1 2010-03-05       21828. FALSE     A     151315        46.5
#>  6 1_1       1     1 2010-03-12       21043. FALSE     A     151315        57.8
#>  7 1_1       1     1 2010-03-19       22137. FALSE     A     151315        54.6
#>  8 1_1       1     1 2010-03-26       26229. FALSE     A     151315        51.4
#>  9 1_1       1     1 2010-04-02       57258. FALSE     A     151315        62.3
#> 10 1_1       1     1 2010-04-09       42961. FALSE     A     151315        65.9
#> # ℹ 991 more rows
#> # ℹ 8 more variables: Fuel_Price <dbl>, MarkDown1 <dbl>, MarkDown2 <dbl>,
#> #   MarkDown3 <dbl>, MarkDown4 <dbl>, MarkDown5 <dbl>, CPI <dbl>,
#> #   Unemployment <dbl>