Data Visualization

How this guide benefits you

This guide covers how to use the plot_timeseries() for data visualization. Once you understand how it works, you can apply explore time series data easier than ever.

This tutorial focuses on, plot_timeseries(), a workhorse time-series plotting function that:

1 Libraries

Run the following code to setup for this tutorial.

Code
# Import packages
import pytimetk as tk
import pandas as pd

2 Plotting Time Series

Let’s start with a popular time series, taylor_30_min, which includes energy demand in megawatts at a sampling interval of 30-minutes. This is a single time series.

Code
# Import a Time Series Data Set
taylor_30_min = tk.load_dataset("taylor_30_min", parse_dates = ['date'])
taylor_30_min
date value
0 2000-06-05 00:00:00+00:00 22262
1 2000-06-05 00:30:00+00:00 21756
2 2000-06-05 01:00:00+00:00 22247
3 2000-06-05 01:30:00+00:00 22759
4 2000-06-05 02:00:00+00:00 22549
... ... ...
4027 2000-08-27 21:30:00+00:00 27946
4028 2000-08-27 22:00:00+00:00 27133
4029 2000-08-27 22:30:00+00:00 25996
4030 2000-08-27 23:00:00+00:00 24610
4031 2000-08-27 23:30:00+00:00 23132

4032 rows × 2 columns

The plot_timeseries() function generates an interactive plotly chart by default.

  • Simply provide the date variable (time-based column, date_column) and the numeric variable (value_column) that changes over time as the first 2 arguments.
  • By default, the plotting engine is plotly, which is interactive and excellent for data exploration and apps. However, if you require static plots for reports, you can set the engine to engine = ‘plotnine’ or engine = ‘matplotlib’.

Interactive plot

Code
taylor_30_min.plot_timeseries('date', 'value')

Static plot

Code
taylor_30_min.plot_timeseries(
    'date', 'value',
     engine = 'plotnine'
)

<Figure Size: (700 x 500)>

2.1 Plotting Groups

Next, let’s move on to a dataset with time series groups, m4_monthly, which is a sample of 4 time series from the M4 competition that are sampled at a monthly frequency.

Code
# Import a Time Series Data Set
m4_monthly = tk.load_dataset("m4_monthly", parse_dates = ['date'])
m4_monthly
id date value
0 M1 1976-06-01 8000
1 M1 1976-07-01 8350
2 M1 1976-08-01 8570
3 M1 1976-09-01 7700
4 M1 1976-10-01 7080
... ... ... ...
1569 M1000 2015-02-01 880
1570 M1000 2015-03-01 800
1571 M1000 2015-04-01 1140
1572 M1000 2015-05-01 970
1573 M1000 2015-06-01 1430

1574 rows × 3 columns

Visualizing grouped data is as simple as grouping the data set with groupby() before run it into the plot_timeseries() function. Here are the key points:

  • Groups can be added using the pandas groupby().
  • These groups are then converted into facets.
  • Using facet_ncol = 2 returns a 2-column faceted plot.
  • Setting facet_scales = "free" allows the x and y-axes of each plot to scale independently of the other plots.
Code
m4_monthly.groupby('id').plot_timeseries(
    'date', 'value', 
    facet_ncol = 2, 
    facet_scales = "free"
)

The groups can also be vizualized in the same plot using color_column paramenter. Let’s come back to taylor_30_min dataframe.

Code
# load data
taylor_30_min = tk.load_dataset("taylor_30_min", parse_dates = ['date'])

# extract the month using pandas
taylor_30_min['month'] = pd.to_datetime(taylor_30_min['date']).dt.month

# plot groups
taylor_30_min.plot_timeseries(
    'date', 'value', 
    color_column = 'month'
)

3 Next steps

Check out the Pytimetk Basics Guide next.

4 More Coming Soon…

We are in the early stages of development. But it’s obvious the potential for pytimetk now in Python. 🐍