Data Visualization

How this guide benefits you

This guide covers how to use the plot_timeseries() for data visualization. Once you understand how it works, you can apply explore time series data easier than ever.

This tutorial focuses on, plot_timeseries(), a workhorse time-series plotting function that:

Generates interactive plotly plots (great for exploring & streamlit/shiny apps)
Consolidates 20+ lines of plotnine/matpotlib & plotly code
Scales well to many time series
Can be converted from interactive plotly to static plotnine/matplotlib plots

1 Libraries

Run the following code to setup for this tutorial.

Code

# Import packages
import pytimetk as tk
import pandas as pd

2 Plotting Time Series

The main function is plot_timeseries(). We’ll cover some key functionality for easy time series visualization for single and grouped time series.

2.1 Plotting a Single Time Series

Let’s start with a popular time series, taylor_30_min, which includes energy demand in megawatts at a sampling interval of 30-minutes. This is a single time series.

Code

# Import a Time Series Data Set
taylor_30_min = tk.load_dataset("taylor_30_min", parse_dates = ['date'])
taylor_30_min

	date	value
0	2000-06-05 00:00:00+00:00	22262
1	2000-06-05 00:30:00+00:00	21756
2	2000-06-05 01:00:00+00:00	22247
3	2000-06-05 01:30:00+00:00	22759
4	2000-06-05 02:00:00+00:00	22549
...	...	...
4027	2000-08-27 21:30:00+00:00	27946
4028	2000-08-27 22:00:00+00:00	27133
4029	2000-08-27 22:30:00+00:00	25996
4030	2000-08-27 23:00:00+00:00	24610
4031	2000-08-27 23:30:00+00:00	23132

4032 rows × 2 columns

The plot_timeseries() function generates an interactive plotly chart by default.

Simply provide the date variable (time-based column, date_column) and the numeric variable (value_column) that changes over time as the first 2 arguments.
By default, the plotting engine is plotly, which is interactive and excellent for data exploration and apps. However, if you require static plots for reports, you can set the engine to engine = ‘plotnine’ or engine = ‘matplotlib’.

Interactive plot

Code

taylor_30_min.plot_timeseries('date', 'value')

Static plot

Code

taylor_30_min.plot_timeseries(
    'date', 'value',
     engine = 'plotnine'
)

<Figure Size: (700 x 500)>

2.2 Plotting Groups

Next, let’s move on to a dataset with time series groups, m4_monthly, which is a sample of 4 time series from the M4 competition that are sampled at a monthly frequency.

Code

# Import a Time Series Data Set
m4_monthly = tk.load_dataset("m4_monthly", parse_dates = ['date'])
m4_monthly

	id	date	value
0	M1	1976-06-01	8000
1	M1	1976-07-01	8350
2	M1	1976-08-01	8570
3	M1	1976-09-01	7700
4	M1	1976-10-01	7080
...	...	...	...
1569	M1000	2015-02-01	880
1570	M1000	2015-03-01	800
1571	M1000	2015-04-01	1140
1572	M1000	2015-05-01	970
1573	M1000	2015-06-01	1430

1574 rows × 3 columns

Visualizing grouped data is as simple as grouping the data set with groupby() before run it into the plot_timeseries() function. There are 2 methods:

Facets
Plotly Dropdown

Facets (Subgroups on one plot)

This is great to see all time series in one plot. Here are the key points:

Groups can be added using the pandas groupby().
These groups are then converted into facets.
Using facet_ncol = 2 returns a 2-column faceted plot.
Setting facet_scales = "free" allows the x and y-axes of each plot to scale independently of the other plots.

Code

m4_monthly.groupby('id').plot_timeseries(
    'date', 'value', 
    facet_ncol = 2, 
    facet_scales = "free"
)

Plotly Dropdown

Sometimes you have many groups and would prefer to see one plot per group. This can be accomplished with plotly_dropdown. You can adjust the x and y position as follows:

Code

m4_monthly.groupby('id').plot_timeseries(
    'date', 'value', 
    plotly_dropdown=True,
    plotly_dropdown_x=0,
    plotly_dropdown_y=1
)

The groups can also be vizualized in the same plot using color_column paramenter. Let’s come back to taylor_30_min dataframe.

Code

# load data
taylor_30_min = tk.load_dataset("taylor_30_min", parse_dates = ['date'])

# extract the month using pandas
taylor_30_min['month'] = pd.to_datetime(taylor_30_min['date']).dt.month

# plot groups
taylor_30_min.plot_timeseries(
    'date', 'value', 
    color_column = 'month'
)

3 Next steps

Check out the Pytimetk Basics Guide next.

4 More Coming Soon…

We are in the early stages of development. But it’s obvious the potential for pytimetk now in Python. 🐍

Please ⭐ us on GitHub (it takes 2-seconds and means a lot).
To make requests, please see our Project Roadmap GH Issue #2. You can make requests there.
Want to contribute? See our contributing guide here.