The plot_correlation_funnel function generates a correlation funnel plot using either Plotly or plotnine in Python.
Parameters
Name
Type
Description
Default
data
pd.DataFrame
The data parameter is a pandas DataFrame that contains the correlation values and corresponding features. It should have two columns: ‘correlation’ and ‘feature’.
required
limits
tuple
The limits parameter is a tuple that specifies the lower and upper limits of the x-axis in the correlation funnel plot. By default, the limits are set to (-1, 1), which means the x-axis will range from -1 to 1.
(-1, 1)
alpha
float
The alpha parameter determines the transparency of the data points in the plot. A value of 1.0 means the points are fully opaque, while a value less than 1.0 makes the points more transparent.
1.0
title
str
The title of the plot.
'Correlation Funnel Plot'
x_lab
str
The x_lab parameter is used to specify the label for the x-axis of the plot. It represents the label for the correlation values.
'Correlation'
y_lab
str
The y_lab parameter is used to specify the label for the y-axis in the correlation funnel plot. It represents the name or description of the feature being plotted.
'Feature'
base_size
float
The base_size parameter is used to set the base font size for the plot. It is multiplied by different factors to determine the font sizes for various elements of the plot, such as the title, axis labels, tick labels, legend, and annotations.
11
width
Optional[int]
The width parameter is used to specify the width of the plot in pixels. It determines the horizontal size of the plot.
None
height
Optional[int]
The height parameter is used to specify the height of the plot in pixels. It determines the vertical size of the plot when it is rendered.
None
engine
str
The engine parameter determines the plotting engine to be used. It can be set to either “plotly” or “plotnine”. If set to “plotly”, the function will generate an interactive plot using the Plotly library. If set to “plotnine”, it will generate a static plot using the plotnine library. The default value is “plotly”.
'plotly'
Returns
Type
Description
The function plot_correlation_funnel returns a plotly figure object if the engine parameter is
set to ‘plotly’, and a plotnine object if the engine parameter is set to ‘plotnine’.
See Also
binarize(): Binarize the dataset into 1’s and 0’s.
correlate(): Calculate the correlation between features in a pandas DataFrame.
Examples
# NON-TIMESERIES EXAMPLE ----import pandas as pdimport numpy as npimport pytimetk as tk# Set a random seed for reproducibilitynp.random.seed(0)# Define the number of rows for your DataFramenum_rows =200# Create fake data for the columnsdata = {'Age': np.random.randint(18, 65, size=num_rows),'Gender': np.random.choice(['Male', 'Female'], size=num_rows),'Marital_Status': np.random.choice(['Single', 'Married', 'Divorced'], size=num_rows),'City': np.random.choice(['New York', 'Los Angeles', 'Chicago', 'Houston', 'Miami'], size=num_rows),'Years_Playing': np.random.randint(0, 30, size=num_rows),'Average_Income': np.random.randint(20000, 100000, size=num_rows),'Member_Status': np.random.choice(['Bronze', 'Silver', 'Gold', 'Platinum'], size=num_rows),'Number_Children': np.random.randint(0, 5, size=num_rows),'Own_House_Flag': np.random.choice([True, False], size=num_rows),'Own_Car_Count': np.random.randint(0, 3, size=num_rows),'PersonId': range(1, num_rows +1), # Add a PersonId column as a row count'Client': np.random.choice(['A', 'B'], size=num_rows) # Add a Client column with random values 'A' or 'B'}# Create a DataFramedf = pd.DataFrame(data)# Binarize the datadf_binarized = df.binarize(n_bins=4, thresh_infreq=0.01, name_infreq="-OTHER", one_hot=True)df_binarized.glimpse()