Cross-filtering Visualizations with Dash Plotly

Many BI tools provide a features of cross-filtering widgets. This features enables widgets on a dashboard to respond to an interaction with one of the widget. When user clicks on a visualization, other widgets show the filtered data based on which category the user interacts with on the triggering widget. 

It is a bit tricky to have this feature implemented with Dash Plotly. Here is a glimpse of what we will do with Dash Plotly:


The code is available at github repo : https://github.com/dvinayakn/cross-filtering-visualizations-dash

We will work with a fictitious data in csv file - sales_data.csv included in the assets folder. Here is a glimpse of the data:

The code execution starts with index.py, where a Dash object is created and layout is assigned to the Dash object. The layout is defined in layout.py. The layout contains an html Div with below children:
  1. dcc.Store to contain sesssion id.
  2. dcc.Store to contain application data in memory.
  3. A reset button.
  4. Three visualizations indicating revenue by product, region and month respectively.
Let's look at the core of this application - the callbacks. The first callback is responsible for interactive behaviour of the application

# Define the callback for cross filtering
@app.callback(
Output('revenue-by-product-graph-id', 'figure'),
Output('revenue-by-region-graph-id', 'figure'),
Output('revenue-by-month-graph-id', 'figure'),
Output('report-data', 'data'),

Input('session-id', 'data'),
Input('revenue-by-product-graph-id','clickData'),
Input('revenue-by-region-graph-id','clickData'),
Input('revenue-by-month-graph-id','clickData'),

State('report-data', 'data')
)

This states that the callback will be triggered whenever the session id is generated or any of the three visualizations is clicked

The callback function updates "figure" properties of all the three visualizations and saves the filtered application data in store with id 'report-data'.

Let's look at the callback function:

Below code snippet decides when to load the data from source and when to use the filtered data:

    # Check if this callback is getting called for the first time
    # or the session id is recreated
    # session id is configured to be re-created whenever 'Reset' button
    # is clicked.
    # Hence a click on "Reset" button also satisfies this condition and makes
    # the app to load data from source.
if report_data is None or ctx.triggered_id == 'session-id':
#Read the data from source
print('Reading data from source')
data_df = pd.read_csv('assets/sales_data.csv')

    # If the above condition is not satisfied, then it means the callback
    # is triggered by interaction with one of the charts
# In this case, read the data from report data store.
# 'data-store' contains data filtered via earlier interactions
else :
# Read the data from store - report_data
print('Reading data from dcc.Store')
data_df = pd.read_json(report_data)


The next code snippet determines which visualization has triggered this callback and filters the data based on the "clickData" property. 
    # Filter the data based on user interaction.
    # Look at bottom of file for example of click data.
if ctx.triggered_id == 'revenue-by-product-graph-id':
clicked_product = product_click_data['points'][0]['x']
data_df = data_df[data_df['Product Name'] == clicked_product]

elif ctx.triggered_id == 'revenue-by-region-graph-id':
clicked_region = region_click_data['points'][0]['x']
data_df = data_df[data_df['Region Name'] == clicked_region]

elif ctx.triggered_id == 'revenue-by-month-graph-id':
clicked_month = month_click_data['points'][0]['x']
data_df = data_df[data_df['Month Name'] == clicked_month]

Here is an example of "clickData" object to understand the above filtering better:

# Example of click data object:
# {
# 'points': [
# {
# 'curveNumber': 0,
# 'pointNumber': 4,
# 'pointIndex': 4,
# 'x': 'Product 5',
# 'y': 677597,
# 'label': 'Product 5',
# 'value': 677597,
# 'text': 677597,
# 'bbox': {
# 'x0': 619.49,
# 'x1': 726.51,
# 'y0': 67.25,
# 'y1': 67.25
# }
# }
# ]
# }

The next code snippet grabs the filtered data and creates visualizations with it:
    # Revenue by product graph
revenue_by_product_df = data_df[['Product Name', 'Revenue']]
                            .groupby(by=['Product Name']).sum().reset_index()
revenue_by_product_chart = px.bar(
data_frame=revenue_by_product_df,
x='Product Name',
y='Revenue',
text='Revenue',
)
revenue_by_product_chart.update_xaxes(title=None)
revenue_by_product_chart.update_layout(margin=dict(l=10, r=10, t=10, b=10))
# Revenue by region graph
revenue_by_region_df = data_df[['Region Name', 'Revenue']]
                            .groupby(by=['Region Name']).sum().reset_index()
revenue_by_region_chart = px.bar(
data_frame=revenue_by_region_df,
x='Region Name',
y='Revenue',
text='Revenue'
)
revenue_by_region_chart.update_xaxes(title=None)
revenue_by_region_chart.update_layout(margin=dict(l=10, r=10, t=10, b=10))
# Revenue by month graph
revenue_by_month_df = data_df[['Month Name', 'Revenue']]
                            .groupby(by=['Month Name']).sum().reset_index()
revenue_by_month_chart = px.bar(
data_frame=revenue_by_month_df,
x='Month Name',
y='Revenue',
text='Revenue'
)
revenue_by_month_chart.update_xaxes(title=None)
revenue_by_month_chart.update_layout(margin=dict(l=10, r=10, t=10, b=10))

Finally, the three visualizations and the filtered data is returned. The visualizations are assigned as "figure" property of dcc.Graph objects thereby reflected on user screen. The filtered data goes to data property of the dcc.Store object with id 'report-data', thereby it is available for the next callback.

The second callback simply recreates the session id with uuid4 and assigns this to data property of dcc.Store object with id 'session-id'. Any change in 'data' property of dcc.Store with id 'session-id' triggers the above callback which in turns load data from source as the callback is triggered by session id recreation.

# Callback to reset the dashboard to original state when "Reset" button
# is clicked
# This callback recreates the session id and assigns the outcome
# to dcc.Store object with id 'session-id'
@app.callback(
Output('session-id','data'),
Input('reset-button-id','n_clicks'),
prevent_initial_call = True
)
def reset_dashboard(n_clicks):
return str(uuid4())
 

Comments