Use normalization on seaborn heatmap


Sometimes, a normalization step is necessary to find out the patterns in your heatmap. This post shows how to normalize a data frame to plot a heatmap using seaborn in order to avoid an individual column or row to absorbing all the color variations.

In the first chart of the first example, you can see that while one column appears as yellow, the rest of the heatmap appears as green. This column absorbs all the color variations. To avoid this, you can normalize the data frame. You can normalize on columns or on rows. Several formula can be used, read this page to find the one you need.

Column normalization

You can compare the charts below in order to see the difference between the initial data frame and the normalized version of it.

# libraries
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
 
# Create a dataframe where the average value of the second column is higher than others:
df = pd.DataFrame(np.random.randn(10,10) * 4 + 3)
df[1]=df[1]+40
 
# If we do a heatmap, we just observe that one column has higher values than others:
sns.heatmap(df, cmap='viridis')
plt.show()

# Now if we normalize it by column:
df_norm_col=(df-df.mean())/df.std()
sns.heatmap(df_norm_col, cmap='viridis')
plt.show()

Row normalization

The same principle works for row normalization.

# libraries
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
 
# Create a dataframe where the average value of the second row is higher
df = pd.DataFrame(np.random.randn(10,10) * 4 + 3)
df.iloc[2]=df.iloc[2]+40
 
# If we do a heatmap, we just observe that one row has higher values than others:
sns.heatmap(df, cmap='viridis')
plt.show()
 
# Normalize it by row:
df_norm_row = df.apply(lambda x: (x-x.mean())/x.std(), axis = 1)
 
# And see the result
sns.heatmap(df_norm_row, cmap='viridis')
plt.show()

Scatterplot

Heatmap

Correlogram

Bubble

Connected Scatter

2D Density

Contact & Edit

👋 This document is a work by Yan Holtz. Any feedback is highly encouraged. You can fill an issue on Github, drop me a message onTwitter, or send an email pasting yan.holtz.data with gmail.com.

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!

Violin

Density

Histogram

Boxplot

Ridgeline

Scatterplot

Heatmap

Correlogram

Bubble

Connected Scatter

2D Density

Barplot

Spider / Radar

Wordcloud

Parallel

Lollipop

Circular Barplot

Treemap

Venn Diagram

Donut

Pie Chart

Dendrogram

Circular Packing

Line chart

Area chart

Stacked Area

Streamgraph

Timeseries with python

Timeseries

Map

Choropleth

Hexbin

Cartogram

Connection

Bubble

Chord Diagram

Network

Sankey

Arc Diagram

Edge Bundling

Colors

Interactivity

Animation with python

Animation

Cheat sheets

Caveats

3D