Hexbin

A hexbin plot is useful to represent the relationship of 2 numerical variables when you have a lot of data points. Without overlapping of the points, the plotting window is split into several hexbins. The color of each hexbin denotes the number of points in it. This can be easily done using the hexbin() function of matplotlib. Note that you can change the size of the bins using the gridsize argument. The parameters of hexbin() function used in the example are:

  • x, y: The data positions
  • gridsize: the number of hexagons in the x-direction and the y-direction

Libraries & Dataset

Let's start by import a few libraries and create a dataset:

# libraries
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
 
# create data
df = pd.DataFrame({
   'x': np.random.normal(size=100000),
   'y': np.random.normal(size=100000)
})
df.head()
x y
0 -0.802614 -0.745659
1 -0.861153 0.013962
2 -1.847799 1.002161
3 0.263749 0.264361
4 2.192508 0.464232

Make the plot

Making an hexbin plot is quite straightforward with the hexbin() function from matplotlib

fig, axs = plt.subplots(ncols=2, figsize=(8,4))
 
# Make the plot
axs[0].hexbin(df['x'], df['y'], gridsize=(15,15))
 
# We can control the size of the bins:
axs[1].hexbin(df['x'], df['y'], gridsize=(150,150))

plt.show()

Color

It is possible to change the color palette applied to the plot with the cmap argument. Read this page to learn more about color palette with matplotlib and pick up the right one.

fig, axs = plt.subplots(ncols=2, nrows=2, figsize=(8,8))
 
# red colormap
axs[0,0].hexbin(df['x'], df['y'], gridsize=(15,15), cmap=plt.cm.Reds_r)
axs[0,0].set_title('cmap=plt.cm.Reds')

# blue colormap
axs[0,1].hexbin(df['x'], df['y'], gridsize=(15,15), cmap=plt.cm.Blues_r)
axs[0,1].set_title('cmap=plt.cm.Blues')

# green colormap
axs[1,0].hexbin(df['x'], df['y'], gridsize=(15,15), cmap=plt.cm.Greens_r)
axs[1,0].set_title('cmap=plt.cm.Greens')

# grey colormap
axs[1,1].hexbin(df['x'], df['y'], gridsize=(15,15), cmap=plt.cm.Greys_r)
axs[1,1].set_title('cmap=plt.cm.Greys')

plt.show()

Colorbar and legend

Note that you can easily add a color bar beside the plot using colorbar() function.

plt.hexbin(df['x'], df['y'], gridsize=(15,15), cmap=plt.cm.Greys_r)
plt.colorbar()
plt.show()

Going further

You might be interested:

Contact & Edit


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!