Adding jitter to a boxplot distribution


Boxplot is an amazing way to study distributions. However, note that different type of distribution can be hidden under the same box. Thus, it is highly advised to display every observations over your boxplot, to be sure not to miss an interesting pattern. Note that violin plots can be an interesting alternative if you have many many observations.

Seaborn boxplot() function does not include any argument to display points directly. To do so, we use a matplotlib.axes object in order to successively plot a seaborn boxplot() and a seaborn swarmplot(). The latter enables us to add points to the figure.
Overall, such a figure is quite similar to a violinplots in terms of information.

# libraries & dataset
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="darkgrid")
df = sns.load_dataset('iris')

# Usual boxplot
ax = sns.boxplot(x='species', y='sepal_length', data=df)
 
# Add jitter with the swarmplot function
ax = sns.swarmplot(x='species', y='sepal_length', data=df, color="grey")
plt.show()

Violin

Density

Histogram

Boxplot

Ridgeline

Contact & Edit

👋 This document is a work by Yan Holtz. Any feedback is highly encouraged. You can fill an issue on Github, drop me a message onTwitter, or send an email pasting yan.holtz.data with gmail.com.

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!

Violin

Density

Histogram

Boxplot

Ridgeline

Scatterplot

Heatmap

Correlogram

Bubble

Connected Scatter

2D Density

Barplot

Spider / Radar

Wordcloud

Parallel

Lollipop

Circular Barplot

Treemap

Venn Diagram

Donut

Pie Chart

Dendrogram

Circular Packing

Line chart

Area chart

Stacked Area

Streamgraph

Map

Choropleth

Hexbin

Cartogram

Connection

Bubble

Chord Diagram

Network

Sankey

Arc Diagram

Edge Bundling

Colors

Interactivity

Animation with python

Animation

Cheat sheets

Caveats

3D