Boxplot


A boxplot summarizes the distribution of a numeric variable for one or several groups. It allows to quickly get the median, quartiles and outliers but also hides the dataset individual data points. In python, boxplots are most of time done thanks to the boxplot function of the Seaborn library.

⏱ Quick start

Seaborn is definitely the best library to quickly build a boxplot. It offers a dedicated boxplot() function that roughly works as follows:🔥

# library & dataset
import seaborn as sns
df = sns.load_dataset('iris')

sns.boxplot( x=df["species"], y=df["sepal_length"] )

⚠️ Mind the boxplot

A boxplot is an awesome way to summarize the distribution of a variable. However it hides the real distribution and the sample size. Check the 3 charts below that are based on the exact same dataset.

To read more about this, visit data-to-viz.com that has a dedicated article.

🔎 violin function parameters→ see full doc

stringcolor under the curve


Contact

👋 This document is a work by Yan Holtz. Any feedback is highly encouraged. You can fill an issue on Github, drop me a message onTwitter, or send an email pasting yan.holtz.data with gmail.com.

Violin

Density

Histogram

Boxplot

Ridgeline

Scatterplot

Heatmap

Correlogram

Bubble

Connected Scatter

2D Density

Barplot

Spider / Radar

Wordcloud

Parallel

Lollipop

Circular Barplot

Treemap

Venn Diagram

Donut

Pie Chart

Dendrogram

Circular Packing

Line chart

Area chart

Stacked Area

Streamgraph

Map

Choropleth

Hexbin

Cartogram

Connection

Bubble

Chord Diagram

Network

Sankey

Arc Diagram

Edge Bundling

Colors

Interactivity

Animation with python

Animation

Cheat sheets

Caveats

3D