Histogram
A Histogram represents the distribution of a numeric variable for one or several groups. The values are split in bins, each bin is represented as a bar. This page showcases many histograms built with python, using both the seaborn
and the matplotlib
libraries.
⏱ Quick start (Seaborn)
Seaborn
is definitely the best library to quickly build a histogram thanks to its distplot()
.
Note the importance of the bins
parameter: try several values to see which represents your data the best. 🔥
# library & dataset
import seaborn as sns
df = sns.load_dataset('iris')
# Plot the histogram thanks to the distplot function
sns.distplot( a=df["sepal_length"], hist=True, kde=False, rug=False )
Histogram charts with Seaborn
Seaborn
is a python library allowing to make better charts easily. It is well adapted to build histogram thanks to its distplot
function. The following charts will guide you through its usage, going from a very basic histogram to something much more customized.
Quick start (Matplotlib)
Matplotlib can also build decent histograms easily. It provides a hist()
function that accept a vector of numeric values as input.
It also provides all the options you can think of to customize the binning and the genreral appearance.
# library & dataset
import matplotlib.pyplot as plt
hours = [17, 20, 22, 25, 26, 27, 30, 31, 32, 38, 40, 40, 45, 55]
# Initialize layout
fig, ax = plt.subplots(figsize = (9, 9))
#plot
ax.hist(hours, bins=5, edgecolor="black");
Histograms with Matplotlib
As usual matplotlib is perfectly skilled to build nice histogram, but require some more work camparing to seaborn to get a good looking figure.
The examples below should help you to get started with matplotlib histograms. They go from a very basic version and then show how to customize it, like adding annotation.
Contact
👋 This document is a work by Yan Holtz. Any feedback is highly encouraged. You can fill an issue on Github, drop me a message onTwitter, or send an email pasting yan.holtz.data
with gmail.com
.