One numerical variable only

If you have only one numerical variable, you probably better have to make an histogram or a density plot. But you can still use the violinplot function to describe the distribution of this variable, as follows:

# libraries & dataset
import seaborn as sns
import matplotlib.pyplot as plt
  
sns.set_theme(style="darkgrid")
df = sns.load_dataset('iris')
 
# Make boxplot for one group only
sns.violinplot(y=df["sepal_length"])
plt.show()

One variable and several groups

Usually, violinplots are used in cases similar to boxplots: when you have one numerical variable and several groups. It allows to compare distributions from one group to another. One usually works with two columns, one giving the value of the variable, the other one the group:

# libraries & dataset
import seaborn as sns
import matplotlib.pyplot as plt
  
sns.set_theme(style="darkgrid")
df = sns.load_dataset('iris')
 
# plot
sns.violinplot( x=df["species"], y=df["sepal_length"] )
plt.show()

Several variables

Violinplots are also useful to compare several variables. In the iris dataset, we can compare the first 2 numerical variables:

# libraries & dataset
import seaborn as sns
import matplotlib.pyplot as plt
  
sns.set_theme(style="darkgrid")
df = sns.load_dataset('iris')
 
# plot
sns.violinplot(data=df.iloc[:,0:2])
plt.show()

Contact & Edit


👋 This document is a work by Yan Holtz. You can contribute on github, send me a feedback on twitter or subscribe to the newsletter to know when new examples are published! 🔥

This page is just a jupyter notebook, you can edit it here. Please help me making this website better 🙏!