Libraries
First, we need to load a few libraries:
- seaborn: for creating the scatterplot
- matplotlib: for displaying the plot
- pandas: for data manipulation
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Dataset
Since scatter plot are made for visualizing relationships between two numerical variables, we need a dataset that contains at least two numerical columns.
Here, we will use the iris
dataset that we load directly from the gallery:
path = 'https://raw.githubusercontent.com/holtzy/The-Python-Graph-Gallery/master/static/data/iris.csv'
df = pd.read_csv(path)
Color
You can custom the appearance of the regression fit in a scatterplot built with seaborn thanks to the line_kws
argument.
Let's start by customising the color
:
fig, ax = plt.subplots(figsize=(8, 6))
sns.regplot(
x=df["sepal_length"],
y=df["sepal_width"],
line_kws={"color": "r"},
ax=ax
)
plt.show()
Opacity
You can also custom the opacity of the line with the alpha
value:
fig, ax = plt.subplots(figsize=(8, 6))
sns.regplot(
x=df["sepal_length"],
y=df["sepal_width"],
line_kws={
"color": "r",
"alpha": 0.4
},
ax=ax
)
plt.show()
Line width and style
You can also custom the line width and style with the linewidth
and linestyle
values:
fig, ax = plt.subplots(figsize=(8, 6))
sns.regplot(
x=df["sepal_length"],
y=df["sepal_width"],
line_kws={
"color": "r",
"alpha": 0.4,
"lw": 5,
"ls": "--"
},
ax=ax
)
plt.show()
Going further
This post explains how to customize the appearance of a regression fit in a scatter plot with seaborn.
You might be interested in a more advanced example on how to visualize linear regression and how to color dots according to a third variable.