Data Visualization using Matplotlib in Python
Matplotlib is a widely-used Python library used for creating static, animated and interactive data visualizations. It is built on the top of NumPy and it can easily handles large datasets for creating various types of plots such as line charts, bar charts, scatter plots, etc. These visualizations help us to understand data better by presenting it clearly through graphs and charts. In this article, we will see how to create different types of plots and customize them in matplotlib.
Installing Matplotlib for Data Visualization
To install Matplotlib, we use the pip command. If pip is not installed on your system, please refer to our article Download and install pip Latest Version to set it up.
To install Matplotlib type below command in the terminal:
pip install matplotlib
If we are working on a Jupyter Notebook, we can install Matplotlib directly inside a notebook cell by running:
!pip install matplotlib
Visualizing Data with Pyplot using Matplotlib
Matplotlib provides a module called pyplot which offers a MATLAB-like interface for creating plots and charts. It simplifies the process of generating various types of visualizations by providing a collection of functions that handle common plotting tasks. Let’s explore some examples with simple code to understand how to use it effectively.
1. Line Chart
Line chart is one of the basic plots and can be created using the plot() function. It is used to represent a relationship between two data X and Y on a different axis.
Syntax:
matplotlib.pyplot.plot(x, y, color=None, linestyle='-', marker=None, linewidth=None, markersize=None)
Example:
import matplotlib.pyplot as plt
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
plt.plot(x, y)
plt.title("Line Chart")
plt.ylabel('Y-Axis')
plt.xlabel('X-Axis')
plt.show()
Output:
2. Bar Chart
A bar chart is a graph that represents the category of data with rectangular bars with lengths and heights which is proportional to the values which they represent. The bar plot can be plotted horizontally or vertically. It describes the comparisons between different categories and can be created using the bar() method.
In the below example we will using Pandas library for its implementation on tips dataset. It is the record of the tip given by the customers in a restaurant for two and a half months in the early 1990s and it contains 6 columns. You can download the dataset from here.
Syntax:
matplotlib.pyplot.bar(x, height, width=0.8, bottom=None, color=None, edgecolor=None, linewidth=None)
Example:
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
x = data['day']
y = data['total_bill']
plt.bar(x, y)
plt.title("Bar chart")
plt.ylabel('Total Bill')
plt.xlabel('Day')
plt.show()
Output:
3. Histogram
A histogram is used to represent data provided in a form of some groups. It is a type of bar plot where the X-axis represents the bin ranges while the Y-axis gives information about frequency. The hist() function is used to find and create histogram of x.
Syntax:
matplotlib.pyplot.hist(x, bins=None, range=None, density=False, color=None, edgecolor=None, alpha=None)
Example:
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
x = data['total_bill']
plt.hist(x)
plt.title("Histogram")
plt.ylabel('Frequency')
plt.xlabel('Total Bill')
plt.show()
Output:

4. Scatter Plot
Scatter plots are used to observe relationships between variables. The scatter() method in the matplotlib library is used to draw a scatter plot.
Syntax:
matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, linewidths=None, edgecolors=None, alpha=None)
Example:
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
x = data['day']
y = data['total_bill']
plt.scatter(x, y)
plt.title("Scatter Plot")
plt.ylabel('Total Bill')
plt.xlabel('Day')
plt.show()
Output:

5. Pie Chart
Pie chart is a circular chart used to display only one series of data. The area of slices of the pie represents the percentage of the parts of the data. The slices of pie are called wedges. It can be created using the pie() method.
Syntax:
matplotlib.pyplot.pie(data, explode=None, labels=None, colors=None, autopct=None, shadow=False)
Example:
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
cars = ['AUDI', 'BMW', 'FORD',
'TESLA', 'JAGUAR',]
data = [23, 10, 35, 15, 12]
plt.pie(data, labels=cars)
plt.title(" Pie Chart")
plt.show()
Output:

6. Box Plot
A Box Plot is also known as a Whisker Plot which is a standardized way of displaying the distribution of data based on a five-number summary: minimum, first quartile (Q1), median (Q2), third quartile (Q3) and maximum. It can also show outliers.
Syntax:
matplotlib.pyplot.boxplot(x, notch=False, vert=True, patch_artist=False, showmeans=False, showcaps=True, showbox=True)
Example:
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(10)
data = [np.random.normal(0, std, 100) for std in range(1, 4)]
plt.boxplot(data, vert=True, patch_artist=True,
boxprops=dict(facecolor='skyblue'),
medianprops=dict(color='red'))
plt.xlabel('Data Set')
plt.ylabel('Values')
plt.title('Box Plot')
plt.show()
Output:

The box shows the interquartile range (IQR) the line inside the box shows the median and the "whiskers" extend to the minimum and maximum values within 1.5 * IQR from the first and third quartiles. Any points outside this range are considered outliers and are plotted as individual points.
7. Heatmap
A Heatmap represents data in a matrix form where individual values are represented as colors. They are useful for visualizing the magnitude of multiple features in a two-dimensional surface and identifying patterns, correlations and concentrations.
Syntax:
matplotlib.pyplot.imshow(X, cmap=None, interpolation=None, aspect=None)
Example:
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(0)
data = np.random.rand(10, 10)
plt.imshow(data, cmap='viridis', interpolation='nearest')
plt.colorbar()
plt.xlabel('X-axis Label')
plt.ylabel('Y-axis Label')
plt.title('Heatmap')
plt.show()
Output:

The color bar on the side provides a scale to interpret the colors, darker colors representing lower values and lighter colors representing higher values. This type of plot is used in fields like data analysis, bioinformatics and finance to visualize data correlations and distributions across a matrix.
How to Customize Matplotlib Visualizations?
Matplotlib allows many ways for customization and styling of our plots. We can change colors, add labels, adjust styles and much more. By applying these customization techniques to basic plots we can make our visualizations clearer and more informative. Lets see various customizing ways:
1. Customizing Line Chart
We can customize line charts using these properties:
- Color: Change the color of the line
- Linewidth: Adjust the width of the line
- Marker: Change the style of plotted points
- Markersize: Change the size of the markers
- Linestyle: Define the style of the line like solid, dashed, etc.
Example:
import matplotlib.pyplot as plt
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
plt.plot(x, y, color='green', linewidth=3, marker='o',
markersize=15, linestyle='--')
plt.title("Customizing Line Chart")
plt.ylabel('Y-Axis')
plt.xlabel('X-Axis')
plt.show()
Output:

2. Customizing Bar Chart
Bar charts can be made more informative and visually appealing by customizing:
- Color: Fill color of the bars
- Edgecolor: Color of the bar edges
- Linewidth: Thickness of the edges
- Width: Width of each bar
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
x = data['day']
y = data['total_bill']
plt.bar(x, y, color='green', edgecolor='blue',
linewidth=2)
plt.title("Customizing Bar Chart")
plt.ylabel('Total Bill')
plt.xlabel('Day')
plt.show()
Output:

The lines between bars correspond to the values on the Y-axis for each X-axis category.
3. Customizing Histogram Plot
To make histogram plots more effective we can apply various customizations:
- Bins: Number of groups (bins) to divide data into
- Color: Bar fill color
- Edgecolor: Bar edge color
- Linestyle: Style of the edges like solid, dashed, etc.
- Alpha: Transparency level (0 = transparent, 1 = opaque)
Example:
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
x = data['total_bill']
plt.hist(x, bins=25, color='green', edgecolor='blue',
linestyle='--', alpha=0.5)
plt.title(" Customizing Histogram Plot")
plt.ylabel('Frequency')
plt.xlabel('Total Bill')
plt.show()
Output:

4. Customizing Scatter Plot
Scatter plots can be enhanced with:
- S: Marker size (single value or array)
- C: Color of markers or sequence of colors
- Marker: Marker style like circle, diamond, etc.
- Linewidths: Width of marker borders
- Edgecolor: Color of marker borders
- Alpha: Blending value, between 0 (transparent) and 1 (opaque)
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
x = data['day']
y = data['total_bill']
plt.scatter(x, y, c=data['size'], s=data['total_bill'],
marker='D', alpha=0.5)
plt.title("Customizing Scatter Plott")
plt.ylabel('Total Bill')
plt.xlabel('Day')
plt.show()
Output:

5. Customizing Pie Chart
To make our pie charts more effective and visually appealing we consider the following customization:
- Explode: Moving the wedges of the plot
- Autopct: Label the wedge with their numerical value.
- Color: Colors of the slices
- Sadow: Used to create a shadow effect
Example:
import matplotlib.pyplot as plt
import pandas as pd
data = pd.read_csv('/content/tip.csv')
cars = ['AUDI', 'BMW', 'FORD',
'TESLA', 'JAGUAR',]
data = [23, 13, 35, 15, 12]
explode = [0.1, 0.5, 0, 0, 0]
colors = ( "orange", "cyan", "yellow",
"grey", "green",)
plt.pie(data, labels=cars, explode=explode, autopct='%1.2f%%',
colors=colors, shadow=True)
plt.show()
Output:
Matplotlib’s Core Components: Figures and Axes
Before we proceed let’s understand two classes which are important for working with Matplotlib.
1. Figure class
The figure class is like the entire canvas or window where all plots are drawn. Think of it as the overall page or frame that can contain one or more plots. We can create a Figure using the figure() function. It controls the size, background color and other properties of the whole drawing area.
Syntax:
matplotlib.figure.Figure(figsize=None, dpi=None, facecolor=None, edgecolor=None, linewidth=0.0, ...)
Example:
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
fig = plt.figure(figsize =(7, 5), facecolor='g',
edgecolor='b', linewidth=7)
ax = fig.add_axes([1, 1, 1, 1])
ax.plot(x, y)
plt.title("Linear graph", fontsize=25, color="yellow")
plt.ylabel('Y-Axis')
plt.xlabel('X-Axis')
plt.ylim(0, 80)
plt.xticks(x, labels=["one", "two", "three", "four"])
plt.legend(["GFG"])
plt.show()
Output:
2. Axes Class
Axes class represents the actual plotting area where data is drawn. It is the most basic and flexible for creating plots or subplots within a figure. A single figure can contain multiple axes but each Axes object belongs to only one figure. We can create an Axes object using the axes() function.
Syntax:
axes([left, bottom, width, height])
Like pyplot, the Axes class provides methods to customize our plot which includes:
- ax.set_title(): Add a title to the plot
- ax.set_xlabel(), ax.set_ylabel(): Add labels to the X and Y axes
- ax.set_xlim(), ax.set_ylim(): Set limits for the axes
- ax.set_xticklabels(), ax.set_yticklabels(): Customize tick labels
- ax.legend(): Add a legend to describe plot elements
Example:
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
fig = plt.figure(figsize = (5, 4))
ax = fig.add_axes([1, 1, 1, 1])
ax1 = ax.plot(x, y)
ax2 = ax.plot(y, x)
ax.set_title("Linear Graph")
ax.set_xlabel("X-Axis")
ax.set_ylabel("Y-Axis")
ax.legend(labels = ('line 1', 'line 2'))
plt.show()
Output:
Advanced Techniques for Visualizing Subplots
We have learned how to add basic parts to a graph to show more information. One method can be by calling the plot function again and again with a different set of values as shown in the above example. Now let’s see how to draw multiple graphs in one figure using some Matplotlib functions and how to create subplots.
Method 1: Using add_axes()
The add_axes() method allows us to manually add axes to a figure in Matplotlib. It takes a list of four values [left, bottom, width, height]
to specify the position and size of the axes.
Example:
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
fig = plt.figure(figsize =(5, 4))
ax1 = fig.add_axes([0.1, 0.1, 0.8, 0.8])
ax2 = fig.add_axes([1, 0.1, 0.8, 0.8])
ax1.plot(x, y)
ax2.plot(y, x)
plt.show()
Output:
Method 2: Using subplot()
The subplot() method adds a plot to a specified grid position within the current figure. It takes three arguments: the number of rows, columns and the plot index.
Example:
import matplotlib.pyplot as plt
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
plt.figure()
plt.subplot(121)
plt.plot(x, y)
plt.subplot(122)
plt.plot(y, x)
Output:
Method 3: Using subplot2grid()
The subplot2grid() creates axes object at a specified location inside a grid and also helps in spanning the axes object across multiple rows or columns.
Example:
import matplotlib.pyplot as plt
x = [10, 20, 30, 40]
y = [20, 25, 35, 55]
axes1 = plt.subplot2grid (
(7, 1), (0, 0), rowspan = 2, colspan = 1)
axes2 = plt.subplot2grid (
(7, 1), (2, 0), rowspan = 2, colspan = 1)
axes1.plot(x, y)
axes2.plot(y, x)
Output:
Saving Plots Using savefig()
When we create plots using Matplotlib sometimes we want to save them as image files so we can use them later in reports, presentations or share with others. Matplotlib provides the savefig() method to save our current plot to a file on our computer. We can saving a plot in different formats like .png, .jpg, .pdf, .svg and more by just changing the file extension.
Example:
import matplotlib.pyplot as plt
year = ['2010', '2002', '2004', '2006', '2008']
production = [25, 15, 35, 30, 10]
plt.bar(year, production)
plt.savefig("output.jpg")
plt.savefig("output1", facecolor='y', bbox_inches="tight",
pad_inches=0.3, transparent=True)
Output:
With these Matplotlib functions and techniques we can create clear, customized and insightful visualizations that bring our data to life.