Data Visualization with Pandas
Pandas has been aiding us so far in the phase of Data Preprocessing. Though, in one instance, while creating Histograms, we've also utilized another module from Pandas - plotting
.
We've purposefully avoided is so far, because introducing it earlier would raise more questions than it answered. Namely, Pandas and Matplotlib were such a common an ubiquitous duo, that Pandas has started integrating Matplotlib's functionality. It heavily relies on Matplotlib to do any actual plotting, and you'll find many Matplotlib functions wrapped in the source code. Alternatively, you can use other backends for plotting, such as Plotly and Bokeh.
However, Pandas also introduces us to a couple of plots that aren't a part of Matplotlib's standard plot types, such as KDEs, Andrews Curves, Bootstrap Plots and Scatter Matrices.
The plot()
function of a Pandas DataFrame
uses the backend specified by plotting.backend
, and depending on the kind
argument - generates a plot using the given library. Since a lot of these overlap - there's no point in covering plot types such as line
, bar
, hist
and scatter
. They'll produce much the same plots with the same code as we've been doing so far with Matplotlib.
We'll only briefly take a look at the plot()
function since the underlying mechanism has been explored so far. Instead, let's focus on some of the plots that we can't already readily do with Matplotlib.