Course Ebook (PDF)

Course Ebook (EPUB)

Data can be represented in many different ways. We quantify observable phenomena to generate data which can then be represented through mathematical formulas, music, text, visualizations, etc.
Python has become one of the preferred languages in the world of Data Science over the years, given its simplicity and ease of use, which lowered the barrier to entry from other professions and opened up collaboration options between many research groups. Even without deep knowledge of Python - many scientists have become able to visualize and share their work with their colleagues and peers, and the new wave of users necessitated a new wave of tools. The easier it is to get started - the larger the impact on the community.
Many libraries have been created to make working with data easier, and as of writing this course, there's no shortage of powerful libraries that allow even the unpracticed folk to step in and try their hand at extracting knowledge from data. Data Visualization is one of the techniques used in all aspects of research and science. It's an interdisciplinary field that represents data through various graphical elements, such as lines and markers. Though - Data Visualization is much more than graphs and charts you've learned in elementary school. You can plot and explore relationships between data, their distributions, summaries, and put things into perspective. Data Visualization has become much more - it's become a storytelling substrate. Each plot you make can tell a story, and you're the artist shaping it.
An artist can also choose to put things out of context - and it's easy to lie with data. As a matter of fact, many people do - however, the statement is not as cynical as it may sound. Many people lie to themselves, not others. Without proper knowledge of how to approach problems, you can &quot;hallucinate&quot; relationships where they aren't present, and infer causation from correlation, which more often than not doesn't hold water.
The applicability of Data Visualization is a long list and it's it's a technique used within Data Science, Data Analysis, Descriptive statistics as well as at the core of Exploratory Data Analysis which is present at the heart of practically all research. Whether you're a biologist, molecular physicist, machine learning engineer, software engineer, psychologist or philosopher - your findings are backed by data, and data in the form of numbers is hard to interpret for humans. We use this crucial step of interpreting our own results as input for our Bayesian best guess of the world and reality around us, so it's crucial that we produce clear, concise, impactful and interpretable results, lest we end up fooling ourselves.
<h3 id="howthecourseisformatted">How the Course Is Formatted</h3>
This course will cover 9 different libraries used in the Python ecosystem for Data Visualization, going over the most relevant and unique attributes and features of each. This course will also cover the different types of data you can visualize in Python, in addition to common visualization techniques, tools, and plot types.
Note: Note that the course assumes prior knowledge of Python's syntax and basic handling in the language. We won't assume prior knowledge of the additional libraries used in the course such as Pandas and NumPy. Chapter 3 is dedicated to a brief introduction to Pandas and how it can be used to load, manipulate and visualize data. NumPy will be used extensively though the course for its helper methods, to generate ranges and dummy values, as well as calculate aggregate statistics. In most cases, these methods are fairly self-explanatory and a dedicated section to NumPy isn't required to get the hang of it. Whenever new methods are used - a short description in an additional paragraph will be added to explain its usage.
Each lesson in the course will start with an introduction to the library covered in the lesson, followed by its internal representation and terminology. The strengths and weaknesses of each library as well as some of the baseline (common) and unique plots will be covered, but we won't be diving into every plot of every library, as that would be extremely repetitive and tiresome. Instead, we'll be building a holistic intuition for each library.
For most libraries in the course, we'll end it off with a hands-on, end-to-end project, featuring a new dataset or domain, and this is the focus of the course. Diversity is one of the most important aspects of practice in the field, and each dataset needs to be preprocessed in a different manner. Additionally, someone who does Data Visualization has to have the ability to understand a wide variety of topics, at least on a surface level. You can't infer from data if you don't understand what it means. Additionally, you're much more liable to interpreting something wrong and arriving at false conclusions if you're not familiar with the domain you're dealing with.
Throughout the course - we'll be using a baseline, simple dataset or a couple of simple datasets to start out with the libraries, followed by a new dataset or multiple datasets in the hands-on section. We'll be exploring mathematical structures and generator functions, EEG (electroencephalogram) brainwave data, spatial data, etc.
Some of these domains use completely different data formats - for instance, GeoJSON is oftentimes used for spatial data, EEG uses several data formats (we'll be working with CSV), genomic data can be represented through various formats such as FASTA, bedGraph, bowtie, etc.
Many formats can be boiled down to the good old CSV format most people are acquainted with, and when possible - we'll be falling back to it for simplicity's sake, as this is the most common one you'll be using in your day-to-day work.
Many of these datasets belong to completely different families of data, and you need some exposition and context to understand it properly. The course will have an introductory section on each domain right before we dive into the project. This should be enough for you to grasp and understand the visualizations and analysis done in the practical sections. Don't forget, doing your own research of the fields you're visualizing data from is part of the Data Visualization process.
The course is written by two authors - David Landup, who authored the hands-on, end-to-end sections and edited the rest of the course and Daniel Nelson, who authored the introductions to the libraries.
<h4 id="landscapeofpythonslibraries">Landscape of Python's Libraries</h4>
Before delving too deeply into the libraries themselves, it would be helpful to gain an intuition of how the landscape of Python’s visualization libraries breaks down. To put that another way, it’s helpful to understand how the different Python libraries are designed and related to one another. Understanding how the different libraries operate will help you choose the best library for your visualization project.
There are a number of different data visualization libraries and modules compatible with Python. Most of the Python data visualization libraries can be placed into one of four groups, separated based on their origins and focus.
The groups are:
<ul>
<li>Matplotlib-based libraries</li>
<li>JavaScript libraries</li>
<li>JSON libraries</li>
<li>WebGL libraries</li>
</ul>
<h4 id="matplotlibbasedlibraries">Matplotlib-based Libraries</h4>
The first major group of libraries is those based on Matplotlib. Matplotlib is one of the oldest Python data visualization libraries, and thanks to its wealth of features and ease of use it is still one of the most widely used one. Matplotlib was first released back in 2003 and has been continuously updated since.
Matplotlib contains a large number of visualization tools, plot types, and output types. It produces mainly static visualizations. While the library does have some 3D visualization options, these options are far more limited than those possessed by other libraries like Plotly and VisPy. It is also limited in the field of interactive plots, unlike Bokeh, which we'll cover in a later lesson.
Because of Matplotlib’s success as a visualization library, various other libraries have expanded on its core features over the years. These libraries are Matplotlib-based, using Matplotlib as an engine for their own visualization functions.
The libraries based upon Matplotlib add new functionality to the library by specializing in the rendering of certain data types or domains, adding new types of plots, or creating new high-level APIs for Matplotlib’s functions.
They're used alongside Matplotlib, not instead, to expand its styling and plotting capabilities.
<h4 id="javascriptbasedlibraries">JavaScript-based Libraries</h4>
There are a number of JavaScript-based libraries for Python that specialize in data visualization. The adoption of HTML5 by web browsers enabled interactivity for graphs and visualizations, instead of only static 2D plots. Styling HTML pages with CSS can net beautiful visualizations.
These libraries wrap JavaScript/HTML5 functions and tools in Python, allowing the user to create new interactive plots. The libraries provide high-level APIs for the JavaScript functions, and the JavaScript primitives can often be edited to create new types of plots, all from within Python.
<h4 id="jsonbasedlibraries">JSON-based Libraries</h4>
JavaScript Object Notation (JSON) is a data interchange format, containing data in a simple structured format that can be interpreted not only by JavaScript libraries but by almost any language. It's also human-readable.
There are various Python libraries designed to interpret and display JSON data. With JSON-based libraries, the data is fully contained in a JSON data file. This makes it possible to integrate plots with various visualization tools and techniques.
<h4 id="webglbasedlibraries">WebGL-based Libraries</h4>
The WebGL standard is a graphics standard that enables interactivity for 3D plots. Much like how HTML5 made interactivity for 2D plots possible (and plotting libraries were developed as a result), the WebGL standard gave rise to 3D interactive plotting libraries.
Python has several plotting libraries that are focused on the development of WebGL plots. Most of these 3D plotting libraries allow for easy integration and sharing via Jupyter notebooks and remote manipulation through the web.
<h4 id="otherlibraries">Other Libraries</h4>
There are also a variety of other Python plotting libraries, many of which create Python wrappers for other languages and visualization platforms.
<h3 id="popularpythondatavisualizationlibraries">Popular Python Data Visualization Libraries</h3>
This course will cover the most popular data visualization libraries for Python, which fall into the five different categories defined above. The libraries covered in this course are: Matplotlib, Pandas, Seaborn, Bokeh, Plotly, Altair, GGPlot, GeoPandas, and VisPy.
You’ll need to know what these different libraries are capable of, in order to choose the proper library for your project’s needs. Let’s take a quick look at these different libraries, some of their unique distinctive features, and what they're used for.
<h4 id="matplotlibbasedpythonlibraries">Matplotlib-based Python Libraries</h4>
<h5 id="matplotlib">Matplotlib</h5>
As already stated above, Matplotlib is one of the most common and widely used visualization libraries, used to create static 2D plots, although it does have some support for 3D visualizations. Matplotlib is structured in a fashion that allows the user to create and customize multiple plots for a single image, achieved through the creation of subplots. It's intended to make producing both simple and advanced plots straightforward and intuitive and has support for both static and interactive visualization modes. Though, it's relatively limited when it comes to interactive visualization.
Matplotlib is able to generate numerous different plot types and styles, and it can work along with general-purpose Python GUI libraries like Qt and Tkinter.
<h5 id="pandas">Pandas</h5>
Pandas is a data analysis and manipulation library. While Pandas does come with some visualization and plotting functions, the main reason Pandas is so popular and widely used is that the library makes manipulating data simple and straightforward. Pandas can read data in many different formats, and it creates a Python data object filled with rows and columns, called a <code>DataFrame</code>.
These rows and columns are easy to manipulate through built-in functions that let the user merge, split, view, filter, sort, and otherwise alter the data within them, all done with relatively simple commands.
For these reasons, Pandas is frequently used alongside the other data visualization libraries - to prepare the data in question for analysis.
<h5 id="seaborn">Seaborn</h5>
Seaborn is a visualization library that adds onto Matplotlib’s basic functions. Seaborn is intended to enable the easy creation of informative and attractive visualizations. Seaborn gives the user more control over their plots, letting them do things that aren’t possible with normal Matplotlib.
This includes the ability to easily produce less common types of visualizations such as heatmaps, violin plots, and joint plots, amongst other plots. Seaborn's goal is to abstract away many of Matplotlib's low-level functions and methods, letting the user create visually impressive plots with less code compared to Matplotlib.
Seaborn gives you more customization options for your plots as well, allowing you to use preset themes or customize the plots to your liking. It also enables efficient handling of dataframes and time-series data.
<h5 id="geopandas">GeoPandas</h5>
GeoPandas is an extension to the Pandas plotting library designed to make it easier to work with geospatial/geographical data. GeoPandas enables the types of data manipulation possible in Pandas on geometric data, letting you easily carry out visualization tasks that would typically require a spatial database.
GeoPandas allows you to specify the shape of graph regions using special shapefiles, and to clip points and lines to the boundary mask.
<h4 id="javascriptbasedlibraries">JavaScript-based Libraries</h4>
<h5 id="bokeh">Bokeh</h5>
Bokeh is a visualization library that allows the user to create interactive visualizations that can be displayed in Jupyter notebooks and web browsers. Bokeh is focused on the production of highly interactive visualizations, unlike Matplotlib which has just a handful of interactive options. Visualizations in Bokeh are based around objects called &quot;glyphs&quot;, which you can render in numerous different shapes and styles.
Bokeh lets you choose different tools to include alongside your visualization. These tools let you select groups of data points, hover over points to see more information about them, zoom in on multiple graphs at once, and more.
It also allows you to construct numerous different plots with various styles, all the while maintaining high performance across large datasets. Bokeh supports HTML formatting and exporting and has native Pandas integration, allowing you to edit dataframes and the resulting visualizations easily.
With Bokeh, it's easy to create a well-styled interactive HTML file which you can then embed into a page or presentation.
<h5 id="plotly">Plotly</h5>
Like Bokeh, Plotly is designed specifically with the purpose of creating interactive plots. Plotly supports numerous use cases like statistical, geographic, scientific, and even 3D datasets. Similar to Bokeh's use of glyphs, the fundamental unit of a Plotly plot is the &quot;trace&quot;. You can combine multiple traces and display them all on a single figure.
Plotly for Python is based on JavaScript's Plotly library and it can be used to create more than 40 different types of plots and charts, each of which can be displayed in a Jupyter notebook or saved in an HTML file. Plotly allows the user to save their plots in the cloud or as a file on their device.
Plotly plots are interactive by default, and they can be created with JSON charts as well as easily embedded in web pages. You can also export Plotly graphs in a variety of different formats, such as PNG, SVG, PDF, and HTML to your local machine.
<h4 id="jsonbasedlibraries">JSON-based Libraries</h4>
<h5 id="altair">Altair</h5>
Altair is a Python library designed explicitly for the visualization of statistical data. Altair is based on the Vega and Vega-Lite standards, meaning that you use visualization grammar (specific phrases) that allow you to specify the level of interactivity and style you want your graph to have. Vega specifications are used to define how interactive visualizations are created in JavaScript Object Notation (JSON). Altair is a declarative library, and all you need to do is declare which kind of graph you'd like to create along with some desired features for it.
With Altair, you can produce effective visualizations with minimal code. You can often create complex plots with just a single line of code. However, Altair does lack some of the more advanced customization features of the other libraries.
Altair is designed to quickly create interactive statistical visualizations that can be integrated with IPython notebooks. Altair also lets you create compound charts comprised of different layers.
<h4 id="webglbased">WebGL-Based</h4>
<h5 id="vispy">VisPy</h5>
VisPy is a 2D and 3D visualization library, created primarily to assist in the visualization of big data. Unlike the other libraries mentioned here, VisPy makes use of Graphics Processing Units (GPUs) to display the visualization of large datasets.
VisPy supports visualizations of scientific and statistical plots featuring millions of data points. It's intended to be scalable, easy to use, and fast. With having both low-level and high-level interfaces, VisPy makes it possible to create visualizations with relatively few lines of code and then edit those visualizations to your needed specifications.
It has OpenGL support, on which it currently bases some of its functionality, though it does require knowledge of the OpenGL Shaders Language (GLSL) to use.
<h4 id="other">Other</h4>
<h5 id="ggplot">GGplot</h5>
GGplot is intended to make producing plots simple and efficient, rendering them with minimal code. It uses the “Grammar of Graphics” standard, borrowed from R. GGplot graphs contain consistent basic elements, which makes graphs uniform and easy to read.
GGplot lets you perform aesthetics mapping, meaning that you can control how variables within your dataset are mapped onto visual properties, defining mappings for different variables and layers of your graph.

David Landup

Dan Nelson

An Introduction To Data Visualization In Python

Now that we've covered the different libraries that we'll explore, let's take a bit of time to discuss some of the plots and visualizations you can create with these libraries. As discussed in the section introducing the different libraries, Python can be used to visualize everything from simple, static graphs and plots to complex, interactive, and even 3D plots.
Covering the variety of plots you can create will help you get a better idea of different ways you can visualize your data and how to choose the right plot type for the job.
We’ll divide the plot types into several categories:
<ul>
<li>Statistical plots</li>
<li>Images</li>
<li>Networks/Graphs</li>
<li>Geographical</li>
<li>3D and Interactive</li>
<li>Grids and Meshes</li>
</ul>
<h3 id="statisticalplots">Statistical Plots</h3>
Statistical plots are, arguably, the most common type of plots you'll come across. These plots are commonly used in statistics to visualize datasets, making comparisons between different features and observing trends in the data.
They are generally simple and really useful for basic descriptive statistics and data analysis. Some of these plots are learned in elementary school and fall under the category of common knowledge, though some, like Scatterplots and Violin Plots might be unfamiliar to you without prior knowledge on the subject.
<h4 id="bargraph">Bar Graph</h4>

Types of Plots

Pandas is one of the most commonly used data science and analysis libraries in Python. The popularity of Pandas comes from the fact that it lets you easily create and edit data structures, making both data visualization and manipulation very straightforward.
Pandas allows the user to make dataframes out of either an entire dataset or subsets of that dataset, and then do things like cut, filter, merge, and otherwise edit those dataframes.
<blockquote>
A Pandas <code>DataFrame</code> is a frame of two-dimensional, size-mutable, potentially heterogeneous tabular data.
</blockquote>
More on that later.
Note that Pandas is more of a data manipulation library than a visualization library. While Pandas does allow you to create some plots with its methods and functions, it relies heavily on Matplotlib.
We will cover Matplotlib in the following lesson, but the preparation of data for visualization is a critical first step - which is a breeze to do with Pandas.
Pandas gives you complete control over a dataset, allowing you to select the entire dataset, just a single element in the dataset or anything in-between.
It supports a number of dataset manipulation techniques, like giving you the ability to merge, concatenate, join, and pivot tables. You can iterate through rows of a table and apply transformations to them, like stripping out unnecessary characters or dropping duplicate rows. All of these features make Pandas an excellent tool for data preprocessing.

Manipulating and Visualizing Data with Pandas

Matplotlib is the most widely used data visualization and plotting library in all of Python. In fact, as we've said before, many of the other libraries in this course utilize attributes of Matplotlib to display the plots they generate.
Much of Matplotlib's popularity comes from the fact that it is highly customizable, with users able to edit almost every aspect of a Matplotlib plot.
Matplotlib plots are comprised of a hierarchy of objects. At the top level of the plot, the <code>Figure</code> is what contains the rest of the plot elements. The intermediate and lower level plot elements are objects and elements like the <code>Axes</code>, <code>Labels</code>, <code>Ticks</code>, and <code>Legends</code>. All of these elements can be tweaked by the user.
In this section, we'll cover the features of Matplotlib, and when you would want to use it. We'll then move on to covering the layout and elements that comprise a Matplotlib plot, demonstrating how to customize these elements. We'll then go over some examples of the visualizations that you can create with Matplotlib.
Finally, we'll explore the Collatz Conjecture and learn how simple number sequences can have profound visualizations.
<h2 id="featuresofmatplotlib">Features of Matplotlib</h2>
One reason for Matplotlib's enduring popularity is the fact that every element of a Matplotlib plot can be customized. Plots in Matplotlib are all based on <code>Figure</code>s. The <code>Figure</code> is the whole window which holds a single plot or even multiple plots.

Matplotlib    

Seaborn is a statistical infographics library, which builds on the capabilities of Matplotlib. Seaborn was designed to augment Matplotlib’s functions and tools, addressing some of the common issues users have with Matplotlib, with the goal of making the creation of useful, aesthetically pleasing visualizations quicker and easier.
Matplotlib’s high-level API is low-level compared to Seaborn, and as a result the user often needs to write a fair amount of boilerplate code. Seaborn attempts to reduce much of the redundancy in Matplotlib and make the more difficult Matplotlib visualization tasks easier.
In this lesson, we’ll go over the features of Seaborn, discuss the process of creating and styling plots with Seaborn, and then look at some sample visualizations produced with it. We'll top it off with a hands-on project, exploring the Confused Students EEG Dataset.
<h2 id="featuresofseaborn">Features of Seaborn</h2>
There are some notable features of Seaborn that make many people's preferred plotting choice, over pure Matplotlib. Seaborn allows the user to create statistical graphics easily thanks to features like: a high-level interface, aesthetically pleasing themes, easy comparison between multiple variables, multi-plot grids, univariate and bivariate visualization, automatic estimation for regressions, and easy plotting of time series data.
Compared to Matplotlib, Seaborn’s API is a much higher-level API, meaning that it takes fewer lines of code to produce visualizations with Seaborn.

Seaborn

Bokeh is a JavaScript-based data visualization library that specializes in the creation of interactive plots. While Matplotlib and libraries based around it are the most popular data visualization libraries in Python, the JavaScript counterparts (Bokeh and Plotly) are quickly catching up. While Plotly has been starting to &quot;steal the spotlight&quot; in the JS-based ecosystem, Bokeh is still very well worth covering. We'll cover Plotly in Lesson 8.
The advantage of using Matplotlib is that the plots produced with it are consistent and easily reproducible by others. It's widely-used and many are familiar with it, but Bokeh is able to create visualizations that are much more interactive and optimized for display on the web.
In the following sections of this lesson, we’ll cover the most notable features of Bokeh, explore how interactive plots in Bokeh are created, and then explore some different examples of the plots you can create with Bokeh. Finally, we'll dive into another mini project!
<h2 id="featuresofbokeh">Features of Bokeh</h2>
As alluded to above, Bokeh was designed to create interactive visualizations, and these visualizations can be embedded in websites. The visualizations are created with “glyphs”, which can be individually stylized and customized. Bokeh visualizations can be formatted and exported using simple HTML formatting, allowing you to ensure that your visualizations mesh nicely with any web pages you are going to display them on. This makes it a great companion for data-driven applications that want to display visualizations to its users! Matplotlib's and Seaborn's output would have to be saved as a file and then added as an image (or SVG), which is a hassle.

Bokeh

Altair is a Python library engineered to facilitate the visualization of statistical data. What sets Altair apart from other visualization libraries is that it is based on the Vega and Vega-Lite standards, which are a type of grammar related to visualizations.
The result is that you can simply describe what you want a plot to look like using the JSON format and Altair will render the corresponding visualization. This makes creating plots in Altair quite intuitive and easy to grasp.
In the coming lesson, we will cover the basic functions of Altair’s declarative API, examine some methods of customizing your plots in Altair, and look at some examples of the different plots you can create with Altair. Please be aware that at the time of this writing much of Altair is still under development.
The course will be updated with future Altair releases accordingly.
<h2 id="altairsdeclarativeapi">Altair’s Declarative API</h2>
The idea behind Altair is that it’s a declarative library. In order to create visualizations, all you need to do is declare which type of visualization you’d like and declare a few arguments that tell Altair to create the visualization with certain desired features.
When you declare a plot in Altair, you typically chain together your declarations, starting with the all-purpose <code>Chart</code> object and then declaring what type of chart you want, followed by how the data should be encoded on that chart.

Altair

The Plotly Python library is an open-source data visualization library for Python. Similar to Altair, Plotly makes use of the JSON data format to visualize data. Plotly supports the creation of over 40 different chart types to visualize different data types like statistical data, geographic data, financial data and scientific data.
Plotly has been rising in popularity in the recent years, and is one of the most widely-used libraries in the ecosystem. It comes extended with Plotly Express (official high-level API) - which looks a fair bit like Seaborn, making it a fairly simple/easy transition for most people who are already acquainted with the powerful library. This made adoption much more intuitive and simple - and Plotly has been increasingly showing up in web-based visualization. You're well versed with Matplotlib/Seaborn and want to incorporate those visualizations online? Good luck. The best you'll be able to do is save an image, and then load it into an HTML page - but it stays an image.
While images are fine for many use cases, the web is interactive, and having at least a rudimentary level of interaction really helps improve the user's experience. This is where Plotly comes in, making the creation of beautiful, highly-performant visualizations both easy to do and to integrate in an HTML page.

Plotly

GGplot/GGplot2 is focused on making the creation of data visualizations intuitive and straightforward. GGplot uses a certain plotting standard that ensures that all plots are comprised of the same basic elements. This plotting format is called the “Grammar of Graphics” standard, and it is adapted from R. The advantage of this consistent plotting standard helps ensure that the code used to render the graphs are uniform and easy to interpret.
In this lesson, we’ll cover the most notable features of GGplot, learn the process of creating plots with GGplot, and then explore some of the different types of visualizations we can create with GGplot.
GGplot makes use of the “Grammar of Graphics” plotting style. The concept behind the Grammar of Graphics is that there are different layers or attributes that comprise every plot, and that these layers have consistent names that can easily be referenced, created, and updated.
GGplot also allows you to easily create subsets or facets of data and then visualize these facets on the same plot, all with relatively little code. This aids in exploratory data analysis.
<h2 id="thegrammarofgraphics">The Grammar of Graphics</h2>
Before we delve into how to create plots with GGplot, we should take some time to understand the Grammar of Graphics plotting style. The plotting style is comprised of several different components. You can think of the different plot components as layers that build on top of one another to create the entire visualization.

GGplot2/Plotnine

GeoPandas is an extension to the Pandas plotting library designed to make it easier to work with geospatial/geographical data. Much like how regular Pandas allows the user to create and manipulate DataFrames, Geopandas is intended to facilitate these operations on geospatial data.
GeoPandas was designed to make manipulating and visualizing geospatial data simpler, as it is a geographically focused version of Pandas. Similarly, Geoplot is a plotting API that is a geographic counterpart to Matplotlib.
In this lesson, we’ll look at the data types that GeoPandas makes use of, explore different ways GeoPandas can be used to prepare data for plots, and look at some examples of plots you can make with GeoPandas and GeoPlot. We'll also learn how to make interactive geodata maps using Folium.
<h2 id="notablefeaturesofgeopandas">Notable Features of GeoPandas</h2>
GeoPandas has some useful features intended to make working with geospatial data easier.
Spatial Data Handling
Just like Pandas was created to facilitate the manipulation and cleaning of regular data, GeoPandas is intended to make manipulating geospatial data easier. It allows the user to handle spatial data with the types of functions observed in regular Pandas, letting the user merge, split, select, and perform set operations on geospatial data.

GeoPandas and GeoPlot

VisPy is a library designed for use by data scientists, and it's intended to make creating complex, interactive visualizations as quickly as possible. Instead of using just the CPU to process data, VisPy takes advantage of the extra processing power granted by Graphical Processing Units (GPUs). This makes the rendering of large datasets faster than other libraries. VisPy graphs and charts can be scaled up, making visualizations out of thousands or even millions of points of data.
At the time of this writing VisPy is still under development, so this lesson will be somewhat different to the other lessons. We'll focus on presenting some of the things VisPy is capable of doing, and take a look at the architecture that underlies VisPy. We’ll go over the general process of creating plots with VisPy, as well as creating and saving animations and object files.
<h2 id="notablefeaturesofvispy">Notable Features of VisPy</h2>
Before diving into the process of creating plots with VisPy, let's cover the notable features of VisPy and get a sense of why you would want to use it over other libraries in certain situations.
OpenGL Support
OpenGL is one of the most widely used graphics APIs in the world, used to render both 2D and 3D vector graphics. VisPy has an object-oriented OpenGL API, which lets the user employ the various functions of OpenGL through just a few lines of Python code. OpenGL is what enables the use of GPUs.

VisPy

That concludes this course - &quot;Data Visualization in Python&quot;. Thank you for taking a ride with us!
<blockquote>
Online education is spreading through the world, and is becoming an increasingly important part of many lives. We believe that accessible, high-quality resources can help empower people that build tomorrow, and remain guided by that goal.
</blockquote>
At StackAbuse, we believe that learning is not a one-stop time investment. It's life-long. Especially in the volatile and rapidly changing world of Computer Science and Software Engineering. So, we've pledged to update our courses, guides, and other upcoming material to keep the pace of progress in the field. Software is updating - it's only fitting that learning resources are updating as well.
Thank you for purchasing &quot;Data Visualization in Python&quot;! We hope that it has brought a ton of value to you so far, and know that it will continue to do so as you dive further in to this topic.
<blockquote>
Now, we'd like to ask you to get involved in improving the next version of the book and our courses.
</blockquote>
We believe that high-quality resources and education is community-driven and that minor (or major) contributions from each member results in a wonderful learning oasis. For this, feedback is crucial.

Thank You for Supporting Independent Publishers and Online Education

Data Visualization in Python, a course for beginner to intermediate Python developers, will guide you through simple data manipulation with Pandas, cover core plotting libraries like Matplotlib and Seaborn, and show you how to take advantage of declarative and experimental libraries like Altair.
Before diving too deep into the libraries themselves, we'll help you gain a better understanding of how the landscape of Python’s visualization libraries breaks down. To put that another way, it’s helpful to understand how the different Python libraries are designed and related to one another. Understanding how the different libraries operate will help you choose the best library for your visualization project.
We'll be covering:
<ul>
<li>
Matplotlib-based libraries
</li>
<li>
JavaScript libraries
</li>
<li>
JSON libraries
</li>
<li>
WebGL libraries
</li>
</ul>
More specifically, over the span of 11 chapters this course will cover 9 Python libraries: Pandas, Matplotlib, Seaborn, Bokeh, Altair, Plotly, GGPlot, GeoPandas, and VisPy. Each library has its own unique features and quirks, some related to each other, while some are based on completely different technologies and ideas. That being said, this course will act as a one-stop in-depth resource for learning the ins and outs of each.
Whether you're a student or a seasoned developer, this course aims to get you on board with the current landscape of Data Visualization libraries in Python and up to speed with some of the most popular and powerful tools out there.
<h3 id="contents">Contents:</h3>
<ul>
<li>
Introduction to Data Visualization
</li>
<li>
Types of Plots
</li>
<li>
Manipulating and Visualizing Data with Pandas
</li>
<li>
Matplotlib
</li>
<li>
Seaborn
</li>
<li>
Bokeh
</li>
<li>
Altair
</li>
<li>
Plotly
</li>
<li>
Ggplot
</li>
<li>
GeoPandas
</li>
<li>
VisPy
</li>
</ul>