Domain Research

David Landup
David Landup

Note: This Guided Project is combined from parts of our Data Visualization in Python and Data Visualization in Python with Matplotlib and Pandas courses, and additionally made available as a standalone project. For the full experience, please enroll into the respective course(s).

Visualizing EEG Data with Python - Matplotlib and Seaborn

In this Guided Project, we'll be building a project based on EEG scans. Remember - when doing data visualization, you have to get familiar with the domain you're working with at least superficially. This will allow you to understand the format of the data you're working with and to interpret the visualizations you're making. We'll be diving into a topic that not everyone has been acquainted with, and it's an extensive topic in and of itself.

We'll be scratching the surface of what EEG is to gain a basic intuition on how it works and how we can interpret the data, before performing any visualizations.

More specifically - we'll be working with two datasets:

The first dataset was created in a study trying to figure out whether EEG correlates with genetic predisposition to alcoholism, while the second was created to figure out whether EEG correlates with the level of confusion of a student while watching MOOC clips of differing complexity.

Both of these tasks are best done with machine learning algorithms, which are great making inference from data. However, the first step in any such project would be to properly explore the data visually, through data visualization techniques.

What is EEG?

Electroencephalography (EEG) is the process of recording an individuals brain activity - from a macroscopic scale. It's a non-invasive (external) procedure and collects aggregate, not individual neuronal data. This by all means doesn't mean the procedure is of low quality or inaccurate. What this means is that we see activation data of huge clumps of neurons, corresponding to a singular electrode placed in a certain area. To collect data on individual neurons, we'd need an invasive technology, which inserts channels directly into the brain, which are in physical contact with the neurons themselves. By using EEG and collecting data from a bunch of neurons that fire together - we've got a fairly effective way to correlate neuron activation to certain stimuli without having to perform invasive surgery on a patient.

These electrical signals are strong enough to actually travel through the skull, in a small range. By placing many electrodes on the human scalp - we can capture these signals. EEG allows us to also correlate these signals through time, which allows us to directly correlate certain stimuli with activations of groups of neurons.

There are various EEG electrode configurations - and each channel stems from one or more electrodes. The more channels we have, the more data points we can acquire and analyze. 64-channel EEG headsets aren't cheap and are mainly found in funded labs, however, 1-5 channel headsets can be found commercially and purchased for under $300 dollars, at the time of writing.

While not cheap by most standards - the technology is getting more refined and cheaper year-by-year, and at the very least, it's getting more accessible to the average consumer who'd like to play around with the data they can extract from themselves.

A 64-channel headset typically has an electrode placement of that looks like this:

Brylie Christopher Oxley, CC0, via Wikimedia Commons

The first dataset we'll be working with was made with a device of 64 electrodes, while the second dataset will be a much more "home-like" device with a single channel that measured the activity over the frontal lobe, which stems from voltage differences between three electrodes - one on the forehead and two on the ears. The frontal lobe is very important for voluntary (planned) movement, expressive language, memory, emotions, problem solving and social interaction. In short - it plays a big role in higher order functions that are at play in the human body.

Note: Keep in mind - we're here for Data Visualization. Our task is to visualize the signals provided in the dataset, and analyze it. Our task isn't to check the validity of the experiment's hypothesis - whatever it may be. Sometimes, these datasets are released without an accompanying "goal". For instance, a dataset might be created as part of a research endeavor, or it might be created to democratize data and allow others to attempt finding correlations and relationships in it.

That being said - let's hop in!

Lessson 1/4
You must first start the project before tracking progress.
Mark completed

© 2013-2024 Stack Abuse. All rights reserved.