How to Check for NaN Values in Python

Introduction

Today we're going to explore how to check for NaN (Not a Number) values in Python. NaN values can be quite a nuisance when processing data, and knowing how to identify them can save you from a lot of potential headaches down the road.

Why Checking for NaN Values is Important

NaN values can be a real pain, especially when you're dealing with numerical computations or data analysis. They can skew your results, cause errors, and generally make your life as a developer more difficult. For instance, if you're calculating the average of a list of numbers and a NaN value sneaks in, your result will also be NaN, regardless of the other numbers. It's almost as if it "poisons" the result - a single NaN can throw everything off.

Note: NaN stands for 'Not a Number'. It is a special floating-point value that cannot be converted to any other type than float.

NaN Values in Mathematical Operations

When performing mathematical operations, NaN values can cause lots of issues. They can lead to unexpected results or even errors. Python's math and numpy libraries typically propagate NaN values in mathematical operations, which can lead to entire computations being invalidated.

For example, in numpy, any arithmetic operation involving a NaN value will result in NaN:

import numpy as np

a = np.array([1, 2, np.nan])
print(a.sum())

Output:

nan

In such cases, you might want to consider using functions that can handle NaN values appropriately. Numpy provides nansum(), nanmean(), and others, which ignore NaN values:

print(np.nansum(a))

Output:

3.0

Pandas, on the other hand, generally excludes NaN values in its mathematical operations by default.

How to Check for NaN Values in Python

There are many ways to check for NaN values in Python, and we'll cover some of the most common methods used in different libraries. Let's start with the built-in math library.

Using the math.isnan() Function

The math.isnan() function is an easy way to check if a value is NaN. This function returns True if the value is NaN and False otherwise. Here's a simple example:

import math

value = float('nan')
print(math.isnan(value))  # True

value = 5
print(math.isnan(value))  # False
Get free courses, guided projects, and more

No spam ever. Unsubscribe anytime. Read our Privacy Policy.

As you can see, when we pass a NaN value to the math.isnan() function, it returns True. When we pass a non-NaN value, it returns False.

The benefit of using this particular function is that the math module is built-in to Python, so no third party packages need to be installed.

Using the numpy.isnan() Function

If you're working with arrays or matrices, the numpy.isnan() function can be a nice tool as well. It operates element-wise on an array and returns a Boolean array of the same shape. Here's an example:

import numpy as np

array = np.array([1, np.nan, 3, np.nan])
print(np.isnan(array))
# array([False,  True, False,  True])

In this example, we have an array with two NaN values. When we use numpy.isnan(), it returns a Boolean array where True corresponds to the positions of NaN values in the original array.

You'd want to use this method when you're already using NumPy in your code and need a function that works well with other NumPy structures, like np.array.

Using the pandas.isnull() Function

Pandas provides an easy-to-use function, isnull(), to check for NaN values in the DataFrame or Series. Let's take a look at an example:

import pandas as pd

# Create a DataFrame with NaN values
df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan], 'C': [1, 2, 3]})

print(df.isnull())

The output will be a DataFrame that mirrors the original, but with True for NaN values and False for non-NaN values:

       A      B      C
0  False  False  False
1  False   True  False
2   True   True  False

One thing you'll notice if you test this method out is that it also returns True for None values, hence why it refers to null in the method name. It will return True for both NaN and None.

Comparing the Different Methods

Each method we've discussed — math.isnan(), numpy.isnan(), and pandas.isnull() — has its own strengths and use-cases. The math.isnan() function is a straightforward way to check if a number is NaN, but it only works on individual numbers.

On the other hand, numpy.isnan() operates element-wise on arrays, making it a good choice for checking NaN values in numpy arrays.

Finally, pandas.isnull() is perfect for checking NaN values in pandas Series or DataFrame objects. It's worth mentioning that pandas.isnull() also considers None as NaN, which can be very useful when dealing with real-world data.

Conclusion

Checking for NaN values is an important step in data preprocessing. We've explored three methods — math.isnan(), numpy.isnan(), and pandas.isnull() — each with its own strengths, depending on the type of data you're working with.

We've also discussed the impact of NaN values on mathematical operations and how to handle them using numpy and pandas functions.

Last Updated: September 22nd, 2023
Was this helpful?
Project

Building Your First Convolutional Neural Network With Keras

# python# artificial intelligence# machine learning# tensorflow

Most resources start with pristine datasets, start at importing and finish at validation. There's much more to know. Why was a class predicted? Where was...

David Landup
David Landup
Details

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms