In this guide - we'll take a look at *how to calculate the Euclidean distance between two points in Python, using Numpy.*

## What is Euclidean Distance?

*Euclidean distance* is a fundamental distance metric pertaining to systems in *Euclidean space*.

Euclidean space is the

classical geometrical spacethat you get familiar with in the Math class, typically bound to 3 dimensions. Though, it can also be prescribed to any non-negative integer dimension as well.

Euclidean distance is the shortest line between two points in Euclidean space.

The name comes from Euclid, who is widely recognized as *"the father of geometry"*, as this was the only space people at the time would typically conceive of. Through time, different types of space have been observed in Physics and Mathematics, such as *Affine space*, and non-Euclidean spaces and geometry are very unintuitive for our cognitive perception.

In 3-dimensional Euclidean space, the shortest line between two points will

always be a straight linebetween them, though this doesn't hold for higher dimensions.

Given this fact, Euclidean distance isn't always the most useful metric to keep track of when dealing with many dimensions, and we'll focus on 2D and 3D Euclidean space to calculate the Euclidean distance.

Measuring distance for high-dimensional data is typically done with other distance metrics such as *Manhattan distance*.

Generally speaking, Euclidean distance has *major* usage in development of 3D worlds, as well as Machine Learning algorithms that include distance metrics, such as K-Nearest Neighbors. Typically, Euclidean distance will represent how similar two data points are - assuming some clustering based on other data has already been performed.

### Mathematical Formula

The mathematical formula for calculating the Euclidean distance between 2 points in 2D space:

$$

d(p,q) = \sqrt[2]{(q_1-p_1)^2 + (q_2-p_2)^2 }

$$

The formula is easily adapted to 3D space, as well as any dimension:

$$

d(p,q) = \sqrt[2]{(q_1-p_1)^2 + (q_2-p_2)^2 + (q_3-p_3)^2 }

$$

The general formula can be simplified to:

$$

d(p,q) = \sqrt[2]{(q_1-p_1)^2 + ... + (q_n-p_n)^2 }

$$

A sharp eye may notice the similarity between Euclidean distance and Pythagoras' Theorem:

$$

C^2 = A^2 + B^2

$$

$$

d(p,q)^2 = (q_1-p_1)^2 + (q_2-p_2)^2

$$

There in fact *is* a relationship between these - Euclidean distance is calculated via Pythagoras' Theorem, given the Cartesian coordinates of two points.

Because of this,

Euclidean distanceis sometimes known asPythagoras' distance, as well, though, the former name is much more well-known.

**Note**: The two points are vectors, but the output should be a scalar (which is the distance).

We'll be using NumPy to calculate this distance for two points, and the same approach is used for 2D and 3D spaces:

```
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = fig.add_subplot(111, projection = '3d')
ax.scatter(0, 0, 0)
ax.scatter(3, 3, 3)
plt.show()
```

## Calculating Euclidean Distance in Python with NumPy

First, we'll need to install the NumPy library:

```
$ pip install numpy
```

Now, let's import it and set up our two points, with the Cartesian coordinates as (0, 0, 0) and (3, 3, 3):

```
import numpy as np
# Initializing the points
point_1 = np.array((0, 0, 0))
point_2 = np.array((3, 3, 3))
```

Now, instead of performing the calculation manually, let's utilize the helper methods of NumPy to make this even easier!

*np.sqrt()* and *np.sum()*

The operations and mathematical functions required to calculate Euclidean Distance are pretty simple: *addition*, *subtraction*, as well as the *square root function*. Multiple additions can be replaced with a *sum*, as well:

$$

d(p,q) = \sqrt[2]{(q_1-p_1)^2 + (q_2-p_2)^2 + (q_3-p_3)^2 }

$$

NumPy provides us with a `np.sqrt()`

function, representing the square root function, as well as a `np.sum()`

function, which represents a sum. With these, calculating the Euclidean Distance in Python is simple and intuitive:

```
# Get the square of the difference of the 2 vectors
square = np.square(point_1 - point_2)
# Get the sum of the square
sum_square = np.sum(square)
```

This gives us a pretty simple result:

```
(0-3)^2 + (0-3)^2 + (0-3)^2
```

Which is equal to *27*. All that's left is to get the square root of that number:

```
# The last step is to get the square root and print the Euclidean distance
distance = np.sqrt(sum_square)
print(distance)
```

This results in:

```
5.196152422706632
```

In true Pythonic spirit, this can be shortened to just a single line:

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually *learn* it!

```
distance = np.sqrt(np.sum(np.square(point_1 - point_2)))
```

And you can even use the built-in `pow()`

and `sum()`

methods of the `math`

module of Python instead, though they require you to hack around a bit with the input, which is conveniently abstracted using NumPy, as the `pow()`

function only works with scalars (each element in the array individually), and accepts an argument - to which power you're raising the number.

This approach, though, intuitively *looks* more like the formula we've used before:

```
from math import *
distance = np.sqrt(sum(pow(a-b, 2) for a, b in zip(point_1, point_2)))
print(distance)
```

This also results in:

```
5.196152422706632
```

*np.linalg.norm()*

The `np.linalg.norm()`

function represents a *Mathematical norm*. In essence, a *norm* of a vector is it's *length*. This length doesn't have to necessarily be the *Euclidean distance*, and can be other distances as well. Euclidean distance is the **L2 norm of a vector** (sometimes known as the **Euclidean norm**) and by default, the `norm()`

function uses L2 - the `ord`

parameter is set to 2.

If you were to set the `ord`

parameter to some other value *p*, you'd calculate other *p-norms*. For instance, the *L1 norm of a vector is the Manhattan distance*!

With that in mind, we can use the `np.linalg.norm() `

function to calculate the Euclidean distance easily, and much more cleanly than using other functions:

```
distance = np.linalg.norm(point_1-point_2)
print(distance)
```

This results in the L2/Euclidean distance being printed:

```
5.196152422706632
```

*L2 normalization* and *L1 normalization* are heavily used in Machine Learning to normalize input data.

If you'd like to learn more about feature scaling - read our Guide to Feature Scaling Data with Scikit-Learn!

*np.dot()*

We can also use a * Dot Product* to calculate the Euclidean distance. In Mathematics, the

*Dot Product*is the result of multiplying two equal-length vectors and the result is a single number - a scalar value. Because of the return type, it's sometimes also known as a

*. This operation is often called the*

**"scalar product"****inner product**for the two vectors.

To calculate the dot product between 2 vectors you can use the following formula:

$$

\vec{p} \cdot \vec{q} = {(q_1-p_1) + (q_2-p_2) + (q_3-p_3) }

$$

With NumPy, we can use the `np.dot()`

function, passing in two vectors.

If we calculate a Dot Product of the difference between both points, with that same difference - we get a number that's in a relationship with the Euclidean Distance between those two vectors. Extracting the square root of that number nets us the distance we're searching for:

```
# Take the difference between the 2 points
diff = point_1 - point_2
# Perform the dot product on the point with itself to get the sum of the squares
sum_square = np.dot(diff, diff)
# Get the square root of the result
distance = np.sqrt(sum_square)
print(distance)
```

Of course, you can shorten this to a one-liner as well:

```
distance = np.sqrt(np.dot(point_1-point_2, point_1-point_2))
print(distance)
```

```
5.196152422706632
```

## Using the Built-In *math.dist()*

Python has its built-in method, in the `math`

module, that calculates the distance between 2 points in 3d space. However, this only works with Python 3.8 or later.

`math.dist()`

takes in two parameters, which are the two points, and returns the Euclidean distance between those points.

**Note**: Please note that the two points must have the same dimensions (i.e both in 2d or 3d space).

Now, to calculate the Euclidean Distance between these two points, we just chuck them into the `dist()`

method:

```
import math
distance = math.dist(point_1, point_2)
print(distance)
```

```
5.196152422706632
```

## Conclusion

*Euclidean distance* is a fundamental distance metric pertaining to systems in *Euclidean space*.

Euclidean space is the

classical geometrical spacethat you get familiar with in the Math class, typically bound to 3 dimensions. Though, it can also be prescribed to any non-negative integer dimension as well.

Euclidean distance is the shortest line between two points in Euclidean space.

The metric is used in many contexts within data mining, machine learning, and several other fields, and is one of the fundamental distance metrics.