OpenCV Edge Detection in Python with cv2.Canny()

Introduction

Edge detection is something we do naturally, but isn't as easy when it comes to defining rules for computers. While various methods have been devised, the reigning method was developed by John F. Canny in 1986., and is aptly named the Canny method.

It's fast, fairly robust, and works just about the best it could work for the type of technique it is. By the end of the guide, you'll know how to perform real-time edge detection on videos, and produce something along the lines of:

Canny Edge Detection

What is the Canny method? It consists of four distinct operations:

  • Gaussian smoothing
  • Computing gradients
  • Non-Max Suppression
  • Hysteresis Thresholding

Gaussian smoothing is used as the first step to "iron out" the input image, and soften the noise, making the final output much cleaner.

Image gradients have been in use in earlier applications for edge detection. Most notably, Sobel and Scharr filters rely on image gradients. The Sobel filter boils down to two kernels (Gx and Gy), where Gx detects horizontal changes, while Gy detects vertical changes:

G x = [ 1 0 + 1 2 0 + 2 1 0 + 1 ] G y = [ 1 2 1 0 0 0 + 1 + 2 + 1 ]

When you slide them over an image, they'll each "pick up" (emphasize) the lines in their respective orientation. Scharr kernels work in the same way, with different values:

G x = [ + 3 0 3 + 10 0 10 + 3 0 3 ] G y = [ + 3 + 10 + 3 0 0 0 3 10 3 ]

These filters, once convolved over the image, will produce feature maps:

Image credit: Wikipedia

For these feature maps, you can compute the gradient magnitude and gradient orientation - i.e. how intense the change is (how likely it is that something is an edge) and in which direction the change is pointing. Since Gy denotes the vertical change (Y-gradient), and Gx denotes the horizontal change (X-gradient) - you can calculate the magnitude by simply applying the Pythagorean theorem, to get the hypotenuse of the triangle formed by the "left" and "right" directions:

$$
{G} ={\sqrt {{{G} _{x}}^{2}+{{G} _{y}}^{2}}}
$$

Using the magnitude and orientation, you can produce an image with its edged highlighted:

Image credit: Wikipedia

However - you can see how much noise was also caught from the texture of the bricks! Image gradients are very sensitive to noise. This is why Sobel and Scharr filters were used as the component, but not the only approach in Canny's method. Gaussian smoothing helps here as well.

Non-Max Suppression

A noticeable issue with the Sobel filter is that edges aren't really clear. It's not like someone took a pencil and drew a line to create a line art of the image. The edges usually aren't so clear cut in images, as light diffuses gradually. However, we can find the common line in the edges, and suppress the rest of the pixels around it, yielding a clean, thin separation line instead. This is known as Non-Max Suppression! The non-max pixels (ones smaller than the one we're comparing them to in a small local field, such as a 3x3 kernel) get suppressed. The concept is applicable to more tasks than this, but let's bind it to this context for now.

Hysteresis Thresholding

Many non-edges can and likely will be evaluated as edges, due to lighting conditions, the materials in the image, etc. Because of the various reasons these miscalculations occur - it's hard to make an automated evaluation of what an edge certainly is and isn't. You can threshold gradients, and only include the stronger ones, assuming that "real" edges are more intense than "fake" edges.

Thresholding works in much the same way as usual - if the gradient is below a lower threshold, remove it (zero it out), and if it's above a given top threshold, keep it. Everything in-between the lower bound and upper bound is in the "gray zone". If any edge in-between the thresholds is connected to a definitive edge (ones above the threshold) - they're also considered edges. If they're not connected, they're likely artifacts of a miscalculated edge.

That's hysteresis thresholding! In effect, it helps clean up the final output and remove false edges, depending on what you classify as a false edge. To find good threshold values, you'll generally experiment with different lower and upper bounds for the thresholds, or employ an automated method such as Otsu's method or the Triangle method.

Let's load an image in and grayscale it (Canny, just as Sobel/Scharr requires images to be gray-scaled):

import cv2
import matplotlib.pyplot as plt

img = cv2.imread('finger.jpg', cv2.IMREAD_GRAYSCALE)
img_blur = cv2.GaussianBlur(img, (3,3), 0)

plt.imshow(img_blur, cmap='gray')

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

The closeup image of a finger will serve as a good testing ground for edge detection - it's not easy to discern a fingerprint from the image, but we can approximate one.

Edge Detection on Images with cv2.Canny()

Canny's algorithm can be applied using OpenCV's Canny() method:

cv2.Canny(input_img, lower_bound, upper_bound)

Finding the right balance between the lower bound and upper bound can be tricky. If both are low - you'll have few edges. If the lower bound is low and upper is high - you'll have noise. If both are high and close to each other - you'll have few edges. The right spot has just enough gap between the bounds, and has them on the right scale. Experiment!

The input image will be blurred by the Canny method, but oftentimes, you'll benefit from blurring it before it goes in as well. The method applies a 5x5 Gaussian blur to the input before going through the rest of the operations, but even with this blur, some noise can still seep through, so we've blurred the image before feeding it into the algorithm:

# Canny Edge Detection
edge = cv2.Canny(img_blur, 20, 30)

fig, ax = plt.subplots(1, 2, figsize=(18, 6), dpi=150)
ax[0].imshow(img, cmap='gray')
ax[1].imshow(edge, cmap='gray')

This results in:

The values of 20 and 30 here aren't arbitrary - I've tested the method on various parameters, and chose a set that seemed to produce a decent result. Can we try to automate this?

Automated Thresholding for cv2.Canny()?

Can you find an optimal set of threshold values? Yes, but it doesn't always work. You can make your own calculation for some good value, and then adjust the range with a sigma around that threshold:

lower_bound = (1-sigma)*threshold
upper_bound = (1+sigma)*threshold

When sigma, is say, 0.33 - the bounds will be 0.66*threshold and 1.33*threshold, allowing a ~1/3 range around it. Though, finding the threshold is what's more difficult. OpenCV provides us with Otsu's method (works great for bi-modal images) and the Triangle method. Let's try both of them out, as well as taking a simple median of the pixel values as the third option:

otsu_thresh, _ = cv2.threshold(img_blur, 0, 255, cv2.THRESH_OTSU)
triangle_thresh, _ = cv2.threshold(img_blur, 0, 255, cv2.THRESH_TRIANGLE)
manual_thresh = np.median(img_blur)

def get_range(threshold, sigma=0.33):
    return (1-sigma) * threshold, (1+sigma) * threshold

otsu_thresh = get_range(otsu_thresh)
triangle_thresh = get_range(triangle_thresh)
manual_thresh = get_range(manual_thresh)

print(f"Otsu's Threshold: {otsu_thresh} \nTriangle Threshold: {triangle_thresh} \nManual Threshold: {manual_thresh}")

This results in:

Otsu's Threshold: (70.35, 139.65) 
Triangle Threshold: (17.419999999999998, 34.58) 
Manual Threshold: (105.18999999999998, 208.81)

Advice: For a detailed guide on thresholding with OpenCV, read our "OpenCV Thresholding in Python with cv2.threshold()"!

These are pretty different! From the values we've seen before, we can anticipate the Triangle method working the best here. The manual threshold isn't very informed, since it just takes the median pixel value, and ends up having a high base threshold which is further multiplied into a wide range for this image. Otsu's method suffers less from this, but suffers nonetheless.

If we run the Canny() method with these threshold ranges:

edge_otsu = cv2.Canny(img_blur, *otsu_thresh)
edge_triangle = cv2.Canny(img_blur, *triangle_thresh)
edge_manual = cv2.Canny(img_blur, *manual_thresh)

fig, ax = plt.subplots(1, 3, figsize=(18, 6), dpi=150)
ax[0].imshow(edge_otsu, cmap='gray')
ax[1].imshow(edge_triangle, cmap='gray')
ax[2].imshow(edge_manual, cmap='gray')

Note: The function expects multiple arguments, and our thresholds are a single tuple. We can destructure the tuple into multiple arguments by prefixing it with *. This works on lists and sets as well, and is a great way of supplying multiple arguments after obtaining them by programmatic means.

This results in:

The Triangle method worked pretty well here! This is no guarantee that it'll work well in other cases as well.

Real-Time Edge Detection on Videos with cv2.Canny()

Finally, let's apply Canny edge detection to a video in real-time! We'll display the video being processed (each frame as it's done) using cv2.imshow() which displays a window with the frame we'd like to display. Though, we'll also save the video into an MP4 file that can later be inspected and shared.

To load a video using OpenCV, we use the VideoCapture() method. If we pass in 0 - it'll record from the current web cam, so you can run the code on your web cam as well! If you pass in a filename, it'll load the file:

def edge_detection_video(filename):
    cap = cv2.VideoCapture(filename)
    # VideoWriter for saving the video
    fourcc = cv2.VideoWriter_fourcc(*'MP4V')
    out = cv2.VideoWriter('output.mp4', fourcc, 30.0, (int(cap.get(3)), int(cap.get(4))), isColor=False)
    
    while cap.isOpened():
        (ret, frame) = cap.read()
        if ret == True:
            frame = cv2.GaussianBlur(frame, (3, 3), 0)
            frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
            edge = cv2.Canny(frame, 50, 100)
            out.write(edge)
            cv2.imshow('Edge detection', edge)
        else:
            break

        if cv2.waitKey(10) & 0xFF == ord('q'):
            break

    cap.release()
    out.release()
    cv2.destroyAllWindows()

edge_detection_video('secret_video.mp4')

The VideoWriter accepts several parameters - the output filename, the FourCC (four codec codes, denoting the codec used to encode the video), the frame rate and the resolution as a tuple. To not guess or resize the video - we've used the width and height of the original video, obtained through the VideoCapture instance that contains data about the video itself, such as the width, height, total number of frames, etc.

While the capture is opened, we try to read the next frame with cap.read(), which returns a result code and the next frame. The result code is True or False, denoting the presence of the next frame or a lack thereof. Only when there is a frame, we'll try to process it further, otherwise, we'll break the loop. For each valid frame, we run it through a gaussian blur, convert it to grayscale, run cv2.Canny() on it and write it using the VideoWriter to the disk, and display using cv2.imshow() for a live view.

Finally, we release the capture and video writer, as they're both working with files on the disk, and destroy all existing windows.

When you run the method with a secret_video.mp4 input - you'll see a window pop up and once it's finished, a file in your working directory:

Conclusion

In this guide, we've taken a look at how Canny edge detection works, and its constituent parts - gaussian smoothing, Sobel filters and image gradients, Non-Max Suppression and Hysteresis Thresholding. Finally, we've explored methods for automated threshold range search for Canny edge detection with cv2.Canny(), and employed the technique on a video, providing real-time edge detection and saving the results in a video file.

Last Updated: November 17th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

David LandupEditor

Entrepreneur, Software and Machine Learning Engineer, with a deep fascination towards the application of Computation and Deep Learning in Life Sciences (Bioinformatics, Drug Discovery, Genomics), Neuroscience (Computational Neuroscience), robotics and BCIs.

Great passion for accessible education and promotion of reason, science, humanism, and progress.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms