Facial Detection in Python with OpenCV

Introduction

Facial detection is a powerful and common use-case of Machine Learning. It can be used to automatize manual tasks such as school attendance and law enforcement. In the other hand, it can be used for biometric authorization.

In this article, we'll perform facial detection in Python, using OpenCV.

OpenCV

OpenCV is one of the most popular computer vision libraries. It was written in C and C++ and also provides support for Python, besides Java and MATLAB. While it's not the fastest library out there, it's easy to work with and provides a high-level interface, allowing developers to write stable code.

Let's install OpenCV so that we can use it in our Python code:

$ pip install opencv-contrib-python

Alternatively, you can install opencv-python for just the main modules of OpenCV. The opencv-contrib-python contains the main modules as well as the contrib modules which provide extended functionality.

Detecting Faces in an Image Using OpenCV

With OpenCV installed, we can import it as cv2 in our code.

To read an image in, we will use the imread() function, along with the path to the image we want to process. The imread() function simply loads the image from the specified file in an ndarray. If the image could not be read, for example in case of a missing file or an unsupported format, the function will return None.

We will be using an image from Kaggle dataset:

import cv2

path_to_image = 'Parade_12.jpg'
original_image = cv2.imread(path_to_image)

The full RGB information isn't necessary for facial detection. The color holds a lot of irrelevant information on the image, so it's more efficient to just remove it and work with a grayscale image. Additionally, the Viola-Jones algorithm, which works under the hood with OpenCV, checks the difference in intensity of an image's area. Grayscale images point this difference out more dramatically.

Note: In the case of color images, the decoded images will have the channels stored in BGR order, so when changing them to grayscale, we need to use the cv2.COLOR_BGR2GRAY flag:

image = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)

This could have been done directly when using imread(), by setting the cv2.IMREAD_GRAYSCALE flag:

original_image = cv2.imread(path_to_image, cv2.IMREAD_GRAYSCALE)

The OpenCV library comes with several pre-trained classifiers that are trained to find different things, like faces, eyes, smiles, upper bodies, etc.

The Haar features for detecting these objects are stored as XML, and depending on how you installed OpenCV, can most often be found in Lib\site-packages\cv2\data. They can also be found in the OpenCV GitHub repository.

In order to access them from code, you can use a cv2.data.haarcascades and add the name of the XML file you'd like to use.

We can choose which Haar features we want to use for our object detection, by adding the file path to the CascadeClassifier() constructor, which uses pre-trained models for object detection:

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")

Now, we can use this face_cascade object to detect faces in the Image:

detected_faces = face_cascade.detectMultiScale(image=image, scaleFactor=1.3, minNeighbors=4)

When object detection models are trained, they are trained to detect faces of a certain size and might miss faces that are bigger or smaller than they expect. With this in mind, the image is resized several times in the hopes that a face will end up being a "detectable" size. The scaleFactor lets OpenCV know how much to scale the images. In our case, 1.3 means that it can scale 30% down to try and match the faces better.

As for the minNeighbors parameter, it's used to control the number of false positives and false negatives. It defines the minimum number of positive rectangles (detect facial features) that need to be adjacent to a positive rectangle in order for it to be considered actually positive. If minNeighbors is set to 0, the slightest hint of a face will be counted as a definitive face, even if no other facial features are detected near it.

Both the scaleFactor and minNeighbors parameters are somewhat arbitrary and set experimentally. We have chosen values that worked well for us, and gave no false positives, with the trade-off of more false negatives (undetected faces).

The detectMultiScale() method returns a list of rectangles of all the detected objects (faces in our first case). Each element in the list represents a unique face. This list contains tuples, (x, y, w, h), where the x, y values represent the top-left coordinates of the rectangle, while the w, h values represent the width and height of the rectangle, respectively.

We can use the returned list of rectangles, and use the cv2.rectangle() function to easily draw the rectangles where a face was detected. Keep in mind that the color provided needs to be a tuple in RGB order:

for (x, y, width, height) in detected_faces:
    cv2.rectangle(
        image,
        (x, y),
        (x + width, y + height),
        color,
        thickness=2
    )

Now, let's put that all together:

import cv2

def draw_found_faces(detected, image, color: tuple):
    for (x, y, width, height) in detected:
        cv2.rectangle(
            image,
            (x, y),
            (x + width, y + height),
            color,
            thickness=2
        )

path_to_image = 'Parade_12.jpg'
original_image = cv2.imread(path_to_image)

if original_image is not None:
    # Convert image to grayscale
    image = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)

    # Create Cascade Classifiers
    face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
    profile_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_profileface.xml")
    
    # Detect faces using the classifiers
    detected_faces = face_cascade.detectMultiScale(image=image, scaleFactor=1.3, minNeighbors=4)
    detected_profiles = profile_cascade.detectMultiScale(image=image, scaleFactor=1.3, minNeighbors=4)

    # Filter out profiles
    profiles_not_faces = [x for x in detected_profiles if x not in detected_faces]

    # Draw rectangles around faces on the original, colored image
    draw_found_faces(detected_faces, original_image, (0, 255, 0)) # RGB - green
    draw_found_faces(detected_profiles, original_image, (0, 0, 255)) # RGB - red

    # Open a window to display the results
    cv2.imshow(f'Detected Faces in {path_to_image}', original_image)
    # The window will close as soon as any key is pressed (not a mouse click)
    cv2.waitKey(0) 
    cv2.destroyAllWindows()
else:
    print(f'En error occurred while trying to load {path_to_image}')

We used two different models on this picture. The default model for detecting front-facing faces, and a model built to better detect faces looking to the side.

Faces detected with the frontalface model are outlined in green, and faces detected with the profileface model are outlined with red. Most of the faces the first model found would have also been found by the second, so we only drew red rectangles where the profileface model detected a face but frontalface didn't:

profiles_not_faces = [x for x in detected_profiles if x not in detected_faces]

The imshow() method simply shows the passed image in a window with the provided title. With the picture we selected, this would provide the following output:

frontal and profile face detection

Using different values for scaleFactor and minNeighbors will give us different results. For example, using scaleFactor = 1.1 and minNeighbors = 4 gives us more false positives and true positives with both models:

face detection lower scale factor

We can see that the algorithm isn't perfect, but it is very efficient. This is most notable when working with real-time data, such as a video feed from a webcam.

Real-Time Face Detection Using a Webcam

Video streams are simply streams of images. With the efficiency of the Viola-Jones algorithm, we can do face detection in real-time.

The steps we need to take are very similar to the previous example with only one image - we'll be performing this on each image in the stream.

To get the video stream, we'll use the cv2.VideoCapture class. The constructor for this class takes an integer parameter representing the video stream. On most machines, the webcam can be accessed by passing 0, but on machines with several video streams, you might need to try out different values.

Next, we need to read individual images from the input stream. This is done with the read() function, which returns retval and image. The image is simply the retrieved frame. The retval return value is used to detect whether a frame has been retrieved or not, and will be False if it hasn't.

However, it tends to be inconsistent with video input streams (doesn't detect that the webcam has been disconnected, for example), so we will be ignoring this value.

Let's go ahead and modify the previous code to handle a video stream:

import cv2

def draw_found_faces(detected, image, color: tuple):
    for (x, y, width, height) in detected:
        cv2.rectangle(
            image,
            (x, y),
            (x + width, y + height),
            color,
            thickness=2
        )

# Capturing the Video Stream
video_capture = cv2.VideoCapture(0)

# Creating the cascade objects
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
eye_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_eye_tree_eyeglasses.xml")

while True:
    # Get individual frame
    _, frame = video_capture.read()
    # Covert the frame to grayscale
    grayscale_image = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
	# Detect all the faces in that frame
    detected_faces = face_cascade.detectMultiScale(image=grayscale_image, scaleFactor=1.3, minNeighbors=4)
    detected_eyes = eye_cascade.detectMultiScale(image=grayscale_image, scaleFactor=1.3, minNeighbors=4)
    draw_found_faces(detected_faces, frame, (0, 0, 255))
    draw_found_faces(detected_eyes, frame, (0, 255, 0))

    # Display the updated frame as a video stream
    cv2.imshow('Webcam Face Detection', frame)

    # Press the ESC key to exit the loop
    # 27 is the code for the ESC key
    if cv2.waitKey(1) == 27:
        break

# Releasing the webcam resource
video_capture.release()

# Destroy the window that was showing the video stream
cv2.destroyAllWindows()

Conclusion

In this article, we've created a facial detection application using Python and OpenCV.

Using the OpenCV library is very straight-forward for basic object detection programs. Experimentally adjusting the scaleFactor and minNeighbors parameters for the types of images you'd like to process can give pretty accurate results very efficiently.