Face Recognition with Python and OpenCV

Source: Deep Learning on Medium

Face Recognition with Python and OpenCV

Today, we’ll perform face recognition with Python, OpenCV with help from pre-trained deep learning model.

We’ll be using the dlib library to create a 128 dimensional vector space where images of the same person are near to each other and images from different people are far apart. We’ll also use the face_recognition library that wraps around dlib’s facial recognition functionality, making it easier to work with. Moreover, we’ll use OpenCV to perform various image processing operations and the imutils package as well.

Installing the packages is simple, we can use pip to install them on any system, so I’ll just skip the part.

Encoding the faces

Before we can recognizes faces in images and videos, first we need to encode the faces in our training set which will later be used for the purpose of face recognition. However, we won’t actually be training a network here- the network has already been trained to create 128 dimensional embeddings on a dataset of almost 3 million images- thanks to Davis King.

We’ll use the pretrained model to construct a 128 dimensional embeddings for all the faces in our dataset.

During classification, we can use a simple k-NN model + votes to make the final face of classification.

We’ll start with creating a argument parser to parse the command line arguments for encode.py

ap = argparse.ArgumentParser()
ap.add_arguments("-i", "--dataset", required = True, help = "path to input directory of images"
ap.add_argument("-e", "--encodings", required = True, help = "path to serialized db of facial encodings"
ap.add_argument("-d", "--detection-method", type = str, default = "cnn" help = "face detection model to use: either 'hog' or 'cnn'")
args = vars(ap.parse_args())

The different argument flags along with their explanation are given below:

— dataset : The path to our dataset. All the images for a single person should be in a separate folder, with the person’s name as the folder’s name. These folders should reside in the dataset folder, the path to which is passed as the — dataset argument.

— encodings : The 128 dimensional face encodings for every image in our dataset is written to the file that this argument points to.

— detection-method : Before we can encode faces in images, we first need to detect them. The face_recognition library uses two different methods : “hog” or “cnn” for face detection which should be used along with the — detection-method argument.

Next, we’ll grab the paths to the input images in our dataset.

from imutils import paths imagePaths = list(paths.list_images(args["dataset"]))

Then, we’ll initialize the list of known encoding and the known names. These are the encodings/ names that we’ll create for every image in the next phase.

knownEncodings = []
knownNames = []

Now, we will be looping over all the images in our dataset and create a 128 dimensional embedding for them.

# loop over the image paths
for (i, imagePath) in enumerate(imagePaths):
# extract the person name from the image path
print("[INFO] processing image {}/{}".format(i + 1,
name = imagePath.split(os.path.sep)[-2]
# load the input image and convert it from RGB (OpenCV ordering)
# to dlib ordering (RGB)
image = cv2.imread(imagePath)
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect the (x, y)-coordinates of the bounding boxes
# corresponding to each face in the input image
boxes = face_recognition.face_locations(rgb,
# compute the facial embedding for the face
encodings = face_recognition.face_encodings(rgb, boxes)
# loop over the encodings
for encoding in encodings:
# add each encoding + name to our set of known names and encodings

The loop will cycle over all the images in the dataset. Each image is read using the imagePath extracted earlier, and then converted into rgb color space, which is the desired color space for dlib. Then, we detect the coordinates of the bounding boxes using the face_locations function of face_recognition library. The encoding for the face is then extracted using the face_encodings function. The name and the encoding is then appended into the knownNames and knownEncodings list intialized earlier respectively.

We now need to save all the encodings and their respective names which can be done using the code block below:

data = {"encodings": knownEncodings, "names": knownNames}
f = open(args["encodings"], "wb")

The complete code for encoding a given dataset of image is given below:

Save the code in a file encode_faces.py and run from the command line as:

–> python encode_faces.py — dataset dataset — encodings encodings.pickle

The images should be in their separate folder for every person inside dataset folder.

We now have a file named encodings.pickle, which contains the 128 dimensional face embedding for every image in our dataset.

Working with a GPU speeds up the encoding and detection time exponentially but we will still be able to perform this operation using a standard CPU.

Recognizing faces in images

After encoding the user’s faces using encode.py and storing the 128 dimensional face encodings, we are now ready to recognize faces in an image using OpenCV, Python and Deep Learning.

First we need to import the required dependencies.

import face_recognition
import argparse
import pickle
import cv2

Then, just like with the encoding script, we need to create an argument parser in order to parse the command line arguments.

ap = argparse.ArgumentParser()
ap.add_argument("-e", "--encodings", required=True,
help="path to serialized db of facial encodings")
ap.add_argument("-i", "--image", required=True,
help="path to input image")
ap.add_argument("-d", "--detection-method", type=str, default="cnn",
help="face detection model to use: either `hog` or `cnn`")
args = vars(ap.parse_args())

The command line arguments along with their brief description are given below:

— encodings : The path to the pickle file containing our face encodings.

— image: The image that we will be using to perform face recognition.

— detection-method : We can use either ‘hog’ or ‘cnn’. ‘hog’ detection method is faster but less accurate while ‘cnn’ is slower but more accurate. We can use ‘hog’ detection method on slower machines. The default detection method is ‘cnn’.

We now need to load the pre-trained face encodings.

data = pickle.loads(open(args["encodings"],"rb").read())

The path to the encoding file was passed through the command line which we are accessing using args command and load it in the variable data.

Then, we need to load the input image and convert it from BGR to RGB. OpenCV expects an image to be in BGR color space whereas dlib expects an image to be in RGB color space. Since we are using the dlib, we have to convert the image from BGR to RGB.

image = cv2.imread(args["image"])
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

Now, we’ll be detecting a bounding box corresponding to each face in the input image, then compute the facial embeddings for each image.

boxes = face_recognition.face_locations(rgb, model= args["detection_method"])
encodings = face_recognition.face_encodings(rgb, boxes)
#initialize a list of names for each face detected
names = []

Now, we’ll loop over the encodings of all the detected faces and compare them with our known encodings dataset (contained in data[“encodings”]) using face_recognition.compare_faces().

This function returns a list of True/ False values, one for each image in our dataset.

Head over compare_faces implementation for more details.

We’ll use the matches list to compute the number of “votes” for each name, tally up the votes and select the persons’s name with the most corresponding votes.

for encoding in encodings:
# attempt to match each face in the input image to our known
# encodings
matches = face_recognition.compare_faces(data[“encodings”],
name = “Unknown”
# check to see if we have found a match
if True in matches:
# find the indexes of all matched faces then initialize a
# dictionary to count the total number of times each face
# was matched
matchedIdxs = [i for (i, b) in enumerate(matches) if b]
counts = {}
# loop over the matched indexes and maintain a count for
# each recognized face face
for i in matchedIdxs:
name = data[“names”][i]
counts[name] = counts.get(name, 0) + 1
# determine the recognized face with the largest number of
# votes (note: in the event of an unlikely tie Python will
# select first entry in the dictionary)
name = max(counts, key=counts.get)

# update the list of names

Then we’ll loop over the bounding boxes and labeled names for each person and draw them on our output image.

# loop over the recognized faces
for ((top, right, bottom, left), name) in zip(boxes, names):
# draw the predicted face name on the image
cv2.rectangle(image, (left, top), (right, bottom), (0, 255, 0), 2)
y = top — 15 if top — 15 > 15 else top + 15
cv2.putText(image, name, (left, y), cv2.FONT_HERSHEY_SIMPLEX,
0.75, (0, 255, 0), 2)
# show the output image
cv2.imshow(“Image”, image)

To run the facial recognition Python Script, enter the following command in the command line

python recognize_faces_image.py — encodings encodings.pickle — image example.png

The given command line argument assume that both the encodings.pickle and example.png files have been stored in the same directory as of recognize_faces_image.py. In any other case, just change the directory accordingly.

The complete code implementation for recognize_faces_image.py is given below:


In this tutorial, we learned how to perform face recognition with OpenCV, Python and deep learning.

A big shout-out to pyimagesearch, which has been pivotal for the development of these script. I have used various different techniques and tricks learnt from the website.

I will also be posting on how to use the face recognizer to recognize faces from video file on real time through a camera or a pre-saved video file soon.