Source: Deep Learning on Medium
‘facelib’ a modular framework for face recognition pipeline in images and videos
do you want to perform face recognition in videos and images with many types of detectors, embedding and matching method? facelib is a software solution just for that !! it also comes with powerful tracking method that boost performance in video and live steam face recognition
what is facelib ?
the project solves the problem of face recognition by solving each of the following problems individually: face detection, face alignment, face embedding and face matching or recognition, the code reflects the academic partition so that each step can be carried out independently with whatever framework or programming language the user prefers and then it will be integrated in the pipeline. the project also works with video or live stream and implement object tracking to get better results and to reduce the computation cost when dealing with high frame rate. the project aims to find a software solution that allows developers to quickly customize a face recognition pipeline depending on their needs, also the project help researchers evaluate their models developed for a specific task (such as face embedding) with different detection and alignment methods. there are many other useful use cases for the project. the code is organized and documented in a good way so that developers can read and make adjustments relatively easy, the docs is built using sphinx so the user can search for any needed documentation.
The code architecture consists of independent units that perform independent tasks, the main block is the class FaceCore which handles the pipeline operation but can be used for single tasks such as detection or embedding. Each class has the option to “clean” the model it leaded, the FaceCore is built to keep the models in memory in order to avoid loading overhead. This boost the performance of facelib in applications. In any time, you can use the clean option to free memory up.
he FaceCore class can perform video face recognition and tracking using the function “process stream” (which can handle both live stream and video) , this includes ignoring small faces that appears in a video and once a face is bigger than a threshold it then performs recognition pipeline on that face for N (a variable that you can control) frames and it accumulate the decision made about the identity of the face and makes a final decision, once a final decision is made no more recognition is performed and the function continue to only track the face.
the decision accumulation can be made in two ways (found to have similar performance) which are voting and feature fusion, in voting each of the first N frames has one vote on the final decision while in feature fusion we take the mean of the N embedding vectors extracted in the N frames for a certain face and then make a final matching using the new embedding vector. This is important to reduce the computation cost and to make better face recognition.
a more technical explanation is coming soon so, stay tuned and have a good day.