Skip to main content

Simple Face Recognition Project using OpenCV python Deep Learning

Okay, not from completely scratch though, in this Article you are going to learn to build a simple face detection and recognition console based application using Opencv python and Deeplearning

Before Starting:

  • If you don't have enough time to read the whole article or you are too lazy to read articles  Scroll all the way down and there is source code at last heading Resources.
  • If you really love to learn step by step, there are lots of comments inside the code. I highly recommand you to read and go through it
  • And at last, Don't panic :D
Lets start:

Installing Libraries:

we also need to install imutils. Actually imutils is used to make basic image processing functions such as translation, rotation, resizing, skeletonization, and displaying Matplotlib images easier with OpenCV but we will be using it to maintain directory or paths. I won't be talking too much about these library  coz they the creator of these library have written detailed articles (link will be in the references). The dlib by Davis King contains “deep metric learning”. 128-d embeddings for each face in the image in dataset will be created by using this learning and later will be used for face recognition.

Before Diving right into Coding

  • Because a lot of work is to be done with GPU, we will be using google colab. if you want to learn more about google colab I will be writing soon about it and you can also learn from here
  • If you have GPU supported local machine and want to work on your own machine please install dlib with GPU support and I highly recommend you to use virtual environment

Now the wait is over, Let's Get Started

open up new tab in your browser and google colab it.
  • From file (left top corner) open new python3 notebook
Let's Start with installing requirements
First we need to install dlib. In first inline type,

!pip install dlib and hit shift + enter

Now we need to install face_recognition module

!pip install face_recognition and hit shift + enter

Let us now install imutils.
!pip install imutils

Please note that the requirements may have already been satisfied in your colab notebook

Now you need some face datasets. But don't worry I have created one for you. This dataset consists of total 90 images, of 7 of my college friends hosted in github publicly so let's clone it!
!git clone https://github.com/puri-gagan/dataset.git

After we've completed the model, we have to train (encode the images of the faces in images in dataset to 128-d embeddings) it using this dataset and the output data from train should be dumped (saved) in a file with the extension .pickle format. After training we'll need some input images to recognize, i.e, whose face is in that image. So we'll need some input images. For this I have maintained a Directory 'colabrequiredfiles' which is also hosted in github. So let's Clone that as well. (These might sound weird and confusing, but worry not, you will understand all of these as we go forward)
Let's clone those images now.
!git clone https://github.com/puri-gagan/colabrequiredfiles.git

All Set... Let's Do Some Real Coding now. In New inline let's import necessary libraries
from imutils import paths
import face_recognition
import argparse
import pickle
import cv2
import os

Now grab the paths to the files in our dataset in variable imgPaths
# grab the paths to the input images in our dataset
print("Initializing ...")
imgPaths = list(paths.list_images("dataset/dataset"))

Now lets initialize two array variable knownEncodings[] and knownNames[]. These variable will hold the array data of 128-d encoding of each face in an image and the corresponding name of the directory (person's name) containing the image (note: name the folder with the name of the person in the image and make separate folder for different person with their name. please see the dataset structure. For example the 11 images containing the faces of 'Gagan Bhattrai' is place in the folder named 'Gagan_bhattrai'. This name will be used to show the name in the input image when recognizing the person).
knownEncodings = []
knownNames = []

It's time to encode each face in the image to 128-d embedding for all images in the dataset. So let's loop over the images.
# loops over all the image (90 times) for 90 images in the dataset
for (i, imgPath) in enumerate(imgPaths):
    # extract the person name from the image path
    print("[processing image {}/{}".format(i + 1, len(imgPaths)))
    name = imgPath.split(os.path.sep)[-2]

    # load the image coming in imgpath form imagepaths images 
    image = cv2.imread(imgPath)
    # Since we are using Dlib which uses RGB image channel
    # but the imread function from opencv reads image in BGR
    # so we convert it from GRB to RGB 
    # and please note that rgb is image type variable
    rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # detects face in the image (i.e. rgb) and draws a rectange in the face location
    # and only face is set in boxes variable so that encodings of only faces
    # can be calculated using face_encodings method from face_recognition model
    boxes = face_recognition.face_locations(rgb, model="cnn")
    encodings = face_recognition.face_encodings(rgb, boxes)

    # encoding is done for each faces in an image 'rgb'. you can visualize the data of encoding using print(encodings)
    # now for all encodings of the faces (128-embeding of one face is one encoding for one face. 
    # There can be multiple face in an image so different encodings for different face in the image)
    # we append the encodings and names in array varible initilized earlier
    for encoding in encodings:
        # add each encoding + name to our set of known names and encodings
        knownEncodings.append(encoding)
        knownNames.append(name)

Lots of frustating # comments right?? but please read those to be clear in every step :)
Finally the encoding and it's corresponding name is appended in array for all the image. Now we will be storing the encoded data in .pickle file format.

# facial encodings + names is save to disk
print("saving the encodings in colabrequiredfiles/encodings.pickle directory...")
data = {"encodings": knownEncodings, "names": knownNames}
f = open("colabrequiredfiles/encodings.pickle", "wb")
f.write(pickle.dumps(data))
f.close()
print("Done!!")

First part is done!! please keep those all code from import to till print("Done!!") in one inline block of google colab notebook and hit shift + enter. If you have done this in your local machine it may take several minutes to finish encoding and that's why we are using GPU supported cloud platform of google colab. You can see encodings.pickle by clicking the arrow like icon on your left and in files tab. you can also download and upload the files and use them in google colab.

 

Second Part:

The Second part is to play with the encoded data so that the face can be recognized for specific person. This part is a bit logical as we have to draw logic from such 0.10356407 types of numbers stored in .pickle format. such a pickled data :D. But playing with it is peace of a cake as those data contains some meanings in it.
Without getting late, let's get started.
In new inline block Import the necessary packages
import face_recognition
import pickle
import cv2
# this package is to show image in google colab nootbook
from google.colab.patches import cv2_imshow

Let's load the .pickle encoding data and take input image form colabrequiredfiles/ directory and some more thing explained in # comments
import face_recognition
import pickle
import cv2
from google.colab.patches import cv2_imshow

#open the stored encodings and names in previous code and store in data 
data = pickle.loads(open("colabrequiredfiles/encodings.pickle", "rb").read())
# load the input image and convert it from BGR to RGB as done earlier
image = cv2.imread("colabrequiredfiles/bikram.jpg")
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 
# detects face in the image (i.e. rgb) and draws a rectange in the face location
# and only face is set in boxes variable so that encodings of only faces
# can be calculated using face_encodings method from face_recognition model
print("recognizing faces...")
boxes = face_recognition.face_locations(rgb, model= "cnn")
#encode detected face from input image to 128-d embedding
encodings = face_recognition.face_encodings(rgb, boxes)
 
# initialize the list of names which will hold the names for each face detected
names = []

It's time to loop over the encodings for each face in input image and check if the encodings matches with the encoding stored from dataset training
for encoding in encodings:
    # check if each face encoding in the input image match to our known
    # encodings by using compare_faces method from face_recognition package
    # this will return a list of True/False values indicating which known_face_encodings match the face encoding
    # since in our dataset it found 94 faces and hence 94 encoding each of 128-d embedings
    # so matches will return list of 94 True/False
    # when true and when false?? -> it calculates euclidean distance
    # How much distance between faces (input embeding and dataset embeding) to consider it a match? 
    # well the defalut tolerance difference between distance is 0.6 less than this will give true and greater will return false
    matches = face_recognition.compare_faces(data["encodings"], encoding)
    name = "Unknown"
    # check to see if we have found a match
    if True in matches:
        # returns a list of index values that has value True in the matches list
        matchedIdxs = [i for (i, b) in enumerate(matches) if b]
        # dictionary which will have name as key and vote or count as value
        counts = {}
        # the i in for loop from matchedIdxs list points to the index value of list of key name in dictionary type variable 'data'  
        for i in matchedIdxs:
            name = data["names"][i]
            counts[name] = counts.get(name, 0) + 1    
        # determine the recognized face with the largest number of
        # votes (note: in the event of an unlikely tie Python will
        # select first entry in the dictionary)
        name = max(counts, key=counts.get)
    # update the list of names
    names.append(name)

Yeah I got you! It's a little hard to understand that logic right? Don't worry, as I already told you, it's just a peice of a cake if you visulize the data. I have prepared visualized file for that steps. Read the # comments carefully and sync it with the visualized data from link here
Now it's time to draw a box with recognized name in the detected face.
for ((top, right, bottom, left), name) in zip(boxes, names):
    # draw the predicted face name on the image
    cv2.rectangle(image, (left, top), (right, bottom), (0, 255, 0), 2)
    y = top - 15 if top - 15 > 15 else top + 15
    cv2.putText(image, name, (left, y), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 255, 0), 2)

cv2_imshow(image)
cv2.waitKey(0)

please keep those all code from import to till cv2.waitKey(0) in one inline block of google colab notebook.
Taadaaa!!!! Congratulations to you!! you Completed it... See the output by Shift + Enter


References:

Resources:

Comments

Popular posts from this blog

Image Compression and Color Quantization using K-Means Clustering

In this post, you'll able to compress an image of higher size relatively to a smaller size. Here size I mean the image's memory consumption, not the aspect ratio (though it is also somewhat related to the size). Before we begin, let's be familiar with what Image Compression, Color Quantization and K-Means Clustering is. Basically  K-Means Clustering  is used to find the central value (centroid) for k  clusters of data. Then each data point is assigned to the cluster whose center is nearest to k . Then, a new centroid is calculated for each of the k  clusters based upon the data points that are assigned in that cluster. In our case, the data points will be Image pixels. Assuming that you know what pixels are, these pixels actually comprises of 3 channels, Red, Green and Blue . Each of these channels' have intensity ranging from 0 to 255, i.e., altogether 256. So as a whole, total number of colors in each pixel is, 256 x 256 x 256.  Each pixel(color) has 2^8 colors

Happiness Detection in Images using Keras

Hello folks! Are you happy or are you not sure? Alright, let's build a model that will help you find out if you're happy or not.  Well, let's start with some basic understanding of this tutorial and later dive deeper into the neural networks. We're very well known what  popular Computer Vision is. It is one of the most popular field of machine learning. Happiness Detection is also one of such field where we apply Computer Vision techniques. This is a binary classification type of problem where we'll building a model that will detect whether the input image is either smiling or not.   The dataset is already labeled as smiling or not smiling. We'll be using 600 images for training and 150 images as test dataset. Before we get our hands into the core part, let's first import some libraries. Now let's know more about the data.  After the execution, you'll be able to look at the number of data we've taken for training and testing the prepared model. N