Skip to content
Anatoly Medvedev edited this page Sep 8, 2022 · 1 revision

Typing SVG


Table of Contents

1. Face Detection

Face detection on raw images using MTCNN on GPU device.

import cv2
import torch
from PIL import Image, ImageDraw

from neuroface import MTCNN

# Initialize GPU device if available.
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Initialize MTCNN on GPU device.
mtcnn = MTCNN(keep_all=True, device=device).eval()

# Upload image and change color space.
image = cv2.imread(<select image>)
image = Image.fromarray(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
# Detect face boxes on image.
boxes, _ = mtcnn.detect(image)

# Draw detected faces on image.
draw = ImageDraw.Draw(image)
for box in boxes:
    draw.rectangle(box.tolist(), outline=(255, 0, 0), width=6)

Face detection and extracting using MediaPipe Face Detection implementation.

import cv2

from neuroface import FaceDetection

# Initialize FaceMesh.
model = FaceDetection()

# Upload image.
image = cv2.imread(<select image>)

# Detect facial landmarks.
face_batch = model.extract(image)

2. Face Comparison

Face recognition models use a multi-dimensional vector representation of a face. NeuroFace provides access to face vectorization directly using InceptionResnetV1 pretrained on VGGFace2 dataset.

import torch
import torchvision.io as io

from neuroface import MTCNN, InceptionResnetV1
from neuroface.face.comparison.distance import distance

# Initialize GPU device if available.
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

# Initialize MTCNN and InceptionResnetV1 on GPU device.
mtcnn = MTCNN(keep_all=True, device=device).eval()
resnet = InceptionResnetV1(pretrained='vggface2', device=device).eval()

# Upload images and rearrange dimensions (C H W --> H W C).
image = io.read_image(<select image>).to(device).permute(1, 2, 0)

# Detect faces on images.
face = mtcnn(image)

Calculating distance between obtained embeddings.

  • 0 to select Euclidian distance:

$$d(p, q)=\sqrt{\sum_{i=1}^{n} (p_i-q_i)^2}.$$

  • 1 to select Euclidian distance with L2 normalization:

$$|p|=\sqrt{\sum_{i=1}^{n} |p_i|^2}, |q|=\sqrt{\sum_{i=1}^{n} |q_i|^2}$$

$$d(p, q)=\sqrt{\sum_{i=1}^{n} (|p|_i-|q|_i)^2}.$$

  • 2 to select cosine similarity:

$$d(p, q)=\frac{\sum_{i=1}^{n} p_{i} q_{i}}{\sqrt{\sum_{i=1}^{n} p_{i}^2} \sqrt{\sum_{i=1}^{n} q_{i}^2}}.$$

  • 3 to select Manhattan distance:

$$d(p, q)=\sum_{i=1}^{n} |p_i-q_i|.$$

# Rearrange dimensions (B H W C --> B C H W) and build face embeddings.
embedding = resnet(face.permute(0, 3, 1, 2))

# Calculate distance between embeddings.
print(distance(<select embedding>, <select embedding>, distance_metric=<select metric>))
Euclidian Distance 0.0 1.9461 0.5072
Euclidian Distance with L2 Normalization 0.0 1.9461 0.5072
Cosine Similarity 0.0 0.4914 0.2318
Manhattan Distance 0.0 25.2447 12.7753

3. Facial Landmark Detection

import cv2

from neuroface import FaceMesh

# Initialize FaceMesh.
model = FaceMesh(static_image_mode=True, max_num_faces=1)

# Upload image.
image = cv2.imread(<select image>)

# Detect facial landmarks.
face_array = model.detect(image)

4. Pose Landmark Detection

In progress.

5. Facial Expression Recognition

In progress.