This project is all about defining and training a convolutional neural network to perform facial keypoint detection, and using computer vision techniques to transform images of faces.
This model takes an input image and makes transformations to it like converting it to grayscale image , reclaing it to be (224x224) and Normalizing its values to be in range [0,1].
Then the image is fed to the network to predict the output facial keypoints of the input image with error ~ 2%
error ~ 0.4%
.
The output would be in shape (136,1)
then we reshape it to be (68,2)
which means a pair of values for each keypoints (x,y) that identifies the facial keypoints.
And finally we plot the output points (x,y)
on the original input image to show the facial keypoints.
We should be able to get less error < 2%
-
Try splitting the given test set into Validation set and Test set in order to get better results. -
Try training the model for more epochs > 10 epochs -
Try EarlyStopping to protect the model from overfitting the data -
Try different kinds of pretrained networks like AlexNet , ResNet , etc.. -
Try adding more Convolutional layers and make your model more complex.Not Needed
-
Try getting a larger data set.Not Needed
With these methods we can get error that is close to error ~ 0.5%
✅
Error now has decreased to ~ 0.004
on both Training and Validation Sets
This project uses opencv and PyTorch to install these libraries.
pip install opencv-python
pip3 install torch torchvision
Used ResNet
instead of the previous architecture
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn.init as I
import torchvision.models as models
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.resnet18 = models.resnet18(pretrained=True)
# change from supporting color to gray scale images
self.resnet18.conv1 =
nn.Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2),
padding=(3, 3), bias=False)
n_inputs = self.resnet18.fc.in_features
self.resnet18.fc = nn.Linear(n_inputs, 136)
def forward(self, x):
x = self.resnet18(x)
return x
import torch.optim as optim
criterion = nn.SmoothL1Loss().cuda if device == 'cuda' else nn.SmoothL1Loss()
#To Turn on the gradients after being disabled in the Network Architecture
optimizer = optim.Adam(filter(lambda p: p.requires_grad,net.parameters()), lr = 0.001)
- Ahmed Abd-Elbakey Ghonem - Github
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
- Hat tip to @stefanonardo whose EarlyStopping class code was used