A Project Report On Facial Expression Recognition Using Deep Neural Networks
ABSTRACT
Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem. Despite efforts made in developing various methods for FER, exist- ing approaches traditionally lack generalizability when applied to unseen images or those that are captured in wild setting. Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifier’s hyperparameters are tuned to give best recognition accuracies across a single database, or a small collection of similar databases.
Nevertheless, the results are not significant when they are applied to novel data. This paper proposes a deep neural network architecture to address the FER problem across multiple well-known standard face datasets. Specifically, our network consists of two convolutional layers each followed by max pooling and then four Inception layers. The network is a single component architecture that takes registered facial images as the input and classifies them into either of the six basic or the neutral expressions.
We conducted comprehensive experiments on seven publically available facial expression databases, viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER2013. The results of proposed architecture are comparable to or better than the state-ofthe-art methods and better than traditional convolutional neural networks and in both accuracy and training time. This paper presents a novel deep neural network architecture for the FER problem, and examines the network’s ability to perform cross-database classification while train- ing on databases that have limited scope, and are often specialized for a few expressions (e.
g. MultiPIE and FERA). We conducted comprehensive experiments on seven well- known facial expression databases (viz. MultiPIE, MMI, CK+, DISFA, FERA, SFEW, and FER 2013) and obtain results which are significantly better than, or comparable to, traditional convolutional neural networks or other state-of- the-art methods in both accuracy and learning time.
ACKNOWLEDGEMENT
We have immense pleasure in successful completion of this project on Facial Expression Recognition Using Deep Neural Networks. We would like to take this opportunity to express our gratitude to Dr. C P S Prakash, Principal of DSCE, for permitting us to utilize all the necessary facilities of the institution.
We are also very grateful to our respected Vice Principal, HOD of Computer Science & Engineering, DSCE, Bangalore, Dr. Ramesh Babu D R, for his support and encouragement.
We are immensely grateful to our respected and learned guide, Dr. Shubha Bhat, Associate Professor CSE, DSCE for their valuable help and guidance. We are extremely thankful to them for all the encouragement and guidance they have given us during every stage of the project.
We would like to thank our project co-ordinators Dr. Vindhya M, Associate Professor, CSE, DSCE for their guidance and support.
We are also thankful to all the other faculty and staff members of our department for their kind co-operation and help.
Lastly, we would like to express our deep appreciation towards our classmates and our family for providing us with constant moral support and encouragement.
- SHIVANI DATT [1DS15CS099]
- MEENAKSHI BHAT [1DS15CS057]
- SHALINI [1DS15CS093]
- SHWETHA SANJAY SAVALGI [1DS13CS054]
Facial Expression Recognition using Neural Networks
Introduction
Current Human Machine Interaction (HMI) frameworks presently can’t seem to achieve the full passionate and social abilities fundamental for rich and powerful connection with people to characterize faces in a given single picture or succession of pictures as one of the six essential feelings. conventional AI methodologies, for example, booster vector machines and Bayesian classifiers, have been effective while ordering presented outward appearances in a controlled domain, late investigations have demonstrated that these arrangements don’t have the adaptability to group pictures caught in an unconstrained uncontrolled way (“in the wild”) or when connected databases for which they were not structured. This poor generalizability of these strategies is essentially because of the way that numerous methodologies are subject or database needy and just fit for perceiving misrepresented or constrained articulations like those in the preparation database. In addition, obtaining accurate training data is particularly difficult, especially for emotions such as anger or sad which are very difficult to accurately replicate.
Recently, due to an increase in the ready availability of computational power and increasingly large training databases to work with, the machine learning technique of neural networks has seen resurgence in popularity. Recent state of the art results have been obtained using neural net-works in the fields of visual object recognition, human pose estimation, face verification and many more. Even in the FER field results so far have been promising. Unlike traditional machine learning
approaches where features are defined by hand, we often see improvement in visual processing tasks when using neural networks because of the network’s ability to extract undefined features from the training database. It is often the case that neural networks that are trained on large amounts of data are able to extract features generalizing well to scenarios that the network has not been trained on.
We explore this idea closely by training our proposed network architecture on a subset of the available training databases, and then per-forming cross-database experiments which allow us to accurately judge the network’s performance in novel scenarios.
In the FER problem, however, unlike visual object databases such as imageNet, existing FER databases of-ten have limited numbers of subjects, few sample images or videos per expression, or small variation between sets, making neural networks significantly more difficult to train. For example, the FER2013 database (one of the largest recently released FER databases) contains 35,887 images of different subjects yet only 547 of the images portray disgust. Similarly, the CMU MultiPIE face database contains around 750,000 images but it is comprised of only 337 different subjects, where 348,000 images portray only a “neutral” emotion and the remaining images portray anger, fear or sadness respectively. Dept. of CSE, DSCE, Bangalore 78 1Facial Expression Recognition using Neural Networks
PROBLEM STATEMENT
Problem Statement
Human facial expressions can be easily classified into 7 basic emotions: happy, sad, surprise, fear, anger and neutral. Facial emotions are expressed through theactivation of specific sets of facial muscles. These sometimes subtle, yet complex, signals in an expression often contain an abundant amount of information about our state of mind. Through facial emotion recognition, we are able to measure the effects that content and services have on the users through an easyand low-cost procedure. For example, retailers may use there metrics to evaluate customer interest. Healthcare providers can provide better service by using additional information about patients’ emotional state during the treatment. Humans are well-trained in reading the emotions of others, in fact, at just 14 months old, babies can already tell the difference between happy and sad. We designed a deep learning neural network that gives machines the ability to make inferences about our emotional states.
Facial expression recognition is a process performed by computers, which consists of:
- Detection of the face when the user comes in the web cam’s frame.
- Extracting facial features from the detected face region and detecting the shape of facial components or describing the texture of the skin in a facial area. This is called Facial Features Extraction.
- After the feature extraction, the computer categorizes the emotion states of the user through the datasets provided during training of the model. Dept. of CSE, DSCE, Bangalore 78 2
Facial Expression Recognition using Neural Networks3 LITERATURE SURVEY
Literature Survey
Human Facial Expression Recognition from Static Image using Shape and Appearance Feature
Authors: Naveen Kumar , H N Jagadeesha , S Amith Kjain.
Description:
This paper proposes a Facial Expression Recognition using Histogram of Oriented Gradients (HOG) and Support Vector Machine(SVM). The proposed work shows how HOG features can be exploited for facial expression recognition. Use of HOG features make the performance of FER system to be subject independent.
The accuracy of the work is found be 92.56% when implemented using Cohnkanade dataset for six basic expressions. Results indicate that shape features on face carry more information for emotion modelling when compared with texture and geometric features. Shape features are better when compared with geometric features due to the fact that a small pose variation degrades the performance of FER system which relies on geometric features where as a small pose variation doesn’t reflect any changes on a FER system which relies on HOG features. Detection rates for disgust, fear and sad is less in the proposed work. Detection rates can be further improved by combining shape, texture and geometric features. Optimized cell sizes may be considered for real time implementation so as to address both detection rate and processing speed. The influence of non-frontal face on the performance of FER system could be addressed in the future work.
Face Detection and Recognition using Viola-Jones algorithm and Fusion of PCA and ANN
Authors : Narayan T. Deshpande , Dr. S. Ravishankar,
Description :
This paper propose to Face recognition, Principal Component Analysis, Artificial Neural Network, Viola-Jones algorithm. The paper presents an efficient approach for face detection and recognition using Viola-Jones, fusion of PCA and ANN techniques. The performance of the proposed method is compared with other existing face recognition methods and it is observed that better accuracy in recognition is achieved with the proposed method. Face detection and recognition plays a vital role in a wide range of applications. In most of the applications a high rate of accuracy in identifying a person is desired hence the proposed method can be considered in comparison with the existing methods. Dept. of CSE, DSCE, Bangalore 78 3
Facial Expression Recognition using Neural Networks
LITERATURE SURVEY
Facial Expression Recognition
Authors : Neeta Sarode , Prof. Shalini Bhatia
Description:
This paper propose to grayscale images; face; facial expression recognition; lip region extraction; human-computer interaction. Experiments are performed on grayscale image databases. Images from Yale facial image database and JAFFE database (Figure 7) are used for experiments. JAFFE database consists of grayscale images. The database consists of Japanese Female Facial Expressions that have 7 expressions of 10 people including neutral. Each person has 3-4 images of same expression, so the total number of images in the database comes to 213 images.
An efficient, local image- based approach for extraction of intransient facial features and recognition of four facial expressions was presented. In the face, we use the eyebrow and mouth corners as main ‘anchor’ points. It does not require any manual intervention (like initial manual assignment of feature points). The system, based on a local approach, is able to detect partial occlusions also.
Comparision of PCA and LDA Techniques for Face
Recognition Feature Based Extraction With Accuracy
Enhancement
Authors : Riddhi A. Vyas , Dr.S.M. Shah
Description:
This paper propose to Face recognition, PCA, LDA, Eigen value, Covariance, Euclidean distance, Eigen face, Scatter matrix. A feature extraction is a quite tricky phase in a process of Recognition. To get better rate of face recognition the correct choice of algorithm from many for feature extraction is extremely significant and that plays significant role in face recognition process. Before selecting the feature extraction techniques you must have knowledge of it and which one performs accurately in which criteria. In this comparative analysis, it is provided which Feature extraction technique is performs accurate in different criteria.
From individual conclusion it is clear and proves that LDA is efficient for facial recognition method for images of Yale database, comparative study mention that LDA achieved 74.47% recognition rate with training set of 68 images and out of 165 images total 123 images are recognized with higher accuracy. In future Face Recognition rate can be improved that includes the full frontal face with facial expression using PCA and LDA. Face recognition Rate can be improved with hybrid preprocessing technique for PCA and LDA. Both feature extraction technique cannot give satisfied recognition rate for Illumination problem so it can beimproved. PCA and LDA can be combining with other techniques DWT, DCT,
LBP etc can improve the face recognition rate. Dept. of CSE, DSCE, Bangalore 78 4
Facial Expression Recognition using Neural Networks3 LITERATURE SURVEY
Facial Expression Recognition Using Visual Saliency and Deep Learning
Authors : Viraj Mavani , Shanmuganathan Raman , Krishna Prasad Miyapuram.
Description:
This paper propose to Facial Expression Recognition Using Visual Saliency and Deep Learning. We have demonstrated a CNN for facial expression recognition with generalization abilities. We tested the contribution of potential facial regions of interest in human vision using visual saliency of images in our facial expressions datasets.The confusion between different facial expressions was minimal with high recognition accuracies for four emotions – disgust, happy, sad and surprise [Table 1, 2]. The general human tendency of angry being confused as sad was observed [Table 1] as given in [22]. Fearful was confused with neutral, whereas neutral was confused with sad. When saliency maps were used, we observed a change in the confusion matrix of emotion recognition accuracies. Angry, neutral and sad emotions were now more confused with disgust, whereas surprised was more confused as fearful [Table 2]. These results suggested that the generalization of deep learning network with visual saliency 65.39% was much higher than chance level of 1/7. Yet, the structure of confusion matrix was much different when compared to the deep learning network that considered complete images. We conclude with the key contributions of the paper as two-fold. (i), we have presented generalization of deep learning network for facial emotion recognition across two datasets. (ii), we introduce here the concept of visual saliency of images as input and observe the behavior of the deep learning network to be varied. This opens up an exciting discussion on further integration of human emotion recognition (exemplified using visual saliency in this paper) and those of deep convolutional neural networks for facial expression recognition. Dept. of CSE, DSCE, Bangalore 78 5