Introduction
Current iHuman iMachine iInteraction i(HMI) iframeworks ipresently ican’t iseem ito iachieve ithe ifull ipassionate iand isocial iabilities ifundamental ifor irich iand ipowerful iconnection iwith ipeople ito icharacterize ifaces iin ia igiven isingle ipicture ior isuccession iof ipictures ias ione iof ithe isix iessential ifeelings. iconventional iAI imethodologies, ifor iexample, ibooster ivector imachines iand iBayesian iclassi iers, ihave ibeen ie iective iwhile iordering ipresented ioutward iappearances iin ia icontrolled idomain, ilate iinves-tigations ihave idemonstrated ithat ithese iarrangements idon’t ihave ithe iadaptability ito igroup ipictures icaught iin ian iunconstrained iuncontrolled iway i(“in ithe iwild”) ior iwhen iconnected idatabases ifor iwhich ithey iwere inot istructured.
iThis ipoor igen-eralizability iof ithese istrategies iis iessentially ibecause iof ithe iway ithat inumerous imethodologies iare isubject ior idatabase ineedy iand ijust it ifor iperceiving imisrep-resented ior iconstrained iarticulations ilike ithose iin ithe ipreparation idatabase. iIn iaddition, iobtaining iaccurate itraining idata iis iparticularly idi icult, iespecially ifor iemotions isuch ias ianger ior isad iwhich iare ivery idi icult ito iaccurately ireplicate.
Recently, idue ito ian iincrease iin ithe iready iavailability iof icomputational ipower iand iincreasingly ilarge itraining idatabases ito iwork iwith, ithe imachine ilearning itech-nique iof ineural inetworks ihas iseen iresurgence iin ipopularity. iRecent istate iof ithe iart iresults ihave ibeen iobtained iusing ineural inet-works iin ithe ields iof ivisual iobject irecognition, ihuman ipose iestimation, iface iveri ication iand imany imore. iEven iin ithe iFER ield iresults iso ifar ihave ibeen ipromising. iUnlike itraditional imachine ilearning iapproaches iwhere ifeatures iare ide ined iby ihand, iwe ioften isee iimprovement iin ivisual iprocessing itasks iwhen iusing ineural inetworks ibecause iof ithe inetwork‘s iability ito iextract iunde ined ifeatures ifrom ithe itraining idatabase.
iIt iis ioften ithe icase ithat ineural inetworks ithat iare itrained ion ilarge iamounts iof idata iare iable ito iextract ifeatures igeneralizing iwell ito iscenarios ithat ithe inetwork ihas inot ibeen itrained ion. iWe iexplore ithis iidea iclosely iby itraining iour iproposed inetwork iarchitecture ion ia isubset iof ithe iavailable itraining idatabases, iand ithen iper-forming icross-database iex-periments iwhich iallow ius ito iaccurately ijudge ithe inetwork’s iperformance iin inovel iscenarios.
In ithe iFER iproblem, ihowever, iunlike ivisual iobject idatabases isuch ias iimageNet, iexisting iFER idatabases iof-ten ihave ilimited inumbers iof isubjects, ifew isample iimages ior ivideos iper iexpression, ior ismall ivariation ibetween isets, imaking ineural inetworks isigni icantly imore idi icult ito itrain. iFor iexample, ithe iFER2013 idatabase i(one iof ithe ilargest irecently ireleased iFER idatabases) icontains i35,887 iimages iof idi ierent isubjects iyet ionly i547 iof ithe iimages iportray idisgust. iSimilarly, ithe iCMU iMultiPIE iface idatabase icontains iaround i750,000 iimages ibut iit iis icomprised iof ionly i337 idi ierent isubjects, iwhere i348,000 iimages iportray ionly ia ineutral” iemotion iand ithe iremaining iimages iportray ianger, ifear ior isadness irespectively.
Problem iStatement
Human ifacial iexpressions ican ibe ieasily iclassi ied iinto i7 ibasic iemotions: ihappy, isad, isurprise, ifear, ianger iand ineutral. iFacial iemotions iare iexpressed ithrough ithe iactivation iof ispeci ic isets iof ifacial imuscles. iThese isometimes isubtle, iyet icom-plex, isignals iin ian iexpression ioften icontain ian iabundant iamount iof iinformation iabout iour istate iof imind. iThrough ifacial iemotion irecognition, iwe iare iable ito imeasure ithe ie iects ithat icontent iand iservices ihave ion ithe iusers ithrough ian ieasy iand ilow-cost iprocedure. iFor iexample, iretailers imay iuse ithere imetrics ito ievaluate icustomer iinterest. iHealthcare iproviders ican iprovide ibetter iservice iby iusing iaddi-tional iinformation iabout ipatients’ iemotional istate iduring ithe itreatment. iHumans iare iwell-trained iin ireading ithe iemotions iof iothers, iin ifact, iat ijust i14 imonths iold, ibabies ican ialready itell ithe idi ierence ibetween ihappy iand isad. iWe idesigned ia ideep ilearning ineural inetwork ithat igives imachines ithe iability ito imake iinferences iabout iour iemotional istates.
Facial iexpression irecognition iis ia iprocess iperformed iby icomputers, iwhich icon-sists iof:
Detection iof ithe iface iwhen ithe iuser icomes iin ithe iweb icam’s iframe.
Extracting ifacial ifeatures ifrom ithe idetected iface iregion iand idetecting ithe ishape iof ifacial icomponents ior idescribing ithe itexture iof ithe iskin iin ia ifacial iarea. iThis iis icalled iFacial iFeatures iExtraction.
After ithe ifeature iextraction, ithe icomputer icategorizes ithe iemotion istates iof ithe iuser ithrough ithe idatasets iprovided iduring itraining iof ithe imodel.
Dept. iof iCSE, iDSCE, iBangalore i782
Facial iExpression iRecognition iusing iNeural iNetworks3LITERATURE iSURVEY
Literature iSurvey
Human iFacial iExpression iRecognition ifrom iStatic iIm-age iusing iShape iand iAppearance iFeature
Authors: iNaveen iKumar i, iH iN iJagadeesha i, iS iAmith iKjain. iDescription:
This ipaper iproposes ia iFacial iExpression iRecognition iusing iHistogram iof iOri-ented iGradients i(HOG) iand iSupport iVector iMachine(SVM). iThe iproposed iwork ishows ihow iHOG ifeatures ican ibe iexploited ifor ifacial iexpression irecognition. iUse iof iHOG ifeatures imake ithe iperformance iof iFER isystem ito ibe isubject iindependent. iThe iaccuracy iof ithe iwork iis ifound ibe i92.56% iwhen iimplemented iusing iCohn-kanade idataset ifor isix ibasic iexpressions. iResults iindicate ithat ishape ifeatures ion iface icarry imore iinformation ifor iemotion imodelling iwhen icompared iwith itexture iand igeometric ifeatures. iShape ifeatures iare ibetter iwhen icompared iwith igeometric ifeatures idue ito ithe ifact ithat ia ismall ipose ivariation idegrades ithe iperformance iof iFER isystem iwhich irelies ion igeometric ifeatures iwhere ias ia ismall ipose ivariation idoesn’t ire iect iany ichanges ion ia iFER isystem iwhich irelies ion iHOG ifeatures. iDe-tection irates ifor idisgust, ifear iand isad iis iless iin ithe iproposed iwork. iDetection irates ican ibe ifurther iimproved iby icombining ishape, itexture iand igeometric ifeatures. iOp-timized icell isizes imay ibe iconsidered ifor ireal itime iimplementation iso ias ito iaddress iboth idetection irate iand iprocessing ispeed. iThe iin iuence iof inon-frontal iface ion ithe iperformance iof iFER isystem icould ibe iaddressed iin ithe ifuture iwork.
Face iDetection iand iRecognition iusing iViola-Jones ial-gorithm iand iFusion iof iPCA iand iANN
Authors:NarayanT.Deshpande,Dr.S.Ravishankar,
Description:
This ipaper ipropose ito iFace irecognition, iPrincipal iComponent iAnalysis, iAr-ti icial iNeural iNetwork, iViola-Jones ialgorithm. iThe ipaper ipresents ian ie icient iapproach ifor iface idetection iand irecognition iusing iViola-Jones, ifusion iof iPCA iand iANN itechniques. iThe iperformance iof ithe iproposed imethod iis icompared iwith iother iexisting iface irecognition imethods iand iit iis iobserved ithat ibetter iaccuracy iin irecognition iis iachieved iwith ithe iproposed imethod.Face idetection iand irecognition iplays ia ivital irole iin ia iwide irange iof iapplications. iIn imost iof ithe iapplications ia ihigh irate iof iaccuracy iin iidentifying ia iperson iis idesired ihence ithe iproposed imethod ican ibe iconsidered iin icomparison iwith ithe iexisting imethods.
Dept. iof iCSE, iDSCE, iBangalore i783
Facial iExpression iRecognition iusing iNeural iNetworks3LITERATURE iSURVEY
Facial iExpression iRecognition
Authors:NeetaSarode,Prof.ShaliniBhatia
Description:
This ipaper ipropose ito igrayscale iimages; iface; ifacial iexpression irecognition; ilip iregion iextraction; ihuman-computer iinteraction. iExperiments iare iperformed ion igrayscale iimage idatabases. iImages ifrom iYale ifacial iimage idatabase iand iJAFFE idatabase i(Figure i7) iare iused ifor iexperiments. iJAFFE idatabase iconsists iof igrayscale iimages. iThe idatabase iconsists iof iJapanese iFemale iFacial iExpressions ithat ihave i7 iexpressions iof i10 ipeople iincluding ineutral. iEach iperson ihas i3-4 iimages iof isame iexpression, iso ithe itotal inumber iof iimages iin ithe idatabase icomes ito i213 iimages. iAn ie icient, ilocal iimage- ibased iapproach ifor iextraction iof iintransient ifacial ifea-tures iand irecognition iof ifour ifacial iexpressions iwas ipresented. iIn ithe iface, iwe iuse ithe ieyebrow iand imouth icorners ias imain i`anchor’ ipoints. iIt idoes inot irequire iany imanual iintervention i(like iinitial imanual iassignment iof ifeature ipoints). iThe isystem, ibased ion ia ilocal iapproach, iis iable ito idetect ipartial iocclusions ialso.
Comparision iof iPCA iand iLDA iTechniques ifor iFace iRecognition iFeature iBased iExtraction iWith iAccuracy iEnhancement
Authors:RiddhiA.Vyas,Dr.S.M.Shah
Description:
This ipaper ipropose ito iFace irecognition, iPCA, iLDA, iEigen ivalue, iCovariance, iEuclidean idistance, iEigen iface, iScatter imatrix. iA ifeature iextraction iis ia iquite itricky iphase iin ia iprocess iof iRecognition. iTo iget ibetter irate iof iface irecognition ithe icorrect ichoice iof ialgorithm ifrom imany ifor ifeature iextraction iis iextremely isigni icant iand ithat iplays isigni icant irole iin iface irecognition iprocess. iBefore ise-lecting ithe ifeature iextraction itechniques iyou imust ihave iknowledge iof iit iand iwhich ione iperforms iaccurately iin iwhich icriteria. iIn ithis icomparative ianalysis, iit iis ipro-vided iwhich iFeature iextraction itechnique iis iperforms iaccurate iin idi ierent icriteria. iFrom iindividual iconclusion iit iis iclear iand iproves ithat iLDA iis ie icient ifor ifacial irecognition imethod ifor iimages iof iYale idatabase, icomparative istudy imention ithat iLDA iachieved i74.47% irecognition irate iwith itraining iset iof i68 iimages iand iout iof i165 iimages itotal i123 iimages iare irecognized iwith ihigher iaccuracy. iIn ifuture iFace iRecognition irate ican ibe iimproved ithat iincludes ithe ifull ifrontal iface iwith ifacial iexpression iusing iPCA iand iLDA. iFace irecognition iRate ican ibe iimproved iwith ihybrid ipreprocessing itechnique ifor iPCA iand iLDA. iBoth ifeature iextraction itech-nique icannot igive isatis ied irecognition irate ifor iIllumination iproblem iso iit ican ibe iimproved. iPCA iand iLDA ican ibe icombining iwith iother itechniques iDWT, iDCT, iLBP ietc ican iimprove ithe iface irecognition irate.
Dept. iof iCSE, iDSCE, iBangalore i784
Facial iExpression iRecognition iusing iNeural iNetworks3LITERATURE iSURVEY
Facial iExpression iRecognition iUsing iVisual iSaliency iand iDeep iLearning
Authors i: iViraj iMavani i, iShanmuganathan iRaman i, iKrishna iPrasad
Miyapuram.
Description:
This ipaper ipropose ito iFacial iExpression iRecognition iUsing iVisual iSaliency iand iDeep iLearning. iWe ihave idemonstrated ia iCNN ifor ifacial iexpression irecognition iwith igeneralization iabilities. iWe itested ithe icontribution iof ipotential ifacial iregions iof iinterest iin ihuman ivision iusing ivisual isaliency iof iimages iin iour ifacial iexpressions idatasets.The iconfusion ibetween idi ierent ifacial iexpressions iwas iminimal iwith ihigh irecognition iaccuracies ifor ifour iemotions i{ idisgust, ihappy, isad iand isurprise i[Table i1, i2]. iThe igeneral ihuman itendency iof iangry ibeing iconfused ias isad iwas iobserved i[Table i1] ias igiven iin i[22]. iFearful iwas iconfused iwith ineutral, iwhereas ineutral iwas iconfused iwith isad. iWhen isaliency imaps iwere iused, iwe iobserved ia ichange iin ithe iconfusion imatrix iof iemotion irecognition iaccuracies. iAngry, ineutral iand isad iemotions iwere inow imore iconfused iwith idisgust, iwhereas isurprised iwas imore iconfused ias ifearful i[Table i2]. iThese iresults isuggested ithat ithe igeneralization iof ideep ilearning inetwork iwith ivisual isaliency i65.39% iwas imuch ihigher ithan ichance ilevel iof i1/7. iYet, ithe istructure iof iconfusion imatrix iwas imuch idi ierent iwhen icompared ito ithe ideep ilearning inetwork ithat iconsidered icomplete iimages. iWe iconclude iwith ithe ikey icontributions iof ithe ipaper ias itwo-fold. i(i), iwe ihave ipresented igeneralization iof ideep ilearning inetwork ifor ifacial iemotion irecognition iacross itwo idatasets. i(ii), iwe iintroduce ihere ithe iconcept iof ivisual isaliency iof iimages ias iinput iand iobserve ithe ibehavior iof ithe ideep ilearning inetwork ito ibe ivaried. iThis iopens iup ian iexciting idiscussion ion ifurther iintegration iof ihuman iemotion irecognition i(exempli ied iusing ivisual isaliency iin ithis ipaper) iand ithose iof ideep iconvolutional ineural inetworks ifor ifacial iexpression irecognition.
Dept. iof iCSE, iDSCE, iBangalore i785
Facial iExpression iRecognition iusing iNeural iNetworks4ARCHITECTURE
Architecture
Block iDiagram
Applying ithe iInception ilayer ito iapplications iof iDeep iNeural iNetwork ihas ihad iremarkable iresults, iand iit iseems ionly ilogical ito iextend istate iof ithe iart itechniques iused iin iobject irecognition ito ithe iFER iproblem. iIn iaddition ito imerely iproviding itheoretical igains ifrom ithe isparsity, iand ithus, irelative idepth, iof ithe inetwork, ithe iInception ilayer ialso iallows ifor iimproved irecognition iof ilocal ifeatures, ias ismaller iconvolutions iare iapplied ilocally, iwhile ilarger iconvolutions iapproximate iglobal ifea-tures. iThe iincreased ilocal iperformance iseems ito ialign ilogically iwith ithe iway ithat ihumans iprocess iemotions ias iwell. iBy ilook-ing iat ilocal ifeatures isuch ias ithe ieyes iand imouth, ihumans ican idistinguish ithe imajority iof ithe iemotions i. iSimilarly, ichil-dren iwith iautism ioften icannot idistinguish iemotion iprop-erly iwithout ibeing itold ito iremember ito ilook iat ithe isame ilo-cal ifeatures i. iBy iusing ithe iInception ilayer istructure iand iapplying ithe inetwork-in-network itheory iproposed ito iwe ican iexpect isigni icant igains ion ilocal ifeature iper-formance, iwhich iseems ito ilogically itranslate ito iimproved iFER iresults.