Speeding up evolutionary approaches to face recognition by means of Hadoop

4:30 pm - 4:55 pm 24 Thursday


Evolutionary Computation Software

1 Introduction

Although face recognition may be considered as an easy task for humans, it becomes very complex when performed by a computer. Different approaches have been proposed [1][2], but current algorithms still incur substantial computational cost. Algorithms most typically explore a big database containing hundreds of photographs, and these photographs are employed for both training and testing the algorithm. For each of the image considered, a fixed number of interest points locations are fed to the algorithm by researchers from the human face, and these set of points must be extracted and employed by the machine learning algorithm during the training process first, and the testing step afterwards. We discuss here the possibility of enabling the algorithm to learn and decide which points should be used, which would require longer computing time but would probably allow to improve recognition rates.We propose a new approach consisting of allowing the algorithm to learn which is the best subset of points for the problem at hand. The subset of points will be obtained by means of an Evolutionary Algorithm(EA). This learning stage requires a training and test steps, which, in this particular problem, is very costly. We employ Hadoop [3] as the basis for running the algorithms proposed.

2 Methodology

Face Recognition Algorithms (FRAs) require large image databases that must be intensively processed for the algorithm to learn. Each image includes meta information related to specific face locations which are known as Points of Interest (POI).We introduce an EA with the aim of selecting the subset of POIs that should be used. We use a simple EA: Individuals are represented by a string of zeros and ones, each bit referred to a specific location, where zero means that the POI is not used and a one the opposite. Although the global goal is to obtain similar or better results than previous approaches with a reduced number of POIs, this paper focuses in reducing computing time required for the whole algorithm proposed. We thus focus on the phase requiring longer computing time: the fitness evaluation (intensive image processing operations required). Two jobs will be run on Hadoop for each individual, these jobs correspond to the two phases that the FRA is divided into:training and querying.Once the evaluation phase starts, one job is run for each individual, which are part of FRA training process. Hadoop jobs are executed along two phases known as Map and Reduce, each of them requiring several tasks that are distributed along the architecture and are executed in parallel. The first stage (Map) will apply the same set of operations to every image. The Map output is a descriptor vector per image, this output is used as the input of the next phase, the Reduce. The Reduce phase generates a final matrix which contains the knowledge for recognizing faces. This matrix is used in the next job, the query job,where the algorithm tries to recognize the faces. Results obtained are checked with actual classes, and the hits percentage is used as the fitness. Once all jobs have finished, the evaluation phase of the EA concludes.

3 Results

We tested the implementation using different number of computing nodes, and the time required in each case is shown in figure 1. We notice a significant reduction in time and a good scalability.These promising results will allow us to execute the EA with larger populations for a higher number of generations. It will help to optimize the FRA using less POIs and likely with similar or better results.

4 Conclusions

This paper proposes a new EA based approach to face recognition, that would allow in the future to automatically decide best POIs to be employed. The long computing time required has led us to develop a Hadoop based implementation, which distribute fitness evaluations by means of map/reduce jobs. The solution presented allows us to run a face recognition algorithm into an unlimited number of machines using commodity hardware notably reducing computing time.

5 Acknowledgments

Spanish Ministry of Economy, Project UEX:EPHEMEC (TIN2014-56494-C4-2-P); Gobex, FEDER GRU10029.


[1] Cesar Benavides, Juan Villegas-Cortes, Graciela Román-Alonso and Carlos Avilés-Cruz, Reconocimiento de rostros a partir de la propia imagen usando técnica CBIR, X Congreso Español sobre Metaheurísticas, Algoritmos Evolutivos y Bioinspirados (MAEB), 2015

[2] A-K-Jain, Automatic face recognition: State of the art, Distinguished Lecture Series, pp. 0-44, Septiembre 2010.

[3] Hadoop, distributed scalable fault-tolerance framework for data processing. http://hadoop.apache.org/