## Semantic Genetic Programming for Sentiment Analysis

Published in 24 Thursday

### Genetic Programming

In last years, the production of textual documents in social media has increased exponentially. This ever-growing amount of available information promotes the research and business activities around opinion mining and sentiment analysis areas. In social media, people share comments about many disparate topics. i.e., events, persons, organization, etc. The main result is that social media has become a source of human opinion. For this reason, the text mining has received a lot of attention from many companies and governments.

The automatic sentiment analysis of given text is one of the most important tasks in text mining. This is a classification task that consists in determining whether a particular document has a positive, negative or neutral opinion. There also exists variations considering intermediate levels of the sentiments. Determining whether a text document has a positive or negative opinion is turning on an essential tool for both public and private companies [7]. This tool is useful to know “What people think”; so, it could be a significant help to any decision-making process (for any level of government, marketing, etc.) [6].

On the other hand, Genetic Programming (GP) is an evolutionary algorithm that has received a lot of attention due to its success in solving hard real-world problems [8]. Surprisingly, for the best of our knowledge, GP has not been used to tackle the problem of sentiment analysis. In fact, the use of GP in the field of text analysis is scarce, being one of these exceptions our previous research work (see [3]). In that paper, GP was used to optimize the weights in a vector space model for text classification, or the works in Automatic text summarization presented in [9, 10].

Sentiment analysis poses a number of challenges where semantic GP might be a feasible option. Some of these problems come from its high-dimensional representation and the considerable training set size. Just to fix ideas, a typical real-world dataset for text-mining is represented using tens to hundred of thousands coordinates and tens of thousands examples. These characteristics make traditional GP to suffer from finding an optimal solution in reasonable time. The interested reader in how a document collection is processed to obtain a vector representation is referenced to the specialized literature [1].

The fast convergence rate of novel semantic GP indicates that they may provide a feasible solution for text mining problems. The semantic genetic operators that seem to have the highest convergence rate are the ones proposed by [5, 2]. These both techniques were inspired by the geometric crossover and both use a more aggressive approach than the original one. The key idea for these new approaches consists in creating the best offspring that can be produced by a linear combination of the parents. However, these method suffer from overfitting. On the other hand, in [4], we applied the idea of a linear combination at the level of the individual, i.e., an individual is composed of a set of expressions which are linearly combined to produce the final output. This latter work presents a better trade off between learning and generalization.

In this contribution, we propose to combine both strategies, i.e., orchestrate the crossover and the local optimization of individuals, to tackle the problem of semantic analysis.

#### References

[1] Ricardo A. Baeza-Yates and Berthier A. Ribeiro-Neto.Modern Information Retrieval. Addison-Wesley, 2ndedition, 2011.

[2] Mauro Castelli, Leonardo Trujillo, Leonardo Vanneschi, Sara Silva, Emigdio Z-Flores, and Pierrick Legrand.Geometric Semantic Genetic Programming with Local Search. In Proceedings of the 2015 on Genetic and Evolutionary Computation Conference, GECCO ’15, pages 999–1006, New York, NY, USA, 2015. ACM. 00000.

[3] Hugo Jair Escalante, Mauricio A. Garcia-Limon, Alicia Morales-Reyes, Mario Graff, Manuel Montes-y Gomez,Eduardo F. Morales, and Jose Martinez-Carranza. Term-weighting learning via genetic programming for text classification.Knowledge-Based Systems, 2015. 00000.

[4] Mario Graff, Eric S. Tellez Hugo Jair Escalante, and Jose Ortiz-Bejar. Memetic genetic programming based on orthogonal projections in the phenotype space. IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC2015) (under review).

[5] Mario Graff, Eric Sadit Tellez, Elio Villasenor, and Sabino Miranda-Jiménez. Semantic genetic programming operators based on projections in the phenotype space.Research in Computing Science, 94:73–85, 2015.

[6] Bo Pang and Lillian Lee. Opinion mining and sentiment analysis.Foundations and Trends in Information Retrieval, 2(1-2):1–135, 2008.

[7] Tao Peng, Wanli Zuo, and Fengling He. Svm based adaptive learning method for text classification from positive and unlabeled documents.Knowledge and Information Systems, 16(3):281–301, 2008.

[8] Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee.A Field Guide to Genetic Programming.Lulu Enterprises, UK Ltd, March 2008. 00724.

[9] Carlos Nascimento Silla Jr., Gisele L. Pappa, Alex Alves Freitas, and Celso A. A. Kaestner. Automatic text summarization with genetic algorithm-based attribute selection. In Christian Lemaître, Carlos A. Reyes,and Jesús A. González, editors,Advances in Artificial Intelligence - IBERAMIA 2004, Proceedings 9th Ibero-American Conference on AI, volume 3315 of Lecture Notes in Computer Science, pages 305–314, Puebla,Mexico, November 22-26 2004. Springer.

[10] Nguyen Quang Uy, Pham Tuan Anh, Truong Cong Doan, and Nguyen Xuan Hoai. A study on the use of genetic programming for automatic text summarization. In Hung Dang-Van and Jeff Sanders, editors,The Fourth International Conference on Knowledge and Systems Engineering, KSE 2012, pages 93–98, Danang,Vietnam, 17-19 August 2012.

## Behavior based approach for Genetic Programming

Published in 24 Thursday

### Genetic Programming

Evolutionary algorithms (EAs) are a broad family of search and optimization algorithms that are based on a simplified model of Neo-Darwinian evolution, with impressive results in many domains [1]. EAs are guided by an objective function and specially designed search operators, in this sense can be said that they are objective-based search (OS), this is a key difference with respect to natural evolution, an open-ended process that lacks a predefined purpose. Open-ended artificial evolution does not use an objective function to drive the search, at least not an explicit one. Only recently has open-ended search been proposed to solve mainstream problems, one promising algorithm is Novelty Search (NS) proposed by Lehman and Stanley [2].

The core idea behind NS is that using an objective function to determine fitness in challenging problems may mislead the search and prevent it from reaching the global optima. Therefore, the proposal of NS is to abandon the objective function, and instead determine selective pressure based on the novelty or “uniqueness” of each individual by considering a description of the behavior each individual exhibits. From the NS perspective, a behavior refers to a description of the interaction between a candidate solution and it’s domain-specific context. Even though, NS has achieved promising results in different areas of evolutionary robotics, such as navigation [2, 4], morphology design and gait control [3], has not been tested on machine learning problems.

This work makes the following contributions. We apply NS on a common machine learning problem; supervised classification with GP, while previously works have mostly focused on evolutionary robotics. The NS approach is tested on real-world datasets, and using two new versions of NS are proposed; Probabilistic NS (PNS), and a variant of Minimal Criteria NS (MCNS), and lastly a combination of both named as MCPNS. The former models the behavior of each solution as arandom vector and eliminates all the NS parameters while reducing the computational overhead of the NS algorithm; the latter uses a standard objective function to constrain and bias the search towards high performance solutions. The paper also discusses the effects of NS on GP search dynamics and code growth.

Table 1 shows results about the dataset Teaching Assistant Evaluation (TAE)1, we can note that in general, all NS algorithms are very competitive relative to OS.To illustrate the performance differences for the TAE dataset among the methods, Figure 1 shows both the average size evolution plot, and a box plot comparison of the test error from the best solution found in each run.

When we consider the average program size it is clear that all NS variants evolve much smaller populations.In particular NS and MCNS show an intrinsic bloat-control property in this domain. In summary, this work will help to establish the behavior-based approach with NS as a viable alternative for GP-based systems.

#### References

[1] John Koza. Human-competitive results produced by genetic programming.Genetic Programming and Evolvable Machines, 11(3):251–284, 2010.

[2] Joel Lehman and Kenneth O. Stanley. Exploiting open-endedness to solve problems through the search for novelty. InProceedings of the Eleventh International Conference on Artificial Life, Cambridge, MA,ALIFE XI. MIT Press, 2008.

[3] Joel Lehman and Kenneth O. Stanley. Abandoning objectives: Evolution through the search for novelty alone.Evol. Comput., 19(2):189–223, 2011.

[4] Paulo Urbano, Enrique Naredo, and Leonardo Trujillo. Generalization in maze navigation using grammatical evolution and novelty search. In Theory and Practice of Natural Computing, volume 8890 of Lecture Notes in Computer Science, pages 35–46. Springer International Publishing, 2014.

## Specialist predictors of expected performance for Genetic Programming classifier

Published in 24 Thursday

### Genetic Programming

Determining problem difficulty has been an important issue in evolutionary computation for several years[2]. From an algorithmic perspective, problem difficulty can be related to the total runtime (or memory)required to find an optimal solution, difficult problems require exponential run time while easy problems can be solved in linear time. Limiting our overview of previous works related to GP research, this work focuses on two groups of methods used to measure problem difficulty in GP, these are: Evolvability Indicators(EIs) and Predictors of the Expected Performance (PEP). Broadly speaking, EIs attempt to provide a measure related to the underlying difficulty of the search process and on the other hand a PEP provides a prediction of the expected performance of a GP search, which might or might not coincide with the difficulty of the search. Currently, the development of PEP’s represents the minority of research devoted to problem difficulty in GP, with only a few recent proposals. In particular, Graff and Poli [1] have studied the development of such predictive models, for symbolic regression, boolean and time-series problems.

This work is an extension of our previous work [3], where PEP’s were first proposed for a GP algorithm applied to supervised classification. The general process relies on posing a supervised learning task, whereeach problem instance is described by a domain-specific feature vector and the performance of the GP-system on a set of synthetic problems is used as the ground truth to learn a PEP. The learned modelis then used to predict the performance on unseen problem instances. Of particular interest is the fact that learning can be done on an arbitrarily generated set of synthetic problems and testing can then be performed on real-world datasets. Our goal is to improve performance prediction on real-world problems by using an ensemble approach several PEP models, each one referred to as an SPEP.

The proposal is depicted in Figure 1, an extension of the the basic PEP approach.

Where given a classification problem we do the following. First, apply a preprocessing step to simplify the feature extraction process and deal with multidimensional representations. Second, perform feature extraction to obtain an abstraction of the problem. Third, each problem is classified into a specif group using its corresponding feature vector $$\beta$$. Fourth, to use a Ensemble SPEP model that takes as input the extracted features and produces as output the predicted classification error (PCE) for a given problem,each group is associated to a particular SPEP in the ensemble, hence if a problem is classified into the $$i-th$$ group then the $$i-th$$ SPEP in the ensemble is used to compute the performance prediction.

Focusing on the real-world problems, Figure 2 summarize the performance of the Ensemble-2 predictors using 5 features. In particular, the left column show plot of the ground truth CE$$\mu$$of each problem(triangles) and the Ensemble-2 prediction PCE. In particular, these plots show three types of PCE: (1)correctly classified problems for which the appropriate SPEP was selected (CC-PCE); (2) missclassified problems for which an incorrect SPEP was selected (MC-PCE);and (3) for the missclassified problems,the oracle SPEP prediction (O-PCE), which is the PCE produced by the correct SPEP. Moreover, the second column of Figure 2 present a scatter plot of the true CE$$\mu$$ and the PCE, using the same notations and different predictions. These plots provide a graphical confirmation of the quality of the performance prediction achieved. It is important to highlight the impact of a missclassified problem (shown as a black circle) compared to the prediction on the same problem if the correct SPEP had been chosen (O-PCE).For all problems for which the correct SPEP was chosen the PCE is highly correlated with the ground truth with only marginal differences in most cases. Finally,we build the SPEP models with a synthetics benchmark and tested over real-world classification problems. These showing a good results an interesting way for to solve the problem of to predict performance.

References

[1] Mario Graff and Riccardo Poli. Practical model of genetic programming’s performance on rational symbolic regression problems. InEuroGP, pages 122–133, 2008.

[2] Kent McClymont, David Walker, and Max Dupenois. The lay of the land: a brief survey of problem understanding. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference companion, GECCO Companion ’12, pages 425–432, New York, NY, USA,2012. ACM.

[3] Leonardo Trujillo, Yuliana Martínez, Edgar Galván-López, and Pierrick Legrand. Predicting problem difficulty for genetic programming applied to data classification. In Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO ’11, pages 1355–1362, New York, NY,USA, 2011. ACM.

## Automatic random tree generator on FPGA

Published in 24 Thursday

### Genetic Programming

This work deals with the implementation of the automatic generation of random syntax trees on FPGAs, these trees are used as the initial population in genetic programming (GP) algorithms and serve as the mean of how genetic material is inherited through the generations. GP is an evolutionary computation technique that allows to find the solution of a search problem without requiring the user to provide prior information about the structure or form of the solution. Trees are the most used data structure in GP applications found in literature for the representation of programs, these programs are normally mathematical functions although they may be computer code or movement instruction in the case of robotic applications.

Initially, GP was run using software on a desktop computer or workstation, however as the paradigm matured, GP was used on the resolution of more complex problems leading to longer times and larger memory consumption. One method to reduce time consumption is to use multicore microprocessor, this technique allows the whole program to run in concurrent threads that accelerate the calculation process, however the number of cores found in commercial microprocessor are limited to four or so. On the other hand, one can take advantage of the massive parallel processing power that a Graphics Processor Unit (GPU) exhibits. GPU may seem the natural choice when high parallel processing power is needed, however the strength ofGPUs is also its own weakness because high parallelization works well when the very same operation needs to be performed on a vast amount of data, but it is not efficient when different operations on differentdata is required, the latter is a normal case when GP is used in real world applications. Here is when another well-known device comes to the rescue: the Field Programmable Gate Array (FPGA). FPGA is a very flexible device that allows the implementation of complex digital systems in a small silicon chip or Integrated Circuit. FPGA is less expensive compared to the GPU or a CPU not to mention compared to other high performance computing technologies. Another good characteristic of FPGAs is that it is possible to perform parallel computing because it is able to handle multiple parallel processes.

Finally, one should point to the fact that FPGA is reconfigurable so it provides additional flexibility. Some works have addressed the issue of GP hardware implementation on GPU like in [3] and [5], some others have done implementations on FPGAs like in [1], [2] and [4], nevertheless, none of the previous works implemented true tree data structure as means of evolving programs.

In this work the implementation of the automatic generation of random trees is totally embedded on a FPGA and allows flexibility in the tree instantiation in contrast to previous works, this means that besides the tree-like structure implementation on the FPGA, a Pseudo-Random Number Generator is used, so random function or variable/constant selection is enabled. As several processes may be parallelized on the FPGA, each tree is generated in parallel, so it means that a signicant reduction in the amount of time consumed is expected when compared to traditional runs on CPUs, more signicant is the fact that true embedded tree-like data structures are embedded on the FPGA.

#### References

[1]Radek Hrbacek and Michaela Sikulova Co evolutionary Cartesian Genetic Programming in FPGA.12th European Conference on Articial Life Proceedings, pp. 431-438, 2013.

[2]Sidhu, R. P. S.; Mei, A. and Prasanna, V. K. Genetic Programming using Self-Recongurable.FPGAs, in '9th International Workshop on Field Programmable Logic and Applications' , Springer-Verlag,Glasgow, UK , pp. 301-312, 1998.

[3]Douglas A. Augusto, Helio J.C. Barbosa. Accelerated parallel genetic programming tree evaluation with OpenCL.Journal of Parallel and Distributed Computing, Volume 73, Issue 1, pp. 86-100, January 2013.

[4]M. I. Heywo o d and A. N. Zincir-Heywood Register Based Genetic Programming on FPGA Computing Platforms Genetic Programming, Proceedings of EuroGP'2000, pp. 44-59, 2000.

[5]Harding, S. Evolution of image filters on graphics processor units using Cartesian Genetic Programming Evolutionary Computation, IEEE World Congress on Computational Intelligence, pp. 1921-1928, 2008.