Evolutionary computation for solving search-based data analytics problems

Original article was published by on Latest Results for Artificial Intelligence Review

Abstract

Automatic extracting of knowledge from massive data samples, i.e., big data analytics (BDA), has emerged as a vital task in almost all scientific research fields. The BDA problems are rather difficult to solve due to their large-scale, high-dimensional, and dynamic properties, while the problems with small data are usually hard to handle due to insufficient data samples and incomplete information. Such difficulties lead to the search-based data analytics problem, where a data analysis task is modeled as a complex, dynamic, and computationally expensive optimization problem and then solved by using an iterative algorithm. In this paper, we intend to present an extensive and in-depth discussion on the utilizing of evolutionary computation (EC) based optimization methods [including evolutionary algorithms (EAs) and swarm intelligence (SI)] for solving search-based data analysis problems. Then, as an example for illustration, we provide a comprehensive review of the applications of state-of-the-art EC methods for different types of data mining problems in bioinformatics. Here, the detailed analysis and discussion are conducted on three types of data samples, which include sequences data, network data, and image data. Finally, we survey the challenges faced by EC methods and the trend for future directions. Based on the applications of EC methods for search-based data analysis problems involving inexact and uncertain information, the insights of data analytics are able to understand better, and more efficient algorithms could be designed to solve real-world complex BDA problems.