Phenotype prediction is one of the central issues in genetics and medical sciences research. Due to the advent of highthroughput screening technologies, microarray-based cancer classification has become a standard procedure to identify cancer-related gene signatures. Since gene expression profiling in transcriptome is of high dimensionality, it is a challenging task to discover a biologically functional signature over different cell lines. In this article, we present an innovative framework for finding a small portion of discriminative genes for a specific disease phenotype classification by using information theory. The framework is a data-driven approach and considers feature relevance, redundancy, and interdependence in the context of feature pairs. Its effectiveness has been validated by using a brain cancer benchmark, where the gene expression profiling matrix is derived from Affymetrix Human Genome U95Av2 GeneChip®. Three multivariate filters based on information theory have also been used for comparison. To show the strengths of the framework, three performance measures, two sets of enrichment analysis, and a stability index have been used in our experiments. The results show that the framework is robust and able to discover a gene signature having a high level of classification performance and being more statistically significant enriched.
|Title of host publication||Lecture Notes in Artificial Intelligence|
|Subtitle of host publication||ICAART 2014 Revised Selected Papers|
|Editors||Béatrice Duval, Jaap van den Herik, Stephane Loiseau, Joaquim Filipe|
|Number of pages||17|
|Publication status||Published - Oct 2015|