Microarray gene expression data plays a prominent role in feature selection that helps in diagnosis and treatment of a wide variety of diseases. Microarray gene expression data contains redundant feature genes of high dimensionality and smaller training and testing samples. This paper proposes a customized similarity measure using fuzzy rough quick reduct algorithm for attribute selection. Information Gain based entropy is used to reduce the dimensionality in the first stage and the proposed fuzzy rough quick reduct method that defines a customized similarity measure for selecting the minimum number of informative genes and removing the redundant genes is employed at the second stage. The proposed method is evaluated using leukemia, lung and ovarian cancer gene expression datasets on a random forest classifier. The proposed method produces 97.22%, 99.45% and 99.6% classifier accuracy on leukemia, lung and ovarian cancer gene expression datasets respectively. The research study is carried out using the R open source software package. The proposed method shows substantial improvement in the performance with respect to various statistical parameters like classification accuracy, precision, recall, f-measure and region of characteristic compared to available methods in literature.
Arunkumar, C. and Ramakrishnan, S.
"Attribute selection using fuzzy roughset based customized similarity measure for lung can cer microarra y gene expression data,"
Future Computing and Informatics Journal: Vol. 3
, Article 10.
Available at: https://digitalcommons.aaru.edu.jo/fcij/vol3/iss1/10