Open Science Research Excellence

Open Science Index

Commenced in January 2007 Frequency: Monthly Edition: International Paper Count: 34

An Improved K-Means Algorithm for Gene Expression Data Clustering

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Biomolecules Based Microarray for Screening Human Endothelial Cells Behavior

Endothelial Progenitor Cell (EPC) based therapies continue to be of interest to treat ischemic events based on their proven role to promote blood vessel formation and thus tissue re-vascularisation. Current strategies for the production of clinical-grade EPCs requires the in vitro isolation of EPCs from peripheral blood followed by cell expansion to provide sufficient quantities EPCs for cell therapy. This study aims to examine the use of different biomolecules to significantly improve the current strategy of EPC capture and expansion on collagen type I (Col I). In this study, four different biomolecules were immobilised on a surface and then investigated for their capacity to support EPC capture and proliferation. First, a cell microarray platform was fabricated by coating a glass surface with epoxy functional allyl glycidyl ether plasma polymer (AGEpp) to mediate biomolecule binding. The four candidate biomolecules tested were Col I, collagen type II (Col II), collagen type IV (Col IV) and vascular endothelial growth factor A (VEGF-A), which were arrayed on the epoxy-functionalised surface using a non-contact printer. The surrounding area between the printed biomolecules was passivated with polyethylene glycol-bisamine (A-PEG) to prevent non-specific cell attachment. EPCs were seeded onto the microarray platform and cell numbers quantified after 1 h (to determine capture) and 72 h (to determine proliferation). All of the extracellular matrix (ECM) biomolecules printed demonstrated an ability to capture EPCs within 1 h of cell seeding with Col II exhibiting the highest level of attachment when compared to the other biomolecules. Interestingly, Col IV exhibited the highest increase in EPC expansion after 72 h when compared to Col I, Col II and VEGF-A. These results provide information for significant improvement in the capture and expansion of human EPC for further application.

Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data

A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.

Application of KL Divergence for Estimation of Each Metabolic Pathway Genes

Development of a method to estimate gene functions is an important task in bioinformatics. One of the approaches for the annotation is the identification of the metabolic pathway that genes are involved in. Since gene expression data reflect various intracellular phenomena, those data are considered to be related with genes’ functions. However, it has been difficult to estimate the gene function with high accuracy. It is considered that the low accuracy of the estimation is caused by the difficulty of accurately measuring a gene expression. Even though they are measured under the same condition, the gene expressions will vary usually. In this study, we proposed a feature extraction method focusing on the variability of gene expressions to estimate the genes' metabolic pathway accurately. First, we estimated the distribution of each gene expression from replicate data. Next, we calculated the similarity between all gene pairs by KL divergence, which is a method for calculating the similarity between distributions. Finally, we utilized the similarity vectors as feature vectors and trained the multiclass SVM for identifying the genes' metabolic pathway. To evaluate our developed method, we applied the method to budding yeast and trained the multiclass SVM for identifying the seven metabolic pathways. As a result, the accuracy that calculated by our developed method was higher than the one that calculated from the raw gene expression data. Thus, our developed method combined with KL divergence is useful for identifying the genes' metabolic pathway.

Imputation Technique for Feature Selection in Microarray Data Set

Analyzing DNA microarray data sets is a great challenge, which faces the bioinformaticians due to the complication of using statistical and machine learning techniques. The challenge will be doubled if the microarray data sets contain missing data, which happens regularly because these techniques cannot deal with missing data. One of the most important data analysis process on the microarray data set is feature selection. This process finds the most important genes that affect certain disease. In this paper, we introduce a technique for imputing the missing data in microarray data sets while performing feature selection.

Apoptosis Pathway Targeted by Thymoquinone in MCF7 Breast Cancer Cell Line

Array-based gene expression analysis is a powerful tool to profile expression of genes and to generate information on therapeutic effects of new anti-cancer compounds. Anti-apoptotic effect of thymoquinone was studied in MCF7 breast cancer cell line using gene expression profiling with cDNA microarray. The purity and yield of RNA samples were determined using RNeasyPlus Mini kit. The Agilent RNA 6000 NanoLabChip kit evaluated the quantity of the RNA samples. AffinityScript RT oligo-dT promoter primer was used to generate cDNA strands. T7 RNA polymerase was used to convert cDNA to cRNA. The cRNA samples and human universal reference RNA were labelled with Cy-3-CTP and Cy-5-CTP, respectively. Feature Extraction and GeneSpring softwares analysed the data. The single experiment analysis revealed involvement of 64 pathways with up-regulated genes and 78 pathways with downregulated genes. The MAPK and p38-MAPK pathways were inhibited due to the up-regulation of PTPRR gene. The inhibition of p38-MAPK suggested up-regulation of TGF-ß pathway. Inhibition of p38-MAPK caused up-regulation of TP53 and down-regulation of Bcl2 genes indicating involvement of intrinsic apoptotic pathway. Down-regulation of CARD16 gene as an adaptor molecule regulated CASP1 and suggested necrosis-like programmed cell death and involvement of caspase in apoptosis. Furthermore, down-regulation of GPCR, EGF-EGFR signalling pathways suggested reduction of ER. Involvement of AhR pathway which control cytochrome P450 and glucuronidation pathways showed metabolism of Thymoquinone. The findings showed differential expression of several genes in apoptosis pathways with thymoquinone treatment in estrogen receptor-positive breast cancer cells.

Integration of Microarray Data into a Genome-Scale Metabolic Model to Study Flux Distribution after Gene Knockout
Prediction of perturbations after genetic manipulation (especially gene knockout) is one of the important challenges in systems biology. In this paper, a new algorithm is introduced that integrates microarray data into the metabolic model. The algorithm was used to study the change in the cell phenotype after knockout of Gss gene in Escherichia coli BW25113. Algorithm implementation indicated that gene deletion resulted in more activation of the metabolic network. Growth yield was more and less regulating gene were identified for mutant in comparison with the wild-type strain.
Statistical Measures and Optimization Algorithms for Gene Selection in Lung and Ovarian Tumor

Microarray technology is universally used in the study of disease diagnosis using gene expression levels. The main shortcoming of gene expression data is that it includes thousands of genes and a small number of samples. Abundant methods and techniques have been proposed for tumor classification using microarray gene expression data. Feature or gene selection methods can be used to mine the genes that directly involve in the classification and to eliminate irrelevant genes. In this paper statistical measures like T-Statistics, Signal-to-Noise Ratio (SNR) and F-Statistics are used to rank the genes. The ranked genes are used for further classification. Particle Swarm Optimization (PSO) algorithm and Shuffled Frog Leaping (SFL) algorithm are used to find the significant genes from the top-m ranked genes. The Naïve Bayes Classifier (NBC) is used to classify the samples based on the significant genes. The proposed work is applied on Lung and Ovarian datasets. The experimental results show that the proposed method achieves 100% accuracy in all the three datasets and the results are compared with previous works.

Performance Analysis of Genetic Algorithm with kNN and SVM for Feature Selection in Tumor Classification

Tumor classification is a key area of research in the field of bioinformatics. Microarray technology is commonly used in the study of disease diagnosis using gene expression levels. The main drawback of gene expression data is that it contains thousands of genes and a very few samples. Feature selection methods are used to select the informative genes from the microarray. These methods considerably improve the classification accuracy. In the proposed method, Genetic Algorithm (GA) is used for effective feature selection. Informative genes are identified based on the T-Statistics, Signal-to-Noise Ratio (SNR) and F-Test values. The initial candidate solutions of GA are obtained from top-m informative genes. The classification accuracy of k-Nearest Neighbor (kNN) method is used as the fitness function for GA. In this work, kNN and Support Vector Machine (SVM) are used as the classifiers. The experimental results show that the proposed work is suitable for effective feature selection. With the help of the selected genes, GA-kNN method achieves 100% accuracy in 4 datasets and GA-SVM method achieves in 5 out of 10 datasets. The GA with kNN and SVM methods are demonstrated to be an accurate method for microarray based tumor classification.

Principal Component Analysis using Singular Value Decomposition of Microarray Data

A series of microarray experiments produces observations of differential expression for thousands of genes across multiple conditions. Principal component analysis(PCA) has been widely used in multivariate data analysis to reduce the dimensionality of the data in order to simplify subsequent analysis and allow for summarization of the data in a parsimonious manner. PCA, which can be implemented via a singular value decomposition(SVD), is useful for analysis of microarray data. For application of PCA using SVD we use the DNA microarray data for the small round blue cell tumors(SRBCT) of childhood by Khan et al.(2001). To decide the number of components which account for sufficient amount of information we draw scree plot. Biplot, a graphic display associated with PCA, reveals important features that exhibit relationship between variables and also the relationship of variables with observations.

Gene Selection Guided by Feature Interdependence

Cancers could normally be marked by a number of differentially expressed genes which show enormous potential as biomarkers for a certain disease. Recent years, cancer classification based on the investigation of gene expression profiles derived by high-throughput microarrays has widely been used. The selection of discriminative genes is, therefore, an essential preprocess step in carcinogenesis studies. In this paper, we have proposed a novel gene selector using information-theoretic measures for biological discovery. This multivariate filter is a four-stage framework through the analyses of feature relevance, feature interdependence, feature redundancy-dependence and subset rankings, and having been examined on the colon cancer data set. Our experimental result show that the proposed method outperformed other information theorem based filters in all aspect of classification errors and classification performance.

Categorization and Estimation of Relative Connectivity of Genes from Meta-OFTEN Network
The most common result of analysis of highthroughput data in molecular biology represents a global list of genes, ranked accordingly to a certain score. The score can be a measure of differential expression. Recent work proposed a new method for selecting a number of genes in a ranked gene list from microarray gene expression data such that this set forms the Optimally Functionally Enriched Network (OFTEN), formed by known physical interactions between genes or their products. Here we present calculation results of relative connectivity of genes from META-OFTEN network and tentative biological interpretation of the most reproducible signal. The relative connectivity and inbetweenness values of genes from META-OFTEN network were estimated.
Probe Selection for Pathway-Specific Microarray Probe Design Minimizing Melting Temperature Variance

In molecular biology, microarray technology is widely and successfully utilized to efficiently measure gene activity. If working with less studied organisms, methods to design custom-made microarray probes are available. One design criterion is to select probes with minimal melting temperature variances thus ensuring similar hybridization properties. If the microarray application focuses on the investigation of metabolic pathways, it is not necessary to cover the whole genome. It is more efficient to cover each metabolic pathway with a limited number of genes. Firstly, an approach is presented which minimizes the overall melting temperature variance of selected probes for all genes of interest. Secondly, the approach is extended to include the additional constraints of covering all pathways with a limited number of genes while minimizing the overall variance. The new optimization problem is solved by a bottom-up programming approach which reduces the complexity to make it computationally feasible. The new method is exemplary applied for the selection of microarray probes in order to cover all fungal secondary metabolite gene clusters for Aspergillus terreus.

A Comparison of SVM-based Criteria in Evolutionary Method for Gene Selection and Classification of Microarray Data
An evolutionary method whose selection and recombination operations are based on generalization error-bounds of support vector machine (SVM) can select a subset of potentially informative genes for SVM classifier very efficiently [7]. In this paper, we will use the derivative of error-bound (first-order criteria) to select and recombine gene features in the evolutionary process, and compare the performance of the derivative of error-bound with the error-bound itself (zero-order) in the evolutionary process. We also investigate several error-bounds and their derivatives to compare the performance, and find the best criteria for gene selection and classification. We use 7 cancer-related human gene expression datasets to evaluate the performance of the zero-order and first-order criteria of error-bounds. Though both criteria have the same strategy in theoretically, experimental results demonstrate the best criterion for microarray gene expression data.
A Novel Microarray Biclustering Algorithm
Biclustering aims at identifying several biclusters that reveal potential local patterns from a microarray matrix. A bicluster is a sub-matrix of the microarray consisting of only a subset of genes co-regulates in a subset of conditions. In this study, we extend the motif of subspace clustering to present a K-biclusters clustering (KBC) algorithm for the microarray biclustering issue. Besides minimizing the dissimilarities between genes and bicluster centers within all biclusters, the objective function of the KBC algorithm additionally takes into account how to minimize the residues within all biclusters based on the mean square residue model. In addition, the objective function also maximizes the entropy of conditions to stimulate more conditions to contribute the identification of biclusters. The KBC algorithm adopts the K-means type clustering process to efficiently make the partition of K biclusters be optimized. A set of experiments on a practical microarray dataset are demonstrated to show the performance of the proposed KBC algorithm.
An SVM based Classification Method for Cancer Data using Minimum Microarray Gene Expressions
This paper gives a novel method for improving classification performance for cancer classification with very few microarray Gene expression data. The method employs classification with individual gene ranking and gene subset ranking. For selection and classification, the proposed method uses the same classifier. The method is applied to three publicly available cancer gene expression datasets from Lymphoma, Liver and Leukaemia datasets. Three different classifiers namely Support vector machines-one against all (SVM-OAA), K nearest neighbour (KNN) and Linear Discriminant analysis (LDA) were tested and the results indicate the improvement in performance of SVM-OAA classifier with satisfactory results on all the three datasets when compared with the other two classifiers.
A Simple Affymetrix Ratio-transformation Method Yields Comparable Expression Level Quantifications with cDNA Data
Gene expression profiling is rapidly evolving into a powerful technique for investigating tumor malignancies. The researchers are overwhelmed with the microarray-based platforms and methods that confer them the freedom to conduct large-scale gene expression profiling measurements. Simultaneously, investigations into cross-platform integration methods have started gaining momentum due to their underlying potential to help comprehend a myriad of broad biological issues in tumor diagnosis, prognosis, and therapy. However, comparing results from different platforms remains to be a challenging task as various inherent technical differences exist between the microarray platforms. In this paper, we explain a simple ratio-transformation method, which can provide some common ground for cDNA and Affymetrix platform towards cross-platform integration. The method is based on the characteristic data attributes of Affymetrix- and cDNA- platform. In the work, we considered seven childhood leukemia patients and their gene expression levels in either platform. With a dataset of 822 differentially expressed genes from both these platforms, we carried out a specific ratio-treatment to Affymetrix data, which subsequently showed an improvement in the relationship with the cDNA data.
Improved Wavelet Neural Networks for Early Cancer Diagnosis Using Clustering Algorithms

Wavelet neural networks (WNNs) have emerged as a vital alternative to the vastly studied multilayer perceptrons (MLPs) since its first implementation. In this paper, we applied various clustering algorithms, namely, K-means (KM), Fuzzy C-means (FCM), symmetry-based K-means (SBKM), symmetry-based Fuzzy C-means (SBFCM) and modified point symmetry-based K-means (MPKM) clustering algorithms in choosing the translation parameter of a WNN. These modified WNNs are further applied to the heterogeneous cancer classification using benchmark microarray data and were compared against the conventional WNN with random initialization method. Experimental results showed that a WNN classifier with the MPKM algorithm is more precise than the conventional WNN as well as the WNNs with other clustering algorithms.

Differentiation of Gene Expression Profiles Data for Liver and Kidney of Pigs
Using DNA microarrays the comparative analysis of a gene expression profiles is carried out in a liver and kidneys of pigs. The hypothesis of a cross hybridization of one probe with different cDNA sites of the same gene or different genes is checked up, and it is shown, that cross hybridization can be a source of essential errors at revealing of a key genes in organ-specific transcriptome. It is reveald that distinctions in profiles of a gene expression are well coordinated with function, morphology, biochemistry and histology of these organs.
Multidimensional Visualization Tools for Analysis of Expression Data
Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.
Systholic Boolean Orthonormalizer Network in Wavelet Domain for Microarray Denoising

We describe a novel method for removing noise (in wavelet domain) of unknown variance from microarrays. The method is based on the following procedure: We apply 1) Bidimentional Discrete Wavelet Transform (DWT-2D) to the Noisy Microarray, 2) scaling and rounding to the coefficients of the highest subbands (to obtain integer and positive coefficients), 3) bit-slicing to the new highest subbands (to obtain bit-planes), 4) then we apply the Systholic Boolean Orthonormalizer Network (SBON) to the input bit-plane set and we obtain two orthonormal otput bit-plane sets (in a Boolean sense), we project a set on the other one, by means of an AND operation, and then, 5) we apply re-assembling, and, 6) rescaling. Finally, 7) we apply Inverse DWT-2D and reconstruct a microarray from the modified wavelet coefficients. Denoising results compare favorably to the most of methods in use at the moment.

Analysis of DNA Microarray Data using Association Rules: A Selective Study

DNA microarrays allow the measurement of expression levels for a large number of genes, perhaps all genes of an organism, within a number of different experimental samples. It is very much important to extract biologically meaningful information from this huge amount of expression data to know the current state of the cell because most cellular processes are regulated by changes in gene expression. Association rule mining techniques are helpful to find association relationship between genes. Numerous association rule mining algorithms have been developed to analyze and associate this huge amount of gene expression data. This paper focuses on some of the popular association rule mining algorithms developed to analyze gene expression data.

Feature Subset Selection approach based on Maximizing Margin of Support Vector Classifier
Identification of cancer genes that might anticipate the clinical behaviors from different types of cancer disease is challenging due to the huge number of genes and small number of patients samples. The new method is being proposed based on supervised learning of classification like support vector machines (SVMs).A new solution is described by the introduction of the Maximized Margin (MM) in the subset criterion, which permits to get near the least generalization error rate. In class prediction problem, gene selection is essential to improve the accuracy and to identify genes for cancer disease. The performance of the new method was evaluated with real-world data experiment. It can give the better accuracy for classification.
An Automatic Gridding and Contour Based Segmentation Approach Applied to DNA Microarray Image Analysis
DNA microarray technology is widely used by geneticists to diagnose or treat diseases through gene expression. This technology is based on the hybridization of a tissue-s DNA sequence into a substrate and the further analysis of the image formed by the thousands of genes in the DNA as green, red or yellow spots. The process of DNA microarray image analysis involves finding the location of the spots and the quantification of the expression level of these. In this paper, a tool to perform DNA microarray image analysis is presented, including a spot addressing method based on the image projections, the spot segmentation through contour based segmentation and the extraction of relevant information due to gene expression.
First Studies of the Influence of Single Gene Perturbations on the Inference of Genetic Networks
Inferring the network structure from time series data is a hard problem, especially if the time series is short and noisy. DNA microarray is a technology allowing to monitor the mRNA concentration of thousands of genes simultaneously that produces data of these characteristics. In this study we try to investigate the influence of the experimental design on the quality of the result. More precisely, we investigate the influence of two different types of random single gene perturbations on the inference of genetic networks from time series data. To obtain an objective quality measure for this influence we simulate gene expression values with a biologically plausible model of a known network structure. Within this framework we study the influence of single gene knock-outs in opposite to linearly controlled expression for single genes on the quality of the infered network structure.
Influence of Noise on the Inference of Dynamic Bayesian Networks from Short Time Series
In this paper we investigate the influence of external noise on the inference of network structures. The purpose of our simulations is to gain insights in the experimental design of microarray experiments to infer, e.g., transcription regulatory networks from microarray experiments. Here external noise means, that the dynamics of the system under investigation, e.g., temporal changes of mRNA concentration, is affected by measurement errors. Additionally to external noise another problem occurs in the context of microarray experiments. Practically, it is not possible to monitor the mRNA concentration over an arbitrary long time period as demanded by the statistical methods used to learn the underlying network structure. For this reason, we use only short time series to make our simulations more biologically plausible.
A Systems Approach to Gene Ranking from DNA Microarray Data of Cervical Cancer
In this paper we present a method for gene ranking from DNA microarray data. More precisely, we calculate the correlation networks, which are unweighted and undirected graphs, from microarray data of cervical cancer whereas each network represents a tissue of a certain tumor stage and each node in the network represents a gene. From these networks we extract one tree for each gene by a local decomposition of the correlation network. The interpretation of a tree is that it represents the n-nearest neighbor genes on the n-th level of a tree, measured by the Dijkstra distance, and, hence, gives the local embedding of a gene within the correlation network. For the obtained trees we measure the pairwise similarity between trees rooted by the same gene from normal to cancerous tissues. This evaluates the modification of the tree topology due to progression of the tumor. Finally, we rank the obtained similarity values from all tissue comparisons and select the top ranked genes. For these genes the local neighborhood in the correlation networks changes most between normal and cancerous tissues. As a result we find that the top ranked genes are candidates suspected to be involved in tumor growth and, hence, indicates that our method captures essential information from the underlying DNA microarray data of cervical cancer.
A Phenomic Algorithm for Reconstruction of Gene Networks

The goal of Gene Expression Analysis is to understand the processes that underlie the regulatory networks and pathways controlling inter-cellular and intra-cellular activities. In recent times microarray datasets are extensively used for this purpose. The scope of such analysis has broadened in recent times towards reconstruction of gene networks and other holistic approaches of Systems Biology. Evolutionary methods are proving to be successful in such problems and a number of such methods have been proposed. However all these methods are based on processing of genotypic information. Towards this end, there is a need to develop evolutionary methods that address phenotypic interactions together with genotypic interactions. We present a novel evolutionary approach, called Phenomic algorithm, wherein the focus is on phenotypic interaction. We use the expression profiles of genes to model the interactions between them at the phenotypic level. We apply this algorithm to the yeast sporulation dataset and show that the algorithm can identify gene networks with relative ease.

A Hybrid Approach for Selection of Relevant Features for Microarray Datasets

Developing an accurate classifier for high dimensional microarray datasets is a challenging task due to availability of small sample size. Therefore, it is important to determine a set of relevant genes that classify the data well. Traditionally, gene selection method often selects the top ranked genes according to their discriminatory power. Often these genes are correlated with each other resulting in redundancy. In this paper, we have proposed a hybrid method using feature ranking and wrapper method (Genetic Algorithm with multiclass SVM) to identify a set of relevant genes that classify the data more accurately. A new fitness function for genetic algorithm is defined that focuses on selecting the smallest set of genes that provides maximum accuracy. Experiments have been carried on four well-known datasets1. The proposed method provides better results in comparison to the results found in the literature in terms of both classification accuracy and number of genes selected.

Vol:14 No:01 2020
Vol:13 No:12 2019Vol:13 No:11 2019Vol:13 No:10 2019Vol:13 No:09 2019Vol:13 No:08 2019Vol:13 No:07 2019Vol:13 No:06 2019Vol:13 No:05 2019Vol:13 No:04 2019Vol:13 No:03 2019Vol:13 No:02 2019Vol:13 No:01 2019
Vol:12 No:12 2018Vol:12 No:11 2018Vol:12 No:10 2018Vol:12 No:09 2018Vol:12 No:08 2018Vol:12 No:07 2018Vol:12 No:06 2018Vol:12 No:05 2018Vol:12 No:04 2018Vol:12 No:03 2018Vol:12 No:02 2018Vol:12 No:01 2018
Vol:11 No:12 2017Vol:11 No:11 2017Vol:11 No:10 2017Vol:11 No:09 2017Vol:11 No:08 2017Vol:11 No:07 2017Vol:11 No:06 2017Vol:11 No:05 2017Vol:11 No:04 2017Vol:11 No:03 2017Vol:11 No:02 2017Vol:11 No:01 2017
Vol:10 No:12 2016Vol:10 No:11 2016Vol:10 No:10 2016Vol:10 No:09 2016Vol:10 No:08 2016Vol:10 No:07 2016Vol:10 No:06 2016Vol:10 No:05 2016Vol:10 No:04 2016Vol:10 No:03 2016Vol:10 No:02 2016Vol:10 No:01 2016
Vol:9 No:12 2015Vol:9 No:11 2015Vol:9 No:10 2015Vol:9 No:09 2015Vol:9 No:08 2015Vol:9 No:07 2015Vol:9 No:06 2015Vol:9 No:05 2015Vol:9 No:04 2015Vol:9 No:03 2015Vol:9 No:02 2015Vol:9 No:01 2015
Vol:8 No:12 2014Vol:8 No:11 2014Vol:8 No:10 2014Vol:8 No:09 2014Vol:8 No:08 2014Vol:8 No:07 2014Vol:8 No:06 2014Vol:8 No:05 2014Vol:8 No:04 2014Vol:8 No:03 2014Vol:8 No:02 2014Vol:8 No:01 2014
Vol:7 No:12 2013Vol:7 No:11 2013Vol:7 No:10 2013Vol:7 No:09 2013Vol:7 No:08 2013Vol:7 No:07 2013Vol:7 No:06 2013Vol:7 No:05 2013Vol:7 No:04 2013Vol:7 No:03 2013Vol:7 No:02 2013Vol:7 No:01 2013
Vol:6 No:12 2012Vol:6 No:11 2012Vol:6 No:10 2012Vol:6 No:09 2012Vol:6 No:08 2012Vol:6 No:07 2012Vol:6 No:06 2012Vol:6 No:05 2012Vol:6 No:04 2012Vol:6 No:03 2012Vol:6 No:02 2012Vol:6 No:01 2012
Vol:5 No:12 2011Vol:5 No:11 2011Vol:5 No:10 2011Vol:5 No:09 2011Vol:5 No:08 2011Vol:5 No:07 2011Vol:5 No:06 2011Vol:5 No:05 2011Vol:5 No:04 2011Vol:5 No:03 2011Vol:5 No:02 2011Vol:5 No:01 2011
Vol:4 No:12 2010Vol:4 No:11 2010Vol:4 No:10 2010Vol:4 No:09 2010Vol:4 No:08 2010Vol:4 No:07 2010Vol:4 No:06 2010Vol:4 No:05 2010Vol:4 No:04 2010Vol:4 No:03 2010Vol:4 No:02 2010Vol:4 No:01 2010
Vol:3 No:12 2009Vol:3 No:11 2009Vol:3 No:10 2009Vol:3 No:09 2009Vol:3 No:08 2009Vol:3 No:07 2009Vol:3 No:06 2009Vol:3 No:05 2009Vol:3 No:04 2009Vol:3 No:03 2009Vol:3 No:02 2009Vol:3 No:01 2009
Vol:2 No:12 2008Vol:2 No:11 2008Vol:2 No:10 2008Vol:2 No:09 2008Vol:2 No:08 2008Vol:2 No:07 2008Vol:2 No:06 2008Vol:2 No:05 2008Vol:2 No:04 2008Vol:2 No:03 2008Vol:2 No:02 2008Vol:2 No:01 2008
Vol:1 No:12 2007Vol:1 No:11 2007Vol:1 No:10 2007Vol:1 No:09 2007Vol:1 No:08 2007Vol:1 No:07 2007Vol:1 No:06 2007Vol:1 No:05 2007Vol:1 No:04 2007Vol:1 No:03 2007Vol:1 No:02 2007Vol:1 No:01 2007