Go home now Header Background Image
Search
Submission Procedure
share: |
 
Follow us
 
 
 
 
Volume 19 / Issue 4

available in:   PDF (211 kB) PS (462 kB)
 
get:  
Similar Docs BibTeX   Write a comment
  
get:  
Links into Future
 
DOI:   10.3217/jucs-019-04-0483

 

Improving Accuracy of Decision Trees Using Clustering Techniques

Javier Torres-Niño (University Carlos III Madrid, Spain)

Alejandro Rodríguez-González (Centre for Plant Biotechnology and Genomics UPM-INIA, Spain)

Ricardo Colomo-Palacios (University Carlos III Madrid, Spain)

Enrique Jiménez-Domingo (University Carlos III Madrid, Spain)

Giner Alor-Hernandez (Instituto Tecnológico de Orizaba, Mexico)

Abstract: Data mining is an important part of information management technology. Simply put, it is a method to extract and analyze meaningful patterns and correlations in a large relational database. In Data mining, Decision trees are one of the most worldwide used tools for decision support. In the emerging area of Data mining applications, users of data mining tools are faced with the problem of data sets that are comprised of large numbers of features and instances. Such kinds of data sets are not easy to handle for mining because decision trees generally depends on several parameters like dataset used and configuration of the tree itself among others in order to build an accurate model classification. In this work a novel hybrid classifier system is presented for improving accuracy of decision trees using clustering techniques. This system is formed by a clustering algorithm, a decision tree and an optional module for identifying appropriate parameters for the clustering algorithm. These three modules working together are capable to increase the accuracy of the solutions. The validation of the results of this work has been performed using several well-known datasets and applying two decision trees algorithms. The accuracy percentages are compared in order to show our proposal improvement, obtaining good results. Finally two clustering algorithms have been used to compare the accuracy between different proposals.

Keywords: accuracy improvement, clustering, decision tree

Categories: H.3.3, I.5.2, I.6.1