Towards a reverse engineering approach for guiding user in applying data mining
- Roberto Espinosa 1
- José-Norberto Mazón 2
- José Zubcoff 2
-
1
Universidad de Matanzas
info
-
2
Universitat d'Alacant
info
Publisher: Ángeles Saavedra Places ; Coral Calero Muñoz ; Universidad de La Coruña
ISBN: 978-84-9749-486-1
Year of publication: 2011
Pages: 23-30
Congress: JISBD´11 (16. 2011. A Coruña)
Type: Conference paper
Abstract
Data mining is at the core of the knowledge discovery process. However, an initial preprocessing step is crucial for assuring reliable results within this process. Preprocessing of data is a time-consuming and non-trivial task since data quality issues should be considered. This is even worst when dealing with complex data, not only because of the different kind of complex data types (XML, multimedia, and so on), but also because of the high dimensionality of complex data. Therefore, to overcome this situation, in this position paper we propose using mechanisms based on data reverse engineering for automatically measuring some data quality criteria on the data sources. These measures will guide user in selecting the most adequate data mining algorithm in the early stages of the knowledge discovery process. Finally, it is worth noting that this work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting data miners in applying those algorithms that obtain the most reliable knowledge from the available data sources.