Instance reduction approach to machine learning and multi-database mining

Ireneusz Czarnowski; Piotr Jędrzejowicz

doi:10.17951/ai.2006.4.1.60-71

Instance reduction approach to machine learning and multi-database mining

Ireneusz Czarnowski, Piotr Jędrzejowicz

Abstract

The paper proposes a heuristic instance reduction algorithm as an approach to machine learning and knowledge discovery in centralized and distributed databases. The proposed algorithm is based on an original method for a selection of reference instances and creates a reduced training dataset. The reduced training set consisting of selected instances can be used as an input for the machine learning algorithms used for data mining tasks. The algorithm calculates for each instance in the data set the value of its similarity coefficient. Values of the coefficient are used to group instances into clusters. The number of clusters depends on the value of the so called representation level set by the user. Out of each cluster only a limited number of instances is selected to form a reduced training set. The proposed algorithm uses population learning algorithm for selection of instances. The paper includes a description of the proposed approach and results of the validating experiment.

Full Text:

PDF

DOI: http://dx.doi.org/10.17951/ai.2006.4.1.60-71
Date of publication: 2006-01-01 00:00:00
Date of submission: 2016-04-27 10:15:02

Statistics

Total abstract view - 919

Downloads (from 2020-06-17) - PDF - 0

Indicators

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Username
Password
Remember me