Undersampling Instance Selection for Hybrid and Incomplete Imbalanced Data
Oscar Camacho-Nieto (CIDETEC-IPN, Mexico)
Cornelio Yáñez-Marquez (Ciudad de México, Mexico)
Yenny Villuendas-Rey (CIDETEC-IPN, Mexico)
Abstract: This paper proposes a novel undersampling method, for dealing with imbalanced datasets. The proposal is based on a novel instance importance measure (also introduced in this paper), and is able to balance hybrid and incomplete data. The numerical experiments carried out show the proposed undersampling algorithm outperforms others algorithms of the state of art, in well-known imbalanced datasets.
Keywords: hybrid and incomplete data, imbalanced data, undersampling
Categories: I.2.1, I.2.6, I.5.1