The Modified Principal Component Analysis Feature Extraction Method for the Task of Diagnosing Chronic Lymphocytic Leukemia Type B-CLL
Mariusz Topolski (Wrocław University of Science and Technology, Poland)
Abstract: The vast majority of medical problems are characterised by the relatively high spatial dimensionality of the task, which becomes problematic for many classic pattern recognition algorithms due to the well-known phenomenon of the curse of dimensionality. This creates the need to develop methods of space reduction, divided into strategies for the selection and extraction of features. The most commonly used tool of the second group is the PCA, which, unlike selection methods, does not select a subset of the original set of features and performs its mathematical transformation into a less dimensional form. However, natural downside of this algorithm is the fact that class context is not present in supervised learning tasks. This work proposes a feature extraction algorithm using the approach of the pca method, trying not only to reduce the feature space, but also trying to separate the class distributions in the available learning set. The problematic issue of the work was the creation of a method of feature extraction describing the prognosis for a chronic lymphocytic leukemia type B-CLL, which will be at least as good, or even better than when compared to other quality extractions. The purpose of the research was accomplished for binary and three-class cases in the event in which for verification of extraction quality, five algorithms of machine learning were applied. The obtained results were compared with the application of paired samples Wilcoxon test.
Keywords: Principal Components Analysis, data classification, lymphocytic leukemia type B-CLL, recognition of patterns
Categories: H.2, H.3.7, H.5.4