Splice Site Prediction using Support Vector Machines with Context-Sensitive Kernel Functions
Yifei Chen (Vrije Universiteit Brussel, Belgium)
Feng Liu (Vrije Universiteit Brussel, Belgium)
Bram Vanschoenwinkel (Vrije Universiteit Brussel, Belgium)
Bernard Manderick (Vrije Universiteit Brussel, Belgium)
Abstract: This paper focuses on the use of support vector machines on a typical context-dependent classification task, splice site prediction. For this type of problems, it has been shown that a context-based approach should be preferred over a transformation approach because the former approach can easily incorporate statistical measures or directly plug sensitivity information into distance functions. In this paper, we designed three types of context-sensitive kernel functions: polynomial-based, radial basis function-based and negative distance-based kernels. From the experimental results it becomes clear that the radial basis function-based kernel with information gain weighting gets the best accuracies and can always outperform their simple non-sensitive counterparts both in accuracy and in model complexity. And with well designed features and carefully chosen context sizes, our system can predict splice sites with fairly high accuracy, which can achieve the F P 95% rate, 3.94 for donor sites and 5.98 for acceptor sites, an approximate state of the art performance for the moment.
Keywords: kernel functions, splice site prediction, support vector machines
Categories: I.2.6, I.5.4, J.3