Abstract:
Complex biological data generated from various experiments are stored in diverse data types in multiple datasets. By appropriately representing each biological dataset as...Show MoreMetadata
Abstract:
Complex biological data generated from various experiments are stored in diverse data types in multiple datasets. By appropriately representing each biological dataset as a kernel matrix then combining them in solving problems, the kernel-based approach has become a spotlight in data integration and its application in bioinformatics and other fields as well. While linear combination of unweighed multiple kernels (UMK) is popular, there have been effort on multiple kernel learning (MKL) where optimal weights are learned by semi-definite programming or sequential minimal optimization (SMO-MKL). These methods provide high accuracy of biological prediction problems, but very complicated and hard to use, especially for non-experts in optimization. These methods are also usually of high computational cost and not suitable for large data sets. In this paper, we propose two simple but effective methods for determining weights for conic combination of multiple kernels. The former is to learn optimal weights formulated by our measure FSM for kernel matrix evaluation (feature space-based kernel matrix evaluation measure), denoted by FSM-MKL. The latter assigns a weight to each kernel that is proportional to the quality of the kernel, determining by direct cross validation, named proportionally weighted multiple kernels (PWMK). Experimental comparative evaluation of the four methods UMK, SMO-MKL, FSM-MKL and PWMK for the problem of protein-protein interactions shows that our proposed methods are simpler, more efficient but still effective. They achieved performances almost as high as that of MKL and higher than that of UMK.
Date of Conference: 13-17 July 2008
Date Added to IEEE Xplore: 08 August 2008
ISBN Information: