Abstract:
With the explosive growth of digitalized data, knowledge discovery from data is attracting more attentions from different domains. Lots of data analysis and mining algori...Show MoreMetadata
Abstract:
With the explosive growth of digitalized data, knowledge discovery from data is attracting more attentions from different domains. Lots of data analysis and mining algorithms have been proposed in machine learning field. But most of these works concentrated on outperforming previous ones on more datasets. Performance on a given application dataset remains unclear. To evaluating all the available algorithms against a particular dataset can be computationally expensive and tedious. As a result, it's difficult for domain experts to choose a suitable data analysis algorithm. The frequently applied methods are always limited to several popular ones. This paper investigates the characteristics of dataset with 23 proposed quantitative indicators. With these quantitative descriptions, a novel framework PepAls is proposed as an initial attempt to recommend a suitable analysis algorithm for a given application dataset. The highest analysis result of the dataset, which could be reached by up to date methods, is also predicted by PepAls. Using PepAls, domain engineers can roughly preview the analysis result and quickly get the suitable algorithm. Newly proposed methods in machine learning can be easily introduced to application. PepAls is a framework, which can serve different kinds of learning problem when embedded different recommendation models. This paper uses multi-label learning problem as an example to illustrate the kernel idea. Elaborated experiments show that our approach is practical and helpful to narrow the gap of algorithm design and application.
Date of Conference: 10-13 December 2018
Date Added to IEEE Xplore: 24 January 2019
ISBN Information: