Elsevier

Neurocomputing

Volume 73, Issues 13–15, August 2010, Pages 2614-2623
Neurocomputing

VQSVM: A case study for incorporating prior domain knowledge into inductive machine learning

https://doi.org/10.1016/j.neucom.2010.05.007Get rights and content

Abstract

When dealing with real-world problems, there is considerable amount of prior domain knowledge that can provide insights on various aspect of the problem. On the other hand, many machine learning methods rely solely on the data sets for their learning phase and do not take into account any explicitly expressed domain knowledge. This paper proposes a framework that investigates and enables the incorporation of prior domain knowledge with respect to three key characteristics of inductive machine learning algorithms: consistency, generalization and convergence. The framework is used to review, classify and analyse key existing approaches to incorporating domain knowledge into inductive machine learning, as well as to consider the risks of doing so. The paper also demonstrates the design of a novel hierarchical semi-parametric machine learning method, capable of incorporating prior domain knowledge. The method—VQSVM—extends the support vector machine (SVM) family of methods with vector quantization (VQ) techniques to address the problem of learning from imbalanced data sets. The paper presents the results of testing the method on a collection of imbalanced data sets with various imbalance ratios and various numbers of subclasses. The learning process of the VQSVM method utilizes some domain knowledge to solve problem of fitting imbalance data. The experiments in the paper demonstrate that enabling the incorporation of prior domain knowledge into the SVM framework is an effective way to overcome the sensitivity of SVM towards the imbalance ratio in a data set.

Introduction

The Internet makes information access easier than ever before, creating strong demands for efficient and effective methods processing large amounts of data in both industry and scientific research communities. Despite the introduction of numerous new machine learning algorithms, there has been limited consideration of the mechanisms for utilization of domain knowledge to the benefit of machine learning. This paper is exploring this fundamental question in inductive machine learning, namely: How can a machine learning system represent and incorporate prior domain knowledge to facilitate the process of machine learning?

The vast majority of standard inductive learning algorithms are data driven; hence, they rely heavily on the sample data and ignore most of existing domain knowledge. There are several reasons why the majority of machine learning techniques ignore the domain knowledge.

Traditionally, researchers in machine learning have sought general-purpose learning algorithms rather than domain-oriented ones. On the other hand, domain experts often do not deeply understand complex machine learning algorithms. Hence, they can hardly incorporate their knowledge into the learning algorithms, even if the architecture of the learning system allows manual tuning of a set of control parameters.

When looking for general-purpose algorithms, researchers in machine learning often have limited understanding of the prior knowledge in the application domains they deal with (i.e. the prior knowledge about the phenomena that have generated the data sets used in their research). Even if they have intentions to consider such knowledge, representations of domain knowledge vary and the lack of a single universal (or standardized) format makes difficult to code these representations in computer systems.

The ability of a machine learning algorithm to incorporate domain knowledge is particularly valuable when it is applied to real-world problems, where valuable prior domain knowledge is available. If successfully incorporated, such prior domain knowledge will improve both the quality of the result produced by the machine learning process and the efficiency of the learning process itself. Further the paper is organized as follows. Section 2 introduces respective basic concepts of inductive machine learning and prior domain knowledge, necessary for presenting proposed methods. Section 3 presents the principles of incorporating prior domain knowledge into inductive machine learning and the application of these principles to the analysis of existing methods of incorporating domain knowledge into inductive machine learning. Section 4 presents the design of a novel hierarchical semi-parametric machine learning method—VQSVM—capable of incorporating prior domain knowledge and the results of its tests over a collection of imbalanced data sets with various imbalance ratios and various numbers of subclasses.

Section snippets

Inductive machine learning and prior domain knowledge

For the purpose of this paper, the inductive machine learning problem is defined as follows: Given a training data set, e.g. input–output pairs (xi, yi), which are drawn independently according to a unknown distribution D from unknown function c(x), the so-called I.I.D. assumption, the task of the machine learning algorithm is to find a model (or hypothesis) f(x) within the given hypothesis space H to best approximate to the c(x), i.e. f(x)c(x) with respect to the given data set X×Y and f(x)H

Consistency, generalization and convergence with prior domain knowledge

Prior domain knowledge usefully enhances a machine learning algorithm in multiple aspects. The performance of a machine learning algorithm is generally determined by three key issues: consistency, generalization and convergence.

VQSVM as a case study of incorporating domain knowledge into inductive machine learning

Following the discussion in the previous sections, a novel method is presented to demonstrate the advantage of knowledge-based machine learning, especially under unusual circumstances, such as imbalance data sets.

Traditional statistical methods which make assumptions regarding underlying distributions are named parametric methods. The term “parametric” indicates that the structure of the resultant model is known, and the learning methods only estimate the parameters of the resultant models. In

Conclusion

As a conclusion, this paper reviews the developments of incorporating prior domain knowledge into inductive machine learning, and proposes a framework that incorporates prior domain knowledge in three key issues of inductive machine learning, i.e. consistency, generalization and convergence.

The VQSVM is a hierarchical semi-parametric machine learning algorithm which combines vector quantization and support vector machine. The existing prior domain knowledge indicates there are multiple

Ting Yu is a research fellow at the Integrated Sustainability Analysis (ISA) Group, University of Sydney, Australia. He was awarded a Ph.D. in Computing Science from the University of Technology, Sydney, in 2007. He received an M.Sc. in Distributed Multimedia System from the University of Leeds, UK, and a B.Eng. from the Zhejiang University, PR China. His research interests include Machine Learning, Data Mining, Mathematic Optimization, Parallel Computing, Applied Economic and Sustainability

References (37)

  • E. Tsang et al.

    EDDIE-Automation, a decision support tool for financial forecasting

    Decision Support Systems

    (2004)
  • M. Pazzani et al.

    A knowledge-intensive approach to learning relational concepts

  • S.J. Russell

    Prior knowledge and autonomous learning

    Robotics and Autonomous Systems

    (1991)
  • T. Mitchell

    Machine Learning

    (1997)
  • J. Galindo et al.

    Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications

    Computational Economics

    (2000)
  • T. Poggio, Lecture 2, Learning Problem and Regularization, in: Statistical Learning Theory and Applications, MIT...
  • Y.S. Abu-Mostafa

    Hints

    Neural Computation

    (1995)
  • L.S. Daniel, Selective transfer of neural network task knowledge, Ph.D. Thesis, Department of Computer Science,...
  • O. Bousquet et al.

    Introduction to Statistical Learning Theory

  • B. Scholkopf, P. Simard, A. Smola, V. Vapnik, Prior knowledge in support vector kernels, in: Proceedings of the 1997...
  • P. Niyogi et al.

    Incorporating prior information in machine learning by creating virtual examples

    IEEE

    (1998)
  • A.V. Aho, J.E. Hopcroft, J.D. Ullman, NP-Complete Problems. The Design and Analysis of Computer Algorithms,...
  • D. Decoste et al.

    Training invariant support vector machines machine learning

    Machine Learning

    (2002)
  • S. Thrun

    Explanation based Neural Network Learning: A Lifelong Learning Approach

    (1996)
  • T. Jan, T. Yu, J. Debenham, S. Simoff, Financial prediction using modified probabilistic learning network with embedded...
  • S.-C. Yoon et al.

    Using domain knowledge in knowledge discovery

  • I. Davidson, S.S. Ravi, Hierarchical clustering with constraints: theory and practice, in: The 9th European Principles...
  • P. Simard, B. Victorri, Y.L. Cun, J. Denker, in: J. Moody (Ed.), Tangent Prop—A Formalism for Specifying Selected...
  • Cited by (34)

    • A new approach of integrating industry prior knowledge for HAZOP interaction

      2023, Journal of Loss Prevention in the Process Industries
    • Semi-supervised learning for big social data analysis

      2018, Neurocomputing
      Citation Excerpt :

      Besides, an Ivanov biased regularization term is used and the regularization parameter λ2 is implicitly set to a fixed value. To extend the scheme [53] to semi-supervised learning, the present framework involves both labeled and unlabeled data in the clustering step. Indeed, the effectiveness of model selection is improved by adopting a formulation of biased regularization that fully exploits parameters λ1 and λ2.

    • Addressing imbalanced data with argument based rule learning

      2015, Expert Systems with Applications
      Citation Excerpt :

      This is rather a use case study than a general method for incorporating the background knowledge. Yet another approach has been presented by Ting, Simoff, and Jan (2010) to construct a hierarchical model of a vector quantization over many local SVMs, where the domain knowledge helps deal with multiple subclasses within the majority classes. Finally, in Stuart (2013) and Rabenoro, Lacaille, Cottrell, and Rossi (2014) an expert explicitly formulated several rules, which were then used as starting seeds for an automatic rule induction process.

    View all citing articles on Scopus

    Ting Yu is a research fellow at the Integrated Sustainability Analysis (ISA) Group, University of Sydney, Australia. He was awarded a Ph.D. in Computing Science from the University of Technology, Sydney, in 2007. He received an M.Sc. in Distributed Multimedia System from the University of Leeds, UK, and a B.Eng. from the Zhejiang University, PR China. His research interests include Machine Learning, Data Mining, Mathematic Optimization, Parallel Computing, Applied Economic and Sustainability Analysis.

    Simeon Simoff is a Professor of Information Technology and Head of the School of Computing and Mathematics, University of Western Sydney. He is also an adjunct professor at the University of Technology, Sydney (UTS). He is known for the unique blend of interdisciplinary scholarship, spanning across information mining and analytics, virtual worlds and institutions, and digital knowledge representations. His current research interests are focused on information-rich trading environments and technologies that support them, in particular technologies for information extraction and advice synthesis and the delivery of such advice to negotiation and mediation agents. He received his Ph.D. in Computer Science from Moscow Power Engineering Institute.

    Tony Jan is a Senior Lecturer at the University of Technology Sydney, Australia. He specializes in machine learning techniques for signal processing. He has published over 40 articles in international journals and conferences on machine learning and information security.

    View full text