Skip to main content
Log in

A fast scalable classifier tightly integrated with RDBMS

  • Correspondence
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper, we report our success in building efficient scalable classifiers by exploring the capabilities of modern relational database management systems (RDBMS). In addition to high classification accuracy, the unique features of the approach include its high training speed, linear scalability, and simplicity in implementation. More importantly, the major computation required in the approach can be implemented using standard functions provided by the modern relational DBMS. Besides, with the effective rule pruning strategy, the algorithm proposed in this paper can produce a compact set of classification rules. The results of experiments conducted for performance evaluation and analysis are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Shim K. Developing tightly-coupled data mining applications on a relational database system. InProceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining, August, 1996, pp.112–118.

  2. Liu B, Hsu W, Ma Y. Integrating classification and association rule mining. InProceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, New York, USA, 1998, pp.80–86.

  3. Meretakis D, Wüthrich B. Extending naïve Bayes classifiers using long itemsets. InProceedings of 5th International Conference on Knowledge Discovery and Data Mining, San Diego, California, August, 1999, pp.295–301.

  4. Wang M, Iyer B, Vitter J S. Scalable mining for classification rules in relational databases. InProceedings of the 1998 International Database Engineering and Applications Symposium, Barry Eaglestone, Bipin C Desai, Jianhua Shao (eds.), Cardiff, Wales, U.K.,IEEE Computer Society, 1998, pp.58–67.

    Google Scholar 

  5. Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. InProceedings of ACM SIGMOD International Conference of Management of Data, Washington D.C., May 1993, pp.207–216.

  6. Mehta M, Agrawal R, Rissanen J. SLIQ: A fast scalable classifier for data mining. InProceedings of the 5th International Conference on Extending Database Technology, Avignon, France, March, 1996, 18–33.

  7. Hongjun Lu, Hongyan Liu, Decision tables; Scalable classification exploring RDBMS capabilities. InProceedings of the 16th International Conference on Very Large Databases, Cairo, Egypt, 2000, pp.373–384.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liu Hongyan.

Additional information

This work is supported by the Tsinghua University 985 Basic Research Project (No.091101004).

LIU Hongyan received her Ph.D. degree in management science and engineering from Tsinghua University in 2000. Now she is a lecturer in the School of Economics and Management, Tsinghua University. Her major research interests include database, neural network, data warehousing, and data mining.

LU Hongjun received his Ph.D. degree in computer science from the University of Wisconsin in 1985. Now he is a professor in the Computer Science Department, Hong Kong University of Science and Technology. His major research interests include data/knowledge base management systems, physical database design and database performance, data warehousing, and data mining.

CHEN Jian received his Ph.D. degree in system engineering from Tsinghua University in 1989. Now he is a full professor and Chairman of the Department of Management Science and Engineering, Tsinghua University. His main research interests include supply chain management, E-commerce, decision support systems and information systems, forecast and optimization techniques.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Lu, H. & Chen, J. A fast scalable classifier tightly integrated with RDBMS. J. Comput. Sci. & Technol. 17, 152–159 (2002). https://doi.org/10.1007/BF02962207

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02962207

Keywords

Navigation