An overview of data mining and knowledge discovery

Fan, Jianhua; Li, Deyi

doi:10.1007/BF02946624

An overview of data mining and knowledge discovery

Published: July 1998

Volume 13, pages 348–368, (1998)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Fan Jianhua¹ &
Li Deyi²

711 Accesses
Explore all metrics

Abstract

With massive amounts of data stored in databases, mining information and knowledge in databases has become an important issue in recent research. Researchers in many different fields have shown great interest in data mining and knowledge discovery in databases. Several emerging applications in information providing services, such as data warehousing and on-line services over the Internet, also call for various data mining and knowledge discovery techniques to understand user behavior better, to improve the service provided, and to increase the business opportunities. In response to such a demand, this article is to provide a comprehensive survey on the data mining and knowledge discovery techniques developed recently, and introduce some real application systems as well. In conclusion, this article also lists some problems and challenges for further research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Review on Knowledge Discovery from Databases

An overview of actionable knowledge discovery techniques

Article 20 October 2021

Effective Knowledge Discovery Using Data Mining Algorithm

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Fayyad U, Piatetsky-Shapiro G, Smyth P. Knowledge discovery and data mining: Towards a unifying framework. InProc. KDD-96: Second International Conference on Knowledge Discovery & Data Mining Menlo Park, CA: AAAI Press, 1996, pp.82–88.
Google Scholar
Matheus Christopher J, Chan Philip K, Piatetsky-Shapiro G. Systems for knowledge discovery in databases.IEEE Trans. Knowl. Data Eng., 1993, 5(6).
Knowledge Discovery Nuggets on the Internet: http://www.nuggets.com/.
Fayyad U, Piatetsky-Shapiro G, Smyth P. From data mining to knowledge discovery in databases.AI Magazine, Fall, 1996, pp.37–54.
Agrawal R, Srikant R. Fast algorithms for mining association rules. InProc. Int'l Conf. Very Large Databases, 1994, pp.487–499.
Agrawal R, Imielinski T, Swami A. Mining association rules between sets of items in large databases. InProc. ACM SIGMOD, May 1993, pp.207–216.
Srikant R, Agrawal R. Mining generalized association rules. InProc. 21st Int'l Conf. Very Large Databases, September 1995, pp.407–419.
Han Jiawei, Fu Yong. Discovery of multi-level association rules from large databases. InProc. Int'l Conf. Very Large Databases, 1995, pp.420–431.
Shen K, Ong, Mitbander B, Zaniolo C. Metaqueries for data mining. In:Advances in Knowledge Discovery and Data Mining, Fayyad R U (eds.), AAAI/MIT Press, 1996, pp.375–398.
Park J S, Chen M S, Yu P S. An effective hash based algorithm for mining association rules. InProc. ACM SIGMOD, May 1995, pp.175–186.
Fu Y, Han J. Meta-rule-guided mining of association rules in relational databases. InProc. 1st Int'l Workshop on Integration of Knowledge Discovery with Deductive and Object-Oriented Databases (KDOOD'95), Singapore, Dec. 1995, pp.39–46.
Piatetsky-Shapiro G. Discovery, analysis, and presentation of strong rules. InKnowledge Discovery in Databases Piatetsky-Shapiro G, Frawley W J (eds.), AAAI/MIT Press, 1991, pp. 229–238.
Silberschatz A, Tuzhilin A. On subjective measure of interestingness in knowledge discovery. InProc. 1st Int'l Conf. Knowledge Discovery and Data Mining (KDD95) Montreal, Canada, Aug. 1995, pp.275–281.
Harinarayan V, Ullman J D, Rajaraman A. Implementing data cubes efficiently. InProc. 1996 ACM-SIGMOD Int'l Conf. Management of Data, Montreal, Canada, June 1996.
Gupta A, Harinarayan V, Quass D. Aggregate-query processing in data warehousing environment. InProc. 21st Int'l Conf. Very Large Data Bases, Zurich, Switzerland, September 1995, pp.358–369/
Widom J. Research problems in data warehousing. InProc. 4th Int'l Conf. Information and Knowledge Management, Baltimore, Maryland, Nov. 1995, pp.25–30.
Han J, Cai Y, Cercone N. Data-driven discovery of quantitative rules in relational databases.IEEE Trans. Knowledge and Data Engineering, 1993, 3: 29–40.
Article Google Scholar
Cai Y, Cercone N, Han J. Attribute-Oriented Induction in Relational Databases. InKnowledge Discovery in Database, 1991, pp.213–228.
Han J, Fu Y. Exploration of the Power of Attribute-Oriented Induction in Data Mining. InAdvances in Knowledge Discovery and Data Mining Fayyad U M Piatetsky-Shapiro Get al. (eds.), AAAI/MIT Press, 1996, pp.399–421.
Han J, Cai Y, Cercone N. Knowledge discovery in databases: An attribute-oriented approach. InProc. 18th International Conference on Very Large Databases, Aug. 1992, pp.547–559.
Li Deyi, Shi Xuemei, Meng Haijun. Membership clouds and clouds generators.The Research and Development of Computers 1995, 42(8): 32–41.
Google Scholar
Han Ke. The discovery state space theory and its applications. Ph.D. dissertation Communication Engineering Institute, Nanjing, China 1996.
Google Scholar
Han J, Fu Y. Dynamic generation and refinement of concept hierarchies for knowledge discovery in databases. InProc. AAAI'94 Workshop Knowledge Discovery in Databases, Seatle, July 1994, pp.157–168.
Quinlan J R. Induction of decision trees.Machine Learning, 1986, 1: 81–106.
Google Scholar
Quinlan J R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Shan Ning, Ziarko W, Hamilton H J, Cercone N. Discovering classification knowledge in databases using rough sets. InProc. KDD-96: Second International Conference on Knowledge Discovery & Data Mimng, Menlo Park, CA: AAAI Press, 1996, pp.271–274.
Google Scholar
Agrawal A, Ghosh S, Imielinkski T, Iyer B, Swami A. An interval classifier for database mining applications. InProc. 18th Int'l Conf. Very Large Data Bases, August 1992, pp.560–573.
Ng R, Han J. Efficient and effective clustering method for spatial data mining. InProc. International Conference on Very Large Databases, Santiago, Chile, September 1994, pp.144–155.
Kaufman L, Rousseeuw P J. Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, 1990.
Ester M, Kriegel H P, Xu X. Knowledge discovery in large spatial databases: Focusing techniques for efficient class identification. InProc. 4th Int'l Symposium on Large Spatial Databases (SSD95), Portland, Maine, August 1995, pp.67–82.
Zhang T, Ramakrishnan R, Livny M. BIRCH: An efficient data clustering method for very large databases. InProc. ACM-SIGMOD, Montreal, Canada, June 1996.
Agrawal R, Faloutsos C, Swami A. Efficient similarity search in sequence databases. InProc. 4th Int'l Conf. Foundations of Data Organization and Algorithms, October 1993.
Faloutsos C, Ranganathan M, Manolopoulos Y. Fast subsequence matching in time-series databases. InProc. ACM SIGMOD, Minneapolis, MN, May 1994, pp.419–429.
Agrawal R, Lin K I, Sawhney H S, Shim K. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. InProc. 21st Int'l Conf. Very Large Databases, September 1995, pp.490–501.
Li C S, Yu P S, Castelli V. HierarchyScan: A hierarchical similarity search algorithm for databases of long sequences. InProc. 12th Int'l Conf. Data Engineering, February 1996.
Flockhart I W, Radcliffe N J. A genetic algorithm-based approach to data mining. InProc. KDD-96: Second Int'l Conf. Knowledge Discovery & Data Mining, Menlo Park, CA: AAAI Press, 1996, pp.299–302.
Google Scholar
Matheus C, Chan P, Piatetsky-Shapiro G. System for knowledge discovery in databases.IEEE Trans. Knowledge and Data Engineering, 1993, 5(6): 903–913.
Article Google Scholar
Schmitz J, Armstrong G, Little J D C. CoverStory automated news finding in marketing. InDSS Transactions, Institute of Management Sciences, Providence, RI, 1990.
Hoschka P, Klosgen W. A support system for interpreting statistical data. InKnowledge Discovery in Databases, Piatetsky-Shapiro G, Frawley W (eds.), Cambridge, MA: AAAI/MIT, 1991, pp.325–345.
Google Scholar
Piatetsky-Shapiro G, Matheus C J. Knowledge discovery workbench: An exploratory environment for discovery in business databases. InWorkshop Notes from the 9th National Conference on Artificial Intelligence: Knowledge Discovery in Databases, Anaheim, CA, July 1991, pp.11–24.
Piatetsky-Shapiro G. Discovery, Analysis, and Presentation of Strong Rules. InKnowledge Discovery in Databases, Cambridge, MA: AAAI/MIT, 1991, pp.229–248.
Google Scholar
Piatetsky-Shapiro G (ed.). Workshop Notes from the 9th Nar. Conf. Art. Intell.: Knowledge Discovery in Databases, Anaheim, CA, July, 1991.
Piatetsky-Shapiro G. Probabilistic data dependencies. InProc. Mach. Discovery Work, (9th Mach. Learn. Conf.), Aberdeen, Scotland, 1992, pp.11–17.
Han Jiawei, Fu Yongjianet al. DB Miner: A system for mining knowledge in large relational database. InProc. KDD-96: Second Int'l Conf. Knowledge Discovery & Data Mining, Menlo Park, CA: AAAI Press, 1996, pp.250–255.
Google Scholar
Roddick John F, Craske Noel G, Richards Thomas J. Handling discovered structure in database systems.IEEE Trans. Knowledge and Data Engineering, April 1996, 8(2): 227–240.
Article Google Scholar
Han Jiawei, Huang Yue, Cercone Nick, Fu Yongjian. Intelligent query answering by knowledge discovery techniques.IEEE Trans. Knowledge and Data Engineering, June 1996, 8(3): 373–390.
Article Google Scholar
Rakesh Agrawal, Manish Mehta, John Shafer, Ramakrishnan Srikant. The quest data mining system. InProc. KDD-96: Second Int'l Conf. Knowledge Discovery & Data Mining, Menlo Park, CA: AAAI Press, 1996, pp.244–249.
Google Scholar
Srikant R, Agrawal R. Mining quantitative association rules in large relational tables. InProc. ACM SIGMOD Conf. Management of Data, 1996.
Srikant R, Agrawal R. Mining sequential patterns: Generalizations and performance improvements. InProc. Fifth Int'l Conf. Extending Database Technology (EDBT), 1996.
Piatetsky-Shapiro G, Brachman R, Khabaza Tet al. An overview of issues in developing industrial data mining and knowledge discovery applications. InProc. KDD-96: Second Int'l Conf. Knowledge Discovery & Data Mining, Menlo Park, CA: AAAI Press, 1996, pp.89–95.
Google Scholar
Selinger P G. Predictions and challenges for database systems in the year 2000. InProc. 19th Int'l Conf. Very Large Databases, Agrawal R, Baker S, Bell D (eds.), Dublin, Ireland, 1993, pp.667–675.
Fayyad U, Haussler D, Stolorz P. KDD for science data analysis: Issues and examples. InProc. KDD-96: Second Int'l Conf. Knowledge Discovery & Data Mining, Menlo Park, CA: AAAI Press, 1996, pp.50–56.
Google Scholar
Frawley W J, Piatetsky-Shapiro G, Matheus C J. Knowledge discovery in databases: An overview. InKnowledge Discovery in Databases, Cambridge, MA: AAAI/MIT, 1991, pp.1–27. Reprinted inAI Magazine, 1992, 13(3): 1–27.
Google Scholar
Jain A K, Dubes R C. Algorithms for Clustering Data. Prentice-Hall, 1988.
Fisher-D. Optimization and simplification of hierarchical clustering. InProc. 1st Int'l Conf. Knowledge Discovery and Data Mining (KDD95), Montreal, Canada, August 1995, pp.118–123.
Cheeseman P, Stutz J. Bayesian classification (AutoClass): Theory and results. InAdvances in Knowledge Discovery and Data Mining, Fayyad U M, Piatetsky-Shapiro Get al. (eds.), AAAI/MIT Press, 1996, pp.153–180.
Li D, Shi X, Ward P, Gupta M M. Soft inference mechanism based on cloud models. InProc. First Int'l Workshop on Logic Programming and Soft Computing, Francesca Arcelli Fontana, Ferrante Formato and Trevor P. Martin (eds.), Bonn, Germany, Sept. 6, 1996, pp.38–62.
Li Deyi, Shi Xuemei, Vincent N G. On representing uncertainty in commonsense knowledge. InProc. Joint 1997 Pacific Asia Conf. Expert Systems/Singapore Int'l Conf. Intelligent Systems (PACES/SPICIS 97), Orchard Hotel, Singapore, Feb. 1997, pp.291–298.
Kaufman K A, Michalski R S, Kerschberg L. Mining for Knowledge in Databases: Goals and General Description of the INLEN System. InKnowledge Discovery in Databases, Piatetsky-Shapiro G, Frawley W J (eds.), AAAI/MIT Press, 1991, pp.449–462.
Michalski R S. A Theory and Methodology of inductive Learning. Machine Learning: An Artificial Intelligence Approach. Vol. 1, Michalski R Set al. (eds.), Morgan Kaufmann, 1983, pp.83–134.
Michalski R S, Kerschberg L, Kaufman K A, Ribeiro J S. Mining for knowledge in databases: The INLEN architecture, initial implementation and first results.J. Int'l Information Systems, 1992, 1: 85–114.
Article Google Scholar
Mehta M, Agrawal R, Rissanen J. A fast scaleable classifier for data mining. InProc. Fifth Int'l Conf. Extending Database Technology, 1996.
Arning A, Agrawal R. A linear method for deviation detection in large databases. InProc. 2nd Int'l Conf. Knowledge Discovery in Databases and Data Mining, 1996.
O'Leary D E. Knowledge discovery as a threat to database, security. InKnowledge Discovery in Databases, Piatetsky-Shapiro G, Frawley W J (eds.), AAAI/MIT Press, 1991, pp.229–238.
Piatetsky-Shapiro G, Matheus C J. Knowledge discovery workbench of exploring business databases.Int'l J. Intelligent Systems, 1992, 7: 675–686.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Communication Engineering Institute, 210016, Nanjing, P.R. China
Fan Jianhua
System Engineering Institute, 100039, Beijing, P.R. China
Li Deyi

Authors

Fan Jianhua
View author publications
You can also search for this author in PubMed Google Scholar
Li Deyi
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Fan Jianhua is a Ph.D. candidate in Department of Computer Science, Nanjing Communications Engineering Institute. His current research interests include data mining, C³I systems.

Li Deyi graduated from Department of Electronic Engineering, Southeast University in 1967, and received his Ph.D. degree in computer science from Heriot-Watt University, Edinburgh in 1983. He is presently a Professor and the Chief-Engineer in the Institute of Beijing Electroinc System Engineering. His research interests include data mimng, fuzzy control, system simulation and C³I systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, J., Li, D. An overview of data mining and knowledge discovery. J. of Comput. Sci. & Technol. 13, 348–368 (1998). https://doi.org/10.1007/BF02946624

Download citation

Received: 15 September 1997
Revised: 27 March 1998
Issue Date: July 1998
DOI: https://doi.org/10.1007/BF02946624

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An overview of data mining and knowledge discovery

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Review on Knowledge Discovery from Databases

An overview of actionable knowledge discovery techniques

Effective Knowledge Discovery Using Data Mining Algorithm

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An overview of data mining and knowledge discovery

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Review on Knowledge Discovery from Databases

An overview of actionable knowledge discovery techniques

Effective Knowledge Discovery Using Data Mining Algorithm

Explore related subjects

References

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation