Abstract
This paper presents a Text Mining approach for discovering knowledge in texts to later construct decision support systems. Text mining can take advantage of knowledge stored in textual documents, reducing the effort for knowledge acquisition. The approach consists in performing a mining process on concepts present in texts instead of working with words. The assumption is that concepts represent real world events and characteristics better than words, allowing the understanding and the explanation of the reasoning used in decision processes. The proposed approach extracts concepts expressed in natural phrases, and then analyzes their distributions and associations. Concepts distributions and associations are used to characterize classes or situations. After the discovery process, the obtained knowledge can be embedded in automated systems to classify elements or to suggest actions or solutions to problems. In this paper, experiments using the approach in a psychiatric domain are discussed. Concepts extracted from textual medical records represent patients' symptoms, signals and social/behavior characteristics. An automatic system was constructed with the approach: a classifier whose goal is to help physicians in disease diagnoses. Results from this system show that the approach is feasible for constructing decision support systems with satisfactory performance.
Similar content being viewed by others
References
R. Feldman and I. Dagan, “Knowledge discovery in textual databases (KDT),” in Proc. 1st International Conference on Knowledge Discovery (KDD-95), Montreal, August 1995, pp. 112–117.
A. Wilcox et al., “Using knowledge sources to improve classification of medical text reports,” in Proc. Workshop on Text Mining (KDD-2000), Boston, MA, USA, August 2000. Online at www.cs.cmu.edu/~dunja/wshkdd2000.html
R. Feldman and I. Dagan, “Mining text using keyword distributions,” Journal of Intelligent Information Systems, vol. 10, no.3, pp. 281–300, 1998.
S.H. Lin et al., “Extracting classification knowledge of Internet documents with mining term associations: A semantic approach,” in Proc. 21st International ACM-SIGIR Conference on Research andDevelopment in Information Retrieval (SIGIR-98), Melbourne, August 1998, pp. 241–249.
R. Feldman et al., “Text mining at the term level,” in Proc. 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD-98). Lecture Notes in Computer Science vol. 1510, pp. 65–73, Springer-Verlag, 1998. Online at http://www.wisdom.weizmann.ac.il/~lindell/
H. Chen, “The vocabulary problem in collaboration,” IEEE Computer (Special Issue on CSCW), vol. 27, no.5, pp. 2–10, 1994. Online at http://ai.bpa.arizona.edu/papers/cscw94/cscw94.html
H. Chen et al., “A concept space approach to addressing the vocabulary problem in scientific information retrieval: An experiment on the worm community system,” Journal of the American Society for Information Science, vol. 48, no.1, pp. 17–31, 1997. Online at http://ai.bpa.arizona.edu/papers/wcs96/wcs96.html
G.W. Furnas et al., “The vocabulary problem in human-system communication,” Communications of the ACM, vol. 30, no.11, pp. 964–971, 1987.
L.S. Jensen and T. Martinez, “Improving text classification by using conceptual and contextual features,” in Proc. Workshop on Text Mining (KDD-2000), Boston, MA, USA, August 2000. Online at www.cs.cmu.edu/~dunja/wshkdd2000.html
P. Subasic and A. Huettner, “Calculus of fuzzy semantic typing for qualitative analysis of text,” in Proc. Workshop on Text Mining (KDD-2000), Boston, MA, USA, August 2000. Online at www.cs.cmu.edu/~dunja/wshkdd2000.html
H. Chen et al., “Automatic concept classification of text from electronic meetings,” Communications of the ACM, vol. 37, no.10, pp. 56–73, 1994. Online at http://ai.bpa.arizona.edu/papers/ebs92/ebs92.html
C.H. Lin and H. Chen, “An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents,” IEEE Transactions on Systems, Man and Cybernetics, vol. 26, no.1, pp. 1–14, 1996. Online at http://ai.bpa.arizona.edu/papers/chinese93/chinese93.html
C. Apté et al., “Automated learning of decision rules for text categorization,” ACMTransactions on Information Systems, vol. 12, no.3, pp. 233–251, 1994.
J.F. Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks/Cole Publishing Co.: Pacific Grove, 2000.
S. Loh et al., “Concept-based knowledge discovery in texts extracted from the web,” ACM SIGKDD Explorations, vol. 2, no.1, pp. 29–39, 2000. Online at http://www.acm.org/sigkdd/explorations
M. Garofalakis et al., “Data mining and the web: Past, present and future,” in Proc. ACM Workshop on Information and Data Management, Kansas City, 1999, pp. 43–47.
L. Galavotti et al., “Feature selection and negative evidence in automated text categorization,” in Proc. Workshop on Text Mining (KDD-2000), Boston, MA, USA, August 2000. Online at www.cs.cmu.edu/~dunja/wshkdd2000.html
G. Salton and M.J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983.
D.D. Lewis, “Evaluating text categorization,” in Proc. Speech and Natural Language Workshop, February 1991, pp. 312–318. Online at http://www.research.att.com/~lewis
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Loh, S., de Oliveira, J.P.M. & Gameiro, M.A. Knowledge Discovery in Texts for Constructing Decision Support Systems. Applied Intelligence 18, 357–366 (2003). https://doi.org/10.1023/A:1023258306854
Issue Date:
DOI: https://doi.org/10.1023/A:1023258306854