Abstract
There are numerous applications where there is a need to rapidly infer a story about a given subject from a given set of potentially heterogeneous data sources. In this paper, we formally define a story to be a set of facts about a given subject that satisfies a “story length” constraint. An optimal story is a story that maximizes the value of an objective function measuring the goodness of a story. We present algorithms to extract stories from text and other data sources. We also develop an algorithm to compute an optimal story, as well as three heuristic algorithms to rapidly compute a suboptimal story. We run experiments to show that constructing stories can be efficiently performed and that the stories constructed by these heuristic algorithms are high quality stories. We have built a prototype STORY system based on our model—we briefly describe the prototype as well as one application in this paper.
Similar content being viewed by others
References
Agrawal R, Bayardo R, Srikant R (2000) Athena: mining-based interactive management of text databases. In: Proc. intl. conf. on extending database technology. Lecture notes in computer science, vol. 1777. Springer, Berlin Heidelberg New York, pp 365–379
Bers M, Ackermann E, Cassell J, Donegan B, Gonzalez-Heydrich J, DeMaso D, Strohecker C, Lualdi S, Bromley D, Karlin J (1998) Interactive storytelling environments: coping with cardiac illness at Boston’s Children’s Hospital. In: Proc. CHI-1998. ACM, New York, pp 603–610
Callan J, Mitamura T (2002) Knowledge-based extraction of named entities. In: Proc. 4th int. conf. on information and knowledge management. ACM, New York, pp 532–537
de Oliverira IL, Wazlawick RS, (1998) A modular connectionist parser for resolution of pronominal anaphoric references in multiple sentences. In: Proc. int. joint conf. on neural networks. IEEE world congress on computational intelligence, vol 2, pp 1194–1199, May
Fayzullin M, Subrahmanian VS, Picariello A, Sapino ML (2005) The CPR model for summarizing video. Multimedia Tools and Applications 26(2):153–173
Francis WN (1979) Brown corpus manual. http://helmer.aksis.uib.no/icame/brown/bcm.html
GuoDong Z, Jian S (2003) Integrating various features in hidden Markov model using constraint relaxation algorithm for recognition of named entities without gazetteers. In: Proc. int. conf. on natural language processing and knowledge engineering, pp 465–470, October
Jamil HM, Lakshmanan LVS (1995) A declarative semantics for behavioral inheritance and conflict resolution. In: Lloyd J (ed) Proc. of the 12th international logic programming symposium (ILPS). MIT Press, Portland, Oregon, pp 130–144, 4–7 December 1995
Langdon WB, Poli R (2002) Foundations of genetic algorithms. Springer, Berlin Heidelberg New York
Machado I, Prada R, Paiva A (2000) Bringing drama into a virtual stage. Proc. CVE-2000
Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41, November
Neuhoff D (1975) The Viterbi algorithm as an aid in text recognition. IEEE Trans Inf Theory 21(2):222–226
Oyama S, Kokubo T, Ishida T (2004) Domain-specific web search with keyword spices. IEEE Trans Knowl Data Eng 13(1):17–27, January
Rosso P, Masulli F, Buscaldi D (2003) Word sense disambiguation combining conceptual distance, frequency and gloss. In: Proc. int. conf. on natural language processing and knowledge engineering, pp 120–125, October
Theune M, Faas S, Nijholt A, Heylen D (2003) The virtual storyteller: story creation by intelligent agents. In: Proc. of technologies for interactive digital storytelling and entertainment conference, pp 204–215
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fayzullin, M., Subrahmanian, V.S., Albanese, M. et al. Story creation from heterogeneous data sources. Multimed Tools Appl 33, 351–377 (2007). https://doi.org/10.1007/s11042-007-0100-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-007-0100-4