Efficient Algorithms for Constructing Time Decompositions of Time Stamped Documents

Chundi, Parvathi; Zhang, Rui; Rosenkrantz, Daniel J.

doi:10.1007/11546924_50

Parvathi Chundi¹⁹,
Rui Zhang¹⁹ &
Daniel J. Rosenkrantz²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3588))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1273 Accesses

Abstract

Identifying temporal information of topics from a document set typically involves constructing a time decomposition of the time period associated with the document set. In an earlier work, we formulated several metrics on a time decomposition, such as size, information loss, and variability, and gave dynamic programming based algorithms to construct time decompositions that are optimal with respect to these metrics. Computing information loss values for all subintervals of the time period is central to the computation of optimal time decompositions. This paper proposes several algorithms to assist in more efficiently constructing an optimal time decomposition. More efficient, parallelizable algorithms for computing loss values are described. An efficient top-down greedy heuristic to construct an optimal time decomposition is also presented. Experiments to study the performance of this greedy heuristic were conducted. Although lossy time decompositions constructed by the greedy heuristic are suboptimal, they seem to be better than the widely used uniform length decompositions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Discovering patterns in time-varying graphs: a triclustering approach

Article 13 October 2015

Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile

Article 24 June 2017

Compressed \(\text {k}\mathsf {^d}\text {-tree}\) for temporal graphs

Article 31 December 2015

References

Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search In Sequence Databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)
Google Scholar
Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In: Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 216–225 (2003)
Google Scholar
Keogh, E., Chu, S., Hart, D., Pazzani, M.: An Online Algorithm for Segmenting Time Series. In: Proc. of the IEEE International Conference on Data Mining, pp. 289–296 (2001)
Google Scholar
Das, G., Gunopulos, D., Mannila, H.: Finding Similar Time Series. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 88–100. Springer, Heidelberg (1997)
Google Scholar
Lent, B., Agrawal, R., Srikant, R.: Discovering Trends in Text Databases. In: Proc. of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD), pp. 227–230 (1997)
Google Scholar
Roy, S., Gevry, D., Pottenger, W.M.: Methodologies for Trend Detection in Textual Data Mining. In: Proc. of the Textmine 2002 Workshop, SIAM Intl. Conf. on Data Mining (2002)
Google Scholar
Allan, J., Lavrenko, V., Malin, D., Swan, R.: Detections, Bounds, and Timelines: UMass and TDT-3. In: Proc. of the 3rd Topic Detection and Tracking Workshop (2000)
Google Scholar
Chundi, P., Rosenkrantz, D.J.: Constructing Time Decompositions for Analyzing Time Stamped Documents. In: Proc. of the 4th SIAM International Conference on Data Mining, pp. 57–68 (2004)
Google Scholar
Chundi, P., Rosenkrantz, D.J.: On Lossy Time Decompositions of Time Stamped Documents. In: Proc. of the ACM 13th Conference on Information and Knowledge Management (2004)
Google Scholar
Chundi, P., Rosenkrantz, D.J.: Information Preserving Decompositions of Time Stamped Documents. Submitted to the Journal of Data Mining and Knowledge Discovery
Google Scholar
Swan, R., Allan, J.: Automatic Generation of Overview Timelines. In: Proc. of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 49–56 (2000)
Google Scholar
Swan, R., Allan, J.: Extracting Significant Time Varying Features from Text. In: Finin, T.W., Yesha, Y., Nicholas, C. (eds.) CIKM 1992. LNCS, vol. 752, pp. 38–45. Springer, Heidelberg (1993)
Google Scholar
Swan, R., Jensen, D.: TimeMines: Constructing Timelines with Statistical Models of Word Usage. In: Proc. KDD 2000 Workshop on Text Mining (2000)
Google Scholar
Himberg, J., Korpiaho, K., Mannila, H., Tikanmäki, J., Toivonen, H.T.T.: Time series segmentation for context recognition in mobile devices. In: Proc. of the IEEE International Conference on Data Mining, pp. 203–210 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Deptartment, University of Nebraska at Omaha, Omaha, NE
Parvathi Chundi & Rui Zhang
Computer Science Department, SUNY at Albany, Albany, NY, 12222, USA
Daniel J. Rosenkrantz

Authors

Parvathi Chundi
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Daniel J. Rosenkrantz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Copenhagen Business School, Centre for Applied ICT, 60 Howitzvej, 2000, Frederiksberg, DK
Kim Viborg Andersen
University Of Technology Sydney, NSW 2007, Australia
John Debenham
University of Linz, Altenbergerstraße 69, 4040, Linz, Austria
Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chundi, P., Zhang, R., Rosenkrantz, D.J. (2005). Efficient Algorithms for Constructing Time Decompositions of Time Stamped Documents. In: Andersen, K.V., Debenham, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2005. Lecture Notes in Computer Science, vol 3588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546924_50

Download citation

DOI: https://doi.org/10.1007/11546924_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28566-3
Online ISBN: 978-3-540-31729-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics