Skip to main content

Efficient Algorithms for Constructing Time Decompositions of Time Stamped Documents

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3588))

Abstract

Identifying temporal information of topics from a document set typically involves constructing a time decomposition of the time period associated with the document set. In an earlier work, we formulated several metrics on a time decomposition, such as size, information loss, and variability, and gave dynamic programming based algorithms to construct time decompositions that are optimal with respect to these metrics. Computing information loss values for all subintervals of the time period is central to the computation of optimal time decompositions. This paper proposes several algorithms to assist in more efficiently constructing an optimal time decomposition. More efficient, parallelizable algorithms for computing loss values are described. An efficient top-down greedy heuristic to construct an optimal time decomposition is also presented. Experiments to study the performance of this greedy heuristic were conducted. Although lossy time decompositions constructed by the greedy heuristic are suboptimal, they seem to be better than the widely used uniform length decompositions.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search In Sequence Databases. In: Lomet, D.B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993)

    Google Scholar 

  2. Vlachos, M., Hadjieleftheriou, M., Gunopulos, D., Keogh, E.: Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures. In: Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 216–225 (2003)

    Google Scholar 

  3. Keogh, E., Chu, S., Hart, D., Pazzani, M.: An Online Algorithm for Segmenting Time Series. In: Proc. of the IEEE International Conference on Data Mining, pp. 289–296 (2001)

    Google Scholar 

  4. Das, G., Gunopulos, D., Mannila, H.: Finding Similar Time Series. In: Komorowski, J., Żytkow, J.M. (eds.) PKDD 1997. LNCS, vol. 1263, pp. 88–100. Springer, Heidelberg (1997)

    Google Scholar 

  5. Lent, B., Agrawal, R., Srikant, R.: Discovering Trends in Text Databases. In: Proc. of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD), pp. 227–230 (1997)

    Google Scholar 

  6. Roy, S., Gevry, D., Pottenger, W.M.: Methodologies for Trend Detection in Textual Data Mining. In: Proc. of the Textmine 2002 Workshop, SIAM Intl. Conf. on Data Mining (2002)

    Google Scholar 

  7. Allan, J., Lavrenko, V., Malin, D., Swan, R.: Detections, Bounds, and Timelines: UMass and TDT-3. In: Proc. of the 3rd Topic Detection and Tracking Workshop (2000)

    Google Scholar 

  8. Chundi, P., Rosenkrantz, D.J.: Constructing Time Decompositions for Analyzing Time Stamped Documents. In: Proc. of the 4th SIAM International Conference on Data Mining, pp. 57–68 (2004)

    Google Scholar 

  9. Chundi, P., Rosenkrantz, D.J.: On Lossy Time Decompositions of Time Stamped Documents. In: Proc. of the ACM 13th Conference on Information and Knowledge Management (2004)

    Google Scholar 

  10. Chundi, P., Rosenkrantz, D.J.: Information Preserving Decompositions of Time Stamped Documents. Submitted to the Journal of Data Mining and Knowledge Discovery

    Google Scholar 

  11. Swan, R., Allan, J.: Automatic Generation of Overview Timelines. In: Proc. of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 49–56 (2000)

    Google Scholar 

  12. Swan, R., Allan, J.: Extracting Significant Time Varying Features from Text. In: Finin, T.W., Yesha, Y., Nicholas, C. (eds.) CIKM 1992. LNCS, vol. 752, pp. 38–45. Springer, Heidelberg (1993)

    Google Scholar 

  13. Swan, R., Jensen, D.: TimeMines: Constructing Timelines with Statistical Models of Word Usage. In: Proc. KDD 2000 Workshop on Text Mining (2000)

    Google Scholar 

  14. Himberg, J., Korpiaho, K., Mannila, H., Tikanmäki, J., Toivonen, H.T.T.: Time series segmentation for context recognition in mobile devices. In: Proc. of the IEEE International Conference on Data Mining, pp. 203–210 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chundi, P., Zhang, R., Rosenkrantz, D.J. (2005). Efficient Algorithms for Constructing Time Decompositions of Time Stamped Documents. In: Andersen, K.V., Debenham, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2005. Lecture Notes in Computer Science, vol 3588. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546924_50

Download citation

  • DOI: https://doi.org/10.1007/11546924_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28566-3

  • Online ISBN: 978-3-540-31729-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics