skip to main content
10.1145/1183614.1183760acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Multi-task text segmentation and alignment based on weighted mutual information

Published: 06 November 2006 Publication History

Abstract

Text segmentation is important for text analysis, while text alignment is to determine shared sub-topics among similar documents. Multi-task text segmentation and alignment is the extension of single-task segmentation to utilize information of multi-source documents. In this paper we introduce a novel domain-independent unsupervised method for multi-task segmentation and alignment based on the idea that the optimal segmentation and alignment maximizes weighted mutual information, mutual information with term weights. The experiment results show that our approach works well.

References

[1]
R. Bekkerman, R. El-Yaniv, and A. McCallum. Multi-way distributional clustering via pairwise interactions. In Proc. ICML, 2005.
[2]
R. Caruana. Multitask learning. Machine Learning, 28:41--75, 1997.
[3]
F. Choi. Advances in domain indepedent linear text segmentation. In Proc. NAACL, pages 26--33, 2000.
[4]
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Systems, 1990.
[5]
I. Dhillon, S. Mallela, and D. Modha. Information-theoretic co-clustering. In Proc. SIGKDD, pages 89--98, 2003.
[6]
T. Hofmann. Probabilistic latent semantic analysis. In Proc. UAI, 1999.
[7]
X. Ji and H. Zha. Domain-independent text segmentation using anisotropic diffusion and dynamic programming. In Proc. SIGIR, pages 322--329, 2003.
[8]
M. Utiyama and H. Isahara. A statistical model for domain-independent text segmentation. In Proc. ACL, pages 491--498, 1999.

Cited By

View all
  • (2009)Learning to rank graphs for online similar graph searchProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646252(1871-1874)Online publication date: 2-Nov-2009
  • (2009)Independent informative subgraph mining for graph information retrievalProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646026(563-572)Online publication date: 2-Nov-2009
  • (2007)Topic segmentation with shared topic detection and alignment of multiple documentsProceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/1277741.1277778(199-206)Online publication date: 23-Jul-2007

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
November 2006
916 pages
ISBN:1595934332
DOI:10.1145/1183614
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 November 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multi-task
  2. text alignment
  3. text segmentation
  4. weighted mutual information

Qualifiers

  • Article

Conference

CIKM06
CIKM06: Conference on Information and Knowledge Management
November 6 - 11, 2006
Virginia, Arlington, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2009)Learning to rank graphs for online similar graph searchProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646252(1871-1874)Online publication date: 2-Nov-2009
  • (2009)Independent informative subgraph mining for graph information retrievalProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646026(563-572)Online publication date: 2-Nov-2009
  • (2007)Topic segmentation with shared topic detection and alignment of multiple documentsProceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/1277741.1277778(199-206)Online publication date: 23-Jul-2007

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media