skip to main content
10.1145/2682571.2797095acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
short-paper

Document Layout Optimization with Automated Paraphrasing

Published: 08 September 2015 Publication History

Abstract

We introduce a new concept in document layout optimization. In our approach, paraphrase-based~layout~optimization, layout issues (e.g. widows due to poor page breaking) are automatically fixed by rewording the neighboring sentences. Techniques of paraphrasing are borrowed from the field of natural language processing towards this goal, which is the first attempt in the field of document engineering. We implemented a prototype TeX pre/post-processing system that includes two simple paraphrase generators. The experiment shows that our approach is promising and effective for improving document layout.

References

[1]
The Chicago manual of style. The University of Chicago Press Chicago, 16th edition, 2010.
[2]
I. Androutsopoulos and P. Malakasiotis. A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research, 38(1), 2010.
[3]
K. Bazargan and C. V. Radhakrishnan. Removing vertical stretch--mimicking traditional typesetting withTeX. TUGboat, 28(1), 2007.
[4]
J. Carroll, G. Minnen, Y. Canning, S. Devlin, and J. Tait. Practical simplification of English newspaper text to assist aphasic readers. In Proceedings of the AAAI Workshop on Integrating Artificial Intelligence and Assistive Technology, 1998.
[5]
M.-H. Chen, S.-T. Huang, J. Chang, and H.-C. Liou. Developing a corpus-based paraphrase tool to improve EFL learners' writing skills. Computer Assisted Language Learning, 28(1), 2015.
[6]
G. Gange, K. Marriott, and P. Stuckey. Optimal guillotine layout. In Proceedings of the 2012 ACM Symposium on Document Engineering, 2012.
[7]
J. Ganitkevitch, B. Van Durme, and C. Callison-Burch. PPDB: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013.
[8]
C. Jacobs, W. Li, and D. Salesin. Adaptive document layout via manifold content. In Proceedings of the 2nd International Workshop on Web Document Analysis, 2003.
[9]
R. Johari, J. Marks, A. Partovi, and S. Shieber. Automatic yellow-pages pagination and layout. Journal of Heuristics, 2(4), 1997.
[10]
D. E. Knuth and M. F. Plass. Breaking paragraphs into lines. Software: Practice and Experience, 11, 1981.
[11]
Y. Lin, J.-B. Michel, E. L. Aiden, J. Orwant, W. Brockman, and S. Petrov. Syntactic annotations for the Google Books Ngram Corpus. In Proceedings of the ACL 2012 System Demonstrations, 2012.
[12]
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. Computing Research Repository, abs/1301.3781, 2013.
[13]
M. Miwa, R. Saetre, Y. Miyao, and J. Tsujii. Entity-focused sentence simplification for relation extraction. In Proceedings of the 23rd International Conference on Computational Linguistics, 2010.
[14]
R. Nelken and E. Yamangil. Mining Wikipedia's article revision history for training computational linguistics algorithms. In Proceedings of the AAAI Workshop on Wikipedia and Artificial Intelligence, 2008.
[15]
R. Piccoli and J. B. Oliveira. Balancing font sizes for flexibility in automated document layout. In Proceedings of the 2013 ACM Symposium on Document Engineering, 2013.
[16]
W. Xu, C. Callison-Burch, and W. B. Dolan. SemEval-2015 task 1: Paraphrase and semantic similarity in Twitter (PIT). In Proceedings of the 9th International Workshop on Semantic Evaluation, 2015.
[17]
M. Yatskar, B. Pang, C. Danescu-Niculescu-Mizil, and L. Lee. For the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2010.
[18]
Z. Zhu, D. Bernhard, and I. Gurevych. A monolingual tree-based translation model for sentence simplification. In Proceedings of the 23rd International Conference on Computational Linguistics, 2010.

Cited By

View all
  • (2018)A general framework for globally optimized paginationComputational Intelligence10.1111/coin.1216535:2(242-284)Online publication date: 22-Mar-2018
  • (2016)A General Framework for Globally Optimized PaginationProceedings of the 2016 ACM Symposium on Document Engineering10.1145/2960811.2960820(11-20)Online publication date: 13-Sep-2016

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DocEng '15: Proceedings of the 2015 ACM Symposium on Document Engineering
September 2015
248 pages
ISBN:9781450333078
DOI:10.1145/2682571
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 September 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. document layout optimization
  2. natural language processing
  3. paraphrase
  4. tex
  5. typesetting

Qualifiers

  • Short-paper

Funding Sources

Conference

DocEng '15
Sponsor:
DocEng '15: ACM Symposium on Document Engineering 2015
September 8 - 11, 2015
Lausanne, Switzerland

Acceptance Rates

DocEng '15 Paper Acceptance Rate 11 of 31 submissions, 35%;
Overall Acceptance Rate 194 of 564 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2018)A general framework for globally optimized paginationComputational Intelligence10.1111/coin.1216535:2(242-284)Online publication date: 22-Mar-2018
  • (2016)A General Framework for Globally Optimized PaginationProceedings of the 2016 ACM Symposium on Document Engineering10.1145/2960811.2960820(11-20)Online publication date: 13-Sep-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media