Abstract
Computers support the management of large collections of text documents, but efficient reuse of document collections for producing new documents remains inherently difficult. We describe and discuss the design and implementation of a document assembly system based on a document assembly model, where the user produces new specialized documents by querying and browsing a collection of structured document fragments.
This work is supported by the Finnish Technology Development Centre (TEKES) and seven Finnish enterprises (Aamulehti Group, Oy Edita Ab, National Board of Education, WSOY, Helsinki Media, Lingsoft, Inc., and MTV3). Moreover, the grants by the Academy of Finland (Helena Ahonen), the Finnish Cultural Foundation (Barbara Heikkinen), and the 350th Anniversary Foundation of the University of Helsinki (Oskari Heinonen) have provided additional support.
Preview
Unable to display preview. Download preview PDF.
References
H. Ahonen, B. Heikkinen, O. Heinonen, J. Jaakkola, P. Kilpeläinen, G. Lindén, and H. Mannila. Intelligent assembly of structured documents. Technical Report C-1996-40, University of Helsinki, Department of Computer Science, June 1996.
H. Ahonen, B. Heikkinen, O. Heinonen, and P. Kilpeläinen. Assembling documents from digital libraries. In A. Hameurlain and A. M. Tjoa, editors, Proceedings of the 8th International Conference on Database and Expert Systems Applications, DEXA '97, number 1308 in Lecture Notes in Computer Science, pages 419–429, Toulouse, France, Sept. 1997. Springer-Verlag.
H. Ahonen, B. Heikkinen, O. Heinonen, and M. Klemettinen. Discovery of reasonablysized fragments using inter-paragraph similarities. Technical Report C-1997-67, University of Helsinki, Department of Computer Science, Nov. 1997.
H. Ahonen, B. Heikkinen, O. Heinonen, and M. Klemettinen. Improving the accessibility of SGML documents — A content-analytical approach. In SGML Europe '97, pages 321–327, Barcelona, Spain, May 1997. Graphic Communications Association.
L. C. Anderson and J. B. Lotspiech. Rights management and security in the electronic library. Research Report RJ 9977 (89065), IBM Almaden Research Center, Aug. 1995.
D. R. Cutting, D. R. Karger, and J. O. Pedersen. Constant interaction-time Scatter/Gather browsing of very large document collections. In R. Korfhage, E. Rasmussen, and P. Willett, editors, Proceedings of the 16th ACM SIGIR Conference, pages 126–134, Pittsburgh, PA, USA, June 1993.
D. R. Cutting, J. O. Pedersen, D. Karger, and J. W. Tukey. Scatter/Gather: A cluster-based approach to browsing large document collections. In N. Belkin, P. Ingwersen, and A. Mark Pejtersen, editors, Proceedings of the 15th ACM SIGIR Conference, pages 318–329, Copenhagen, Denmark, June 1992.
H. Davis and J. Hey. Automatic extraction of hypermedia bundles from the digital library. In Digital Libraries '95, Proceedings of the Second International Conference on the Theory and Practice of Digital Libraries, Austin, Texas, USA, June 1995. Available at http://www.csdl.tamu.edu/DL95/papers/davis/davis.html.
P. M. English and R. Tenneti. Interleaf active documents. Electronic Publishing — Origination, Dissemination and Design, 7(2):75–87, June 1994.
C. F. Goldfarb. The SGML Handbook. Oxford University Press, 1990.
ISO. Information Processing — Text and Office Systems — Standard Generalized Markup Language (SGML), ISO 8879, 1986.
J. Jaakkola, P. Kilpeläinen, and G. Lindén. TranSID: An SGML tree transformation language. In J. Paakki, editor, Proceedings of the Fifth Symposium on Programming Languages and Software Tools, pages 72–83, Jyväskylä, Finland, June 1997. Technical Report C-199737, University of Helsinki, Department of Computer Science.
W. E. Kimber. Re-usable SGML: Why I demand SUBDOC. In SGML '96, Boston, MA, USA, Nov. 1996. Graphic Communications Association.
D. M. Levy. Document reuse and document systems. Electronic Publishing — Origination, Dissemination and Design, 6(4):339–348, Dec. 1993.
J. McFadden. Hybrid distributed database (HDDB) and the future of SGML. In SGML Europe '96, pages 321–327, Munich, Germany, May 1996. Graphic Communications Association.
P. Pirolli, P. Schank, M. Hearst, and C. Diehl. Scatter/Gather browsing communicates the topic structure of a very large text collection. In Proceedings of the Conference on Human Factors in Computing Systems, CHI '96, pages 213–220, Vancouver, British Columbia, Canada, Apr. 1996. ACM. Available at http://www.soe.berkeley.edu/~schank/parc/pp_txt.htm.
D. Stribling, T. Hunter, L. Olszewski, A. Corrigan, R. Mullis, and L. Allen. A real world conversion to SGML. In Proceedings of the 14th Annual ACM Conference on Systems Documentation, SIGDOC '96, pages 75–86, Research Triangle Park, North Carolina, USA, Oct. 1996. ACM.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ahonen, H., Heikkinen, B., Heinonen, O., Jaakkola, J., Kilpeläinen, P., Lindén, G. (1998). Design and implementation of a document assembly workbench. In: Hersch, R.D., André, J., Brown, H. (eds) Electronic Publishing, Artistic Imaging, and Digital Typography. RIDT 1998. Lecture Notes in Computer Science, vol 1375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053293
Download citation
DOI: https://doi.org/10.1007/BFb0053293
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64298-5
Online ISBN: 978-3-540-69718-3
eBook Packages: Springer Book Archive