Skip to main content
Log in

EDCMS: A content management system for engineering documents

  • Published:
International Journal of Automation and Computing Aims and scope Submit manuscript

Abstract

Engineers often need to look for the right pieces of information by sifting through long engineering documents. It is a very tiring and time-consuming job. To address this issue, researchers are increasingly devoting their attention to new ways to help information users, including engineers, to access and retrieve document content. The research reported in this paper explores how to use the key technologies of document decomposition (study of document structure), document mark-up (with EXtensible Mark-up Language (XML), HyperText Mark-up Language (HTML), and Scalable Vector Graphics (SVG)), and a facetted classification mechanism. Document content extraction is implemented via computer programming (with Java). An Engineering Document Content Management System (EDCMS) developed in this research demonstrates that as information providers we can make document content in a more accessible manner for information users including engineers.

The main features of the EDCMS system are

1) EDCMS is a system that enables users, especially engineers, to access and retrieve information at content rather than document level. In other words, it provides the right pieces of information that answer specific questions so that engineers don’t need to waste time sifting through the whole document to obtain the required piece of information.

2) Users can use the EDCMS via both the data and metadata of a document to access engineering document content.

3) Users can use the EDCMS to access and retrieve content objects, i.e. text, images and graphics (including engineering drawings) via multiple views and at different granularities based on decomposition schemes.

Experiments with the EDCMS have been conducted on semi-structured documents, a textbook of CADCAM, and a set of project posters in the Engineering Design domain. Experimental results show that the system provides information users with a powerful solution to access document content.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. Jacob, A. Sachde, S. Chakravarthy. CX-DIFF: a Change Detection Algorithm for XML Content and Change Visualisation for WebVigil. Data and Knowledge Engineering, vol. 52, no. 2, pp. 209–230, 2005.

    Article  Google Scholar 

  2. T. Hendley. Reviewing the Options for Information and Records Management and Collaborative Working. Managing Information and Documents: The Definitive Guide, 16th ed., M-ID, London, pp. 11–35, 2005.

    Google Scholar 

  3. S. McKeever. Understanding Web Content Management Systems: Evolution, Lifecycle and Market. Industrial Management and Data Systems, vol. 103, no. 9, pp. 686–692, 2003.

    Article  Google Scholar 

  4. L. H. Chen, W. L. Chue. Using Web Structure and Summarisation Techniques for Web Content Mining. Information Processing and Management, vol. 41, no. 5, pp. 1225–142, 2005.

    Article  Google Scholar 

  5. J. T. Sprehe. The Positive Benefits of Electronic Records Management in the Context of Enterprise Content Management. Government Information Quarterly, vol. 22, no. 2, pp. 297–303, 2005.

    Article  Google Scholar 

  6. J. Robertson. Is It Document Management or Content Management? [Online], Available: http://www.steptwo.com.au/ papers/cmb_dmorcm/index.html, June 30, 2006.

  7. Extending Ccross the Organisation: Reuse and Collaboration with XML-based Content, Dynamic Content Software Strategies Consulting Service. CAP Ventures. [Online], Available: http://www.capv.com, June 30, 2006.

  8. Sitecore Content Manager. [Online], Availabler: http://www.sitecore.net, June 30, 2006.

  9. T. Wales. Library Subject Guides: a Content Management Case Study at the Open University, UK. Program — Electronic Library and Information Systems, vol. 39, no. 2, pp. 112–121, 2005.

    Article  Google Scholar 

  10. A. Lowe. Studies of Information Use by Engineering Designers and the Development of Strategies to Aid in its Classification and Retrieval. Ph.D. dissertation, Bristol University, UK, 2002.

    Google Scholar 

  11. A. Lowe, C. A. McMahon, S. J. Culley. Characterising the Requirements of Engineering Information Systems. International Journal of Information Management, vol. 24, no. 5, pp. 401–422, 2004.

    Article  Google Scholar 

  12. R. Fidel, M. Green. The Many Faces of Accessibility: Engineer’s Perception of Information Sources. Information Processing and Management, vol. 40, no. 3, pp. 563–581, 2004.

    Article  Google Scholar 

  13. S. B. Harris, J. Owen, M. S. Bloor, I. Hogg. Engineering Document Management Strategy: Analysis of Requirements, Choice of Direction and System Implementation. Proceedings of the Institution of Mechanical Engineers Part B—Journal of Engineering Manufacture, vol. 211, no. 5, pp. 385–405, 1997.

    Article  Google Scholar 

  14. P. J. Wild, S. J. Culley, C. A. McMahon, M. J. Darlington, S. Liu. Towards a Method for Profiling Engineering Documentation. In Proceedings of the 9th International Design Conference of DESIGN 2006, University of Zagreb Press, Dubrovnik, Croatia, pp. 1309–1318, 2006.

    Google Scholar 

  15. S. Liu, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild. An Approach for Document Fragment Retrieval and its Formatting Issues in Engineering Information Management. Lecture Notes in Computer Science, vol. 3981, pp. 279–287, 2006.

    Google Scholar 

  16. C. A. McMahon, A. Lowe, S. J. Culley, M. Corderoy, R. Crossland, T. Shah, D. Stewart. Waypoint: an Integrated Search and Retrieval System for Engineering Documents. Journal of Computing and Information Science in Engineering, vol. 4, no. 4, pp. 329–338, 2004.

    Article  Google Scholar 

  17. M. Erdmann, R. Studer. How to Structure and Access XML Documents with Ontology. Data and Knowledge Engineering, vol. 36, no. 3, pp. 317–335, 2001.

    Article  Google Scholar 

  18. S. Klink, A. Dengel, T. Kieninger. Document Structure Analysis Based on Layout and Textual Features. [Online], Available: http://dbis.unitrier.de/Mitarbeiter/klink_files/www/Postscript/DAS2000FinalVersion.pdf, June 30, 2006.

  19. J. Kingston, A. Macintosh. Knowledge Management through Multi-perspective Modelling: Representing and Distributing Organisational Memory. Knowledge-based Systems, vol. 13, no. 2–3, pp. 121–131, 2000.

    Article  Google Scholar 

  20. K. A. Chatha, R. H. Weston, R. P. Monfared. An Approach to Modelling Dependencies Linking Engineering Processes. Proceedings of the Institution of Mechanical Engineers Part B — Journal of Engineering Manufacture, vol. 217, no. 5, pp. 669–687, 2003.

    Article  Google Scholar 

  21. M. Fowler, K. Scott. UML Distilled: A Brief Guide to the Standard Object Modelling Language, Addison-Wesley, Boston, 2000.

    Google Scholar 

  22. XML DTD, W3C. [Online], Available: http://www.w3schools.com/dtd/, June 30, 2006.

  23. DOM, W3C. [Online], Available: http://www.w3.org/DOM/, June 30, 2006.

  24. H. Y. Kao, J. M. Ho, M. S. Chen. WISDOM: Web Intra-page Informative Structure Mining Based on Document Object Model. IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 5, pp. 614–627, 2005.

    Article  Google Scholar 

  25. S. Liu, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild. A Computational Framework for Retrieval of Document Fragments Based on Decomposition Schemes in Engineering Information Management. Advanced Engineering Informatics, vol. 20, no. 4, pp. 401–403, 2006.

    Article  Google Scholar 

  26. D. Zijm. The History of Mark-up Languages. [Online], Available: http://www.luminoussolutions.com/data/history_of_mark-up.pdf, June 30, 2006.

  27. B. K. Reid. Scribe: a Document Specification Language and its Compiler. Ph.D. dissertation, Carnegie-Mellon University, USA, 1981.

    Google Scholar 

  28. L. Lamport. LATEX: A Document Preparation System: User’s Guide and Reference Manual. Addison-Wesley, London, 1986.

    Google Scholar 

  29. Document Mark-up Meta-language: GENCODE and the Standard Generalized Mark-up Language (SGML), GCA standard 101, 1983.

  30. Office Document Architecture (ODA), ISO/DIS 8613, Information processing, 1986.

  31. D. W. Langridge. Classification: Its Kinds, Elements, Systems and Applications. Bowker-Saur, London, 1992.

    Google Scholar 

  32. J. Rowley, J. Farrow. Organising Knowledge: an Introduction to Information Retrieval, 3rd ed. Gower Publishing, London, 2000.

    Google Scholar 

  33. A. C. Foskett. The Subject Approach to Information, 5th ed., Library Association Publishing, London, 1996.

    Google Scholar 

  34. M. L. Mackenzie. The Personal Organization of Electronic Mail Messages in a Business Environment: An Exploratory Study. Library and Information Science Research, vol. 22, no. 4, pp. 405–426, 2000.

    Article  MathSciNet  Google Scholar 

  35. A. Taylor. Introduction to Cataloguing and Classification. Libraries Unlimited, London, 1992.

    Google Scholar 

  36. J. Mills. Facetted Classification and Logical Division in Information Retrieval. Library Trends, vol. 52, no. 3, pp. 541–570, 2004.

    Google Scholar 

  37. T. Quatrani. Visual Modelling with Rational Rose 2000 and UML. Addison-Wesley, Boston, 2000.

    Google Scholar 

  38. C. F. Goldfarb. A Generalized Approach to Document Mark-up. In Proceedings of the ACM SIGPLAN SIGOA Symposium on Text Manipulation, Portland, Oregon, SIGPLAN Notices, vol. 16, no. 6, pp. 68–73, 1981.

    Article  MathSciNet  Google Scholar 

  39. J. D. Eisenberg. SVG Essentials. O’Reilly, Beijing, 2002.

    Google Scholar 

  40. S. Gupta, G. E. Kaiser, P. Grimm, M. F. Chiang, J. Starren. Automating Content Extraction of HTML Documents. World Wide Web—Internet and Web Information Systems, vol. 8, no. 2, pp. 179–224, 2005.

    Article  Google Scholar 

  41. D. A. Lizorkin, K. Y. Lisovsky. Implementation of the XML Linking Language XLink by Functional Methods. Programming and Computer Software, vol. 319, no. 1, pp. 34–46, 2005.

    Google Scholar 

  42. XLink and XPointer. W3C. [Online], Available: http://www.w3.org/XML/Linking, June 30, 2006.

  43. E. Freeman. Head First HTML with CSS and XHTML. O’Reilly, Beijing, 2002.

    Google Scholar 

  44. G. Falquet, C. L. Mottaz-Jiang, J. C. Ziswiler. Ontology Based Interfaces to Access a Library of Virtual Hyper-books. Lecture Notes in Computer Science, vol. 3232, pp. 99–110, 2004.

    Google Scholar 

  45. C. A. McMahon, J. Browne. CADCAM Principles, Practice and Manufacturing Management, 2nd ed., Addison-Wesley, Harlow, England, 1998.

    Google Scholar 

  46. H. S. Na, O. H. Choi. FSMI: MDR-based Metadata Interoperability Framework for Sharing XML Documents, Systems Modelling and Simulation: Theory and Applications. Lecture Notes in Computer Science, vol. 3398, pp. 343–351, 2005.

    Google Scholar 

  47. P. Rigaux, and N. Spyratos. Metadata Inference for Document Retrieval in a Distributed Repository. Lecture Notes in Computer Science, vol. 3321, pp. 418–436, 2004.

    Google Scholar 

  48. Dublin Core. [Online], Available: http://dublincore.org/, June 30, 2006.

  49. S. Ahmed, K. M. Wallace. Identifying and Supporting the Knowledge Needs of Novice Designers within the Aerospace Industry. Journal of Engineering Design, vol. 15, no. 5, pp. 475–492, 2004.

    Article  Google Scholar 

  50. S. Liu, C. A. McMahon, M. J. Darlington, S. J. Culley, P. J. Wild. An Automatic Mark-up Approach for Structured Document Retrieval in Engineering Design. In Proceeding of International Conference of Manufacturing Research, Liverpool, UK, pp. 23–28, 2006.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shaofeng Liu.

Additional information

This work was supported by the UK Engineering and Physical Sciences Research Council (EPSRC) (No. GR/R67507/01).

Shaofeng Liu received her B.Sc and M.Sc degrees from Hunan University, China and Ph.D. degree from Loughborough University, UK. She is currently a research officer of Engineering Innovative Manufacturing Research Centre (IMRC) at the University of Bath, UK.

Her current research interests include study of information and knowledge structures, Web technology, information and knowledge system, and mark-up technology application in engineering (design and manufacture) domain.

Chris McMahon is a professor of and the director of Engineering IMRC at the University of Bath, UK. He has carried out research in the information requirements of engineering designers, in information and knowledge management systems for design, in risk and uncertainty management in design, in component durability and reliability, especially in the presence of residual stresses, and in design for remanufacturing, largely in conjunction with industry.

He has published his work widely, including over 150 refereed papers, a number of edited volumes and a textbook on computer-aided design and manufacture. His research interests include engineering design, especially concerning the application of computers to the management of information and uncertainty in design, and to design automation.

Mansur Darlington is a research officer in the Design Information & Knowledge Group of Engineering IMRC at the University of Bath, UK. For the last decade he has been involved in research associated with the capture and codification of engineers’ design knowledge and the development of methods for supporting engineers’ information needs, first in the University of Bath’s Engineering Design Centre, latterly in the IMRC.

His research interests include the capture and representation of design information and knowledge for reasoning about conceptual design, the development of reasoning techniques associated with engineering document and document content retrieval, and the development of methods for controlled document origination to support standardization for reuse and for information minimization.

Steve Culley is a professor of Engineering Design and head of Design in the Department of Mechanical Engineering at the University of Bath, UK. He has researched in the engineering design field for many years. In particular this has included the provision of information and knowledge to support engineering designers. He pioneered work into the introduction and use of the electronic catalogue for standard engineering components and has extended this work to deal with systems and assemblies.

He has over 150 publications and has recently co-authored a book on design and changeover.

Prof. Culley is a member of the EPSRC funded IMRC at the University of Bath, UK and is part of the ‘Grand Challenge’—Immortal Information and Through-life Knowledge Management. He is a fellow of the Institution of Mechanical Engineers.

Peter Wild graduated from Bournemouth University in psychology and computing, and received his M.Sc degree from the University of London, (Queen Mary) in human computer interaction. He is a research officer in IMRC at the University of Bath, UK.

His interests include documents, empirical studies of engineers, design, task analysis, and user interface design.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, S., McMahon, C., Darlington, M. et al. EDCMS: A content management system for engineering documents. Int J Automat Comput 4, 56–70 (2007). https://doi.org/10.1007/s11633-007-0056-x

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-007-0056-x

Keywords

Navigation