Learning the “Whys”: Discovering design rationale using text mining — An algorithm perspective
Highlights
► An exploration of using text mining to discover design rationale in text is reported. ► Algorithms are designed for artifact extraction, issue summary and cause–solution. ► The proposal is based on an issue, solution and artifact layer based DR modeling.
Introduction
To assist engineering design, many computer-aided design and engineering (CAD/E) systems have been developed since the 1960s. Based on the techniques of computer graphics, traditional CAD systems have been helpful in modeling and simulating design objects in 2D or 3D contexts [1]. While these CAD systems can help designers to represent their ideas by means of formal geometrical models, they are expected to provide means of designing new products. Since the 1980s, artificial intelligence techniques have been applied into CAD systems to suggest possible solutions from design knowledge bases. As increasing design information is available in digital form, there is a need to integrate such helpful information into the design knowledge bases to better assist design analysis, innovation and decision-making. It is therefore considered that one of the major concepts for future CAD systems is to build design knowledge bases with a variety of useful engineering design knowledge [2].
Among such design information and knowledge, design rationale (DR) is regarded as one kind of important knowledge for the next-generation product development system [1]. DR generally refers to the explanation of why an artifact is designed the way it is [3]. It is able to help designers to understand design know-how and the technology of an artifact, and also it facilitates the reuse of design knowledge in decision-making and product innovation. Without a careful record of useful design information, significant time and effort are cost to search for relevant answers [4].
Since the 1970s, many DR approaches have been developed with this goal in mind, such as SEURAT for software development [5] and DRed for industrial engineering [4]. However, such DR systems have not been widely spread in industry [6]. One of the most critical reasons is that, they require heavy human involvement to interpret and load DR information into the systems. In addition, they mainly attempt to record DR along design processes, while DRs stored in other archival design documents, such as design reports and patents, are often neglected. Although DR in documents can be interpreted into a predefined DR structure by designers, inconsistency will occur and consequently affect the storage and retrieval of rationales. In order to make the DR process more effective and tractable, one of the promising approaches relies on computational algorithms to discover DR from a large number of archival design documents using text mining techniques. We have also observed that although text mining and machine learning techniques have been applied in design document processing, a limited number of tasks focus on mining deep information from design documents, and none of them is on DR.
In our previous study, we have proposed a layered rationale representation model, Issue, Solution and Artifact Layer (ISAL) [7] and included a conceptual comparison framework between our ISAL model and the classical DR model, i.e. IBIS (Issue based Information System) model. In this paper, we focus on algorithm design to automatically extract DR information from a large amount of archival design documents according to our ISAL model. For each single document, we will extract a single set of issue, solution and artifact. The performance evaluation and scalability test of the algorithms are also detailed. The rest of this paper is organized as follows. Section 2 reviews the state of the art on several relevant topics, i.e. DR models and systems, design document processing and patent processing for design, and highlights the challenges and opportunities. In Section 3, we detail our algorithm design for the ISAL model. Next, an ISAL-based DR retrieval framework is described in Section 4. Using patent documents as our research data, Section 5 then reveals the performance of our DR discovery approach and Section 6 reports an example of DR extracted and a case study on DR retrieval based on the ISAL model. Section 7 discusses some issues in our current DR strategy, including using patents as research data, structure issue in patent texts, DR and process in presenting a holistic view of DR development, etc. Section 8 concludes the paper.
Section snippets
DR representation models and systems
How to represent DR effectively is the most important issue in DR management. It affects the reuse of DR information and knowledge [3]. DR representation models vary greatly as they support different design activities. The first approach is argumentation-based representation. Issue based Information System (IBIS) [8] is the earliest argumentation-based method to represent DR and it is the original model for most DR approaches [9]. In IBIS, issues, positions, arguments and their relationships
ISAL model
In our previous study, we have proposed a computational model, i.e. ISAL [7], which provides the foundation for our further research, such as DR discovery, retrieval and analysis. The ISAL model consists of three layers to represent DR, i.e. issue layer, solution layer and artifact layer, as shown in Fig. 1.
The issue layer describes the design motivational reasons and objectives of designing an artifact. It can be needs of the artifact, limitations of prior relevant artifacts, problems and
An ISAL-based DR discovery, retrieval and management framework
Based on our ISAL representation model and the aforementioned algorithm design, we propose a framework for DR discovery, retrieval and management as shown in Fig. 6. This framework consists of two modules, i.e. DR information organization module and DR search and retrieval module.
The DR information organization module aims to capture and secure DR information from e-design documents by two basic processes. They are DR discovery process and manual DR annotation process. The DR discovery process
Experimental setup
In our study, we use patent documents as our research data. Unlike internal design documents, e.g. design reports, which are confidential, patent documents are quality data source and open accessible with critical rationale information. We randomly collected 18 290 patent documents that were patented by Hewlett-Packard Company or Epson on the topic of inkjet printer design from United States patent database as our research data. Among these 18 290 patents, we randomly selected 300 patents that
An example using inkjet printhead
In order to illustrate our approach for DR discovery and ISAL-based retrieval framework, we demonstrate example DRs extracted by our algorithm and a DR retrieval case study.
Fig. 10 shows the DR extracted from a patent that focuses on high print quality printhead. From the issue layer, it indicates the motivations of a new design. For example, it includes the general requirement of higher quality printing from the market. Also it indicates some detailed design considerations. An example is that
Discussions
Our DR approach has taken advantage of several timely research efforts in text mining, machine learning, information retrieval and text processing at large, and it is technically quite different from the traditional systems that rely on manual efforts in DR capture while design archives are often left intact. A few issues deserve our immediate attention particularly related to the technical strength, merits as well as limitation of the current approach, and hopefully, it sheds light on some
Conclusions
In this paper, we have given our focus to algorithm design for DR discovery and management from a large amount of digitized design documents with rich textual content. Our research efforts in algorithm design, i.e. artifact information extraction, issue summarization and solution–reason pair identification, are structured based on a computational DR model ISAL which was introduced in our previous study. Further experimental studies have been conducted to assess the performance of our proposed
Acknowledgment
The work described in this paper was supported by a research grant from the National University of Singapore (R-265-000-362-133) and was partially supported by an open project of the State Key Lab of CAD&CG, Zhejiang University, China (Grant No: A1013).
References (44)
- et al.
Capturing design rationale
Computer-Aided Design
(2009) - et al.
Software engineering using rationale
Journal of Systems and Software
(2008) PHI: a conceptual foundation for design hypermedia
Design Studies
(1991)- et al.
A rationale-based architecture model for design traceability and reasoning
Journal of Systems and Software
(2007) - et al.
Imbalanced text classification: a term weighting approach
Expert Systems with Applications
(2009) - et al.
A computational framework for retrieval of document fragments based on decomposition schemes in engineering information management
Advanced Engineering Informatics
(2006) - et al.
Product portfolio identification based on association rule mining
Computer-Aided Design
(2005) - et al.
Text mining techniques for patent analysis
Information Processing & Management
(2007) - et al.
Towards content-oriented patent document processing
World Patent Information
(2008) - et al.
Development of a patent document classification and search platform using a back-propagation network
Expert Systems with Applications
(2006)
Patent document categorization based on semantic structural information
Information Processing & Management
Extracting the significant-rare keywords for patent analysis
Expert Systems with Applications
On the development of a technology intelligence tool for identifying technology opportunity
Expert Systems with Applications
Gather customer concerns from online product reviews—a text summarization approach
Expert Systems with Applications
The anatomy of a large-scale hypertextual web search engine
Computer Networks and ISDN Systems
Incremental cue phrase learning and bootstrapping method for causality extraction using cue phrase and word pair probabilities
Information Processing & Management
The role of knowledge in next-generation product development systems
Journal of Computing and Information Science in Engineering
Intelligent computer-aided design systems: past 20 years and future 20 years
Artificial Intelligence for Engineering Design, Analysis and Manufacturing
A survey of design rationale systems: approaches, representation, capture and retrieval
Engineering with Computers
Design rationale: researching under uncertainty
Artificial Intelligence for Engineering Design, Analysis and Manufacturing
A new design rationale representation model for rationale mining
Journal of Computing and Information Science in Engineering
Cited by (56)
A twin data-driven approach for user-experience based design innovation
2023, International Journal of Information ManagementCitation Excerpt :Based on these considerations, we propose a UX-integrated information representation model to combine these two heterogeneous kinds of information to support UX-based product innovation. Fig. 1 shows the proposed representation model, which utilizes our previous UX modelling (Tong et al., 2022; Yang et al., 2019) and design rationale modelling (Liang et al., 2012) for UX and design information respectively, and a semantic-based method to link relevant information at three diverse levels, including aspects, categories, and concepts. In this model, the UX information consists of four aspects: product, situation, UX interaction/experience state, and user cognitive aspects.
Natural language processing in-and-for design research
2022, Design SciencePatent Data for Engineering Design: A Review
2022, Proceedings of the Design SocietyUsing evolutionary algorithms to select text features for mining design rationale
2020, Artificial Intelligence for Engineering Design, Analysis and Manufacturing: AIEDAMText data-driven new product development: a systematic mapping review
2023, Nankai Business Review International