ABSTRACT
We present an approach to automating knowledge extraction in the aerospace engineering domain which has had a fundamental impact on the way engineers manage their collective knowledge built with years of experience. Even though obtaining labelled data in this domain is hard due to the high cost of domain experts' time, the application of the machine learning-based technology was successful, yielding results comparable to the state-of-the-art. Moreover, we present a comparison between several machine learning approaches in extracting knowledge from reports about jet engines. We show that the application of a semi-supervised approach does not provide a significant increase in accuracy so as to justify its adoption due to its much higher computational cost, but that the application of a large-scale approach considerably reduces both training and testing time while keeping accuracy comparable to the standard supervised approach, making it a good choice for this class of application scenarios.
- M. Belkin and P. Niyogi. Using manifold structure for partially labeled classification. Advances in Neural Information Processing Systems, 15, 2002.Google Scholar
- A. Blum, J. Lafferty, R. Rwebangira, and R. Reddy. Semi-supervised learning using randomized min-cuts. In Proceedings of the 21st International Conference on Machine Learning, 2004. Google ScholarDigital Library
- A. Bordes, L. Bottou, P. Gallinari, and J. Weston. Solving multiclass support vector machines with larank. In ICML'07: Proceedings of the 24th international conference on Machine learning, pages 89--96, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- B. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144--152, 1992. Google ScholarDigital Library
- J. Chen, D. Ji, C. L. Tan, and Z. Niu. Relation extraction using label propagation based semi-supervised learning. In ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 129--136, Morristown, NJ, USA, 2006. Association for Computational Linguistics. Google ScholarDigital Library
- F. Ciravegna. Adaptive information extraction from text by rule induction and generalisation. In machine learning for information extraction. In Proceedings 17th Int. Joint Conf. Artificial Intelligence (IJCAI), 2001. Google ScholarDigital Library
- C. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273--297, 1995. Google ScholarDigital Library
- A. Dadzie, R. Bhagdev, A. Chakravarthy, S. Chapman, J. Iria, V. Lanfranchi, J. Magalhaes, D. Petrelli, and F. Ciravegna. Applying semantic web technologies to knowledge sharing in aerospace engineering. Special issue of the Journal of Intelligent Manufacturing on Knowledge Discovery and Management in Engineering Design and Manufacturing, 2008.Google Scholar
- e. DARPA. Proc. 7th Message Understanding Conference (MUC-7). Morgan Kaufman, Fairfax, VA, 1998.Google Scholar
- A. Finn and N. Kushmerick. Multi-level boundary classification for information extraction. In Proceedings European Conference on Machine Learning (ECML), pages 111--122, 2004.Google ScholarDigital Library
- D. Freitag and N. Kushmerick. Boosted wrapper induction. In Proceedings 17th Nat. Conf. Articial Intelligence (AAAI), pages 577--583, 2000. Google ScholarDigital Library
- N. Ireson, F. Ciravegna, M. E. Califf, D. Freitag, N. Kushmerick, and A. Lavelli. Evaluating machine learning for information extraction. In Proceedings 22nd International Conference on Machine Learning (ICML), 2005. Google ScholarDigital Library
- S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In ICML'07: Proceedings of the 24th international conference on Machine learning, pages 807--814, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- V. Sindhwani and S. S. Keerthi. Large scale semi-supervised linear svms. In SIGIR'06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 477--484, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- J. C. Spall. Introduction to Stochastic Search and Optimization. John Wiley&Sons, Inc., New York, NY, USA, 2003. Google ScholarDigital Library
- F. Wang and C. Zhang. Label propagation through linear neighborhoods. IEEE Trans. on Knowl. and Data Eng., 20(1):55--67, 2008. Google ScholarDigital Library
- X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical report, CMU-CALD-02-107, 2002.Google Scholar
- X. Zhu, Z. Ghahramani, and J. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings 20th International Conference on Machine Learning (ICML), 2003.Google Scholar
Index Terms
- Automating knowledge capture in the aerospace domain
Recommendations
Integrating machine learning with knowledge acquisition through direct interaction with domain experts
Knowledge elicitation from experts and empirical machine learning are two distinct approaches to knowledge acquisition with differing and mutually complementary capabilities. Learning apprentices have provided environments in which a knowledge engineer ...
Knowledge Management Revisited
A number of social, economic, technological, and scientific trends have led to the emergence of communities of practice centered on the notion of the knowledge-based organization. Workforce mobility and its implications for transfer of expertise have ...
A new knowledge sourcing framework for knowledge-based engineering
New methodology for efficient knowledge capture and management of the knowledge life cycle.Extended KBE capability for fast and effective knowledge sourcing.Case study involving the optimisation of wing design concepts at an Aerospace manufacturer. New ...
Comments