Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter (O) April 6, 2018

Assessment of variance & distribution in data for effective use of statistical methods for product quality prediction

Bewertung der Varianz und Verteilung mehrerer Variablen in Datensätze für den effektiven Einsatz von Statistischen Methoden zur Vorhersage von Produktqualität
  • Iris Weiß

    Iris Weiß, M.Sc., graduated in Management and Technology with focus on mechanical engineering from Technical University of Munich (TUM) in 2016. She is a research assistant at the Institute of Automation and Information System at TUM. Her main research interest is data mining in automated production systems with focus on the assessment of data quality and the evaluation of the impact of data quality issues on data mining methods.

    EMAIL logo
    and Birgit Vogel-Heuser

    Prof. Dr.-Ing. Birgit Vogel-Heuser graduated in electrical engineering and received the Ph.D. in mechanical engineering from the RWTH Aachen in 1991. She worked for nearly ten years in industrial automation in the machine and plant manufacturing industry. After holding different chairs of automation she has been head of the Institute of Automation and Information Systems at the Technical University of Munich since 2009. Her research work is focused on modeling and education in automation engineering for distributed and intelligent systems.

Abstract

Data mining in automated production systems provide high potential to increase the Overall Equipment Effectiveness. Nevertheless, data of such machines/plants include specific characteristics regarding the variance and distribution of the dataset. For modelling product quality prediction, these characteristics have to be analysed to interpret the results correctly. Therefore, an approach for the analysis of variance and distribution of datasets is proposed. The evaluation of this approach validates the developed guidelines, which identify the reasons for inconsistent prediction results based on two different datasets of the same production system.

Zusammenfassung

Ansätze zur Datenanalyse in automatisierten Produktionssystemen eröffnen hohes Potential zur Erhöhung der Gesamtanlageneffektivität. Daten solcher Anlagen zeigen jedoch spezifische Auffälligkeiten in der Varianz und der Verteilung, die bei der Erstellung von statistischen Methoden zur Vorhersage der Produktqualität berücksichtigt werden müssen, um zufällige, nicht reproduzierbare Vorhersagen zu vermeiden. In diesem Beitrag wird daher ein Ansatz vorgestellt, welcher eine Analyse der Varianz und Verteilung in mehrdimensionalen Datensätzen ermöglicht. Es wird gezeigt, dass die entwickelten Kriterien die Inkonsistenzen in Vorhersagemodellen zweier Datensätze des gleichen Produktionssystems erklären.

About the authors

Iris Weiß

Iris Weiß, M.Sc., graduated in Management and Technology with focus on mechanical engineering from Technical University of Munich (TUM) in 2016. She is a research assistant at the Institute of Automation and Information System at TUM. Her main research interest is data mining in automated production systems with focus on the assessment of data quality and the evaluation of the impact of data quality issues on data mining methods.

Birgit Vogel-Heuser

Prof. Dr.-Ing. Birgit Vogel-Heuser graduated in electrical engineering and received the Ph.D. in mechanical engineering from the RWTH Aachen in 1991. She worked for nearly ten years in industrial automation in the machine and plant manufacturing industry. After holding different chairs of automation she has been head of the Institute of Automation and Information Systems at the Technical University of Munich since 2009. Her research work is focused on modeling and education in automation engineering for distributed and intelligent systems.

References

1. B. Vogel-Heuser and D. Hess. 2016. Guest Editorial Industry 4.0–Prerequisites and Visions. IEEE Trans. Automat. Sci. Eng., pp. 411–413.10.1109/TASE.2016.2523639Search in Google Scholar

2. B. Vogel-Heuser, V. Karaseva, J. Folmer, and I. Kirchen. 2017. Operator Knowledge Inclusion in Data-Mining Approaches for Product Quality Assurance using Cause-Effect Graphs. In: 20th IFAC World Congress (IFAC), pp. 1358–1365.Search in Google Scholar

3. I. Kirchen, D. Schütz, J. Folmer, and B. Vogel-Heuser. 2017. Metrics for the Evaluation of Data Quality of Signal Data in Industrial Processes. In: 15th IEEE International Conference on Industrial Informatics (INDIN).10.1109/INDIN.2017.8104878Search in Google Scholar

4. D. Pantförder, J. Schaupp, and B. Vogel-Heuser. 2017. Making Implicit Knowledge Explicit – Acquisition of Plant Staff’s Mental Models as a Basis for Developing a Decision Support System. In: 19th International Conference on Human-Computer Interaction, pp. 358–365.Search in Google Scholar

5. Z. Ge. 2014. Quality prediction and analysis for large-scale processes based on multi-level principal component modeling strategy. Control Engineering Practice, vol. 31, pp. 9–23.10.1016/j.conengprac.2014.06.006Search in Google Scholar

6. D. Wang. 2011. Robust Data-Driven Modeling Approach for Real-Time Final Product Quality Prediction in Batch Process Operation. IEEE Transactions on Industrial Informatics, vol. 7, no. 2, pp. 371–377.10.1109/TII.2010.2103401Search in Google Scholar

7. K. Peng, K. Zhang, B. You, and J. Dong. 2015. Quality-related prediction and monitoring of multi-mode processes using multiple PLS with application to an industrial hot strip mill. Neurocomputing, vol. 168. pp. 1094–1103.10.1016/j.neucom.2015.05.014Search in Google Scholar

8. S. Zhang, F. Wang, D. He, and R. Jia. 2013. Online quality prediction for cobalt oxalate synthesis process using least squares support vector regression approach with dual updating. Control Engineering Practice, vol. 21, issue 10, pp. 1267–1276.10.1016/j.conengprac.2013.06.002Search in Google Scholar

9. ISO/IEC 25024:2015. 2015. Systems and software engineering – Systems and software Quality Requirements and Evaluation (SQuaRE) — Measurement of data quality.Search in Google Scholar

10. L. L. Pipino, Y. W. Lee, and R. Y. Wang. 2002. Data quality assessment. Commun. ACM., vol. 45, no. 4, p. 211.10.1145/505248.506010Search in Google Scholar

11. T. Hubauer, S. Lamparter, M. Roshchin, N. Solomakhina, and S. Watson. 2013. Analysis of data quality issues in real-world industrial data. In: Annual Conference of the Prognostics and Health Management Society, pp. 271–278.Search in Google Scholar

12. H. R. Nemati, D. M. Steiger, L. S. Iyer, and R. T. Herschel. 2002. Knowledge warehouse: An architectural integration of knowledge management, decision support, artificial intelligence and data warehousing. Decision Support Systemsm vol. 33, no. 2, pp. 143–161.10.1016/S0167-9236(01)00141-5Search in Google Scholar

13. E. Trunzer, I. Kirchen, J. Folmer, G. Koltun, and B. Vogel-Heuser. 2017. A flexible architecture for data mining from heterogeneous data sources in automated production systems. In: IEEE International Conference on Industrial Technology (ICIT), pp. 1106–1111.10.1109/ICIT.2017.7915517Search in Google Scholar

14. W. Chen, K. Zhou, S. Yang, and C. Wu. 2017. Data quality of electricity consumption data in a smart grid environment. Renewable and Sustainable Energy Reviews, vol. 75, pp. 98–105.10.1016/j.rser.2016.10.054Search in Google Scholar

15. R. J. Little, D. B. Rubin, and S. Z. Zangeneh. 2017. Conditions for Ignoring the Missing-Data Mechanism in Likelihood Inferences for Parameter Subsets. Journal of the American Statistical Association, vol. 112, no. 517, pp. 314–320.10.1080/01621459.2015.1136826Search in Google Scholar

16. B. T. Hazen, F. K. Weigel, J. D. Ezell, B. C. Boehmke, and R. V. Bradley. 2017. Toward understanding outcomes associated with data quality improvement. International Journal of Production Economics, vol. 193, pp. 737–747.10.1016/j.ijpe.2017.08.027Search in Google Scholar

17. R. Gitzel. 2016. Data Quality in Time Series Data – An Experience Report. In: 18th IEEE Conference on Business Informatics (CBI), pp. 41–49.Search in Google Scholar

18. Y. Ioannidis. 2003. The History of Histograms. In: 29th international conference on Very large data bases, pp. 19–30.Search in Google Scholar

19. R. L. Nuzzo. 2016. The Box Plots Alternative for Visualizing Quantitative Data. PM&R: the journal of injury, function, and rehabilitation, vol. 8, no. 3, pp. 268–272.10.1016/j.pmrj.2016.02.001Search in Google Scholar PubMed

20. M. Friendly and D. Denis. 2005. “The early origins and development of the scatterplot,” (eng), Journal of the history of the behavioral sciences, vol. 41, no. 2, pp. 103–130.10.1002/jhbs.20078Search in Google Scholar PubMed

Received: 2017-11-15
Accepted: 2018-1-12
Published Online: 2018-4-6
Published in Print: 2018-4-25

© 2018 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 25.4.2024 from https://www.degruyter.com/document/doi/10.1515/auto-2017-0115/html
Scroll to top button