Skip to main content
Log in

A testing data validity assessment method and testing data validation platform based on SOA

  • Special Issue Paper
  • Published:
Service Oriented Computing and Applications Aims and scope Submit manuscript

Abstract

In modern manufacturing, ensuring the quality of component testing data is highly valued by both product manufacturers and component suppliers. However, in common component quality analysis processes, testing data are assumed to be valid, which might not be true. Therefore, assessing the validity of component testing data would be important. Many existing data analysis platforms are separated from enterprises’ own systems, which makes the inspection data analysis incoherent to their business process. In this paper, we propose a testing data quality assessment method and a testing data validation platform based on SOA. The platform provides reliable third-party testing data validation service via RESTful APIs, so that the services can be seamlessly integrated to enterprise systems. The testing data validity assessment method, which is the core of the platform, is implemented by detecting illegal behavior in data recording. The detection is a combination of behavior analysis and a positive and unlabeled learning process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Hazen BT, Boone CA, Ezell JD, Jones-Farmer LA (2014) Data quality for data science, predictive analytics, and big data in supply chain management: an introduction to the problem and suggestions for research and applications. Int J Prod Econ 154:72–80

    Article  Google Scholar 

  2. Cai L, Zhu Y (2015) The challenges of data quality and data quality assessment in the big data era. Data Sci J 14:2. https://doi.org/10.5334/dsj-2015-002

    Article  Google Scholar 

  3. Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 213–220

  4. Duan Y, Fu G, Zhou N, Sun X, Narendra NC, Hu B (2015) Everything as a service (XaaS) on the cloud: origins, current and future trends. In: 2015 IEEE 8th international conference on cloud computing (CLOUD), IEEE, pp 621–628

  5. Kerdoudi ML, Tibermacine C, Sadou S (2016) Opening web applications for third-party development: a service-oriented solution. Serv Oriented Comput Appl 10(4):437–463

    Article  Google Scholar 

  6. Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manag Inf Syst 12(4):5–33

    Article  Google Scholar 

  7. Woodall P, Oberhofer M, Borek A (2014) A classification of data quality assessment and improvement methods. Int J Inf Qual 163(4):298–321

    Article  Google Scholar 

  8. Pipino LL, Lee YW, Wang RY (2002) Data quality assessment. Commun ACM 45(4):211–218

    Article  Google Scholar 

  9. Batini C, Cappiello C, Francalanci C, Maurino A (2009) Methodologies for data quality assessment and improvement. ACM Comput Surv (CSUR) 41(3):16

    Article  Google Scholar 

  10. Myrick ML, Priore RJ, Freese RP, Blackburn JC (2015) US Patent No. 9,170,154. Washington, DC: U.S. Patent and Trademark Office

  11. Gimelli A, Sannino R (2018) A multi-variable multi-objective methodology for experimental data and thermodynamic analysis validation: an application to micro gas turbines. Appl Therm Eng 134:501–512

    Article  Google Scholar 

  12. Rieck K, Trinius P, Willems C, Holz T (2011) Automatic analysis of malware behavior using machine learning. J Comput Secur 19(4):639–668

    Article  Google Scholar 

  13. Saad S, Traore I, Ghorbani A, Sayed B, Zhao D, Lu W, Hakimian P (2011) Detecting P2P botnets through network behavior analysis and machine learning. In: 2011 Ninth annual international conference on privacy, security and trust (PST), IEEE, pp 174–180

  14. Witten, Ian H., et al. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016

  15. Liu H, Motoda H (eds) (1998) Feature extraction, construction and selection: a data mining perspective, vol 453. Springer, Berlin

    MATH  Google Scholar 

  16. Zhou X, Belkin M (2014) Semi-supervised learning. In: Academic Press Library in signal processing, vol 1, Elsevier, pp 1239–1269

  17. Hady MFA, Schwenker F (2013) Semi-supervised learning. In: Handbook on neural information processing, Springer, Berlin, pp 215–239

  18. Yang P, Liu W, Yang J (2017) Positive unlabeled learning via wrapper-based adaptive sampling. In: Proceedings of the 26th international joint conference on artificial intelligence, AAAI Press, pp 3273–3279

  19. Xu Y, Xu C, Xu C, Tao D (2017) Multi-positive and unlabeled learning. In: Proceedings of the 26th international joint conference on artificial intelligence, AAAI Press, pp 3182–3188

  20. Fusilier DH, Montes-y-Gómez M, Rosso P, Cabrera RG (2015) Detecting positive and negative deceptive opinions using PU-learning. Inf Process Manag 51(4):433–443

    Article  Google Scholar 

  21. Lemos AL, Florian D, Boualem B (2016) Web service composition: a survey of techniques and tools. ACM Comput Surv (CSUR) 48(3):33

    Google Scholar 

  22. Tsai WT, Sun X, Balasooriya J (2010) Service-oriented cloud computing architecture. In: 2010 seventh international conference on information technology: new generations (ITNG), IEEE, pp 684–689

  23. “What is Cloud Computing?”. Amazon Web Services. https://aws.amazon.com/what-is-cloud-computing/. Accessed 20 Mar 2013

  24. Mumbaikar S, Padiya P (2013) Web services based on soap and rest principles. Int J Sci Res Publ 3(5):1–4

    Google Scholar 

  25. Lampesberger H (2016) Technologies for web and cloud service interaction: a survey. Serv Oriented Comput Appl 10(2):71–110

    Article  Google Scholar 

  26. Curbera F, Duftler M, Khalaf R, Nagy W, Mukhi N, Weerawarana S (2002) Unraveling the web services web: an introduction to SOAP, WSDL, and UDDI. IEEE Internet Comput 6(2):86–93

    Article  Google Scholar 

  27. Yates A, Beal K, Keenan S, McLaren W, Pignatelli M, Ritchie GR, Flicek P (2014) The ensemble REST API: ensemble data for any language. Bioinformatics 31(1):143–145

    Article  Google Scholar 

  28. Dittrich J, Quiané-Ruiz JA (2012) Efficient big data processing in Hadoop MapReduce. Proc VLDB Endow 5(12):2014–2015

    Article  Google Scholar 

  29. Taylor RC (2010) An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics. In: BMC bioinformatics, vol 11, no 12, BioMed Central, p S1

  30. Mott R (2005) Smith–waterman algorithm. eLS, London

    Book  Google Scholar 

Download references

Acknowledgements

This research is supported by the Shanghai Institute of Precision Measurement Project under Grand No. SAST2017-128 and the National Natural Science Foundation of China under Grant No. 61373030.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongming Cai.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, B., Li, C., Shah, N. et al. A testing data validity assessment method and testing data validation platform based on SOA. SOCA 12, 201–209 (2018). https://doi.org/10.1007/s11761-018-0242-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11761-018-0242-4

Keywords

Navigation