Skip to main content

An Iterative and Incremental Data Preprocessing Procedure for Improving the Risk of Big Data Project

  • Conference paper
  • First Online:
Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 612))

Abstract

Big data applications can enhance the market competitive advantages of enterprises and organizations and can improve people’s quality of life. However, by the impact of many factors, failure rate of big data project is higher than the IT project. In order to reduce the risk of failure, big data projects must overcome a serial of challenges. Ambiguous requirements, poor data quality, and lacking changeability and extensity will directly affect the results of big data analytics, and even cause the wrong decision, inaccurate prediction and improper planning. Making the big data projects have potential failure risk. For this, this paper applies iterative and incremental development (IID) into the data preprocessing, draws up the iterative and incremental data quality improvement (IIDQI) procedure. Applied IIDQI procedure, iterative detects and identifies the defects of data quality, and incrementally strengthen big data quality and control the factors of failure risk. Iterative inspection activities can effectively enhance data quality, communication efficiency, and requirements precision to reduce the risk of big data project failure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cartner: the Gartner Business Intelligence & Analytics Summit (2015). www.gartner.com/newsroom/id/3130017

  2. Almquist, E., Senior, J., Springer, T.: Three promises and perils of Big Data, Bain & Company, Inc. (2015)

    Google Scholar 

  3. Meng, X.F., Ci, X.: Big Data management: concepts. Tech. Challenges J. Comput. Res. Dev. 50(1), 146–169 (2013)

    Google Scholar 

  4. Lackey, D.A. : The Big, Big Data Workbook (2016). blazon.online

  5. Cai, L., Zhu, Y.: The challenges of data quality and data quality assessment in the Big Data era. Data Sci. J. 14(2), 1–10 (2015)

    Google Scholar 

  6. Saha, B., Srivastava, D.: Data quality: the other face of Big Data. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 1294–1297 (2014)

    Google Scholar 

  7. Taleb, I., Dssouli, R., Serhani, M.A.: Big Data pre-processing: a quality framework. In: 2015 IEEE International Congress on, pp. 191–198 (2015)

    Google Scholar 

  8. Deshpande, B.: 5 situations which drive data pre-processing before data mining (2013). http://www.simafore.com/blog/bid/116618/5-situations-which-drive-data-pre-processing-before-data-mining

  9. Szalvay, V.: An Introduction to Agile Software Development. CollabNet, Inc. (2004)

    Google Scholar 

  10. Larman, Craig, Victor, R.: Basili.: Iterative and incremental developments. a brief history. Computer 36(6), 47–56 (2003)

    Article  Google Scholar 

  11. Zikopoulos, P. Eaton, C., et al.: Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media (2011)

    Google Scholar 

  12. Chen, P.C.L., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on Big Data Inform. Sci. 275, 314–347 (2014)

    Google Scholar 

  13. Wagner, D.: The importance of big data analytics in business, October, 2014, World of tech. http://www.techradar.com/news/world-of-tech/the-importance-of-big-data-analytics-in-business-1267606/2

  14. Elgendy, N., Elragal, A.: Big Data analytics: a literature review paper. LNCS, pp. 214–227 (2014)

    Google Scholar 

  15. Clancy T.: Chaos Report, The Standish Group Report (2014)

    Google Scholar 

  16. Dong, X.L., Srivastava, D.: Big Data integration. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 1245–1248 (2013)

    Google Scholar 

Download references

Acknowledgments

This research has supported by Ministry of Science and Technology research project funds (Project No.: MOST 105-2221-E-158-002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sen-Tarng Lai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Lai, ST., Leu, FY. (2018). An Iterative and Incremental Data Preprocessing Procedure for Improving the Risk of Big Data Project. In: Barolli, L., Enokido, T. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing . IMIS 2017. Advances in Intelligent Systems and Computing, vol 612. Springer, Cham. https://doi.org/10.1007/978-3-319-61542-4_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61542-4_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61541-7

  • Online ISBN: 978-3-319-61542-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics