Skip to main content

Preliminary Cleaning and Transformation of Data in Data Mining Using PHP Pthreads Library

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2017 (ICCSA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10408))

Included in the following conference series:

  • 1975 Accesses

Abstract

The article deals with a special case of the preparation of data about the vehicles movements which comes in large volumes from the source to the accelerated applied methods of data mining. Data preparation goes through several stages from selecting the necessary fields and records to saving them with modified values into a new data structure. The source data which consist of 18 fields has a share of incorrect information and formats of numerical information that are not suitable for further processing. The source data is large in volume and processing it in the original form takes a very long time. The article shows how to use the pthreads library to organize multi-threaded processing of this data. To confirm the applicability of this library, the article presents the results of numerical experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Piatetsky-Shapiro, G., Frawley, W.: Knowledge discovery in databases, 539 p. AAAI Press, December 1991. ISBN: 9780262660709

    Google Scholar 

  2. Shichkina, Y., Degtyarev, A., Koblov A.: Technology of cleaning and transforming data using the knowledge discovery in databases (KDD) technology for fast application of data mining methods. In: CEUR Workshop Proceedings. Selected Papers of the 7th International Conference Distributed Computing and Grid-Technologies in Science and Education, vol. 1787, pp. 428–434 (2017). urn:nbn:de:0074-1787-5

    Google Scholar 

  3. The state of the Octoverse. GitHub Octoverse (2016). https://octoverse.github.com/. Last accessed 1 Mar 2017

  4. Programming languages ranking 2016, Tagline — fresh rankings and researches of Runet, 11 April 2016. http://tagline.ru/programming-languages-rating/. Last accessed 1 Mar 2017

  5. PHP: pthreads – Manual, PHP: PHP Manual. http://php.net/manual/en/book.pthreads.php. Last accessed 1 Mar 2017

Download references

Acknowledgments

The paper has been prepared within the scope of the state project “Initiative scientific project” of the main part of the state plan of the Ministry of Education and Science of Russian Federation (task № 2.6553.2017/BCH Basic Part) as well as supported by grant of Russian Fund for Basic Research (16-07-00886).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yulia Shichkina , Alexander Koblov or Kirill Lysov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Shichkina, Y., Koblov, A., Lysov, K., Iakushkin, O. (2017). Preliminary Cleaning and Transformation of Data in Data Mining Using PHP Pthreads Library. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2017. ICCSA 2017. Lecture Notes in Computer Science(), vol 10408. Springer, Cham. https://doi.org/10.1007/978-3-319-62404-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-62404-4_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-62403-7

  • Online ISBN: 978-3-319-62404-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics