Skip to main content

T-SPPA: Trended Statistical PreProcessing Algorithm

  • Conference paper
  • 1081 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 188))

Abstract

Traditional machine learning systems learn from non-relational data but in fact most of the real world data is relational. Normally the learning task is done using a single flat file, which prevents the discovery of effective relations among records. Inductive logic programming and statistical relational learning partially solve this problem. In this work, we resource to another method to overcome this problem and propose the T-SPPA: Trended Statistical PreProcessing Algorithm, a preprocessing method that translates related records to one single record before learning. Using different kinds of data, we compare our results when learning with the transformed data with results produced when learning from the original data to demonstrate the efficacy of our method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: The Konstanz Information Miner. In: Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007), Springer, Heidelberg (2007)

    Google Scholar 

  2. Helge Toutenberg, C., Rao, R.: Linear Models - Least Squares and Alternatives, 2nd edn. Springer, Heidelberg (1999)

    MATH  Google Scholar 

  3. Debnath, G., Shusterman, A., Hansch, C., Debnath, A.K, Lopez de Compadre, R.L.: Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of medicinal chemistry 34(2), 786–797 (1991)

    Article  Google Scholar 

  4. Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)

    MATH  Google Scholar 

  5. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11 (November 2009)

    Google Scholar 

  6. Knobbe, A.J., de Haas, M., Siebes, A.: Propositionalisation and aggregates. In: De Raedt, L., Siebes, A. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 277–288. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  7. Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications. chapter 11, Ellis Horwood, New York (1994)

    MATH  Google Scholar 

  8. Hunter, E.J., Ramig, L.O., Little, M.A., McSharry, P.: Suitability of dysphonia measurements for telemonitoring of parkinson’s disease. IEEE Transactions on Biomedical Engineering (2008)

    Google Scholar 

  9. Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: Yale: Rapid prototyping for complex data mining tasks. In: Ungar, L., Craven, M., Gunopulos, D., Eliassi-Rad, T. (eds.) KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 935–940. ACM, New York (2006)

    Google Scholar 

  10. Theja, P.V.V.K., Vanajakshi, L.: Short term prediction of traffic parameters using support vector machines technique. In: 3rd International Conference on Emerging Trends in Engineering and Technology (ICETET) 2010, pp. 70–75 (2010)

    Google Scholar 

  11. Pintelas, P., Kotsiantis, S., Kanellopoulos, D.: Data preprocessing for supervised leaning. International Journal of Computer Science 1(2), 111–117 (2006)

    Google Scholar 

  12. Srinivasan, A.: The Aleph Manual (2001)

    Google Scholar 

  13. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Silva, T., Dutra, I. (2011). T-SPPA: Trended Statistical PreProcessing Algorithm. In: Snasel, V., Platos, J., El-Qawasmeh, E. (eds) Digital Information Processing and Communications. ICDIPC 2011. Communications in Computer and Information Science, vol 188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22389-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22389-1_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22388-4

  • Online ISBN: 978-3-642-22389-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics