T-SPPA: Trended Statistical PreProcessing Algorithm

Silva, Tiago; Dutra, Inês

doi:10.1007/978-3-642-22389-1_11

T-SPPA: Trended Statistical PreProcessing Algorithm

Tiago Silva³ &
Inês Dutra³

Conference paper

1081 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 188))

Abstract

Traditional machine learning systems learn from non-relational data but in fact most of the real world data is relational. Normally the learning task is done using a single flat file, which prevents the discovery of effective relations among records. Inductive logic programming and statistical relational learning partially solve this problem. In this work, we resource to another method to overcome this problem and propose the T-SPPA: Trended Statistical PreProcessing Algorithm, a preprocessing method that translates related records to one single record before learning. Using different kinds of data, we compare our results when learning with the transformed data with results produced when learning from the original data to demonstrate the efficacy of our method.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B.: KNIME: The Konstanz Information Miner. In: Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007), Springer, Heidelberg (2007)
Google Scholar
Helge Toutenberg, C., Rao, R.: Linear Models - Least Squares and Alternatives, 2nd edn. Springer, Heidelberg (1999)
MATH Google Scholar
Debnath, G., Shusterman, A., Hansch, C., Debnath, A.K, Lopez de Compadre, R.L.: Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of medicinal chemistry 34(2), 786–797 (1991)
Article Google Scholar
Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)
MATH Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11 (November 2009)
Google Scholar
Knobbe, A.J., de Haas, M., Siebes, A.: Propositionalisation and aggregates. In: De Raedt, L., Siebes, A. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 277–288. Springer, Heidelberg (2001)
Chapter Google Scholar
Lavrac, N., Dzeroski, S.: Inductive Logic Programming: Techniques and Applications. chapter 11, Ellis Horwood, New York (1994)
MATH Google Scholar
Hunter, E.J., Ramig, L.O., Little, M.A., McSharry, P.: Suitability of dysphonia measurements for telemonitoring of parkinson’s disease. IEEE Transactions on Biomedical Engineering (2008)
Google Scholar
Mierswa, I., Wurst, M., Klinkenberg, R., Scholz, M., Euler, T.: Yale: Rapid prototyping for complex data mining tasks. In: Ungar, L., Craven, M., Gunopulos, D., Eliassi-Rad, T. (eds.) KDD 2006: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 935–940. ACM, New York (2006)
Google Scholar
Theja, P.V.V.K., Vanajakshi, L.: Short term prediction of traffic parameters using support vector machines technique. In: 3rd International Conference on Emerging Trends in Engineering and Technology (ICETET) 2010, pp. 70–75 (2010)
Google Scholar
Pintelas, P., Kotsiantis, S., Kanellopoulos, D.: Data preprocessing for supervised leaning. International Journal of Computer Science 1(2), 111–117 (2006)
Google Scholar
Srinivasan, A.: The Aleph Manual (2001)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

CRACS - INESC Porto LA, Dep. Ciência de Computadores, Faculdade de Ciências da Universidade do Porto, Porto, Portugal
Tiago Silva & Inês Dutra

Authors

Tiago Silva
View author publications
You can also search for this author in PubMed Google Scholar
Inês Dutra
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Electrical Engineering and Computer Science, VSB-Technical University of Ostrava, VŠB-TUO, 17. listopadu 15, 708 33, Ostrava-Poruba, Czech Republic
Vaclav Snasel & Jan Platos &
Information Systems Department, King Saud University, 11543, Riyadh, Saudi Arabia
Eyas El-Qawasmeh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Silva, T., Dutra, I. (2011). T-SPPA: Trended Statistical PreProcessing Algorithm. In: Snasel, V., Platos, J., El-Qawasmeh, E. (eds) Digital Information Processing and Communications. ICDIPC 2011. Communications in Computer and Information Science, vol 188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22389-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-22389-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22388-4
Online ISBN: 978-3-642-22389-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics