Abstract
DARA (Dynamic Aggregation of Relational Attributes) algorithm is designed to summarize non-target records stored in a non-target table. These records have many-to-one relationships with records stored in the target table. The records stored in the non-target table are summarized and the summarized data is then appended to the target table. With these summarized data appended into the target table, a classifier will be applied to learn this data in order to perform the classification task. However, the predictive accuracy of the classification task is highly influenced by the representation of the summarized data. In our previous works, several types of feature construction methods have been introduced especially for the DARA algorithm in order to improve the descriptive accuracy of the summarized data and indirectly improve the predictive accuracy of the target data. This paper proposes a method that learns relational data based on multiple instances of summarized data that are obtained using different types of feature construction methods. This involves investigating the effect of selecting several sets of summarized data which have been summarized using the feature construction methods and appending these summarized data into the target table before the classification task can be performed. The predictive accuracy of the classification task is expected to be improved when multiple instances of summarized data appended into the target table. The experiment results show that there are some improvements in the predictive accuracy of the classification by selecting multiple instances of summarized data and appending them into the target table.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kavurucu, Y., Senkul, P., Toroslu, I.H.: A Comparative Study on ILP-Based Concept Discovery Systems. Expert Systems with Applications 38(9), 11598–11607 (2011)
Xavier, J.C., Canuto, A.M.P., Freitas, A.A., Goncalves, L.M.G., Silla, C.N.: A Hierarchical Approach to Represent Relational Data Applied to Clustering Tasks. In: The 2011 International Joint Conference on Neural Networks (IJCNN), pp. 3055–3062 (2011)
Tian, Y., Weiss, G.M., Hsu, D.F., Ma, Q.: A Combinatorial Fusion Method for Feature Construction (2009)
Alfred, R.: Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques. Journal of Computer Science 6(7), 775–784 (2010)
Kavurucu, Y., Senkul, P., Toroslu, I.H.: Concept Discovery on Relational Databases: New Techniques for Search Space Pruning and Rule Quality Improvement. Knowledge-Based Systems 23(8), 743–756 (2011)
Maimon, O., Rokach, L.: Introduction to Knowledge Discovery and Data Mining. The Data Mining and Knowledge Discovery HandBook, pp. 1–5. Springer, Heidelberg (2010)
Choudhary, A.K., Harding, J.A., Tiwari, M.K.: Data Mining in Manufacturing: A Review Based on the Kind of Knowledge. Journal of Intelligent Manufacturing, 501–521 (2009)
Alfred, R.: The Study of Dynamic Aggregation of Relational Attributes on Relational Data Mining. In: Alhajj, R., et al. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 214–226. Springer, Heidelberg (2007)
Alfred, R.: Optimizing Feature Construction Process for Dynamic Aggregation of Relational Attributes. J. Comput. Sci. 5, 864–877 (2009)
Alfred, R.: Feature Transformation: A Genetic-Based Feature Construction Method for Data Summarization. Computational Intelligence 26(3), 337–357 (2010)
Alfred, R.: Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques. Journal of Computer Science 6(7), 775–784 (2010)
Sia, F., Alfred, R.: Evolutionary-Based Feature Construction With Substitution For Data Summarization Using DARA. In: The 4th 2012 Conference on Data Mining and Optimization (DMO 2012), Langkawi, Malaysia (2012)
Sia, F., Alfred, R.: A Variable Feature Construction Method For Data Summarization Using DARA. In: The 3rd International Conference on Advancements in Computing Technology (ICACT 2012), Soeul, Korea (2012)
Shafti, L.S., Perez, E.: Evolutionary Multi-Feature Construction for Data Reduction: A Case Study. Appl. Soft Comput. 9, 1296–1303 (2009)
Guan, Y., Dy, J.G., Jordan, M.I.: A Unified Probabilistic Model for Global and Local Unsupervised Feature Selection. In: Proc. ICMC (2011)
Wong, C., Versace, M.: CARTMAP: A Neural Network Method for Automated Feature Selection in financial Time Series Forecasting. J. Neural Computing and Applications, 969–977 (2012)
Vinh, L.T., Lee, S.Y., Park, Y.T., d’Auriol, B.J.: A Novel Feature Selection Method Based On Normalized Mutual Information. J. Applied Intelligence, 100–120 (2012)
Pal, M., Foody, G.M.: Feature Selection for Classification of Hyperspectral Data by SVM. IEEE Transactions on Geoscience and Remote Sensing 48(5), 2297–2307 (2010)
Song, L.A., Smola, A., Gretton, A., Bedo, J., Borgwardt, K.: Feature Selection Via Dependence Maximization. Journal of Machine Learning Research 13, 1393–1434 (2012)
Estevez, P.A., Tesmer, M., Perez, C.A., Zurada, J.M.: Normalized Mutual in-Formation Feature Selection. IEEE Transactions on Neural Networks 20(2), 189–201 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sia, F., Alfred, R., Chin, K.O. (2013). Learning Relational Data Based on Multiple Instances of Summarized Data Using DARA. In: Noah, S.A., et al. Soft Computing Applications and Intelligent Systems. M-CAIT 2013. Communications in Computer and Information Science, vol 378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40567-9_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-40567-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40566-2
Online ISBN: 978-3-642-40567-9
eBook Packages: Computer ScienceComputer Science (R0)