ABSTRACT
In Internet of Things, softwares shall enable their host objects (everyday-objects) to monitor other objects, take actions, and notify humans while using some form of reasoning. The ever changing nature of real life environment necessitates the need for these objects to be able to generalize various inputs inductively in order to play their roles more effectively. These objects shall learn from stored training examples using some generalization algorithm. In this paper, we investigate training sets requirements for object learning and propose a Stratified Ordered Selection (SOS) method as a means to scale down training sets. SOS uses a new instance ranking scheme called LO ranking. Everyday-objects use SOS to select training subsets based on their capacity (e.g. memory, CPU). LO ranking has been designed to broaden class representation, achieve significant reduction while offering same or near same analytical results and to facilitate faster on-demand subset selection and retrieval for resource constrained objects. We show how SOS outperforms other methods using well known machine learning datasets.
- J. Cano, F. Herrera, and M. Lozano. Using evolutionary algorithms as instance selection for data reduction in kdd: an experimental study. Evolutionary Computation, IEEE Transactions on, 7(6):561--575, dec. 2003. Google ScholarDigital Library
- N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16:321--357, 2002. Google ScholarCross Ref
- S. Clémençon and N. Vayatis. Ranking the best instances. J. Mach. Learn. Res., 8:2671--2699, December 2007. Google ScholarDigital Library
- S. Garcia, J. Derrac, J. Cano, and F. Herrera. Prototype selection for nearest neighbor classification: Taxonomy and empirical study. IEEE Transactions on Pattern Analysis and Machine Intelligence, 99(PrePrints), 2011. Google ScholarDigital Library
- H. Hagras, V. Callaghan, M. Colley, G. Clarke, A. Pounds-Cornish, and H. Duman. Creating an ambient-intelligence environment using embedded agents. IEEE Intelligent Systems, 19:12--20, November 2004. Google ScholarDigital Library
- J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 3rd edition, 2011. Google ScholarDigital Library
- Y. Makita, M. de Hoon, and A. Danchin. Hon-yaku: a biology-driven bayesian methodology for identifying translation initiation sites in prokaryotes. BMC Bioinformatics, 8(1):47, 2007.Google ScholarCross Ref
- N. Segata, E. Blanzieri, S. J. Delany, and P. Cunningham. Noise reduction for instance-based learning with a local maximal margin approach. J. Intell. Inf. Syst., 35:301--331, October 2010. Google ScholarDigital Library
- D. B. Skalak. Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Machine Learning: Proceedings of the Eleventh International Conference, pages 293--301. Morgan Kaufmann, 1994.Google ScholarDigital Library
- D. Wilson. Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cybernetics, 2--3:408--421, 1972.Google ScholarCross Ref
- D. R. Wilson and T. R. Martinez. Reduction techniques for instance-based learning algorithms. Mach. Learn., 38:257--286, March 2000. Google ScholarDigital Library
Index Terms
- On-demand numerosity reduction for object learning
Recommendations
On-demand Data Numerosity Reduction for Learning Artifacts
AINA '12: Proceedings of the 2012 IEEE 26th International Conference on Advanced Information Networking and ApplicationsIn domains in which single agent learning is a more natural metaphor for an artifact-embedded agent, Exemplar-Based Learning (EBL) requires significantly large sets of training examples for it to be applicable. Obviously large sets of training examples ...
A Survey of Semi-Supervised Learning Methods
CIS '08: Proceedings of the 2008 International Conference on Computational Intelligence and Security - Volume 02In traditional machine learning approaches to classification, one uses only a labelled set to train the classifier. Labelled instances however are often difficult, expensive, or time consuming to obtain, as they require the efforts of experienced human ...
The use of data-derived label hierarchies in multi-label classification
Instead of traditional (multi-class) learning approaches that assume label independency, multi-label learning approaches must deal with the existing label dependencies and relations. Many approaches try to model these dependencies in the process of ...
Comments