Abstract
Feature selection has become the focus of research area for a long time due to immense consumption of high-dimensional data. Originally, the purpose of feature selection is to select the minimally sized subset of features class distribution which is as close as possible to original class distribution. However in this chapter, feature selection is used to obtain the unique individual significant features which are proven very important in handwriting analysis of Writer Identification domain. Writer Identification is one of the areas in pattern recognition that have created a center of attention by many researchers to work in due to the extensive exchange of paper documents. Its principal point is in forensics and biometric application as such the writing style can be used as bio-metric features for authenticating the identity of a writer. Handwriting style is a personal to individual and it is implicitly represented by unique individual significant features that are hidden in individual’s handwriting. These unique features can be used to identify the handwritten authorship accordingly. The use of feature selection as one of the important machine learning task is often disregarded in Writer Identification domain, with only a handful of studies implemented feature selection phase. The key concern in Writer Identification is in acquiring the features reflecting the author of handwriting. Thus, it is an open question whether the extracted features are optimal or near-optimal to identify the author. Therefore, feature extraction and selection of the unique individual significant features are very important in order to identify the writer, moreover to improve the classification accuracy. It relates to invarianceness of authorship where invarianceness between features for intra-class (same writer) is lower than inter-class (different writer). Many researches have been done to develop algorithms for extracting good features that can reflect the authorship with good performance. This chapter instead focuses on identifying the unique individual significant features of word shape by using feature selection method prior the identification task. In this chapter, feature selection is explored in order to find the most unique individual significant features which are the unique features of individual’s writing. This chapter focuses on the integration of Swarm Optimized and Computationally Inexpensive Floating Selection (SOCIFS) feature selection technique into the proposed hybrid of Writer Identification framework and feature selection framework, namely Cheap Computational Cost Class-Specific Swarm Sequential Selection (C4S4). Experiments conducted to proof the validity and feasibility of the proposed framework using dataset from IAM Database by comparing the proposed framework to the existing Writer Identification framework and various feature selection techniques and frameworks yield satisfactory results. The results show the proposed framework produces the best result with 99.35% classification accuracy. The promising outcomes are opening the gate to future explorations in Writer Identification domain specifically and other domains generally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amend, K., Ruiz, M.S.: Handwriting Analysis. The Career Press, New Jersey (1980)
Huber, R.A., Headrick, A.M.: Handwriting Identification: Facts and Fundamentals. CRC Press, New York (1999)
Baranoski, F.L., Oliveira, L.S., Justino, E.J.R.: Writer Identification Based on Forensic Science Approach. In: XXXIII Latin American Conference on Informatics, San Jose, Costa Rica, pp. 25–32 (2007)
Muda, A.K.: Authorship Invarianceness for Writer Identification Using Invariant Discretization and Modified Immune Classifier. Universiti Teknologi Malaysia (2009)
Li, J., Zheng, R., Chen, H.: From Fingerprint to Writeprint. Communications of the ACM 49(4), 76–82 (2006)
Al-Ma’adeed, S., Mohammed, E., Kassis, D.A., Al-Muslih, F.: Writer Identification Using Edge-based Directional Probability Distribution Features for Arabic Words. In: Proceedings of the 2008 IEEE/ACS International Conference on Computer Systems and Applications 2008, pp. 582–590. IEEE Computer Society (2008), 1544403
Niels, R., Vuurpijl, L., Schomaker, L.: Automatic Allograph Matching in Forensic Writer Identification. International Journal of Pattern Recognition and Artificial Intelligence 21(1), 61–81 (2007), doi:10.1142/S0218001407005302
Pervouchine, V., Leedham, G.: Extraction and Analysis of Forensic Document Examiner Features Used for Writer Identification. Pattern Recognition 40(3), 1004–1013 (2007), doi:10.1016/j.patcog.2006.08.008
Srihari, S.N., Huang, C., Srinivasan, H., Shah, V.: Biometric and Forensic Aspects of Digital Document Processing. In: Chaudhuri, B.B. (ed.) Digital Document Processing. Advances in Pattern Recognition, pp. 379–405. Springer, Heidelberg (2007)
Tapiador, M., Sigüenza, J.: Writer Identification Method Based on Forensic Knowledge. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, pp. 555–561. Springer, Heidelberg (2004)
Srihari, S.N., Cha, S.-H., Arora, H., Lee, S.: Individuality of Handwriting. Journal of Forensic Sciences, 856–872 (2002)
Franke, K., Köppen, M.: A Computer-based System to Support Forensic Studies on Handwritten Documents. International Journal on Document Analysis and Recognition 3(4), 218–231 (2001), doi:10.1007/PL00013565
Liu, C.-L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten Digit Recognition: Benchmarking of the State-of-the-art Techniques. Pattern Recognition 36(10), 2271–2285 (2003)
Liu, C.-L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten Digit Recognition: Investigation of Normalization and Feature Extraction Techniques. Pattern Recognition 37(2), 265–279 (2004)
Xu, D.-Y., Shang, Z.-W., Tang, Y.-Y., Fang, B.: Handwriting-based Writer Identification with Complex Wavelet Transform. In: International Conference on Wavelet Analysis and Pattern Recognition, pp. 597–601 (2008)
Bensefia, A., Paquet, T., Heutte, L.: A Writer Identification and Verification System. Pattern Recognition Letters 26(13), 2080–2092 (2005)
He, Z.-Y., Tang, Y.-Y.: Chinese Handwriting-based Writer Identification by Texture Analysis. In: Proceedings of International Conference on Machine Learning and Cybernetics, pp. 3488–3491 (2004)
Yu, K., Wang, Y., Tan, T.: Writer Identification Using Dynamic Features. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, pp. 512–518. Springer, Heidelberg (2004)
Schlapbach, A., Bunke, H.: Off-line Handwriting Identification Using HMM Based Recognizers. In: 17th International Conference on Pattern Recognition, Cambridge, pp. 654–658 (2004)
Shen, C., Ruan, X.-G., Mao, T.-L.: Writer Identification Using Gabor Wavelet. In: Proceedings of the 4th World Congress on Intelligent Control and Automation, pp. 2061–2064 (2002)
Gupta, S.: Automatic Person Identification and Verification using Online Handwriting. International Institute of Information Technology (2008)
Bulacu, M.L.: Statistical Pattern Recognition for Automatic Writer Identification and Verification. University of Groningen (2007)
Zhang, B., Srihari, S.N.: Analysis of Handwriting Individuality Using Word Features. In: Proceedings of the Seventh International Conference of Document Analysis and Recognition, pp. 1142–1146 (2003)
Marti, U.V., Messerli, R., Bunke, H.: Writer Identification Using Text Line Based Features. In: Proceedings of Sixth International Conference on Document Analysis and Recognition, pp. 101–105 (2001)
Srihari, S.N., Cha, S.-H., Arora, H., Lee, S.: Individuality of Handwriting: A Validation Study. In: Sixth IAPR International Conference on Document Analysis and Recognition, pp. 106–109 (2001)
Yong, Z., Tieniu, T., Yunhong, W.: Biometric Personal Identification Based on Handwriting. In: Proceedings of 15th International Conference on Pattern Recognition, pp. 797–800 (2000)
Zheng, Z., Srihari, R., Srihari, S.N.: A Feature Selection Framework for Text Filtering. In: Proceedings of the Third IEEE International Conference on Data Mining (ICDM 2003), pp. 705–708. IEEE (2003)
Srihari, S.H., Cha, S.-H., Lee, S.: Establishing Handwriting Individuality Using Pattern Recognition Techniques. In: Sixth International Conference on Document Analysis and Recognition, pp. 1195–1204 (2001)
Schlapbach, A., Kilchherr, V., Bunke, H.: Improving Writer Identification by Means of Feature Selection and Extraction. In: Eight International Conference on Document Analysis and Recognition, pp. 131–135. IEEE (2005)
Said, H.E.S., Tan, T., Baker, K.: Personal Identification Based on Handwriting. Pattern Recognition 33, 149–160 (2000)
Cha, S.-H., Srihari, S.: Multiple Feature Integration for Writer Verification. In: Proceedings of 7th International Workshop on Frontiers in Handwriting Recognition, pp. 333–342 (2000)
Zhang, B., Srihari, S.N., Lee, S.: Individuality of Handwritten Characters. In: Proceedings of 7th International Conference on Document Analysis and Recognition, pp. 1086–1090 (2003)
Zois, E.N., Anastassopoulos, V.: Morphological Waveform Coding for Writer Identification. Pattern Recognition 33, 385–398 (2000)
Bulacu, M., Schomaker, L., Vuurpijl, L.: Writer Identification Using Edge-based Directional Features. In: Proceedings of 7th International Conference on Document Analysis and Recognition, pp. 937–941 (2003)
Schomaker, L., Bulacu, M.: Automatic Writer Identification using Connected-component Contours and Edge-based Features of Uppercase Western Script. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 787–798 (2004)
Bensefia, A., Paquet, T., Heutte, L.: Handwriting Analysis for Writer Verification. In: Proceedings of 9th International Workshop on Frontiers in Handwriting Recognition, pp. 196–201 (2004)
Leedham, G., Chachra, S.: Writer Identification Using Innovative Binarised Features of Handwritten Numerals. In: Proceedings of 7th International Conference on Document Analysis and Recognition, pp. 413–417 (2003)
Siddiqi, I.: Classification of Handwritten Documents: Writer Recognition. Université Paris Descartes (2009)
He, Z., Youb, X., Tang, Y.-Y.: Writer Identification Using Global Wavelet-based Features. Neurocomputing 71(10), 1831–1841 (2008)
Yu, L., Liu, H.: Efficient Feature Selection via Analysis of Relevance and Redundancy. Journal of Machine Learning Research, 1205–1224 (2004)
Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. University of Waikato (1999)
Dash, M., Liu, H.: Feature Selection for Classification. Journal of Intelligent Data Analysis, 131–156 (1997)
Zhang, P., Bui, T.D., Suen, C.Y.: Feature Dimensionality Reduction for the Verification of Handwritten Numerals. Pattern Analysis Application, 296–307 (2004)
Kim, G., Kim, S.: Feature Selection Using Genetic Algorithms for Handwritten Character Recognition. In: Seventh International Workshop on Frontiers in Handwriting Recognition, Amsterdam, pp. 103–112. International Unipen Foundation (2000)
Sewell, M.: Feature Selection (2007), http://machine-learning.martinsewell.com/feature-selection/feature-selection.pdf (accessed October 25, 2009)
Ahmad, A., Dey, L.: A Feature Selection Technique for Classificatory Analysis. Pattern Recognition Letters, 43–56 (2004)
Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Portinale, L., Saitta, L.: Feature Selection: State of the Art. In: Feature Selection, pp. 1–22. Universita del Piemonte Orientale, Alessandria (2002)
Kudo, M., Sklansky, J.: Comparison of Algorithms that Select Features for Pattern Classifiers. Journal of Pattern Recognition 33, 25–41 (2000)
Yinan, S., Weijun, L., Yuechao, W.: United Moment Invariants for Shape Discrimination. In: International Conference on Robotics, Intelligent Systems and Signal Processing, Changsha, pp. 88–93. IEEE (2003)
Hu, M.K.: Visual Pattern Recognition by Moment Invariants. IRE Transactions on Information Theory, 179–187 (1962)
Flusser, J., Suk, T., Zitová, B.: Moments and Moment Invariants in Pattern Recognition, vol. 1. John Wiley and Sons, Ltd., West Sussex (2009)
Belkasim, S.O., Shridhar, M., Ahmadi, M.: Pattern Recognition with Moment Invariants: A Comparative Study and New Results. Pattern Recognition 24(12), 1117–1138 (1991)
Ding, M., Chang, J., Peng, J.: Research on Moment Invariants Algorithm. Journal of Data Acquisition and Processing 7(2), 1–9 (1992)
Lv, H., Zhou, J.: Research on Discrete Moment Invariance Algorithm. Journal of Data Acquisition and Processing 1(2), 151–155 (1993)
Wang, B., Sun, J., Cai, A.: Relative Moments and Their Applications to Geometric Shape Recognition. Journal of Image and Graphics 6(3), 296–300 (2002)
Liu, J., Zhang, T.: Construction and Expansion of Target’s Moment Invariants. The Transaction of Photo Electricity Technique 2002, 123–130 (2002)
Mukundan, R., Ramakrishnan, K.R.: Moment Functions in Image Analysis Theory and Application. World Scientific Publishing Co. Pte. Ltd., Singapore (1998)
Vinciarelli, A.: A Survey on Off-line Cursive Word Recognition. Pattern Recognition, 1433–1446 (2002)
Madvanath, S., Govindaraju, V.: The Role of Holistic Paradigms in Handwritten Word Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 149–164 (2001)
Agre, G., Peev, S.: On Supervised and Unsupervised Discretization. Cybernetics and Information Technologies 2(2), 43–57 (2002)
Nguyen, H.S.: Discretization Problems for Rough Set Methods. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 545–552. Springer, Heidelberg (1998)
Xin, G., Xiao, Y., You, H.: Discretization of Continuous Interval-valued Attributes in Rough Set Theory and Application. In: International Conference on Machine Learning and Cybernetics, pp. 3682–3686 (2007)
Liu, H., Hussain, F., Tan, C.-L., Dash, M.: Discretization: An Enabling Technique. Data Mining and Knowledge Discovery 6, 292–423 (2002)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 194–202 (1995)
Kotsiantis, S., Kanellopoulos, D.: Discretization Techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering 32(1), 47–58 (2006)
Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 1–43 (1997)
Gadat, S., Younes, L.: A Stochastic Algorithm for Feature Selection in Pattern Recognition. Journal of Machine Learning Research, 509–547 (2007)
Saeys, Y., Inza, I., Larranaga, P.: A Review of Feature Selection Techniques in Bioinformatics. Journal of Bioinformatics, 2507–2517 (2007)
Liu, Y., Wang, G., Chen, H., Dong, H., Zhu, X., Wang, S.: An Improved Particle Swarm Optimization for Feature Selection. Journal of Bionic Engineering 8(2), 191–200 (2011)
Unler, A., Murat, A.: A Discrete Particle Swarm Optimization Method for Feature Selection in Binary Classification Problems. European Journal of Operational Research 206, 528–539 (2010)
Deriche, M.: Feature Selection Using Ant Colony Optimization. In: 6th International Multi-Conference on Systems, Signals and Devices, pp. 1–4 (2009)
Zhuo, L., Zheng, J., Wang, F., Li, X., Ai, B., Qian, J.: A Genetic Algorithm Based Wrapper Feature Selection Method for Classification of Hyperspectral Images Using Support Vector Machine. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 37, 397–402 (2008)
Subbotin, S., Oleynik, A.: Modifications of Ant Colony Optimization Method for Feature Selection. In: 9th International Conference - The Experience of Designing and Applications of CAD Systems in Microelectronics, pp. 493–494 (2007)
Wang, X., Yang, J., Teng, X., Xia, W., Jensen, R.: Feature Selection based on Rough Sets and Particle Swarm Optimization. Pattern Recognition Letters 28(4), 459–471 (2007)
Ververidis, D., Kotropoulos, C.: Fast and Accurate Sequential Floating Forward Feature Selection with the Bayes Classifier Applied to Speech Emotion Recognition. Signal Processing 88(12), 2956–2970 (2008)
Xie, J., Xie, W., Wang, C., Gao, X.: A Novel Hybrid Feature Selection Method Based on IFSFFS and SVM for the Diagnosis of Erythemato-Squamous Diseases. In: Workshop on Applications of Pattern Analysis, pp. 142–151 (2010)
Chuang, L.-Y., Chang, H.-W., Tu, C.-J., Yang, C.-H.: Improved Binary PSO for Feature Selection using Gene Expression Data. Computational Biology and Chemistry 32(1), 29–38 (2008)
Deisy, C., Subbulakshmi, B., Baskar, S., Ramaraj, N.: Efficient Dimensionality Reduction Approaches for Feature Selection. In: International Conference on Computational Intelligence and Multimedia Applications, pp. 121–127 (2007)
Pizzi, N.J., Pedtycz, W.: Classification of Magnetic Resonance Spectra using Parallel Randomized Feature Selection. In: Proceedings of 2004 IEEE International Joint Conference on Neural Networks, pp. 2455–2459 (2004)
Melab, N., Cahon, S., Talbi, E.-G.: Parallel GA-based Wrapper Feature Selection for Spectroscopic Data Mining. In: Proceedings of International Parallel and Distributed Processing Symposium, pp. 201–208 (2002)
Qian, X.-J., Xu, J.-B.: Optimization and Implementation of Sorting Algorithm Based on Multi-core and Multi-thread. In: IEEE 3rd International Conference on Communication Software and Networks, pp. 29–32 (2011)
Cruz, C., Pelta, D.A., Royo, A.S., Verdegay, J.L.: Soft Computing and Cooperative Strategies for Optimization. In: IEEE Mid-Summer Workshop on Soft Computing in Industrial Applications 2005, pp. 75–78 (2005)
Pudil, P., Novovicova, J., Kittler, J.: Floating Search Methods in Feature Selection. Pattern Recognition Letters 15, 1119–1125 (1994)
Whitney, A.W.: A Direct Method of Nonparametric Measurement Selection. IEEE Transaction in Computational, 1100–1103 (1971)
Eberhart, R.C., Kennedy, J.: A New Optimizer using Particle Swarm Theory. In: Proceedings of 6th International Symposium on Micro Machine and Human Science, pp. 39–43 (1995)
Kennedy, J., Eberhart, R.C.: Particle Swarm Optimization. In: Proceedings of IEEE International Conference on Neural Networks 1995, pp. 1942–1948 (1995)
Abdl, K.M., Mohd Hashim, S.Z.: Swarm-Based Feature Selection for Handwriting Identification. Journal of Computer Science 6(1), 80–86 (2010)
Pratama, S.F., Muda, A.K., Choo, Y.-H., Muda, N.A.: A Comparative Study of Feature Selection Methods for Authorship Invarianceness in Writer Identification. International Journal of Computer Information Systems and Industrial Management Applications 4, 467–476 (2012)
Pratama, S.F., Muda, A.K., Choo, Y.-H., Muda, N.A.: PSO and Computationally Inexpensive Sequential Forward Floating Selection in Acquiring Significant Features for Handwritten Authorship. In: 11th International Conference on Hybrid Intelligent Systems, Melaka, Malaysia, pp. 358–363 (2011)
Pratama, S.F., Muda, A.K., Choo, Y.-H., Muda, N.A.: Computationally Inexpensive Sequential Forward Floating Selection for Acquiring Significant Features for Authorship Invarianceness in Writer Identification. International Journal of New Computer Architectures and Their Applications 1(3), 581–598 (2011)
Pratama, S.F., Muda, A.K., Choo, Y.-H., Muda, N.A.: SOCIFS Feature Selection Framework for Handwritten Authorship. International Journal of Hybrid Intelligent Systems 10(2), 83–91 (2013), doi:10.3233/HIS-130167
Muda, A.K., Shamsuddin, S.M., Darus, M.: Invariants Discretization for Individuality Representation in Handwritten Authorship. In: Srihari, S.N., Franke, K. (eds.) IWCF 2008. LNCS, vol. 5158, pp. 218–228. Springer, Heidelberg (2008)
Pineda-Bautista, B.B., Carrasco-Ochoa, J.A., Martinez-Trinidad, J.F.: General Framework for Class-Specific Feature Selection. Expert Systems with Applications 38, 10018–10024 (2011)
Marti, U., Bunke, H.: The IAM-database: an English Sentence Database for Off-line Handwriting Recognition. International Journal on Document Analysis and Recognition 5, 39–46 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Pratama, S.F., Muda, A.K., Choo, YH., Muda, N.A. (2014). A New Swarm-Based Framework for Handwritten Authorship Identification in Forensic Document Analysis. In: Muda, A., Choo, YH., Abraham, A., N. Srihari, S. (eds) Computational Intelligence in Digital Forensics: Forensic Investigation and Applications. Studies in Computational Intelligence, vol 555. Springer, Cham. https://doi.org/10.1007/978-3-319-05885-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-05885-6_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05884-9
Online ISBN: 978-3-319-05885-6
eBook Packages: EngineeringEngineering (R0)