Abstract
In this paper, we conduct an investigation of automatic authorship attribution on seven Arabic religious books, namely: the holy Quran, Hadith and five other books, by using two fusion techniques. The Arabic dialect is the same (i.e. Standard Arabic) for the seven books. The genre is the same and the topic of the different books is also the same (i.e. Religion).
The authorship characterization is based on four different features: character trigrams, character tetragrams, word unigrams and word bigrams. The task of authorship identification is ensured by four conventional classifiers: Manhattan distance, Multi-Layer Perceptron, Support Vector Machines and Linear Regression. Furthermore, we propose two fusion approaches to strengthen the classification performances. Finally, a particular application is dedicated to the authorship discrimination between the Quran and Hadith, in order to see if the two books could have the same Author or not. Results have shown the importance of the fusion techniques in authorship attribution and confirm that the two books (Quran and Hadith) should belong to two different Authors, which implies that the Quran could not be written by the Prophet.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Signoriello, D.J., Jain, S., Berryman, M.J., Abbott, D.: Advanced text authorship detection methods and their application to biblical texts. In: Proceedings of SPIE (2005), vol. 6039, pp. 163–175. SPIE (2005)
Eder, M.: Does size matter? Autorship attribution, short samples, big problem. In: Digital Humanities 2010 Conference, London, pp. 132–135 (2010)
Luyckx, K., Daelemans, W.: Authorship attribution and verification with many authors and limited data. In: Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, pp. 513–520, August 2008
Juola, P.: Authorship attribution. Found. Trends Inf. Retrieval 1(3), 233–334 (2006). https://doi.org/10.1561/1500000005. Now Publishing, USA
Love, H.: Attributing Authorship: An Introduction. Cambridge University Press, Cambridge (2002)
McMenamin, G.R.: Forensic Linguistics — Advances in Forensic Stylistics. CRC Press, Boca Raton (2002)
Fodil, L., Ouamour, S., Sayoud, H.: Theme classification of arabic text: a statistical approach. In: TKE 2014 Conference: Terminology and Knowledge Engineering, 19–21 June 2014, Berlin, Germany (2014)
Baraka, R., Salem, S., Abu-Hussien, M., Nayef, N., Abu-Shaban, W.: Arabic text author identification using support vector machines. J. Adv. Comput. Sci. Technol. Res. 4(1), 1–11 (2014)
Sayoud, H.: Author discrimination between the Holy Quran and Prophet’s statements. LLC J. Lit. Linguist. Compt. 27(4), 427–444 (2012)
Sayoud, H.: Authorship classification of two old arabic religious books based on a hierarchical clustering. In: LRE-Rel: language resources and evaluation for religious texts, Lütfi Kirdar Convention & Exhibition Centre Istanbul, Turkey, pp. 65–70 (2012)
Ibrahim, I.A.: A brief illustrated guide to understanding Islam. Library of Congress, Catalog Card Number: 97-67654. Published by Darussalam, Publishers and Distributors, Houston (1999). http://www.islam-guide.com/contents-wide.htm, ISBN: 9960-34-011-2
Islahi, A.A.: Fundamentals of Hadith Interpretation; Hashmi, T.M. (English Trans.): Mabadi Tadabbur-i-Hadith. Al-Mawrid, Lahore (1989). www.monthly-renaissance.com/DownloadContainer.aspx?id=71
Sayoud, H.: Automatic speaker recognition – Connexionnist approach. Ph.D thesis, USTHB University, Algiers (2003)
Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations. In: Kasabov, N., Ko, K. (eds.) Proceedings of the ICONIP/ANZIIS/ANNES 1999 Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems, Dunedin, New Zealand, pp. 192–196 (1999)
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649 (2001)
Linear Regression. Last visit in 2013. http://en.wikipedia.org/wiki/Linear_regression
Huang, X., Pan, W.: Linear regression and two-class classification with gene expression data. Bioinformatics 19(16), 2072–2078 (2003)
Tchechmedjiev, A., Schwab, D., Goulian, J.: Fusion strategies applied to multilingual features for an knowledge-based word sense disambiguation algorithm: evaluation and comparison. In: CICLING 2013 Conference: 14th International Conference on Intelligent Text Processing and Computational Linguistics, 24–30 March 2013, University of the Aegean, Samos, Greece (2013)
Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20 (2004)
Dasarathy, B.V.: Decision Fusion. IEEE Computer Society Press, Los Alamitos (1994)
Verlinde, P.: A Contribution to Multimodal Identity Verification using Decision Fusion. Ph.D thesis, Ecole Nationale Supérieure des Télécommunications, Paris, France, 17 September 1999
Stylianou, Y., Pantazis, Y., Calderero, F., Larroy, P., Severin, F., Schimke, S., Bonal, R., Matta, F., Valsamakis, A.: GMM- based multimodal biometric verification. Final Project Report 1, Enterface 2005, 18 July–12 August, Mons, Belgium (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Sayoud, H., Hadjadj, H. (2018). Fusion Based Authorship Attribution-Application of Comparison Between the Quran and Hadith. In: Lachkar, A., Bouzoubaa, K., Mazroui, A., Hamdani, A., Lekhouaja, A. (eds) Arabic Language Processing: From Theory to Practice. ICALP 2017. Communications in Computer and Information Science, vol 782. Springer, Cham. https://doi.org/10.1007/978-3-319-73500-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-73500-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73499-6
Online ISBN: 978-3-319-73500-9
eBook Packages: Computer ScienceComputer Science (R0)