skip to main content
10.1145/2980258.2980465acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciaConference Proceedingsconference-collections
research-article

Integrated Framework for Speech Categorization based on Clustering in Dynamic Environment

Authors Info & Claims
Published:25 August 2016Publication History

Editorial Notes

NOTICE OF CONCERN: ACM has received evidence that casts doubt on the integrity of the peer review process for the ICIA 2016 Conference. As a result, ACM is issuing a Notice of Concern for all papers published and strongly suggests that the papers from this Conference not be cited in the literature until ACM's investigation has concluded and final decisions have been made regarding the integrity of the peer review process for this Conference.

ABSTRACT

The need for gathering, organizing and archiving information becomes a challenging issue because of the voluminous information especially in the area of TV broadcasts. Since there are many TV talk shows in the form of videos and speeches, documentation becomes a major challenging issue. Categorizing the TV programme content helps to organize the related themes in to a single group. This in turn improves the efficiency of navigation and information retrieval. The rapid growth in number of talk shows in various domains such as news groups, medical talk shows, cooking tips, politics and speeches have led to greater complexity in converting the videos and categorizing the TV talk show programme's efficiently in the form of text. This paper introduces an integrated framework for speech categorization based on clustering which is not addressed in the existing system. A heterogeneous integrated framework and a new clustering algorithm named as Theme based Dynamic Document Clustering (TDDC) for dynamic environment have been proposed. The main goal of the system is to help the users, especially the hearing impaired people to locate and retrieve relevant information in the form of domain specific document for TV talk shows. The videos dataset for various domains obtained from DR.Oz.com are considered for experimental analysis. F measure, Inter and Intra cluster similarity are the performance metrics used for experimental analysis. The proposed TDDC for dynamic environment which is compared with the existing FIS (Frequent Item Set) clustering algorithm and the results are analyzed.

References

  1. F. Vallet, S., Essid, J. Carrive, and G. Richard. 2012. TV Content Analysis: Techniques and Applications (Chapter High-Level TV TalkShow Structuring Centered on Speakers' Interventions, CRC, Taylor Francis,.Google ScholarGoogle Scholar
  2. K. Jain., M.N. Murty and P.J. Flynn 1999. Data clustering A review, ACM Computing Surveys, Vol. 31, No. 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. X. Anguera., S. Bozonnet, N. Evans, C. Fredouille, G. Friedland, and O. Vinyals. 2012. Speaker diarization: A review of recent research, IEEE Trans. Audio, Speech, Language Processing., vol. 20, no. 2, pp. 356--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bigot, I.Ferrané., J. Pinquier, and R. André-Obrecht. 2010. Speaker role recognition to help spontaneous conversational speech detection, In the Proceedings of ACM Workshop Searching for Spontaneous Conversational Speech. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Meriem Bendris., DelphineCharlet; Gérard Chollet. 2010. "Talking faces indexing in TV-content", International Workshop on Content-Based Multimedia Indexing (CBMI).Google ScholarGoogle ScholarCross RefCross Ref
  6. Felicien Vallet, Slim Essid, and Jean Carrive. 2013. A Multimodal approach to speaker diarization on TV talk - shows', IEEE transactions on multimedia, Vol. 15, no. 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Vallet, S. Essid, J. Carrive, and G. Richard.2010., Robust visual features for the multimodal identification of unregistered speakers, in Proceedings International. Conference Image Processing, Hong Kong, China,Google ScholarGoogle Scholar
  8. Xavier Anguera, Simon Bozonnet, Nicholas Evans, Corinne Fredouille; Gerald Friedland; Oriol Vinyals. 2012. Speaker diarization: A review of recent research, IEEE Transactions on Audio, Speech, and Language Processing, Vol 20, No 2, 356--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Felicien Vallet, Slim Essid, Jean Carrive and Gael Richard. 2011. High-level TV talk show structuring centered on speakers' interventions", Taylor and Francis.Google ScholarGoogle Scholar
  10. H. Salamin and A. Vinciarelli,. 2012. Automatic role recognition in multi-party conversations: An approach based on turn organization, prosody and conditional random fields, IEEE transaction Multimedia, vol. 14, no. 2, pp. 338--345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fung, B., Wang, K. and Ester, M. 2003., Hierarchical Document Clustering using Frequent Item Sets, SIAM International Conference on Data Mining, SDM '03.Google ScholarGoogle Scholar
  12. Peng, Y., Kou, G., Chen. Z. and Shi, Y. 2006. Recent Trends in Data Mining (DM),: Document Clustering of DM Publications, Proceedings of International Conference on Service Systems and Service Management.Google ScholarGoogle Scholar
  13. Peng, Y., Kou, G., Shi, Y. and Chen, Z. 2006. A Hybrid Strategy for Clustering Data Mining Documents, Proceedings of Sixth IEEE International Conference on Data Mining -- Workshop. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Li, Y., Chung, S. and Holt, J. 2008. Text Document Clustering based on Frequent Word Meaning Sequences", Journal of Data & Knowledge Engineering, Elsevier, Vol. 64, No. 1, pp. 381--404. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Blei, D. 2012. Introduction to Probabilistic Topic Models, Communications of the ACM, Vol. 55, No. 4,pp. 77--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jayabharathy. J, Kanmani. S and Sivaranjini. N. 2014. Correlated Concept based Topic Updation Model for Dynamic Corpora, International Journal of Computer Applications, Vol. 89, No.10, pp. 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  17. Huang, A.. 2008, Similarity Measures for Text Document Clustering, Proceeding of New Zealand Computer Science Student Research Conference, NZCRSC'08.Google ScholarGoogle Scholar
  18. Jayabharathy. J and Kanmani. S. 2014. Correlated Concept Based Dynamic Document Clustering Algorithms for Newsgroups and Scientific Literature, Journal on Decision Analytics, Springeropen, Vol. 1, No. 3. pp. 1--21.Google ScholarGoogle ScholarCross RefCross Ref
  19. Miller, G.A. 1995. WordNet: A Lexical Database for English, Communication of ACM, Vol. 38, No.11, pp. 39--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jayabharathy. J, Kanmani. S and AyeeshaParveen. A. 2011, Document Clustering and Topic Discovery based on Semantic Similarity in Scientific Literature, In the proceedings - Third IEEE International Conference on Communication Software and Networks (ICCSN), China, pp 425--429, 27--29.Google ScholarGoogle ScholarCross RefCross Ref
  21. Li, F., Zhu, Q. and Lin, X..2009, Topic Discovery in Research Literature Based on Non-negative Matrix Factorization and Testor Theory, Proceedings of Asia-Pacific Conference on Information Processing, pp. 266--269 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Selim Mimaroglu and A. Murat Yagci. 2007 "A Binary Method for Fast Computation of Inter and Intra Cluster Similarities for Combining Multiple Clusterings" In the Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human, pp. 452--456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Steinbach, M., Karypis, G. and Kumar, V. 2000. A Comparison of Document Clustering Techniques, In the Proceedings of Workshop on Text Mining, 6th ACM SIGKDD International Conference on Data Mining (KDD'00), pp. 109--110.Google ScholarGoogle Scholar

Index Terms

  1. Integrated Framework for Speech Categorization based on Clustering in Dynamic Environment

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICIA-16: Proceedings of the International Conference on Informatics and Analytics
      August 2016
      868 pages
      ISBN:9781450347563
      DOI:10.1145/2980258

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 August 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader