skip to main content
10.1145/3545947.3573271acmconferencesArticle/Chapter ViewAbstractPublication PagessigcseConference Proceedingsconference-collections
abstract

A Framework to Develop Automatic Speech Recognition for Low Resource Languages

Published: 06 March 2023 Publication History

Abstract

Current Automatic Speech Recognition (ASR) systems, like Google Assistant, Apple's Siri, or Amazon's Alexa, continue to only support a small number of languages (English, Mandarin, Arabic, etc.), primarily those spoken in developed nations with abundant resources. While these languages have been able to reap the benefits of having such technology at their disposal, places like Ethiopia, and Vietnam are still far behind. This work represents a global collaboration to create a framework for customizing ASR systems for low resource languages (LRLs), or languages with limited human and financial resources. This paper describes the methodology for using an existing application (Kaldi) to implement an ASR system for two such languages, Amharic and Vietnamese, with the least amount of annotated speech. The languages are chosen to leverage available student expertise and create cross-cultural connections. The objective of the research is to create a procedure by which, given enough training records and annotation, any language can be added.

Cited By

View all
  • (2025)Code-Switching ASR for Low-Resource Indic Languages: A Hindi-Marathi Case StudyIEEE Access10.1109/ACCESS.2025.352774513(9171-9198)Online publication date: 2025

Index Terms

  1. A Framework to Develop Automatic Speech Recognition for Low Resource Languages

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCSE 2023: Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 2
    March 2023
    1481 pages
    ISBN:9781450394338
    DOI:10.1145/3545947
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 March 2023

    Check for updates

    Author Tags

    1. automatic speech recognition
    2. low resource languages
    3. student research

    Qualifiers

    • Abstract

    Conference

    SIGCSE 2023
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,787 of 5,146 submissions, 35%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Code-Switching ASR for Low-Resource Indic Languages: A Hindi-Marathi Case StudyIEEE Access10.1109/ACCESS.2025.352774513(9171-9198)Online publication date: 2025

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media