skip to main content
10.1145/3183440.3195084acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
poster

Efficiently finding minimal failing input in MapReduce programs

Published:27 May 2018Publication History

ABSTRACT

Debugging of distributed computing model programs like MapReduce is a difficult task. That's why prior studies only focus on finding and fixing bugs in early stages of program development. Delta debugging tries to find minimal failing input in sequential programs by dividing inputs into subsets and testing these subsets one-by-one. But no prior work tries to find minimal failing input in distributed programs like MapReduce. In this paper, we present MapRedDD, a framework to efficiently find minimal failing input in MapReduce programs. MapRedDD employs failing input selection technique, focused on identifying the failing input subset in the single run of MapReduce program with multiple input subsets instead of testing each subset separately. This helps to reduce the number of executions of MapReduce program for each input subset and overcome the overhead of job submission, job scheduling and final outcome retrieval. Our work can efficiently find the minimal failing input in the number of executions equal to the number of inputs to MapReduce program N as opposed to the number of executions of MapReduce program equal to the number of input subsets 2N - 1 in worst case for binary search invariant algorithm to find minimal failing input.

References

  1. 2018. Apache MRUnit. https://mrunit.apache.org/. (2018).Google ScholarGoogle Scholar
  2. 2018. Mockito. https://code.google.com/p/mockito/. (2018).Google ScholarGoogle Scholar
  3. 2018. PowerMock. https://code.google.com/p/powermock/. (2018).Google ScholarGoogle Scholar
  4. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified Data Processing on Large Clusters. Commun. ACM 51, 1 (2008), 107--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Tom White. 2012. Hadoop: The Definitive Guide. " O'Reilly Media, Inc.". Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and Isolating Failure-Inducing Input. IEEE Transactions on Software Engineering 28, 2 (2002), 183--200. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Efficiently finding minimal failing input in MapReduce programs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICSE '18: Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings
      May 2018
      231 pages
      ISBN:9781450356633
      DOI:10.1145/3183440
      • Conference Chair:
      • Michel Chaudron,
      • General Chair:
      • Ivica Crnkovic,
      • Program Chairs:
      • Marsha Chechik,
      • Mark Harman

      Copyright © 2018 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2018

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate276of1,856submissions,15%

      Upcoming Conference

      ICSE 2025
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader