skip to main content
10.1145/2304510.2304514acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Scope playback: self-validation in the cloud

Published: 21 May 2012 Publication History

Abstract

The last decade witnessed the emergence of various distributed storage and computation systems for cloud-scale data processing. Scope is the distributed computation platform targeted for a variety of data analysis and data mining applications, powering Bing and other online services at Microsoft. Scope combines benefits of both traditional parallel databases and MapReduce execution engines to allow easy programmability. It features a SQL-like declarative scripting language with .NET extensions, and delivers massive scalability and high performance through advanced optimization. Scope currently operates over tens of thousands of machines and processes over a million jobs per month.
Such massive data computation platform presents new challenges and opportunities for efficient and effective testing and validation. Traditional approaches for testing database systems are not always sufficient due to several factors. Model-based query generation typically fails to provide coverage of user-defined code, which is very common in Scope scripts. Additionally, rapid release cycles in the platform-as-a-service environment require tools to quickly identify potential regressions, predict the impact of breaking changes, and provide massive test coverage in a short amount of time. In this paper, we describe a test automation tool, denoted by Scope Playback, that addresses these new requirements. Scope Playback leverages the Scope system itself in two important ways. First, it exploits data about every job submitted to production clusters, which is automatically stored by the Scope system. Second, the testing process itself is implemented as a Scope script, automatically benefiting from transparent and massive computation parallelism. Scope Playback currently serves as one crucial validation technique and ensures product quality during Scope release cycles.

References

[1]
R. Chaiken, B. Jenkins, P.-Å. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and efficient parallel processing of massive data sets. In Proceedings of VLDB Conference, 2008.
[2]
S. Chordia, E. Dettinger, and E. Triou. Different query verification approaches used to test entity sql. In Proceedings of the first international workshop on testing database systems, 2008.
[3]
S. R. Dalal, A. Jain, N. Karunanithi, J. M. Leaton, C. M. Lott, G. Patton, and B. Horowitz. Model-based testing in practice. In Proceedings of 21st Annual Conference on Software Engineering, 1999.
[4]
J. Zhou, P.-Å. Larson, and R. Chaiken. Incorporating partitioning and parallel plans into the SCOPE optimizer. In Proceedings of ICDE Conference, 2010.

Cited By

View all
  • (2022)Deploying a Steered Query Optimizer in Production at MicrosoftProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526052(2299-2311)Online publication date: 10-Jun-2022
  • (2021)The cosmos big data platform at MicrosoftProceedings of the VLDB Endowment10.14778/3476311.347639014:12(3148-3161)Online publication date: 28-Oct-2021
  • (2018)SnowtrailProceedings of the Workshop on Testing Database Systems10.1145/3209950.3209958(1-6)Online publication date: 15-Jun-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DBTest '12: Proceedings of the Fifth International Workshop on Testing Database Systems
May 2012
75 pages
ISBN:9781450314299
DOI:10.1145/2304510
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 May 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed computing
  2. playback
  3. scope
  4. testing
  5. validation

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '12
Sponsor:

Acceptance Rates

DBTest '12 Paper Acceptance Rate 12 of 26 submissions, 46%;
Overall Acceptance Rate 31 of 56 submissions, 55%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Deploying a Steered Query Optimizer in Production at MicrosoftProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526052(2299-2311)Online publication date: 10-Jun-2022
  • (2021)The cosmos big data platform at MicrosoftProceedings of the VLDB Endowment10.14778/3476311.347639014:12(3148-3161)Online publication date: 28-Oct-2021
  • (2018)SnowtrailProceedings of the Workshop on Testing Database Systems10.1145/3209950.3209958(1-6)Online publication date: 15-Jun-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media