skip to main content
10.1145/2912152acmconferencesBook PagePublication PageshpdcConference Proceedingsconference-collections
DIDC '16: Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing
ACM2016 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
HPDC'16: The 25th International Symposium on High-Performance Parallel and Distributed Computing Kyoto Japan 1 June 2016
ISBN:
978-1-4503-4352-7
Published:
01 June 2016
Sponsors:
University of Arizona, SIGARCH
In-Cooperation:
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN

Reflects downloads up to 20 Jan 2025Bibliometrics
Skip Abstract Section
Abstract

It is our great pleasure to welcome you to the Sixth International Workshop on Data-intensive Distributed Computing (DIDC 2016), which is held in conjunction with the International ACM Symposium on High Performance Distributed Computing (HPDC 2016).

The data needs of scientific as well as commercial applications from a diverse range of fields have been increasing exponentially over the recent years. Digital data generated from various sources such as scientific instruments, sensors, internet transactions, email, video and click streams can be large, diverse, longitudinal and distributed which poses new challenges and requirements for offline and real time processing where extraction of meaningful information can open novel application areas and lead to new breakthroughs. This data deluge and the increase in the demand for large-scale data processing has necessitated collaboration and sharing of data collections among the world's leading education, research, and industrial institutions and use of distributed resources owned by collaborating parties. In a widely distributed environment, data is often not locally accessible and has thus to be remotely retrieved and stored. While traditional distributed systems work well for computation that requires limited data handling, they may fail in unexpected ways when the computation accesses, creates, and moves large amounts of data especially over wide-area networks. Further, data accessed and created is often poorly described, lacking both metadata and provenance. Scientists, researchers, and application developers are often forced to solve basic data-handling issues, such as physically locating data, how to access it, and/or how to move it to visualization and/or compute resources for further analysis. Although many efforts have been made to develop new programming paradigms and models that can handle the data needs of the application automatically, the results are far from being optimized.

DIDC focuses on the challenges imposed by data-intensive applications on distributed systems, and on the different state-of-the-art solutions proposed to overcome these challenges. It brings together the collaborative and distributed computing community and the data management community in an effort to generate productive conversations on the planning, management, and scheduling of data handling tasks and data storage resources

This year's workshop continues with the tradition of gathering distinguished speakers and providing a diverse program with a variety of topics ranging from data staging and indexing models for data-intensive applications to high-performance genomics and Cloud scheduling.

Skip Table Of Content Section
SESSION: Keynote Address
invited-talk
Towards Convergence of Extreme Computing and Big Data Centers

Rapid growth in the use cases and demands for extreme computing and huge data processing is leading to convergence of the two infrastructures. Tokyo Tech.'s TSUBAME3.0, a 2017 addition to the highly successful TSUBAME2.5, will aim to deploy a series of ...

SESSION: Session I
research-article
Minimising the Execution of Unknown Bag-of-Task Jobs with Deadlines on the Cloud

Scheduling jobs with deadlines, each of which defines the latest time that a job must be completed, can be challenging on the cloud due to the incurred costs and unpredictable performance. This problem is further complicated when there is not enough ...

research-article
Experiences with Performing MapReduce Analysis of Scientific Data on HPC Platforms

The growing interest in being able to apply Big Data techniques to scientific data generated using HPC simulations led to the question of whether this is achievable on the same HPC platform, and if so, what is the performance that can be obtained on ...

research-article
Rethinking High Performance Computing Platforms: Challenges, Opportunities and Recommendations

A growing number of "second generation" high-performance computing applications with heterogeneous, dynamic and data-intensive properties have an extended set of requirements, which cover application deployment, resource allocation, -control, and I/O ...

SESSION: Session II
research-article
Public Access
Efficient and Scalable Workflows for Genomic Analyses

Recent growth in the volume of DNA sequence data and associated computational costs of extracting meaningful information warrants the need for efficient computational systems at-scale. In this work, we propose the Illinois Genomics Execution Environment ...

research-article
Public Access
Persistent Data Staging Services for Data Intensive In-situ Scientific Workflows

Scientific simulation workflows executing on very large scale computing systems are essential modalities for scientific investigation. The increasing scales and resolution of these simulations provide new opportunities for accurately modeling complex ...

research-article
SIDI: A Scalable in-Memory Density-based Index for Spatial Databases

With wide-spread use of location-based services, spatial data is becoming popular. As the data is usually huge in volume and continuously arriving to the storage in real-time, designing systems for efficiently storing this type of data is challenging. ...

Contributors
  • University at Buffalo, The State University of New York
  • Fatih University
Index terms have been assigned to the content through auto-classification.

Recommendations

Acceptance Rates

Overall Acceptance Rate 7 of 12 submissions, 58%
YearSubmittedAcceptedRate
DIDC '1412758%
Overall12758%