skip to main content
10.1145/1779599acmotherconferencesBook PagePublication PagesmdacConference Proceedingsconference-collections
MDAC '10: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud
ACM2010 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
MDAC '10: WWW 2010 Workshop on Massive Data Analytics over the Cloud Raleigh North Carolina USA 26 April 2010
ISBN:
978-1-60558-991-6
Published:
26 April 2010
Recommend ACM DL
ALREADY A SUBSCRIBER?SIGN IN

Reflects downloads up to 02 Mar 2025Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
research-article
Distributed indexing of web scale datasets for the cloud
Article No.: 1, Pages 1–6https://doi.org/10.1145/1779599.1779600

In this paper, we present a distributed architecture for indexing and serving large and diverse datasets. It incorporates and extends the functionality of Hadoop, the open source MapReduce framework, and of HBase, a distributed, sparse, NoSQL database, ...

research-article
A novel approach to multiple sequence alignment using hadoop data grids
Article No.: 2, Pages 1–7https://doi.org/10.1145/1779599.1779601

Multiple alignment of protein sequences is an essential tool in molecular biology. It aids to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of ...

research-article
Beyond online aggregation: parallel and incremental data mining with online Map-Reduce
Article No.: 3, Pages 1–6https://doi.org/10.1145/1779599.1779602

There are only few data mining algorithms that work in a massively parallel and yet online (i.e. incremental) fashion. A combination of both features is essential for mining of large data streams and adds scalability to the concept of Online Aggregation ...

research-article
Extracting user profiles from large scale data
Article No.: 4, Pages 1–6https://doi.org/10.1145/1779599.1779603

In this work we present the details of a large scale user profiling framework that we developed here in IBM on top of Apache Hadoop. We address the problem of extracting and maintaining a very large number of user profiles from large scale data. We ...

research-article
Towards scalable RDF graph analytics on MapReduce
Article No.: 5, Pages 1–6https://doi.org/10.1145/1779599.1779604

In order to exploit the growing amount of RDF data in decision-making, there is an increasing demand for analytics-style processing of such data. RDF data is modeled as a labeled graph that represents a collection of binary relations (triples). In this ...

research-article
SPARQL basic graph pattern processing with iterative MapReduce
Article No.: 6, Pages 1–6https://doi.org/10.1145/1779599.1779605

There have been a number of approaches to adopt the RDF data model and the MapReduce framework for a data warehouse, as the data model is suitable for data integration and the data processing framework is good for large-scale fault-tolerant data ...

research-article
Efficient updates for a shared nothing analytics platform
Article No.: 7, Pages 1–6https://doi.org/10.1145/1779599.1779606

In this paper we describe a cloud-based data-warehouselike system especially targeted to time series data. Apart from the benefits that a distributed storage built on top of a shared-nothing architecture offers, our system is designed to efficiently ...

research-article
Parallelizing Random Walk with Restart for large-scale query recommendation
Article No.: 8, Pages 1–6https://doi.org/10.1145/1779599.1779607

Random Walk with Restart (abbreviated as RWR) has been widely employed in Web search and recommendation systems and several performance enhancement approaches for RWR have been proposed to save storage costs and improve the on-line response time. In ...

Contributors
  • Dell EMC
  • IBM Research - Almaden
  • IBM Research

Recommendations