Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud

MDAC '10: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud

April 2010

2010 Proceeding

Conference Chairs:
Ullas Nambiar
IBM India Research Lab, New Delhi, India
,
John McPherson
IBM Almaden Research Center
,
David Konopnicki
IBM Haifa Research Lab, Israel

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

MDAC '10: WWW 2010 Workshop on Massive Data Analytics over the Cloud Raleigh North Carolina USA 26 April 2010

ISBN:

978-1-60558-991-6

Published:

26 April 2010

Recommend ACM DL

ALREADY A SUBSCRIBER?SIGN IN

Bibliometrics

Abstract

No abstract available.

Proceeding Downloads

PDFFront matter (Title page, Organization, TOC)

Select All

Export Citations Save to Binder

research-article

Distributed indexing of web scale datasets for the cloud

Article No.: 1, Pages 1–6https://doi.org/10.1145/1779599.1779600

In this paper, we present a distributed architecture for indexing and serving large and diverse datasets. It incorporates and extends the functionality of Hadoop, the open source MapReduce framework, and of HBase, a distributed, sparse, NoSQL database, ...

research-article

A novel approach to multiple sequence alignment using hadoop data grids

Article No.: 2, Pages 1–7https://doi.org/10.1145/1779599.1779601

Multiple alignment of protein sequences is an essential tool in molecular biology. It aids to determine evolutionary linkage and to predict molecular structures. The factors to be considered while aligning multiple sequences are speed and accuracy of ...

research-article

Beyond online aggregation: parallel and incremental data mining with online Map-Reduce

Article No.: 3, Pages 1–6https://doi.org/10.1145/1779599.1779602

There are only few data mining algorithms that work in a massively parallel and yet online (i.e. incremental) fashion. A combination of both features is essential for mining of large data streams and adds scalability to the concept of Online Aggregation ...

research-article

Extracting user profiles from large scale data

Article No.: 4, Pages 1–6https://doi.org/10.1145/1779599.1779603

In this work we present the details of a large scale user profiling framework that we developed here in IBM on top of Apache Hadoop. We address the problem of extracting and maintaining a very large number of user profiles from large scale data. We ...

research-article

Towards scalable RDF graph analytics on MapReduce

Article No.: 5, Pages 1–6https://doi.org/10.1145/1779599.1779604

In order to exploit the growing amount of RDF data in decision-making, there is an increasing demand for analytics-style processing of such data. RDF data is modeled as a labeled graph that represents a collection of binary relations (triples). In this ...

research-article

SPARQL basic graph pattern processing with iterative MapReduce

Article No.: 6, Pages 1–6https://doi.org/10.1145/1779599.1779605

There have been a number of approaches to adopt the RDF data model and the MapReduce framework for a data warehouse, as the data model is suitable for data integration and the data processing framework is good for large-scale fault-tolerant data ...

research-article

Efficient updates for a shared nothing analytics platform

Article No.: 7, Pages 1–6https://doi.org/10.1145/1779599.1779606

In this paper we describe a cloud-based data-warehouselike system especially targeted to time series data. Apart from the benefits that a distributed storage built on top of a shared-nothing architecture offers, our system is designed to efficiently ...

research-article

Parallelizing Random Walk with Restart for large-scale query recommendation

Article No.: 8, Pages 1–6https://doi.org/10.1145/1779599.1779607

Random Walk with Restart (abbreviated as RWR) has been widely employed in Web search and recommendation systems and several performance enhancement approaches for RWR have been proposed to save storage costs and improve the on-line response time. In ...

Contributors

Ullas Nambiar
Dell EMC
- Publication Years2001 - 2015
- Publication counts36
- Citation count278
- Available for Download24
- Downloads (cumulative)9,508
- Downloads (12 months)105
- Downloads (6 weeks)21
- Average Downloads per Article396
- Average Citation per Article8
View Full Profile
John Ai McPherson
IBM Research - Almaden
- Publication Years1981 - 2014
- Publication counts21
- Citation count1,239
- Available for Download14
- Downloads (cumulative)12,076
- Downloads (12 months)656
- Downloads (6 weeks)72
- Average Downloads per Article863
- Average Citation per Article59
View Full Profile
David Konopnicki
IBM Research
- Publication Years1995 - 2020
- Publication counts29
- Citation count335
- Available for Download19
- Downloads (cumulative)9,456
- Downloads (12 months)327
- Downloads (6 weeks)41
- Average Downloads per Article498
- Average Citation per Article12
View Full Profile

Comments

Recommendations

WSDM'15 Workshop Summary / Scalable Data Analytics: Theory and Applications
WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining

The SDA workshop at WSDM 2015 is the fifth International Workshop on Scalable Data Analytics, following the previous four workshops of SDA respectively held at IEEE Big Data 2013, PAKDD 2014, IEEE Big Data 2014, and IEEE ICDM 2014. This series of ...
DanaC '13: Proceedings of the Second Workshop on Data Analytics in the Cloud
Big Data Analytics

Save to Binder

Sections

Proceeding Downloads

Save to Binder

Recommendations

WSDM'15 Workshop Summary / Scalable Data Analytics: Theory and Applications

DanaC '13: Proceedings of the Second Workshop on Data Analytics in the Cloud

Big Data Analytics