Inkrementelle Neuberechnungen in MapReduce

Schildgen, Johannes; Jörg, Thomas; Deßloch, Stefan

doi:10.1007/s13222-012-0109-3

Inkrementelle Neuberechnungen in MapReduce

Schwerpunktbeitrag
Published: 29 December 2012

Volume 13, pages 33–43, (2013)
Cite this article

Datenbank-Spektrum Aims and scope Submit manuscript

Johannes Schildgen¹,
Thomas Jörg² &
Stefan Deßloch¹

552 Accesses
Explore all metrics

Zusammenfassung

Das MapReduce-Programmiermodell ermöglicht die skalierbare Analyse und Transformation großer Datenmengen. Wir stellen das auf MapReduce basierende Marimba-Framework zur einfachen Entwicklung von inkrementellen, selbstwartbaren Programmen vor, welche bei Änderung von Quelldaten eine vollständige Wiederholung des MapReduce-Jobs vermeiden. Marimba wird anhand mehrerer Anwendungen illustriert und durch Leistungsmessungen evaluiert.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

Das zu Hadoop gehörende spaltenbasierte Datenbanksystem HBase [4] verwaltet Tabellen, in denen Datensätze, die aus einer eindeutigen Row-ID sowie beliebigen Spalten bestehen, gespeichert werden.
Geänderte Dokumente werden als gelöscht und wieder neu eingefügt betrachtet.
HDFS (Hadoop Distributed File System) ist ein verteiltes Dateisystem, welches Dateiblöcke redundant auf mehreren Rechnern speichert. HDFS dient als Grundlage für das spaltenbasierte Datenbanksystem HBase.

Literatur

Apache Hadoop project. http://hadoop.apache.org/
Bhatotia P, Wieder A, Rodrigues R, Acar UA, Pasquin R (2011) Incoop: mapreduce for incremental computations. In: Proceedings of the 2nd ACM symposium on cloud computing (SOCC ’11), New York, NY, USA, 2011. S 7:1–7:14. ACM
Google Scholar
Dean J, Ghemawat S (2004) In: MapReduce: simplified data processing on large clusters (OSDI), S 137–150
Google Scholar
George L (2011) HBase: the definitive guide, 1st edn. O’Reilly Media, Sebastopol
Google Scholar
Brown University Data Management Group. A comparison of approaches to large-scale data analysis. http://database.cs.brown.edu/projects/mapreduce-vs-dbms/
Ho R (2010) Map/reduce to recommend people connection. August 2010. http://horicky.blogspot.de/2010/08/mapreduce-to-recommend-people.html
Hu Y (2012) Efficiently extracting change data from column oriented NoSQL database. In: Proc of workshop on parallel, peer-to-peer, distributed and cloud computing (ICS 2012), Dezember 2012 (accepted)
Google Scholar
Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. In: EuroSys, S 59–72
Chapter Google Scholar
Jörg T, Parvizi R, Yong H, Dessloch S (2011) Incremental recomputations in mapreduce. In: CloudDB 2011, Oktober 2011
Google Scholar
Krenzel S (2010) MapReduce: Finding friends. http://stevekrenzel.com/finding-friends-with-mapreduce
Logothetis D, Olston C, Reed B, Webb KC, Yocum K (2010) Stateful bulk processing for incremental analytics. In: SoCC, S 51–62
Chapter Google Scholar
Marimba framework. http://code.google.com/marimba-framework
Peng D, Dabek F (2010) Large-scale incremental processing using distributed transactions and notifications. In: OSDI
Google Scholar
Popa L et al. (2009) DryadInc: reusing work in large-scale computations. In: HotCloud
Google Scholar
Schildgen J (2012) Ein MapReduce-basiertes Programmiermodell für selbstwartbare Aggregatsichten. Masterarbeit. TU, Kaiserslautern
Google Scholar

Download references

Danksagung

Die vorgestellten Arbeiten wurden von Google im Rahmen eines Google Research Award finanziell unterstützt.

Author information

Authors and Affiliations

AG Heterogene Informationssysteme, Technische Universität Kaiserslautern, P.O. Box 3049, 67663, Kaiserslautern, Deutschland
Johannes Schildgen & Stefan Deßloch
Google, Dienerstraße 12, 80331, München, Deutschland
Thomas Jörg

Authors

Johannes Schildgen
View author publications
You can also search for this author inPubMed Google Scholar
Thomas Jörg
View author publications
You can also search for this author inPubMed Google Scholar
Stefan Deßloch
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Johannes Schildgen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schildgen, J., Jörg, T. & Deßloch, S. Inkrementelle Neuberechnungen in MapReduce. Datenbank Spektrum 13, 33–43 (2013). https://doi.org/10.1007/s13222-012-0109-3

Download citation

Received: 08 November 2012
Accepted: 10 December 2012
Published: 29 December 2012
Issue Date: March 2013
DOI: https://doi.org/10.1007/s13222-012-0109-3

Schlüsselwörter

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Inkrementelle Neuberechnungen in MapReduce

Zusammenfassung

Access this article

Subscribe and save

Buy Now

Notes

Literatur

Danksagung

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Schlüsselwörter

Subscribe and save

Buy Now