References
Calisi L (2016) How to convert legacy Hadoop Map/Reduce ETL systems to Samza streaming. https://www.youtube.com/watch?v=KQ5OnL2hMBY
Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K (2015) Apache flink: stream and batch processing in a single engine. IEEE Data Eng Bull 38(4):28–38. http://sites.computer.org/debull/A15dec/p28.pdf
Chen S (2016) Scalable complex event processing on Samza @Uber. https://www.slideshare.net/ShuyiChen2/scalable-complex-event-processing-on-samza-uber
Das S, Botev C, Surlaker K, Ghosh B, Varadarajan B, Nagaraj S, Zhang D, Gao L, Westerman J, Ganti P, Shkolnik B, Topiwala S, Pachev A, Somasundaram N, Subramaniam S (2012) All aboard the Databus! LinkedIn’s scalable consistent change data capture platform. In: 3rd ACM symposium on cloud computing (SoCC). https://doi.org/10.1145/2391229.2391247
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: 6th USENIX symposium on operating system design and implementation (OSDI)
Feng T (2015) Benchmarking apache Samza: 1.2 million messages per second on a single node. http://engineering.linkedin.com/performance/benchmarking-apache-samza-12-million-messages-second-single-node
Goodhope K, Koshy J, Kreps J, Narkhede N, Park R, Rao J, Ye VY (2012) Building LinkedIn’s real-time activity data pipeline. IEEE Data Eng Bull 35(2):33–45. http://sites.computer.org/debull/A12june/A12JUN-CD.pdf
Hermann J, Balso MD (2017) Meet michelangelo: uber’s machine learning platform. https://eng.uber.com/michelangelo/
Hindman B, Konwinski A, Zaharia M, Ghodsi A, Joseph AD, Katz R, Shenker S, Stoica I (2011) Mesos: a platform for fine-grained resource sharing in the data center. In: 8th USENIX symposium on networked systems design and implementation (NSDI)
Junqueira FP, Reed BC, Serafini M (2011) Zab: high-performance broadcast for primary-backup systems. In: 41st IEEE/IFIP international conference on dependable systems and networks (DSN), pp 245–256. https://doi.org/10.1109/DSN.2011.5958223
Kleppmann M (2017) Designing data-intensive applications. O’Reilly Media. ISBN:978-1-4493-7332-0
Kleppmann M, Kreps J (2015) Kafka, Samza and the Unix philosophy of distributed data. IEEE Data Eng Bull 38(4):4–14. http://sites.computer.org/debull/A15dec/p4.pdf
Kreps J (2014) Why local state is a fundamental primitive in stream processing. https://www.oreilly.com/ideas/why-local-state-is-a-fundamental-primitive-in-stream-processing
Kreps J, Narkhede N, Rao J (2011) Kafka: a distributed messaging system for log processing. In: 6th international workshop on networking meets databases (NetDB)
Kulkarni S, Bhagat N, Fu M, Kedigehalli V, Kellogg C, Mittal S, Patel JM, Ramasamy K, Taneja S (2015) Twitter heron: stream processing at scale. In: ACM international conference on management of data (SIGMOD), pp 239–250. https://doi.org/10.1145/2723372.2723374
Netflix Technology Blog (2016) Kafka inside Keystone pipeline. http://techblog.netflix.com/2016/04/kafka-inside-keystone-pipeline.html
Noghabi SA, Paramasivam K, Pan Y, Ramesh N, Bringhurst J, Gupta I, Campbell RH (2017) Samza: stateful scalable stream processing at LinkedIn. Proc VLDB Endow 10(12):1634–1645. https://doi.org/10.14778/3137765.3137770
Paramasivam K (2016) Stream processing with Apache Samza – current and future. https://engineering.linkedin.com/blog/2016/01/whats-new-samza
Pathirage M, Hyde J, Pan Y, Plale B (2016) SamzaSQL: scalable fast data management with streaming SQL. In: IEEE international workshop on high-performance big data computing (HPBDC), pp 1627–1636. https://doi.org/10.1109/IPDPSW.2016.141
Qiao L, Auradar A, Beaver C, Brandt G, Gandhi M, Gopalakrishna K, Ip W, Jgadish S, Lu S, Pachev A, Ramesh A, Surlaker K, Sebastian A, Shanbhag R, Subramaniam S, Sun Y, Topiwala S, Tran C, Westerman J, Zhang D, Das S, Quiggle T, Schulman B, Ghosh B, Curtis A, Seeliger O, Zhang Z (2013) On brewing fresh Espresso: LinkedIn’s distributed data serving platform. In: ACM international conference on management of data (SIGMOD), pp 1135–1146. https://doi.org/10.1145/2463676.2465298
Vavilapalli VK, Murthy AC, Douglas C, Agarwal S, Konar M, Evans R, Graves T, Lowe J, Shah H, Seth S, Saha B, Curino C, O’Malley O, Radia S, Reed B, Baldeschwieler E (2013) Apache Hadoop YARN: yet another resource negotiator. In: 4th ACM symposium on cloud computing (SoCC). https://doi.org/10.1145/2523616.2523633
Wang G, Koshy J, Subramanian S, Paramasivam K, Zadeh M, Narkhede N, Rao J, Kreps J, Stein J (2015) Building a replicated logging system with Apache Kafka. Proc VLDB Endow 8(12):1654–1655. https://doi.org/10.14778/2824032.2824063
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this entry
Cite this entry
Kleppmann, M. (2018). Apache Samza. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_197-2
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_197-2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering
Publish with us
Chapter history
-
Latest
Apache Samza- Published:
- 20 March 2018
DOI: https://doi.org/10.1007/978-3-319-63962-8_197-2
-
Original
Samza- Published:
- 19 February 2018
DOI: https://doi.org/10.1007/978-3-319-63962-8_197-1