An Architecture and Methods for Big Data Analysis

Ionescu, Bogdan; Ionescu, Dan; Gadea, Cristian; Solomon, Bogdan; Trifan, Mircea

doi:10.1007/978-3-319-18296-4_39

Bogdan Ionescu⁵,
Dan Ionescu⁵,
Cristian Gadea⁵,
Bogdan Solomon⁵ &
…
Mircea Trifan⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 356))

Included in the following conference series:

International Workshop Soft Computing Applications

888 Accesses

Abstract

Data production has recently witnessed explosive growth, reaching an insurmountable amount (larger than 4 ZB in 2013). This includes data sources such as sensors used to gather climate information, reports on household parameters, posts to social media sites containing digital pictures and videos, purchase transaction records, and cell phone GPS signals, to name a few. Not yet having more than an intuitive and ad hoc definition, big data is challenging the IT infrastructure of companies and organizations, forcing them to look for viable solutions leading to data processing such that enterprises can deploy a better business strategy. In essence, big data implies collecting, extracting, transforming, transporting, loading (ETL), classifying, analyzing, interpreting, and visualizing, among many other operations, on large amounts of structured, semi-structured, and unstructured data, in the order of a few petabytes per day, executed and terminated in critical time. This paper will introduce the architecture and the corresponding functions of a platform and tools implementing part of these challenging operations, while others are being obtained via composing elementary operations. The architecture is built around a distributed network of virtual servers called “agents,” which can migrate around a network of hardware servers whenever available resources are provided or created. A control center makes decisions on moving the agents based on the availability of resources when needed. An example from the telecommunications industry will illustrate how the platform is applied to this domain of big data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Big Data

Big Data Architecture

Emerging Trends in Big Data Analytics—A Study

References

Reinsel D, Gantz J (2011) Extracting value form chaos. http://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf. Accessed Feb 2015
B. M. 2011 hype cycle special report (2011) http://www.gartner.com/newsroom/id/1763814. Accessed Feb 2015
Amazon. AWS Case Study: Obama for America Campaign 2012 (2012) http://aws.amazon.com/solutions/case-studies/obama. Accessed Feb 2015
White T (2009) Hadoop: the definitive guide. 1st edn. O’Reilly Media, Newton
Google Scholar
Podesta J, Pritzker P, Moniz E (2014) Seizing opportunities, preserving values, 1st edn. White House Publishing, Washington
Google Scholar
Chen M, Mao S, Liu Y (2014) Big data: A survey. Mob Netw Appl 19(2):171–209
Article MathSciNet Google Scholar
Gunarathne T, Wu T-L, Choi JY, Bae S-H, Qiu J (2011) Cloud computing paradigms for pleasingly parallel biomedical applications. Concurr Comput: Pract Exp 23(17):2338–2354
Article Google Scholar
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18
Article Google Scholar
Beisken S, Meinl T, Wiswedel B, de Figueiredo L, Berthold M, Steinbeck C (2013) Knimecdk: workflow-driven cheminformatics. BMC Bioinf 14(1):257
Article Google Scholar
R. D. C. Team (2011) R: a language and environment for statistical computing. R Development Core Team, 1st edn
Google Scholar
PMMLorg. Pmml 4.2—general structure (2014) http://www.dmg.org/v4-2-1/GeneralStructure.html. Accessed Feb 2015
Jeffrey D, Sanjay G (2004) Proceedings of usenix osdi ’04: Operating systems design and implementation. In: ICSOC, pp 107–111, Oct 2004
Google Scholar
Big Data for Development: Opportunities Challenges (2012) http://www.unglobalpulse.org/projects/BigDataforDevelopment. Accessed Feb 2015
Eaton C, Deutsh T, Deroos D, Lapis D, Zikopoulos, P (2012) Understanding big data, analytics for enterprise class; hadoop and streaming data. McGraw-Hill, 1st edn 2012
Google Scholar
Hadoop (2015) Hadoop Wiki: PoweredBy http://wiki.apache.org/hadoop/PoweredBy. Accessed Feb 2015
Apache. Apache Mahout Project (2014) https://mahout.apache.org/. Accessed Feb 2015
Solomon B, Ionescu D, Litoiu M, Mihaescu M (2007) Towards a real-time reference architecture for autonomic systems. In: SEAMS ’07: proceedings of the 2007 international workshop on software engineering for adaptive and self-managing systems, pp. 1–10
Google Scholar

Download references

Author information

Authors and Affiliations

NCCT Laboratory, University of Ottawa, 161 Louis Pasteur Room B-306, Ottawa on, K16N5, Canada
Bogdan Ionescu, Dan Ionescu, Cristian Gadea, Bogdan Solomon & Mircea Trifan

Authors

Bogdan Ionescu
View author publications
You can also search for this author in PubMed Google Scholar
Dan Ionescu
View author publications
You can also search for this author in PubMed Google Scholar
Cristian Gadea
View author publications
You can also search for this author in PubMed Google Scholar
Bogdan Solomon
View author publications
You can also search for this author in PubMed Google Scholar
Mircea Trifan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bogdan Ionescu .

Editor information

Editors and Affiliations

Department of Automation and Applied Informatics, Aurel Vlaicu University of Arad, Arad, Romania
Valentina Emilia Balas
Faculty of Science and Technology, Data Science Institute, Bournemouth University, Poole, United Kingdom
Lakhmi C. Jain
University of Belgrade, Belgrade, Serbia
Branko Kovačević

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ionescu, B., Ionescu, D., Gadea, C., Solomon, B., Trifan, M. (2016). An Architecture and Methods for Big Data Analysis. In: Balas, V., C. Jain, L., Kovačević, B. (eds) Soft Computing Applications. SOFA 2014. Advances in Intelligent Systems and Computing, vol 356. Springer, Cham. https://doi.org/10.1007/978-3-319-18296-4_39

Download citation

DOI: https://doi.org/10.1007/978-3-319-18296-4_39
Published: 03 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18295-7
Online ISBN: 978-3-319-18296-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics