Modeling Adaptive Data Analysis Pipelines for Crowd-Enhanced Processes

Cappiello, Cinzia; Pernici, Barbara; Vitali, Monica

doi:10.1007/978-3-030-89022-3_3

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13011))

Included in the following conference series:

International Conference on Conceptual Modeling

1376 Accesses

Abstract

Information from social media can be leveraged by social scientists to support effective decision making. However, such data sources are often characterised by high volumes and noisy information, therefore data analysis should be always preceded by a data preparation phase. Designing and testing data preparation pipelines requires considering requirements on cost, time, and quality of data extraction. In this work, we aim to propose a methodology for modeling crowd-enhanced data analysis pipelines using a goal-oriented approach, including both automatic and human-related tasks, by suggesting the kind of components to include, their order, and their parameters, while balancing the trade-off between cost, time, and quality of the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

CrowdCorrect: A Curation Pipeline for Social Data Cleansing and Curation

Advances in Crowdsourcing: Surveys, Social Media and Geospatial Analysis: Towards a Big Data Toolkit

Crowdsourcing for data management

Article 05 May 2017

Notes

1.
https://medium.com/ai2-blog/crowdsourcing-pricing-ethics-and-best-practices-8487fd5c9872.

References

Akkiraju, R., et al.: Characterizing machine learning processes: a maturity framework. In: Fahland, D., Ghidini, C., Becker, J., Dumas, M. (eds.) BPM 2020. LNCS, vol. 12168, pp. 17–31. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58666-9_2
Chapter Google Scholar
Barozzi, S., Fernandez-Marquez, J.L., Shankar, A.R., Pernici, B.: Filtering images extracted from social media in the response phase of emergency events. In: Proceedings of ISCRAM (2019)
Google Scholar
Berti-Équille, L.: Learn2Clean: optimizing the sequence of tasks for web data preparation. In: Proceedings of WWW Conference, pp. 2580–2586. ACM (2019)
Google Scholar
Chang, W.L., Boyd, D., NBD-PWG NIST big data public working group: NIST big data interoperability framework: volume 6, big data reference architecture [version 2] (2019)
Google Scholar
Fritz, S., et al.: Citizen science and the united nations sustainable development goals. Nat. Sustain. 2(10), 922–930 (2019)
Article Google Scholar
Havas, C., et al.: E2mC: improving emergency management service practice through social media and crowdsourcing analysis in near real time. Sensors 17(12), 2766 (2017)
Article Google Scholar
Iren, D., Bilgen, S.: Cost of quality in crowdsourcing. Hum. Comput. 1(2), 283–314 (2014)
Google Scholar
Negri, V., et al.: Image-based social sensing: combining AI and the crowd to mine policy-adherence indicators from Twitter. In: ICSE, Track Software Engineering in Society, May 2021
Google Scholar
Polyzotis, N., Roy, S., Whang, S.E., Zinkevich, M.: Data lifecycle challenges in production machine learning: a survey. SIGMOD Rec. 47(2), 17–28 (2018)
Article Google Scholar
Purohit, H., Castillo, C., Imran, M., Pandey, R.: Ranking of social media alerts with workload bounds in emergency operation centers. In: Proceedings of Conference on Web Intelligence (WI), pp. 206–213. IEEE (2018)
Google Scholar
Scheunemann, C., Naumann, J., Eichler, M., Stowe, K., Gurevych, I.: Data collection and annotation pipeline for social good projects. In: Proceedings of the AAAI Fall 2020 AI for Social Good Symposium (2020)
Google Scholar
Stodden, V.: The data science life cycle: a disciplined approach to advancing data science as a science. Commun. ACM 63(7), 58–66 (2020)
Article Google Scholar
Zahra, K., Imran, M., Ostermann, F.O.: Automatic identification of eyewitness messages on twitter during disasters. Inf. Process. Manag. 57(1), 102107 (2020)
Article Google Scholar

Download references

Acknowledgements

This work was funded by the European Commission H2020 Project Crowd4SDG, #872944.

Author information

Authors and Affiliations

Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milan, Italy
Cinzia Cappiello, Barbara Pernici & Monica Vitali

Authors

Cinzia Cappiello
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Pernici
View author publications
You can also search for this author in PubMed Google Scholar
Monica Vitali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Monica Vitali .

Editor information

Editors and Affiliations

School of Computing and IT, University of Wollongong, Wollongong, NSW, Australia
Aditya Ghose
Department of Computer Science and Engineering, Chalmers | University of Gothenburg, Gothenburg, Sweden
Jennifer Horkoff
Universidade Federal do Espírito Santo, Vitória, Brazil
Vítor E. Silva Souza
Faculty of Business Administration, Memorial University of Newfoundland, St. John's, NL, Canada
Jeffrey Parsons
Faculty of Business Administration, Memorial University of Newfoundland, St. John's, NL, Canada
Joerg Evermann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cappiello, C., Pernici, B., Vitali, M. (2021). Modeling Adaptive Data Analysis Pipelines for Crowd-Enhanced Processes. In: Ghose, A., Horkoff, J., Silva Souza, V.E., Parsons, J., Evermann, J. (eds) Conceptual Modeling. ER 2021. Lecture Notes in Computer Science(), vol 13011. Springer, Cham. https://doi.org/10.1007/978-3-030-89022-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-89022-3_3
Published: 16 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-89021-6
Online ISBN: 978-3-030-89022-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Modeling Adaptive Data Analysis Pipelines for Crowd-Enhanced Processes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

CrowdCorrect: A Curation Pipeline for Social Data Cleansing and Curation

Advances in Crowdsourcing: Surveys, Social Media and Geospatial Analysis: Towards a Big Data Toolkit

Crowdsourcing for data management

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Modeling Adaptive Data Analysis Pipelines for Crowd-Enhanced Processes

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

CrowdCorrect: A Curation Pipeline for Social Data Cleansing and Curation

Advances in Crowdsourcing: Surveys, Social Media and Geospatial Analysis: Towards a Big Data Toolkit

Crowdsourcing for data management

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation