Abstract
In the era of big data, people are dealing with data all the time. Data collection is the first step and foundation for many other downstream applications. Meanwhile, we observe that data collection is often entity-oriented, i.e., people usually collect data related to a specific entity. In most cases, people achieve entity-oriented data collection by manual query and filtering based on search engines or news applications. However, these methods are not very efficient and effective. In this paper, we consider designing reasonable process rules and integrating artificial intelligence algorithms to help people efficiently and effectively collect the target data related to the specific entity. Concretely, we propose an active workflow method to achieve this goal. The whole workflow method is composed of four processes: task modeling for data collection, Internet data collection, crowdsourcing data collection and multi-source data aggregation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Buettner, R.: A systematic literature review of crowdsourcing research from a human resource management perspective. In: Hawaii International Conference on System Sciences, pp. 4609–4618 (2015)
Corby, O., Dieng-Kuntz, R., Faron-Zucker, C.: Querying the semantic web with corese search engine. In: Eureopean Conference on Artificial Intelligence, ECAI 2004, Including Prestigious Applicants of Intelligent Systems, PAIS 2004, Valencia, Spain, August, pp. 705–709 (2017)
Curcin, V., Ghanem, M., Guo, Y.: The design and implementation of a workflow analysis tool. Philos. Trans. Math. Phys. Eng. Sci. 368(1926), 4193 (2010)
Doan, A.H., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide web. Commun. ACM 54(4), 86–96 (2011)
Georgakopoulos, D., Hornick, M., Sheth, A.: An overview of workflow management: from process modeling to workflow automation infrastructure. Distrib. Parallel Databases 3(2), 119–153 (1995)
Guo, G., Wang, C., Chen, J., Ge, P., Chen, W.: Who is answering whom? Finding “reply-to” relations in group chats with deep bidirectional lstm networks. Clust. Comput. 10, 1–12 (2018)
Guo, G., Wang, C., Ying, X.: Which algorithm performs best: algorithm selection for community detection. In: Companion of the The Web Conference, pp. 27–28 (2018)
Kobayashi, M., Takeda, K.: Information retrieval on the web. Annu. Rev. Inf. Sci. Technol. 39(1), 33–80 (2005)
Murata, T.: Petri nets: properties, analysis and applications. Proc. IEEE 77(4), 541–580 (1977)
Shaila, S.G., Vadivel, A.: Architecture specification of rule-based deep web crawler with indexer. Int. J. Knowl. Web Intell. 4(4), 166–186 (2013)
Acknowledgment
This work was supported in part by the National Key Research and Development Program of China (No. 2017YFC0820402), the Intelligent Manufacturing Comprehensive Standardization and New Pattern Application Project of Ministry of Industry and Information Technology (Experimental validation of key technical standards for trusted services in industrial Internet), and the National Natural Science Foundation of China (No. 61373023).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, G. (2018). An Active Workflow Method for Entity-Oriented Data Collection. In: Woo, C., Lu, J., Li, Z., Ling, T., Li, G., Lee, M. (eds) Advances in Conceptual Modeling. ER 2018. Lecture Notes in Computer Science(), vol 11158. Springer, Cham. https://doi.org/10.1007/978-3-030-01391-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-01391-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01390-5
Online ISBN: 978-3-030-01391-2
eBook Packages: Computer ScienceComputer Science (R0)