ABSTRACT
Crowdsourcing is used to obtain needed ideas and content by soliciting data from a large group of people, especially from an online community. However, the data generated by a group of people is duplicated. As to learn the crowd intention based on the crowd data, we need to do some entity resolution works. Previous works focus on data matching and merging, but remain far from perfect in crowdsourcing area. In our study, we propose a generic way in measuring and representing the crowd intention based on the crowd data. The main contribution of our study is twofold: 1. We propose a graph structure that represents the crowd intention. 2. We propose an entropy-based measurement that evaluates the diversity of the crowd intention.
- Anhai Doan, Raghu Ramakrishnan, Alon Y. Halevy, "Crowdsourcing systems on the world-wide web," Communications of the ACM, vol. 54, no. 3, 2011 Google ScholarDigital Library
- A. K. Elmagarmid, P. G. Ipeirotis, and V. S. Verykios, "Duplicate record detection: A survey. Knowledge and Data Engineering," IEEE Transactions, vol. 19, no. 1, 2007, pp. 1-- 16. Google ScholarDigital Library
- Benjelloun, Omar, Hector Garcia-Molina, et al., "Swoosh: a generic approach to entity resolution," The VLDB Journal--- The International Journal on Very Large Data Bases, vol. 18, no. 1, 2009, pp. 255--276. Google ScholarDigital Library
- Dehmer, Matthias, and Abbe Mowshowitz, "A history of graph entropy measures," Information Sciences, vol. 181, no.1, 2011, pp. 57--78. Google ScholarDigital Library
- Jeff Howe, "The rise of crowdsourcing." Wired magazine, vol. 14, no.6, 2006, pp. 1--4.Google Scholar
- Jiannan Wang, Tim Kraska, Michael J. Franklin, and Jianhua Feng, "Crowder: Crowdsourcing entity resolution," Proceedings of the VLDB Endowment 5, no. 11, 2012, pp. 1483--1494. Google ScholarDigital Library
- Köpcke, Hanna, Andreas Thor, and Erhard Rahm, "Evaluation of entity resolution approaches on real-world match problems," Proceedings of the VLDB Endowment, vol. 3, no. 1-2, 2010, pp. 484--493. Google ScholarDigital Library
- M. Sabetzadeh and S.M. Easterbrook, "View Merging in the Presence of Incompleteness and Inconsistency," Requirements Eng., vol. 11, no. 3, 2006, pp. 174--193. Google ScholarDigital Library
- S. Nejati et al., "Matching and Merging of Variant Feature Specifications," IEEE Trans. Software Eng., vol. 38, no. 6, 2012, pp. 1355--1375. Google ScholarDigital Library
Index Terms
- An Entropy-based Approach to the Crowd Entity Resolution
Recommendations
Attribute-based Crowd Entity Resolution
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge ManagementWe study the problem of using the crowd to perform entity resolution (ER) on a set of records. For many types of records, especially those involving images, such a task can be difficult for machines, but relatively easy for humans. Typical crowd-based ...
Learning an accurate entity resolution model from crowdsourced labels
ICUIMC '14: Proceedings of the 8th International Conference on Ubiquitous Information Management and CommunicationWe investigated the use of supervised learning methods that use labels from crowd workers to resolve entities. Although obtaining labeled data by crowdsourcing can reduce time and cost, it also brings challenges (e.g., coping with the variable quality ...
Using clustering and transitivity to reduce the costs of crowdsourced entity resolution
CrowdSoft 2014: Proceedings of the 1st International Workshop on Crowd-based Software Development Methods and TechnologiesEntity resolution is the process of identifying the data records representing the same entity. ER is a highly important problem in software and application domains. For example, detecting duplicate bug reports with ER can greatly save developing ...
Comments