Playing LEGO with JSON: Probabilistic joins over attribute-value fragments | IEEE Conference Publication | IEEE Xplore

Playing LEGO with JSON: Probabilistic joins over attribute-value fragments


Abstract:

Information about an entity can hardly be assumed to be given in one single document, created in a single instance of time. Rather, it is reasonable to assume that inform...Show More

Abstract:

Information about an entity can hardly be assumed to be given in one single document, created in a single instance of time. Rather, it is reasonable to assume that information is spread over multiple documents and created/enriched over time-for instance through crowdsourcing facts or mined from social network streams, one after the other. In this work, we consider the problem of assembling entity-centric information out of input comprising small pieces of information; provided in form of JSON document snippets. The final goal is to create a document that (possibly fully) describes an entity by putting related fragments together. What makes this task challenging is the lack of evidence telling which fragments belong together and, hence, can be safely combined. We focus on deciding this question using statistics of the already seen fragments, to justify if a join is reasonable or not. We evaluate our approach using real-world datasets and show that we can achieve high precision and recall.
Date of Conference: 16-20 May 2016
Date Added to IEEE Xplore: 23 June 2016
Electronic ISBN:978-1-5090-2109-3
Conference Location: Helsinki, Finland

Contact IEEE to Subscribe

References

References is not available for this document.