Abstract
We propose a novel framework, called IncompFuse, that significantly improves the accuracy of existing methods for reconstructing aggregated historical data from inaccurate historical reports. IncompFuse supports efficient data reliability assessment using the incompatibility probability of historical reports. We provide a systematic approach to define this probability based on properties of the data and relationships between the reports. Our experimental study demonstrates high utility of the proposed framework. In particular, we were able to detect noisy historical reports with very high detection accuracy.










Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Almutairi, F.M., Yang, F., Song, H.A., Faloutsos, C., Sidiropoulos, N., Zadorozhny, V. (2018). Homerun: scalable sparse-spectrum reconstruction of aggregated historical data. Journal Proceedings of the VLDB Endowment, 11(11), 1496–1508.
Amazon. (2002). Amazon auctions. https://www.amazon.com/. [Online].
Askarizade, M., Nematbakhsh, M.A., Davoodi Jam, E. (2012). Data conflict resolution among same entities in web of data. In: 2012 2nd International eConference on Computer and Knowledge Engineering (ICCKE) (pp. 278–282).
Bohannon, P., Fan, W., Flaster, M., Rastogi, R. (2005). A cost-based model and effective heuristic for repairing constraints by value modification. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (pp. 143–154). ACM.
Dong, X.L., Berti-Equille, L., Srivastava, D. (2009). Integrating conflicting data: the role of source dependence. Journal Proceedings of the VLDB Endowment, 2 (1), 550–561.
Dong, X.L., & Naumann, F. (2009). Data fusion: resolving data conflicts for integration. Journal Proceedings of the VLDB Endowment, 2(2), 1654–1655.
Dong, X.L., Saha, B., Srivastava, D. (2012). . Less is More:, Selecting Sources Wisely for Integration, 6(2), 37–48.
Galland, A., Abiteboul, S., Marian, A., Senellart, P. (2010). Corroborating information from disagreeing views. In: Proceedings of the third ACM international conference on Web search and data mining (pp. 131–140). ACM.
Grant, J. (1978). Classifications for inconsistent theories. Notre Dame Journal of Formal Logic, 19(3), 435–444.
Grant, J., & Martinez, M.V. (2018). Measuring Inconsistency in Information. College Publications.
Levien, R. (2009). Attack-Resistant Trust Metrics, (pp. 121–132). Berlin: Springer.
Li, X., Dong, X.L., Lyons, K., Meng, W., Srivastava, D. (2012). . Truth Finding on the Deep Web:, Is the Problem Solved?, 6, 97–108.
Liu, Z., Song, H.A., Zadorozhny, V., Faloutsos, C., Sidiropoulos, N. (2017). Hfuse: Efficient fusion of aggregated historical data. In: Proceedings of SIAM International Conference on Data Mining.
Page, L., Brin, S., Motwani, R., Winograd, T. (1999). The pagerank citation ranking: Bringing order to the Web. Report, Stanford InfoLab.
Pasternack, J., & Roth, D. (2010). Knowing what to believe (when you already know something). In: Proceedings of the 23rd International Conference on Computational Linguistics (pp. 877–885). Association for Computational Linguistics.
Resnick, P., Kuwabara, K., Zeckhauser, R., Friedman, E. (2000). Reputation systems. Communications of the ACM, 43(12), 45–48.
Sharma, D. (2010). Efficient information access in data-intensive sensor networks. PhD dissertation, University of Pittsburgh.
Staworko, S., & Chomicki, J. (2010). Consistent query answers in the presence of universal constraints. Information Systems, 35(1), 1–22.
Thimm, M. (2018). On the evaluation of inconsistency measures. In Grant, J., & Martinez, M.V. (Eds.) Measuring Inconsistency in Information. College Publications, London, UK.
Yi, R., Zadorozhny, V., Oleshchuk, V., Li, F. (2014). A novel approach to trust management in unattended wireless sensor networks. IEEE Transactions on Mobile Computing, 13(7), 1409–1423.
Yin, X., Han, J., Philip, S.Y. (2008). Truth discovery with multiple conflicting information providers on the Web. IEEE Transactions on Knowledge and Data Engineering, 20(6), 796–808.
Yin, X., & Tan, W. (2011). Semi-supervised truth discovery. In: Proceedings of the 20th International Conference on World Wide Web (pp. 217–226). ACM.
Zadorozhny, V., & Grant, J. (2016). A systematic approach to reliability assessment in integrated databases. Journal of Intelligent Information Systems, 46(3), 409–424.
Zadorozhny, V., & Hsu, Y.-F. (2011). Scalable Uncertainty Management. Fifth International Conference Proceedings. In Benferhat, S., & Grant, J. (Eds.) (pp. 331–345). Berlin: Springer.
Zadorozhny, V., Krishnamurthy, P., Abdelhakim, M., Pelechrinis, K., Xu, J. (2017). Data credence in iot: Vision and challenges. Open Journal of Internet of Things (OJIOT), 3(1), 114–126. Special Issue:, Proceedings of the International Workshop on Very Large Internet of Things (VLIoT 2017) in conjunction with the VLDB 2017 Conference., 3(1):114–126.
Zadorozhny, V., & Lewis, M. (2013). Information fusion for usar operations based on crowdsourcing. In: 2013 16th International Conference on Information Fusion (FUSION) (pp. 1450–1457).
Zadorozhny, V., Manning, P., Bain, D.J., Mostern, R. (2013). . Journal of World-Historical Information: JWHI, 1(1), 1.
Zadorozhny, V., & Raschid, L. (2007). Alternative path selection in resilient web infrastructure using performance dependencies. Journal of Web Engineering, 6(2), 121–130.
Ziegler, C.-N., & Lausen, G. (2004). Spreading activation models for trust propagation. In: EEE’04. 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004 (pp. 83–97).
Acknowledgements
We wish to thank the reviewers for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, J., Zadorozhny, V. & Grant, J. IncompFuse: a logical framework for historical information fusion with inaccurate data sources. J Intell Inf Syst 54, 463–481 (2020). https://doi.org/10.1007/s10844-019-00569-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-019-00569-6