Skip to main content
Log in

IncompFuse: a logical framework for historical information fusion with inaccurate data sources

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

We propose a novel framework, called IncompFuse, that significantly improves the accuracy of existing methods for reconstructing aggregated historical data from inaccurate historical reports. IncompFuse supports efficient data reliability assessment using the incompatibility probability of historical reports. We provide a systematic approach to define this probability based on properties of the data and relationships between the reports. Our experimental study demonstrates high utility of the proposed framework. In particular, we were able to detect noisy historical reports with very high detection accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Almutairi, F.M., Yang, F., Song, H.A., Faloutsos, C., Sidiropoulos, N., Zadorozhny, V. (2018). Homerun: scalable sparse-spectrum reconstruction of aggregated historical data. Journal Proceedings of the VLDB Endowment, 11(11), 1496–1508.

    Article  Google Scholar 

  • Amazon. (2002). Amazon auctions. https://www.amazon.com/. [Online].

  • Askarizade, M., Nematbakhsh, M.A., Davoodi Jam, E. (2012). Data conflict resolution among same entities in web of data. In: 2012 2nd International eConference on Computer and Knowledge Engineering (ICCKE) (pp. 278–282).

  • Bohannon, P., Fan, W., Flaster, M., Rastogi, R. (2005). A cost-based model and effective heuristic for repairing constraints by value modification. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (pp. 143–154). ACM.

  • Dong, X.L., Berti-Equille, L., Srivastava, D. (2009). Integrating conflicting data: the role of source dependence. Journal Proceedings of the VLDB Endowment, 2 (1), 550–561.

    Article  Google Scholar 

  • Dong, X.L., & Naumann, F. (2009). Data fusion: resolving data conflicts for integration. Journal Proceedings of the VLDB Endowment, 2(2), 1654–1655.

    Article  Google Scholar 

  • Dong, X.L., Saha, B., Srivastava, D. (2012). . Less is More:, Selecting Sources Wisely for Integration, 6(2), 37–48.

    Google Scholar 

  • Galland, A., Abiteboul, S., Marian, A., Senellart, P. (2010). Corroborating information from disagreeing views. In: Proceedings of the third ACM international conference on Web search and data mining (pp. 131–140). ACM.

  • Grant, J. (1978). Classifications for inconsistent theories. Notre Dame Journal of Formal Logic, 19(3), 435–444.

    Article  MathSciNet  MATH  Google Scholar 

  • Grant, J., & Martinez, M.V. (2018). Measuring Inconsistency in Information. College Publications.

  • Levien, R. (2009). Attack-Resistant Trust Metrics, (pp. 121–132). Berlin: Springer.

    Google Scholar 

  • Li, X., Dong, X.L., Lyons, K., Meng, W., Srivastava, D. (2012). . Truth Finding on the Deep Web:, Is the Problem Solved?, 6, 97–108.

    Google Scholar 

  • Liu, Z., Song, H.A., Zadorozhny, V., Faloutsos, C., Sidiropoulos, N. (2017). Hfuse: Efficient fusion of aggregated historical data. In: Proceedings of SIAM International Conference on Data Mining.

  • Page, L., Brin, S., Motwani, R., Winograd, T. (1999). The pagerank citation ranking: Bringing order to the Web. Report, Stanford InfoLab.

  • Pasternack, J., & Roth, D. (2010). Knowing what to believe (when you already know something). In: Proceedings of the 23rd International Conference on Computational Linguistics (pp. 877–885). Association for Computational Linguistics.

  • Resnick, P., Kuwabara, K., Zeckhauser, R., Friedman, E. (2000). Reputation systems. Communications of the ACM, 43(12), 45–48.

    Article  Google Scholar 

  • Sharma, D. (2010). Efficient information access in data-intensive sensor networks. PhD dissertation, University of Pittsburgh.

  • Staworko, S., & Chomicki, J. (2010). Consistent query answers in the presence of universal constraints. Information Systems, 35(1), 1–22.

    Article  Google Scholar 

  • Thimm, M. (2018). On the evaluation of inconsistency measures. In Grant, J., & Martinez, M.V. (Eds.) Measuring Inconsistency in Information. College Publications, London, UK.

  • Yi, R., Zadorozhny, V., Oleshchuk, V., Li, F. (2014). A novel approach to trust management in unattended wireless sensor networks. IEEE Transactions on Mobile Computing, 13(7), 1409–1423.

    Article  Google Scholar 

  • Yin, X., Han, J., Philip, S.Y. (2008). Truth discovery with multiple conflicting information providers on the Web. IEEE Transactions on Knowledge and Data Engineering, 20(6), 796–808.

    Article  Google Scholar 

  • Yin, X., & Tan, W. (2011). Semi-supervised truth discovery. In: Proceedings of the 20th International Conference on World Wide Web (pp. 217–226). ACM.

  • Zadorozhny, V., & Grant, J. (2016). A systematic approach to reliability assessment in integrated databases. Journal of Intelligent Information Systems, 46(3), 409–424.

    Article  Google Scholar 

  • Zadorozhny, V., & Hsu, Y.-F. (2011). Scalable Uncertainty Management. Fifth International Conference Proceedings. In Benferhat, S., & Grant, J. (Eds.) (pp. 331–345). Berlin: Springer.

  • Zadorozhny, V., Krishnamurthy, P., Abdelhakim, M., Pelechrinis, K., Xu, J. (2017). Data credence in iot: Vision and challenges. Open Journal of Internet of Things (OJIOT), 3(1), 114–126. Special Issue:, Proceedings of the International Workshop on Very Large Internet of Things (VLIoT 2017) in conjunction with the VLDB 2017 Conference., 3(1):114–126.

    Google Scholar 

  • Zadorozhny, V., & Lewis, M. (2013). Information fusion for usar operations based on crowdsourcing. In: 2013 16th International Conference on Information Fusion (FUSION) (pp. 1450–1457).

  • Zadorozhny, V., Manning, P., Bain, D.J., Mostern, R. (2013). . Journal of World-Historical Information: JWHI, 1(1), 1.

    Article  Google Scholar 

  • Zadorozhny, V., & Raschid, L. (2007). Alternative path selection in resilient web infrastructure using performance dependencies. Journal of Web Engineering, 6(2), 121–130.

    Google Scholar 

  • Ziegler, C.-N., & Lausen, G. (2004). Spreading activation models for trust propagation. In: EEE’04. 2004 IEEE International Conference on e-Technology, e-Commerce and e-Service, 2004 (pp. 83–97).

Download references

Acknowledgements

We wish to thank the reviewers for helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiawei Xu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, J., Zadorozhny, V. & Grant, J. IncompFuse: a logical framework for historical information fusion with inaccurate data sources. J Intell Inf Syst 54, 463–481 (2020). https://doi.org/10.1007/s10844-019-00569-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-019-00569-6

Keywords

Navigation