skip to main content
10.1145/1012453.1012464acmconferencesArticle/Chapter ViewAbstractPublication PagesiqisConference Proceedingsconference-collections
Article

A framework for analysis of data freshness

Published: 18 June 2004 Publication History

Abstract

Data freshness has been identified as one of the most important data quality attributes in information systems. This importance increases particularly in the context of distributed systems, composed of a large set of autonomous data sources, where integrating data having different freshness may lead to semantic problems. There are various definitions of data freshness in the literature, depending on the applications where they are used, as well as different metrics to measure them. This paper presents an analysis of these definitions and metrics and proposes a taxonomy based upon the nature of the data, the type of application and the synchronization policies underlying the multi-source information system. We analyze, in terms of the taxonomy, the way freshness is defined and used in several types of systems and we present some open research problems in the field of data freshness evaluation.

References

[1]
Abiteboul, S.; Duschka, O.: "Complexity of answering queries using materialized views". In Proc. of the 1998 ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'98), USA, 1998.]]
[2]
Ballow, D.; Wang, R.; Pazer, H.; Tayi, G.: "Modelling Information Manufacturing Systems to Determine Information Product Quality". Management Science, Vol. 44 (4), April 1998.]]
[3]
Baralis, E.; Paraboschi, S.; Teniente, E.: "Materialized view selection in a multidimensional database". In Proc. of the 23rd Int. Conf. on Very Large Data Bases (VLDB'97), Greece, 1997.]]
[4]
Bouzeghoub, M.; Fabret, F.; Matulovic-Broqué, M.: "Modeling Data Warehouse Refreshment Process as a Workflow Application". In Proc. of the Int. Workshop on Design and Management of Data Warehouses (DMDW'99), Germany, 1999.]]
[5]
Bright, L.; Raschid, L.: "Using Latency-Recency Profiles for Data Delivery on the Web". In Proc. of the 28th Int. Conf. on Very Large Databases (VLDB'02), China, 2002.]]
[6]
Chirkova, R.; Halevy, A.; Suciu, D.: "A formal perspective on the view selection problem". In Proc. of 27th Int. Conf. on Very Large Databases (VLDB'01), Italy, 2001.]]
[7]
Cho, J.; Garcia-Molina, H.: "Synchronizing a database to improve freshness". In Proc. of the 2000 ACM Int. Conf. on Management of Data (SIGMOD'00), USA, 2000.]]
[8]
Cho, J.; Garcia-Molina, H.: "Estimating frequency of change". ACM Trans. on Internet Technology (TOIT), Vol. 3 (3):256--290, 2003.]]
[9]
Gal, A.: "Obsolescent materialized views in query processing of enterprise information systems". In Proc. of the 1999 ACM Int. Conf. on Information and Knowledge Management (CIKM'99), USA, 1999.]]
[10]
Galards, H.; Florescu, D.; Shasha, D.: Simon, E.: "AJAX: An Extensible Data Cleaning Tool". In Proc. of the 2000 ACM Int. Conf. on Management of Data (SIGMOD'00), USA, 2000.]]
[11]
Gancarski, S.; Le Pape, C.; Valduriez, P.: "Relaxing Freshness to Improve Load Balancing in a Cluster of Autonomous Replicated Databases". In Proc. of the 5th Workshop on Distributed Data and Structures (WDAS), Greece, 2003.]]
[12]
Gertz, M.; Tamer Ozsu, M.; Saake, G.; Sattler, K.: "Report on the Dagstuhl Seminar: Data Quality on the Web". SIGMOD Record Vol. 33(1), March 2004.]]
[13]
Gupta, A.; Mumick, I.: "Maintenance of Materialized Views: Problems, Techniques, and Applications". Data Engineering Bulletin, Vol. 18 (2):3--18, June 1995.]]
[14]
Gupta, H.: "Selection of Views to Materialize in a Data Warehouse". In Proc. of the 6th Int. Conf. on Database Theory (ICDT'97), Greece, 1997.]]
[15]
Hammer, J.; Garcia-Molina, H.; Widom, J.; Labio, W.; Zhuge, Y.: "The Stanford Data Warehousing Project". IEEE Data Engineering Bulletin, Vol. 18(2):41--48, June 1995.]]
[16]
Harinarayan, V.; Rajaraman, A.; Ullman, J.: "Implementing Data Cubes Efficiently". In Proc. of the ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'96), Canada, 1996.]]
[17]
Huang, Y.; Sloan, R.; Wolfson, O.: "Divergence caching in client-server architectures". In Proc. of the 3rd Int. Conf. on Parallel and Distributed Information Systems (PDIS 94), USA, 1994.]]
[18]
Hull, R.; Zhou, G.: "A Framework for Supporting Data Integration Using the Materialized and Virtual Approaches". In Proc. of the 1996 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'96), Canada, 1996.]]
[19]
Inmon, W.: "Building the Data Warehouse". John Wiley & Sons Inc., 1996.]]
[20]
Jarke, M.; Jeusfeld, M.; Quix, C.; Vassiliadis, P.: "Architecture and Quality in Data Warehouses: An Extended Repository Approach". Info Systems, Vol. 24(3):229--253, 1999.]]
[21]
Kießling, W.; Kôstler, G.: "Preference SQL - Design, Implementation, Experiences". In Proc. of the 28th Int. Conf. on Very Large Databases (VLDB'02), China, 2002.]]
[22]
Labrinidis, A.; Roussopoulos, N.: "Balancing Performance and Data Freshness in Web Database Servers". In Proc. of the 29th Int. Conf. on Very Large Data Bases (VLDB'03), Germany, 2003.]]
[23]
Li, W. S.; Po, O.; Hsiung, W. P.; Selçuk Candan, K.; Agrawal, D.: "Freshness-driven adaptive caching for dynamic content Web sites". Data & Knowledge Engineering (DKE), Vol. 47(2):269--296, 2003.]]
[24]
Ligoudistianos, S.; Sellis, T.; Theodoratos, D.; Vassiliou, Y.: "Heuristic Algorithms for Designing a Data Warehouse with SPJ Views". In Proc. of 1st Int. Conf. on Data Warehousing and Knowledge Discovery (DaWak '99), Italy, 1999.]]
[25]
Mannino, M.; Walter, Z.: "A Framework for Data Warehouse Refresh Policies". Technical report CSIS-2004-001, University of Colorado at Denver, 2004.]]
[26]
Naumann, F.; Leser, U.: "Quality-driven Integration of Heterogeneous Information Systems". In Proc. of the 25th Int. Conf. on Very Large Data-bases (VLDB'99), Scotland, 1999.]]
[27]
Raman, V.; Hellerstein, J.: "Potter's Wheel: An Interactive Data Cleaning System". In Proc. of the 27th Int. Conf. on Very Large Data Bases (VLDB'01), Italy, 2001.]]
[28]
Redman, T.: "Data Quality for the Information Age". Artech House, 1996.]]
[29]
Segev, A.; Weiping, F.: "Currency-Based Updates to Distributed Materialized Views". In Proc. of the 6th Int. Conf. on Data Engineering (ICDE'90), USA, 1990.]]
[30]
Sheth, A.; Larson, J.: "Federated Database Systems for Managing Distributed, Heterogeneous and Autonomous Databases". ACM Computing Surveys, Vol. 22(3):186--236, September 1990.]]
[31]
Shin, B.: "An exploratory Investigation of System Success Factors in Data Warehousing". Journal of the Association for Information Systems, Vol. 4(2003):141--170, 2003.]]
[32]
Theodoratos, D.; Sellis, T.: "Data Warehouse Configuration". In Proc. of the 23rd Int. Conf. on Very Large DataBases (VLDB'1997), Greece, 1997.]]
[33]
Theodoratos, D.; Bouzeghoub, M.: "Data Currency Quality Factors in Data Warehouse Design". In Proc. of the Int. Workshop on Design and Management of Data Warehouses (DMDW'99), Germany, 1999.]]
[34]
Wang, R.; Strong, D.: "Beyond accuracy: What data quality means to data consumers". Journal on Management of Information Systems, Vol. 12(4):5--34, 1996.]]
[35]
Widom, J.: "Research Problems in Data Warehousing". In Proc. of the 4th Int. Conf. on Information and Knowledge Management (CIKM'95), USA, 1995.]]
[36]
Wiederhold, G.: "Mediators in the architecture of future information systems". IEEE Computer, Vol. 25(3):38--49, 1992.]]
[37]
Yang, J.; Karlapalem, K.; Li, Q.: "Algorithms for materialized view design in data warehousing environment". In Proc. of the 23rd Int. Conf. on Very Large DataBases (VLDB'1997), Greece, 1997.]]
[38]
Zhuge, Y.; Garcia-Molina, H.; Wiener, J.: "Multiple View Consistency for Data Warehousing". In Proc. of the 13th Int. Conf. on Data Engineering (ICDE'97), UK, 1997.]]

Cited By

View all
  1. A framework for analysis of data freshness

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IQIS '04: Proceedings of the 2004 international workshop on Information quality in information systems
    June 2004
    81 pages
    ISBN:1581139020
    DOI:10.1145/1012453
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. freshness
    2. multi-source information systems
    3. quality evaluation

    Qualifiers

    • Article

    Conference

    IQIS04
    Sponsor:

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Data Guards: Challenges and Solutions for Fostering Trust in Data2024 IEEE Visualization and Visual Analytics (VIS)10.1109/VIS55277.2024.00019(56-60)Online publication date: 13-Oct-2024
    • (2024)HTAP Databases: A SurveyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.338969336:11(6410-6429)Online publication date: Nov-2024
    • (2024)Research on Pricing of Data Based on Bi-level Programming ModelAnnals of Data Science10.1007/s40745-024-00549-wOnline publication date: 16-Jun-2024
    • (2024)Strategies for data supply in high-granularity data trade in smart citiesEnvironment Systems and Decisions10.1007/s10669-024-09994-745:1Online publication date: 27-Nov-2024
    • (2023)Energy Efficient Message Scheduling with Redundancy Control for Massive IoT Monitoring2023 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC55385.2023.10118910(1-6)Online publication date: Mar-2023
    • (2023)Data Quality Computation For Obsolescence Detection Within Connected Environments2023 International Conference on Innovations in Intelligent Systems and Applications (INISTA)10.1109/INISTA59065.2023.10310467(1-6)Online publication date: 20-Sep-2023
    • (2022)ByteHTAPProceedings of the VLDB Endowment10.14778/3554821.355483215:12(3411-3424)Online publication date: 1-Aug-2022
    • (2022)HTAP Databases: What is New and What is NextProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3522565(2483-2488)Online publication date: 10-Jun-2022
    • (2022)Effect of End-to-end Delay on Lifetime and Average Bit Error Rate in Directional Antenna Based Wireless Sensor Networks2022 International Balkan Conference on Communications and Networking (BalkanCom)10.1109/BalkanCom55633.2022.9900836(137-141)Online publication date: 22-Aug-2022
    • (2022)Group data freshness scheme for outsourced data in distributed systemsFuture Generation Computer Systems10.1016/j.future.2022.03.011133:C(141-152)Online publication date: 1-Aug-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media