Elsevier

Journal of Systems and Software

Volume 73, Issue 3, November–December 2004, Pages 467-480
Journal of Systems and Software

A simulated annealing approach for multimedia data placement

https://doi.org/10.1016/j.jss.2003.09.020Get rights and content

Abstract

Multimedia applications are characterized by their strong timing requirements and constraints and thus multimedia data storage is a critical issue in the overall system's performance and functionality. This paper describes multimedia data representation models that effectively guide data placement towards the improvement of the Quality of Presentation for the considered multimedia applications. The performance of both constructive placement and iterative improvement placement algorithms is evaluated and discussed. Emphasis is given on placement schemes which are based on the simulated annealing optimization algorithm. A placement policy, based on a self-improving version of the simulated annealing (SISA) algorithm is applied and evaluated. Performance of the placement policies is experimentally evaluated on a simulated tertiary storage subsystem. As proven by the experimentation, the proposed approach shows considerable gain in terms of seek and service times. The improvements of the proposed SISA approach are in the range of 40% when compared to random placement and at the range of 15–35% when compared to the typical simulated annealing algorithm, depending a lot on the initial configuration and the neighborhood search.

Introduction

Increasing popularity of multimedia applications has been followed by corresponding increases in users requirements. Multimedia data differ from conventional text data since they are characterized by: (a) large size and (b) timing requirements and constraints. Representation models for multimedia data with respect to their spatio-temporal requirements have been proposed and classified in Bertino and Ferrari (1998), Kwon et al. (1999), Chung (1979), Escobar-Molano et al. (1996), Ghandeharizadeh (1996) and Hirzalla et al. (1995). More specifically, in Bertino and Ferrari (1998) and Kwon et al. (1999) a classification of the representation models, based on the notion of time is presented and two categories have been identified: the interval-based and the constraint-based models. A number of different representation approaches have been introduced in Bertino and Ferrari (1998), Kwon et al. (1999) and Chung (1979), and in this case the models are classified into three different categories: Graph Models, Petri-Net Models and Object-Oriented Models. This classification mainly focuses on the conceptual data representation and the model to be chosen depends on the application requirements, the developer's conceptual view and the existing system features. Moreover, in Escobar-Molano et al. (1996) and Ghandeharizadeh (1996) video objects representation models are categorized into Stream-Based Models and Structured Models with respect to their physical requirements and from the perspective of the DataBase Management System. Furthermore, a new timeline model, which captures user's interactivity on a set of multimedia documents is proposed in Hirzalla et al. (1995).

Current software development trends favor the involvement of hypermedia documents in most of recent large scale applications, i.e. a navigational type of access and searching is introduced. The allocation of such hypermedia documents and their corresponding multimedia objects (from the end user response time perspective) over a communication network has been discussed in Ahmad et al. (1999). In that context, the proposed model is related to multimedia objects allocation towards effective browsing and navigation in distributed environments. Indexing and declustering schemes for object-oriented or interactive navigational queries are proposed in Chen and Sinha (2000) and Han et al. (1999). Analysis and comparisons of the proposed declustering schemes and performance studies have indicated that navigational based indexing and declustering can be beneficial for the responsiveness and interactivity of the multimedia applications.

Due to the large storage requirements of multimedia data, tertiary storage subsystems are a quite appropriate proposal for multimedia objects storage. Tertiary storage level media are rather inexpensive and despite their slow data transfer rates, they are used for large scale storage mainly due to their huge space capacities. Recent research efforts in tertiary storage has focused on improving their performance towards high Quality of Service (QoS) for multimedia applications. In Prabhakar et al. (1996) the current state of the art in tertiary storage systems is discussed, tertiary system types are classified and extensive performance results are provided. Research work in Chervenak (1994) and Christodoulakis et al. (1997) evaluates storage hierarchies and the usefulness of current tertiary storage systems. In Hillyer and Silberschatz (1996) a serpentine tape drive is studied under model-driven simulation, while in Johnson and Miller (1998) detailed measurements of several tape drives and robotic storage libraries are presented, in order to provide better understanding on the issues related to integrating tertiary storage into a complete computer system. Issues related to data placement on the tertiary storage subsystem have also been studied. More specifically, in Christodoulakis et al. (1997) different data placement policies on various tape technologies and tape libraries have been implemented, while in Sesardi et al. (1994) optimal arrangements of cartridges and file-partitioning schemes are examined under a carousel type mass storage system. Furthermore, in Vakali and Manolopoulos (1998) data placement schemes are considered under three different models corresponding to three mid-range magnetic tape systems, while in Vakali and Terzi (2002) and Vakali et al. (2001) constructive and iterative improvement placement algorithms have been implemented for the placement of multimedia data on a tape-based storage subsystem.

The data placement problem is an optimization problem characterized by an objective cost function that has to be optimized (i.e. minimized or maximized depending on the application). Since optimal data placement is a particularly complicated problem which cannot be solved analytically, an algorithmic approach is necessary in search of the optimal solution. A wide class of heuristic algorithms for solving similar optimization problems is the class of random search algorithms. These are stochastic processes, which perform biased random walks in the space of the candidate solutions of a certain problem. They use random choice as a tool but in such a way that the search is guided towards a global optimal solution. They are highly efficient methods, easily applied on any optimization problem and restriction-free on the objective function (continuity, differentiation, etc.). Such methods for combinatorial optimization are described in Nahar et al. (1986). The most simple, yet powerful, random search technique is the simulated annealing (SA) algorithm. This method was initially presented in Metropolis et al. (1953) and received its name from the physical process called annealing which brings a solid to a state of minimum energy. Moreover, SA was proposed by Kirkpatrick et al. (1983) as a general-purpose optimization technique, suitable for solving many complex combinatorial problems. Since then, there was a vast amount of research work for applications of SA in various optimization problems (for example Hua et al., 1994), as well as for theoretical approaches to probabilistic mechanisms (see Aarts and van Laarhoven, 1989; Azencott, 1992).

This paper's SA-based algorithms are customized for our data placement problem, by adapting the approach presented in Angelis et al. (2001) for the solution of an optimization problem in statistical planning. The present work is an extension of author's previous research efforts as presented in Vakali and Terzi (2002), Vakali et al. (2001) and Angelis et al. (2001), and the paper's key contribution is summarized in the following points:

  • The proposed multimedia data representation model captures both the users navigation (within an application) and the timing constraints (within each multimedia object) towards efficient multimedia data placement under a tertiary storage topology.

  • The problem of data placement is dealt as an optimization problem, where an extension of the SA is employed. In addition to iterative improvement placement policies constructive placement algorithms are also implemented for comparison purposes.

  • A self-improving process of the simulated annealing is introduced for determining multimedia data placement within the tertiary storage subsystem.

  • The impact of the initial placement and of the perturbation function (i.e. the rules governing the search) in the overall performance of the SA-based algorithms are studied and the corresponding experimental results are discussed.


The structure of the remainder of the paper is organized as follows: in Section 2 the proposed representation model is introduced. The considered storage system is presented in Section 3 and the criteria for multimedia objects allocation are emphasized. Extensive discussions on the placement policies are given in Section 4, with particular emphasis on the simulated-annealing-based techniques. The simulation model and its basic components are analyzed in Section 5, while in Section 6 experimentation is described and results are presented. Finally, future work topics are discussed in Section 7. Notice that this work does not include an extensive description of the data model and the intuition of the algorithmic techniques behind it, since these are defined and discussed in authors' earlier work (Vakali and Terzi, 2002; Vakali et al., 2001; Angelis et al., 2001).

Section snippets

Multimedia data representation

This part is a summary over the Graph-Tree multimedia data representation structure that have been proposed and used by the authors in their earlier works (Vakali and Terzi, 2002, Vakali and Terzi, 2001a; Vakali et al., 2001) dealing with the problem of effective multimedia data placement on storage devices based on users' behavior when they navigate a multimedia application. There are two main goals of the proposed representation model: (1) to capture the users access patterns and (2) to

Criteria for multimedia objects allocation

Based on the adopted multimedia data representation, it is important to notice that at the external level, we consider the so called multimedia nodes which are the actual multimedia objects. Then, the proposed browsing graph can be represented by a transition matrix P associated with the graph G, a (M×M) matrix of access or transition probabilities, where by P(i,j)=pij, (i,j∈{1,2,…,M}) we denote the probability of accessing node j from node i in a single step. In order to capture the frequency

Multimedia data placement algorithms

The problem of proposing effective placement of multimedia data on a tertiary storage subsystem is “translated” to the problem of proposing a placement of N physical objects onto T tapes (each tape has Z zones) so that the imposed requests to be serviced in a way that minimizes the data seek and transfer time. Therefore, the data placement problem is an optimization problem. Various approaches have been proposed towards this research direction. In general, the data placement algorithms fall in

The simulation model

A simulation model has been developed such that the proposed data placement policies can be evaluated under an appropriately supported tertiary storage topology. The request process is also simulated by supporting bursts of clients' requests arriving in the system. Our simulation model is depicted in Fig. 4 and it is mainly based on the introduction of four modules:

  • The Data Representation Module: This module mainly represents the multimedia application environment to which the users have

Experimentation––Results

Various experimentation under different workloads and data placement policies have been applied. The experimentation focuses on the performance evaluation of the proposed placement algorithms, with respect to certain performance objective functions (such as seek and service times). Organ-pipe has been chosen as the most indicative algorithm in the class of constructive placements algorithms since it has been proven to outperform other algorithms. All of the SA-based algorithms (i.e. SA, ISA and

Future perspectives

This paper proposes different multimedia data placement algorithms based on the typical SA algorithm. The multimedia data are represented by an effective two-level model and the data placement algorithms are tailored for a tertiary storage topology. A self-improving version of the SA algorithm is proposed in the context of the data placement problem. The placement algorithms are guided by a popularity-based criterion evaluated on multimedia application's physical objects. Experimentation was

Acknowledgements

The authors thank the referees who have contributed in the improvement of the paper's quality, organization and readability. Their comments were valuable and have considerably improved the paper's overall presentation.

Evimaria D. Terzi graduated from the Department of Informatics of Aristotle University of Thessaloniki (Greece) in 2000, and after that obtained her M.Sc. from the Dept. of Computer Science of Purdue University (USA) in May 2002. Since August 2002, she is with the Dept. of Computer Science of University of Helsinki (Finland), studying towards her Ph.D. Her current research interests include algorithms and computational complexity, data mining and its applications mainly to genomic data.

References (31)

  • L. Angelis et al.

    Optimal exact experimental designs with autocorrelated errors through a simulated annealing algorithm

    Computational Statistics and Data Analysis

    (2001)
  • Y.-M. Kwon et al.

    Modeling spatio-temporal constraints for multimedia objects

    Knowledge and Data Engineering

    (1999)
  • E.H.L. Aarts et al.

    Simulated annealing: an introduction

    Statistics Neerlandica

    (1989)
  • I. Ahmad et al.

    Response time driven multimedia data objects allocation for browsing documents in distributed environments

    IEEE Transactions on Knowledge and Data Engineering

    (1999)
  • R. Azencott

    Simulated Annealing. Parallelization Techniques

    (1992)
  • E. Bertino et al.

    Temporal synchronization models for multimedia data

    IEEE Transactions on Knowledge and Data Engineering

    (1998)
  • C.-M. Chen et al.

    Analysis and comparison of declustering schemes for interactive navigation queries

    IEEE Transactions on Knowledge and Data Engineering

    (2000)
  • Chervenak, A.L., 1994. Tertiary Storage––An Elevation of New Applications. PhD Dissertation, University of California...
  • Christodoulakis, S., Triantaflllou, P., Zioga, F., 1997. Principles of optimally placing data in tertiary storage...
  • S.M. Chung

    Computers and Intractability: A Guide to the Theory of NP-Completenes

    (1979)
  • M.L. Escobar-Molano et al.

    An optimal resource scheduler for continuous display of structured video objects

    IEEE Transactions on Knowledge and Data Engineering

    (1996)
  • D.J. Gemmel et al.

    Multimedia storage servers: a tutorial

    IEEE Computer

    (1995)
  • S. Ghandeharizadeh

    Stream-based versus structured video objects: issues, solutions, and challenges

  • J. Han et al.

    Join index hierarchy: an indexing structure for efficient navigation in object-oriented databases

    IEEE Transactions on Knowledge and Data Engineering

    (1999)
  • Hillyer, B.K., Silberschatz, A., 1996. On the modeling and performance characteristics of a serpentine tape. In:...
  • Cited by (0)

    Evimaria D. Terzi graduated from the Department of Informatics of Aristotle University of Thessaloniki (Greece) in 2000, and after that obtained her M.Sc. from the Dept. of Computer Science of Purdue University (USA) in May 2002. Since August 2002, she is with the Dept. of Computer Science of University of Helsinki (Finland), studying towards her Ph.D. Her current research interests include algorithms and computational complexity, data mining and its applications mainly to genomic data.

    Athena I. Vakali received a B.Sc. degree in Mathematics from the Aristotle University of Thessaloniki, Greece, a M.Sc. degree in Computer Science from Purdue University, USA (with a Fulbright scholarship) and a Ph.D. degree in Computer Science from the Department of Informatics at the Aristotle University of Thessaloniki. Since 1997, she is a faculty member of the Department of Informatics, Aristotle University of Thessaloniki, Greece (currently she is an Assistant Professor). Her research interests include design, performance and analysis of storage subsystems and data placement schemes for multimedia and Web based information. She is working on Web data management and she has focused on XML data storage issues. She has published several papers in international journals and conferences. Her research interests include storage subsystem's performance, XML and multimedia data, management and data placement schemes.

    Lefteris Angelis received his B.Sc. degree and Ph.D. diploma in Mathematics from Aristotle University of Thessaloniki (AUTh). He works currently as a Lecturer at the Department of Informatics of AUTh. His research interests include statistical methods with applications to information systems, simulation methods and algorithms for optimization problems.

    View full text