Efficient multimedia query-by-content from mobile devices

https://doi.org/10.1016/j.compeleceng.2008.11.016Get rights and content

Abstract

The phenomenal growth in multimedia content has lead to the development of a variety of multimedia description schemes, which can be used to facilitate querying of multimedia databases. In the increasingly mobile environment of today, multimedia query formats need to be applicable to mobile devices, which, compared to desktop PCs, have specific limitations such as small screen size, limited memory and processing power and high bandwidth cost. As a potential solution to multimedia querying in mobile environments, this paper introduces two concepts: query streaming and its application as targeted browsing. Targeted browsing is a technique for multimedia query-by-content designed especially for mobile devices while query streaming is a method for continually updating a query by sending additional terms to an existing query. This paper describes an implementation of query streaming that combines the Multimedia Query Format (MQF) (a standard communication language for querying multimedia databases) with Fragment Request Units (FRU) and Fragment Update Units (FUU) (which provide a standard way of randomly accessing fragments of XML documents). For efficient compression of the multimedia query XML files, the use of binary compression using MPEG BiM is proposed and a number of use case scenarios are examined. Results show that the proposed solution to provide a significant reduction in the file size required to perform multimedia querying.

Introduction

The area of multimedia query has received considerable attention lately, as a result of the phenomenal growth of user created multimedia content. This explosion in user generated content is visible in the popularity of multimedia sharing sites such as YouTube [1] and Flickr [2]. Searching those multimedia items is a non-trivial task that requires metadata to be attached to the multimedia items in question. Examples are the low-level descriptions of MPEG-7 [3] and higher-level, semantic metadata such as Dublin Core [4]. Attaching the correct description to the multimedia items remains a major problem to be solved today, due to the volume of source data to be described. Requiring a user to sift through all the data he/she created is non-optimal due to the intensive nature of the task and subjectivity involved. Some automatic and semi-automatic description generation have been developed (such as IBM’s MARVEL [5]) but these remains experimental.

While research into multimedia databases and querying has received considerable attention in the literature, the focus of this paper is the more specific problem of developing a standard communication language between clients and database solutions such as MARVEL, Google Image and others. This specific problem has received much less attention in the literature. Previous related work in this area includes the author’s Multimedia Query Format (MQF) [6] as well as Multimedia Retrieval Markup Language (MRML) [7], Multimedia Object Query Language (MOQL) [8], MQL [9], Semantic and Cognition-based Image Retrieval (SEMCOG) [10] which implements Cognition and Semantics-based Query Language (CSQL) as its query format, SQL Multimedia (SQL/MM) [11] and others. Notably, many of these solutions rely on non-standard description schemes and are based on or extensions of SQL – Structured Query Language (with the exception of MQF and MRML, which uses XML) since most are designed before the introduction of MPEG-7. SQL is the standard querying language for textual based databases, and hence most multimedia query languages developed prior to the standardization of MPEG-7 are derived from SQL. Although SQL is sufficient for textual databases with a known structure, it is insufficient for use in general multimedia querying due to its textual nature, e.g. it is very difficult to implement a query-by-example using SQL since SQL cannot embed multimedia data as part of the query. Another limitation is that although MPEG-7 is the standard in multimedia description scheme, it is not the only one (Dublin Core is another, for example). The different data structures used to describe similar features (e.g. author information) in MPEG-7compared with Dublin Core would require any SQL-based solution (e.g. MOQL, MQL, SQL/MM) to account for two or more different description schemes. Hence, these existing approaches do not provide a solution to communicating multimedia queries that is universal to multiple description schemes.

Out of the existing solutions described above, MRML and the author’s MQF are the solutions that are not based on SQL and specifically designed for multimedia querying. Neither of these approaches consider the restrictions imposed by mobile devices, which will be increasingly used for multimedia queries in today’s increasingly mobile environment. Compared to desktop PCs, mobile devices have specific limitations such as small screen size, limited memory and processing power and high bandwidth cost. Hence, this paper proposes application level techniques for querying based on MQF that address bandwidth and physical limitations.

The inherent problem of performing multimedia query-by-content on mobile devices with small screens is that there is no practical way to browse the result set in its entirety; this contrasts with PCs with their increasing display sizes. As a potential solution, this paper introduces two concepts: query streaming and its application as targeted browsing. The combination of these two approaches enables the implementation of multimedia query-by-content on mobile devices as shown in Fig. 1. In the scenario illustrated in Fig. 1, a user on a mobile device queries a multimedia database via a meta-search server. This server acts as an intelligent gateway to a myriad of potential multimedia query services which the server determines as being relevant to the query. Thus the server farm in Fig. 1 may be pre-determined or may be chosen on-the-fly on the basis of the query content. The advantage of the meta-search server is that the client can transmit minimal (and hence low bandwidth) messages to one server which can initiate a much broader set of searches and then compile responses into a single efficient reply to the client. From the user perspective, we propose targeted browsing.

Targeted browsing is a method to perform multimedia query-by-content created especially for mobile devices or any device with limited capabilities. Instead of returning the full result set and requiring the user to decide manually which result is desired, targeted browsing returns a set of metadata associated with the result set, and lets the user choose the desired property of the final result set based on the metadata. This has the effect of minimizing the final result set that is sent to the client. To enable targeted browsing, a query that is sent to the server must be continually updated to account for new information supplied by the user interactively. This requires query streaming:

Query streaming is a method for continually updating a query by sending additional terms to an existing query. This is similar to searching within a result set with an important difference: a new query was not created; the original query was simply updated to include new filtering terms. This is advantageous if the query is performed using a meta-search engine, where there is no need to re-query all of the databases involved if the query is to be updated, thus saving time and bandwidth.

Two key technologies to enable query streaming and targeted browsing are the Multimedia Query Format [6] and Fragment Request Unit/Fragment Update Unit (FRU/FUU) [12]. This paper will explore the application of both technologies for query streaming to enable targeted browsing.

Section 2 of this paper will describe the concept of targeted browsing, including the motivation, requirements and implementation. Section 3 will outline MQF, highlighting the key features and providing example implementations. In Section 4, a method for query streaming to enable targeted browsing using a combination of MQF and FRU/FUU will be described. Typical use case scenarios will be presented in Section 5 with conclusions provided in Section 6.

Section snippets

Motivation

Even with advances in querying methods such as query-by-example, the user is still required to search the result set for a representative image to serve as an example for the next batch of results. This traditional image search method and query-by-example method works if both the hardware is capable of displaying all the results and the user is willing to take the time to page through the result set. In mobile devices, even if the user is willing to take the time, the hardware is incapable of

Design overview

The design and overall concepts of MQF were presented in detail in [6]. In general terms, MQF was designed to be a non-restricting format for multimedia query purposes. The key features of MQF are: The use of Reverse Polish Notation (RPN), the concept of query levels, and meta-search capability.

One of the key design goals when developing MQF was to ensure only minimum features required to ensure compliance with the format. That is, MQF should act as a container for queries, and nothing more; it

Modification to MQF to allow query streaming

To achieve “query streaming” from mobile devices, a new query format with built-in RPN capabilities is required. Existing solutions based on the SQL [18] such as SQL/MM [11] cannot be used for this purpose due to the fact that SQL was designed for textual databases with a rigid, known structure, and each SQL statement must be a complete query. Hence, using SQL, a new query would need to be formulated and cannot be streamed to the server. In contrast, to achieve effective query streaming, the

Targeted browsing use case scenario

In the scenarios presented in Sections 5.1 and 5.2, two sets of results are presented: using plain text-based XML format and a binary-compressed BiM (Binary MPEG format for XML) which is part of the MPEG-B standard ISO/IEC 23001-1 [17].

Multimedia databases and MQF

Although MQF provides a protocol for communication between a client looking for a certain multimedia item and a server which contains the multimedia item in question, the MQF specification does not govern how the multimedia data should be stored and structured in the server. Due to the interest in multimedia data today, most modern database servers can now store binary data instead of text only. From an academic point of view, [21] and [22] provided descriptions on how multimedia databases can

Conclusions

A new method for querying system for multimedia content-based query has been presented. The main limitation of currently available technologies for searching in general is the non-universal design of the output results; where the output of a search engine was optimized for PC-based applications and would not translate well to a mobile environment with limited screen size. Another notable limitation is the lack of ability for search engines to describe the result set in a meaningful manner to

Kevin Adistambha Kevin is an Electrical Engineering PhD student in Whisper lab, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, Australia. His research interests include multimedia query-by-example, motion capture and its databases. He received his Masters by Research in Electrical Engineering in 2006 from the University of Wollongong.

References (22)

  • YouTube: broadcast yoursel. Available from:...
  • Flickr: photo sharing. Available from:...
  • ISO/IEC. MPEG-7 part 5: multimedia description schemes (MDS). ISO/IEC 15938-5:2001...
  • Dublin core metadata initiative. Available from:...
  • IBM research multimedia analysis and retrieval system. Available from:...
  • Adistambha K, Ritz CH, Burnett IS. MQF: an XML based multimedia query format. Presented at the international conference...
  • Müller W, Müller H, Marchand-Maillet S, Pun T Squire DM, Pečenović Z, Giess C, Vries APD. MRML: a communication...
  • Li JZ, Ozsu MT, Szafron D, Oria V. MOQL: a multimedia object query language. In: Proceedings of the 3rd international...
  • Shu-Chen Kau, Tseng JCR. MQL-a query language for multimedia database. Multimedia communications, 1994. In: 5th IEEE...
  • Wen-Syan Li, Selcuk Candan K. SEMCOG: a hybrid object-based image database system and its modeling, language, and query...
  • ISO/IEC. SQL multimedia and application packages. ISO/IEC JTC1/SC32/WG4,...
  • Cited by (3)

    Kevin Adistambha Kevin is an Electrical Engineering PhD student in Whisper lab, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, Australia. His research interests include multimedia query-by-example, motion capture and its databases. He received his Masters by Research in Electrical Engineering in 2006 from the University of Wollongong.

    Stephen Davis Stephen is a research fellow at the University of Wollongong, funded by the Smart Services CRC. His current research interests are multimedia delivery, social networking & collaboration, multimedia semantics and web 2.0. He has a PhD in Computer Engineering from the University of Wollongong.

    Christian Ritz Christian is a lecturer within the School of Electrical, Computer and Telecommunications Engineering at the University of Wollongong. His research interests include single and multichannel speech signal processing, spatial audio coding and multimedia content analysis, annotation and delivery. He received his PhD in Electrical Engineering from the University of Wollongong in 2003.

    Ian Burnett Ian Burnett received the B.Sc., M.Eng., and Ph.D. degrees in electrical and electronic engineering from the University of Bath, Bath, U.K. He is a Professor and Head of the School of Electrical and Computer Engineering at RMIT University, VIC, Australia. His current research interests are in multimedia processing and delivery, speech and audio coding, 3-D spatial audio, and audio separation. Prof. Burnett has been an active participant in MPEG and MPEG-21 in recent years, notably as Chair of the Multimedia Description Schemes subgroup and Australian Head of Delegation.

    View full text