Go home now Header Background Image
Search
Submission Procedure Login
User: anonymous
 
 
 
 
 
Volume 1 / Issue 1 / Abstract

available in:   PDF (116 kB)
 
get:  
Similar Docs BibTeX   Write a comment
  
get:  
Links into Future

MPEG-7 meets Multimedia Database Systems

Mario Döller
Department of Distributed Information Systems
University Passau, Germany
Mario.Doeller@uni-passau.de

Abstract: Currently used Database Management Systems do not fulfill the require-ments of multimedia in querying, indexing and content modeling. Thus, most database provider offer extenders for multimedia data. These extensions, however, provide onlylimited semantic modeling and rely on simple index structure which do not meet the whole nature of multimedia. In this context, this paper points out approaches for theintegration of MPEG-7 as a standard for describing multimedia content into a database management system and its impact to core parts of a database such as data model,access methods, query language and query optimization.

Key Words: Multimedia Databases, MPEG-7

Category: H.3.3, H.2

1 Introduction

Internet multimedia applications, such as video-on-demand, video conferencing,multimedia retrieval services, etc. let us experience multimedia at everyone's desktop and communication devices such as personal digital assistant (PDA) ormobile phone.

Strongly related to this is the enhancement of a multimedia communicationwith meta-data. Meta-data are descriptive data about multimedia content. These could be semantic descriptions, as for instance which persons appear in a videoclip, information on color characteristics of a video (e.g., the dominant color in an image), or it could be information on how a video might be adapted ifresources become rare.

In this context, the Moving Picture Experts Group (MPEG) introduced in2002 a new meta-data standard, called MPEG-7 [Martinez, 2003], for describing high- and low-level features of multimedia data.

Imagine that you are listening to a radio song and you could not remember the title. Using your mobile phone, get recorded 10s of the song, then use an audiorecognition service based on MPEG-7 Audio descriptors, extraction mechanisms and the multimedia database and get a prompt and positive content identification via SMS. In order to enable the described scenario, one needs methods for extracting low level features (in this example, the audio signals) from the unknown audio file. A methodology that allows the description of the multimedia meta-data (e.g., through MPEG-7) and an audio retrieval system that providesmeans for recognizing similar audio signatures. In general, such retrieval systems are closely coupled with database systems. Therefore, Multimedia DatabaseManagement Systems (MMDBMSs) are the technology for content management, storage and streaming [Kosch, 2003] of multimedia.

Page 18

This paper deals with the integration of MPEG-7 as a meta-data standard for multimedia into a databasemanagement system and shows how MPEG-7 and Multimedia Database Systems can benefit from each other. Identified open issues, problems and several solutions, etc. bases on experiences the author gained through his participation in the CODAC1 project which among others targeted on the creation of a MPEG-7Multimedia Database (MPEG-7 MMDB).

The remainder of this paper is organized as follows: Section 2 covers relatedwork in the area of multimedia databases and their core parts such as query languages. Then, Section 3 describes requirements a modern database managementsystem must support in order to enable the integration of MPEG-7 as a data model. The integration of MPEG-7 and its impact to core parts of a databaseis discussed in Section 4 and its subsections. Finally, this paper concludes in Section 5.

2 Related Work

Research and developments in the domain of multimedia databases can mainlybe distinguished between two directions. Based on the fact that most existing Database Management Systems (DBMS) are basically not designed for multi-media, database vendors provide extenders that enable fundamental processing of multimedia data (e.g., Oracle interMedia [Oracle, 2003] and IBM InformixDataBlades [IBM, 2001]). For instance, Oracle interMedia provides basic image storage and content based retrieval (CBR) functionality through their OR-DImage data type. The underlying CBR functionally concentrates on low level features (color, texture, shape) without the possibility for semantic retrieval.Furthermore, no mean for video or audio CBR is available.

The second research direction concentrates on special-purpose MMDBMSwhich are especially tuned for multimedia data (e.g., DISIMA [Oria et al., 2004] and MARS [Mehrotra et al., 1997]). In general, these systems provide individualmultimedia data models [Wen et al., 2003], corresponding query languages (e.g., MOQL [Li et al., 1997]) and respective approaches for any kind of content-basedretrieval [Belongie et al., 1998]. Nevertheless, their drawback is that they are not designed to query multimedia and traditional data at the same time, nor efficientaccess structures are available.

In contrast, there are efforts in order to combine the MPEG-7standard with modern database management systems [Kosch, 2002]. Due to the fact, that MPEG-7 relies on XML Schema, XML solutions for databases [Murthy and Banerjee, 2003] and native XML solutions [Staken, 2002] have to be considered as well. The authorsin [Westermann and Klas, 2003] presented an analysis of XML database solutions for the management of MPEG-7 descriptions.


1http://www.fmi.uni-passau.de/lehrstuehle/kosch/research/codac.php

Page 19

3 Database Requirements for Multimedia Support

In general, one can classify the requirements in the following three sub-areas, namely structural, semantical and syntactical.

As described above, common DBMS have several drawbacks in handling multimedia data [Santini and Jain, 1997, Grosky, 1997]. As database vendors can not support all needs and individual conveniences of different domains (e.g.,requirements for multimedia systems, geographical information systems, etc.), they build their database according to a modular architecture and made theirmanagement systems extensible. Figure 1 shows an example architecture (taken respectively from Oracle 9i and 10g), most modern databases support. Such anarchitecture provides means for extending the basic database services such as type system (e.g., for the integration of a new data model), query processing,optimization and indexing.

Figure 1: Necessary extensibility

These extensibility services cover base structural requirements by enablingthe enhancement of core parts (indexing facility, query language, query optimization, etc.) of a database.

Semantical requirements concern data modeling and query facilities. A query language need to support low-level (content-based) and high-level (semantic)query operations. Besides, spatial and temporal (or a combination of them) operations are required. An example for a complex query might be: I am searching for images that show a red Ferrari besides a green house. Besides the integration of these operations, one has to consider the fact, that the location of a descriptor in MPEG-7 documents can vary. The MPEG-7 schema allows many different ways of describing the same multimedia content. Given a free text annotationdescribing a person in an image, one may use the FreeText DS, or enhance the level of semantic by using the who section in the StructuredText DS. But theinterpretation of the information is the same. In addition, the information can be assigned to different segments (e.g., various StillRegions). A search engineand its corresponding query language has to consider all possible information variations in order to optimize recall and precision. Furthermore, we can distin-guish between context-unaware and context-aware retrieval.

Page 20

In context-unaware retrieval only top level search is performed, without taking the hierarchy intoconsideration. In contrast, context-aware retrieval takes into account that the description of multimedia data is organized in a tree like hierarchy.

Syntactical requirements deal with the input and output format of multimedia queries, inserts and update operations. In the case of MPEG-7, the insertingof MPEG-7 documents has to ensure that only valid and well formed documents are inserted. During an query operation, one might claim that the result mustbe delivered as valid MPEG-7 document(s). This is needed, when the result is forwarded to applications which can only process MPEG-7 descriptions. In addi-tion, update operations have to ensure that consistency in the sense of MPEG-7 conformance of updated data is guaranteed.

Detailed information to mentioned core parts, namely data model, query language, access methods and query optimization is presented in Section 4.

4 Integration of MPEG-7 into MMDBMS

This section addresses the integration of the MPEG-7 standard to an corresponding database data model for storing multimedia meta data and its consequence for depending parts such as access methods, query language and query optimization.

4.1 Data Model

A crucial factor for managing and retrieving multimedia data within a databaseis the underlying data model. Is the data model too coarse-grained (unstructured storage approach, e.g., the whole MPEG-7 document is stored in adatabase XMLType), storage operations are simple whereas retrieval is limited. Is the data model too fine-grained (structured storage (see Florescu etal. [Florescu and Kossmann, 1999], e.g., create for every MPEG-7 descriptor an equivalent database table), storage operations will lead to many sparely filledtables, besides retrieval can support semantically rich queries.

Due to the fact, that MPEG-7 relies on XML-Schema, mapping strategies [Christophides et al., 1994, Amer-Yahia and Fernandez, 2001] for XML to an equivalent database data model have to be considered.

In order to circumvent mapping problems mentioned previously (e.g., sparely filled tables), the transformation strategy for MPEG-7 should consider a trade-off between both directions (structured and unstructured approach) as demonstrated in [Döller, 2004]. The authors utilize available object-relational featuresfor mapping MPEG-7 descriptors to corresponding database types and tables and the supported XMLType to reduce complexity. The combination of object-relational database features, relational keys and object references allows the mapping of the whole MPEG-7 standard into a corresponding database schema.The reduction of the MPEG-7 inheritance hierarchy by skipping abstract types and merging types that only contain a few attributes and elements results ina compact arranged database schema that allows the storage of any kind of MPEG-7 document and offers an efficient and rich model for querying it.

Page 21

The data model itself is only a first step for a high-level multimedia databasesystem which bases on MPEG-7. In addition, one has to consider means for inserting, deleting and updating MPEG-7 documents. These facilities may beintegrated into the database system or provided as tools.

Furthermore, a MPEG-7 multimedia database system has to deal with multimedia query languages, supporting access methods and means for query optimization of multimedia queries (see upcoming Sections).

4.2 Query Language

Traditional database management systems have been very effective and efficient in storing and managing alphanumeric data. Nowadays, based on the ubiquity of digital cameras and MP3-players, the amount of multimedia datais overwhelming. Querying alphanumeric data relies on matching and filter operations which decides for every tuple whether it fits the requirements or not.In multimedia database systems, we basically are interested in similar data. Therefore, databases have to provide adequate query paradigms for similaritysearches [Stricker and Orengo, 1995].

In this context, SQL/MM [Melton and Eisenberg, 2001] introduces a conceptual multimedia data model for the use in multimedia database systems that extend the concept of the object-relational SQL-99. Compared with MPEG-7,the data model of SQL/MM covers the syntactical part of multimedia descriptions but allows no means for decomposing an image for describing the contentsemantically meaningful.

Furthermore, the multimedia query language, MOQL [Li et al., 1997] extends the ODMG's Object Query Language (OQL) [Jordan, 1998] by adding spatial, temporal and presentation properties for content-based image and videodata retrieval.

By the use of MPEG-7 as a data model in multimedia database systems, oneis confronted to think about enhancements of the query language for multimedia data such as similarity search. In addition, the integration of operations thatcan produce XML output has to be considered as well. This is as important as the import format and the output format (both MPEG-7 descriptions) shouldcorrespond to each other.

This means if it is necessary to retrieve the query result as MPEG-7 documents, one has to combine the multimedia query language (e.g., SQL/MM) with SQL/XML [Eisenberg and Melton, 2002] elements (e.g., XMLAgg, XM-LElement, etc.). In addition, it has to be ensured that the resulting XML document satisfies the XML Schema for MPEG-7. This necessitates the enhancementof query processing for type checking of MPEG-7 conformance.

4.3 Access Methods

Indexing is an important concept in modern database management systems to enhance processing efficiency and retrieval capacity (e.g., similarity searches).

Page 22

Innately, most database systems provide only a limited number of integratedaccess methods such as B-tree or hashing facilities. These techniques limit the use of database systems for multimedia data. This is as astonishing as in the lastdecade various different access methods have been established for indexing multidimensional data. To mention only a few: SR-tree [Katayama and Satoh, 1997],M-tree [Ciaccia et al., 1997] or X-tree [Berchtold et al., 1996].

The integration of such access methods is crucial in order to support the re-trieval of multimedia data by similarity searches or other query types. In MPEG7, there exist several descriptors for extracted low level features of audio, videoand image data (e.g., ScalableColorType for images, or AudioSignatureType for audio files). Indexing of these descriptors in combination with an enhancementof the query language (see Subsection 4.2) allows the retrieval of multimedia data based on similarity searches across multiple MPEG-7 documents. In series, such indexing can support content-based retrieval based on low level features.

Although, MPEG-7 provides excellent means for semantic indexingand querying by its semantic descriptors, research is still in an early stage and approaches only have partly reached content-based retrieval sys-tems [Bailer et al., 2004] that relies on MPEG-7. To the author best knowledge, semantic indexing in the context of databases and MPEG-7 has not been cosidered so far.

4.4 Query Optimization

Besides the enhancement of query languages (e.g., SQL/MM) and the integrationof new access methods (e.g., SR-tree), one may not neglect the performance of these operations.

In multimedia databases, queries often contain similarity operations such as range or nearest-neighbor operation for low level features (e.g. a color histogramrepresented by the MPEG-7 ScalableColorType). In order to improve the performance of these operations, one has to extend the query optimizer. In general,a query optimizer can consider three approaches: selectivity, cost model and operator ordering. This paper concentrates on selectivity and cost models, as mostmodern databases provide only means for their enhancement.

In the literature, several cost models exist that concentrate on calculating thecost of index structures for range and nearest-neighbor searches [Böhm, 2000]. In [Lee et al., 1999], the authors present an efficient cost model for predictingthe performance of the k-NN (k-nearest neighbor) query independently of the used index tree. The model is accurate for low- and mid-dimensional data withnon-uniform distribution. The estimation of range query's selectivity represents, apart from few initial approaches [Kosch and Döller, 2005], an open researchquestion. There, the authors introduced an approach for approximating the selectivity of range searches within a n-dimensional data set with the help of adensity based clustering technique (DBSCAN [Ester et al., 1996]).

Page 23

5 Conclusion

This paper points out requirements and impacts to core parts of a databasemanagement system originated by an integration of the MPEG-7 standard as database data model. For this purpose, requirements of an extensible databaseare outlined. Then, the integration itself is addressed. This integration covers in particular the database data model, enhancements of query languages, accessmethods and query optimization.

Hence, there currently are several solutions and proposals available for storingand retrieving MPEG-7 documents. Nevertheless, these systems and proposals leave behind many open issues which have not been considered so far. For instance, every solution applies different (often proprietary) combinations of used retrieval operators and query languages (e.g., SQL/XML in combination withproprietary operators, etc.). Therefore, there is clearly a need for a standardized query language that specifies the input and output format of MPEG-7 queries.This query language has to consider the full strength (spatial, temporal, spatialtemporal, etc.) of multimodal queries.Next, there is still a limited availability of index structures for high dimensional data. This is well known, but still an unresolved problem. In addition, research has to be done for index structures that are especially tuned for MPEG-7 descriptors and/or descriptor schemes (e.g., indexing of StillRegions).

References

[Amer-Yahia and Fernandez, 2001] Amer-Yahia, S. and Fernandez, M. (2001).Overview of Existing XML Storage Techniques. AT&T Labs Research.

[Bailer et al., 2004] Bailer, W., Mayer, H., Neuschmiedq, H., Haas, W., Lux, M., andKlieber, W. (2004). Content-Based Video Retrieval and Summarization using MPEG-7. In Proceedings Internet Imaging V, pages 1-12, San Jose, CA, USA.

[Belongie et al., 1998] Belongie, S., Carson, C., Greenspan, H., and Malik, J. (1998). Color- and Texture-Based Image Segmentation Using EM and Its Application toContent-Based Image Retrieval. In Proceedings of the International Conference on Computer Vision (ICCV'98), pages 675-682, Bombay, India.

[Berchtold et al., 1996] Berchtold, S., Keim, D. A., and Kriegel, H. P. (1996). The XTree: An Index Structure for High-Dimensional Data. In Proceedings of the 22nd Int.Conf. on Very Large Data Bases (VLDB), pages 28-39, Mumbai (Bombay), India. Morgan Kaufmann, ISBN 1-55860-382-4.

[Böhm, 2000] Böhm, C. (2000). A Cost Model for Query Processing in High Dimensional Data Spaces. ACM Transactions on Database Systems (TODS), 25(2):129-178.

[Christophides et al., 1994] Christophides, V., Abiteboul, S., Cluet, S., and Scholl, M.(1994). From Structured Documents to Novel Query Facilities. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data, pages313-324, Minneapolis, Minnesota.

[Ciaccia et al., 1997] Ciaccia, P., Patella, M., and Zezula, P. (1997). M-tree: An effi-cient Access Method for Similarity Search in Metric Spaces. In Proceedings of the 23rd Int. Conf. on Very Large Data Bases (VLDB), pages 426-435, Athens, Greece.Morgan Kaufmann, ISBN 1-55860-470-7.

[Döller, 2004] Döller, M. (2004). The MPEG-7 Multimedia DataBase System (MPEG-7 MMDB). Dissertation, University Klagenfurt, Austria.

[Eisenberg and Melton, 2002] Eisenberg, A. and Melton, J. (2002). SQL/XML is Making Good Progress. ACM SIGMOD Record, 31(2):101-108.

Page 24

[Ester et al., 1996] Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996). A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and DataMining, pages 226-231, Portland, OR, USA.

[Florescu and Kossmann, 1999] Florescu, D. and Kossmann, D. (1999). Storing andQuerying XML Data using RDBMS. IEEE Data Engineering Bulleting, 22(3):27-34.

[Grosky, 1997] Grosky, W. I. (1997). Managing Multimedia Information in DatabaseSystems. Communications of the ACM, 40(12):73-80.

[IBM, 2001] IBM (2001). DataBlade Module Development Overview, Version 4.0. http://www-306.ibm.com/software/data/informix/blades/.

[Jordan, 1998] Jordan, D. (1998). C++ Object Databases, Programming with the ODMG Standard. Addison-Wesley. 456 pages, ISBN: 0-201-63488-0.

[Katayama and Satoh, 1997] Katayama, N. and Satoh, S. (1997). The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries. In ACM SIGMOD Int. Conf. on Management of Data, pages 369-380.

[Kosch, 2002] Kosch, H. (2002). MPEG-7 and Multimedia Database Systems. Sigmond Records, 31(2).

[Kosch, 2003] Kosch, H. (2003). Distributed Multimedia Database Technologies supported by MPEG-7 and MPEG-21. CRC Press. 248 pages, ISBN: 0-849-31854-8.

[Kosch and Döller, 2005] Kosch, H. and Döller, M. (2005). Approximating the selectivity of multimedia range queries. In Proceedings of the IEEE International Conference on Multimedia and Expo, Amsterdam, The Netherlands.

[Lee et al., 1999] Lee, J.-H., Cha, G.-H., and Chung, C.-W. (1999). A Model for kNearest Neighbor Query Processing Cost in Multidimensional Data Spaces. Information Processing Letters, 69(2):69-76.

[Li et al., 1997] Li, J. Z., "Ozsu, M. T., Szafron, D., and Oria, V. (1997). MOQL: AMultimedia Object Query Language. In Proceedings of the third International Workshop on Multimedia Information Systems, pages 19-28, Como Italy.

[Martinez, 2003] Martinez, J. M. (2003). MPEG-7 Overview. ISO/IEC JTC1/SC29/W11 N5525, Pattaya.

[Mehrotra et al., 1997] Mehrotra, S., Rui, Y., Ortega-Binderberger, M., and Huang, T. S. (1997). Supporting Content-based Queries over Images in MARS. In Proceedings of the 1997 International Conference on Multimedia Computing and Systems (ICMCS '97), page 632.

[Melton and Eisenberg, 2001] Melton, J. and Eisenberg, A. (2001). SQL Multimedia Application packages (SQL/MM). ACM SIGMOD Record, 30(4):97-102.

[Murthy and Banerjee, 2003] Murthy, R. and Banerjee, S. (2003). XML Schemas in Oracle XML DB. In Proceedings of the 29th VLDB Conference, pages 1009-1018,Berlin, Germany. Morgan Kaufmann.

[Oracle, 2003] Oracle (2003). Oracle interMedia Reference, 10g Release 1. http://download-east.oracle.com/docs/ .

[Oria et al., 2004] Oria, V., zsu, M. T., and Iglinski, P. J. (2004). Foundation of the DISIMA Image Query Languages. Multimedia Tools and Applicactions Journal,23:185-201.

[Santini and Jain, 1997] Santini, S. and Jain, R. (1997). Image Databases are notDatabases with Images. In Proceedings of the 9th International Conference on Image Analysis and Processing 2, pages 38-45, Florence, Italy.

[Staken, 2002] Staken, K. (2002). Xindice Developers Guide 0.7. The Apache Foundation, http://www.apache.org.

[Stricker and Orengo, 1995] Stricker, M. A. and Orengo, M. (1995). Similarity of color images. In Storage and Retrieval for Image and Video Databases, SPIE, pages 381-392, San Jose, CA.

[Wen et al., 2003] Wen, J.-R., Li, Q., Ma, W.-Y., and Zhang, H.-J. (2003). A Multi-paradigm Querying Approach for a Generic Multimedia Database Management System. ACM SIGMOD Record, 32(1):26-34.

[Westermann and Klas, 2003] Westermann, U. and Klas, W. (2003). An Analysis of XML Database Solutions for the Management of MPEG-7 Media Descriptions. ACMComputing Surveys, 35(4):331-373.

Page 25