The explosive growth of multimedia information on the Web in recent years calls for an elegant means to model and manage multimedia content to facilitate semantic-level access and sharing across diversified applications. From the perspective of retrieval, the semantics of multimedia data features context-dependency and media-independency; both are inadequately supported by the state-of-the-art data modeling technology. In this paper, we address this problem by advocating MediaView as an extended object-oriented view mechanism to bridge the "semantic gap" between conventional databases and semantics-intensive multimedia applications. This mechanism captures the dynamic semantics of multimedia using a modeling construct named media view (MV), which formulates a customized context where heterogeneous media objects with similar/related semantics are characterized by additional properties and user-defined semantic relationships. View operators are proposed for the manipulation and derivation of individual MVs which can be fit into the desired real-life scenarios automatically. The usefulness and elegancy of MediaView are demonstrated by its applications in various (subjective) activities supporting multi-modal retrieval.
Index Terms:
media view, context-dependency, multi-modal retrieval
Citation:
Qing Li, Jun Yang, Yueting Zhuang, "Multi-Modal Information Retrieval with a Semantic View Mechanism," aina, vol. 1, pp.133-138, 19th International Conference on Advanced Information Networking and Applications (AINA'05) Volume 1 (AINA papers), 2005