Cross-Media Analysis and Mining 

Mark Zhang (contact person – click here)
Alberto del Bimbo
Selcuk Candan
Alexander Hauptmann
Ramesh Jain
Alexis Joly
Yueting Zhuang


Today there are lots of heterogeneous and homogeneous media data from multiple sources, such as news media websites, microblog, mobile phone, social networking websites, and photo/video sharing websites. Integrated together these media data represent different aspects of the real-world and help document the evolution of the world. Consequently, it is impossible to correctly conceive and to appropriately understand the world without exploiting the data available on these different sources of rich multimedia content simultaneously and synergistically.

Cross-media analysis and mining is a research area in the general field of multimedia content analysis which focuses on the exploitation of the data with different modalities from multiple sources simultaneously and synergistically to discover knowledge and understand the world. Specifically, we emphasize two essential elements in the study of cross-media analysis that help differentiate cross-media analysis from the rest of the research in multimedia content analysis or machine learning.

The first is the simultaneous co-existence of data from two or more different data sources. This element indicates the concept of “cross”, e.g., cross-modality, cross-source, and cross cyberspace to reality. Cross-modality means that heterogeneous features are obtained from the data in different modalities; cross-source means that the data may be obtained across multiple sources (domains or collections); cross-space means that the virtual world (i.e., cyberspace) and the real world (i.e., reality) complement each other.

The second is the leverage of different types of data across multiple sources for strengthening the knowledge discovery, for example, discovering the (latent) correlation or synergy between the data with different modalities across multiple sources, transferring the knowledge learned from one domain (e.g., a modality or a space) to generate knowledge in another related domain, and generating a summary with the data from multiple sources.

There two essential elements help promote cross-media analysis and mining as a new, emerging, and important research area in today’s multimedia research. With the emphasis on knowledge discovery, cross-media analysis is different from the traditional research areas such as cross-lingual translation. On the other hand, with the general scenarios of the leverage of different types of data across multiple sources for strengthening the knowledge discovery, cross-media analysis and mining addresses a broader series of problems than the traditional research areas such as transfer learning. Overall, cross-media analysis and mining is beneficial for many applications in data mining, causal inference, machine learning, multimedia, and public security.


Like other emerging hot topics in multimedia research, cross-media analysis and mining also has a number of fundamental and controversial issues that must be addressed in order to have a full and complete understanding of the research in this topic. These issues include but are not limited to whether or not there exists a unified representation or modeling for the same semantic concept from different media, and if there is what such unified representation or modeling is; whether or not there exists any “law” that governs the topic evolution and development over the time in different media and if there is what such “law” is and how it is formulated; whether or not there exists a mapping for a conceptual or semantic activity between the cyberspace and the real-world, and if there is what such a mapping is and how it is developed and formulated.


Mark Zhang
zhongfei -AT-

Comments are closed.