1. Introduction
The International Organization for Standardization (ISO) has introduced a new standard, MPEG-7, based on the standards of MPEG-1, MPEG-2 and MPEG-4. The official name of this standard is "Multimedia Content Description Interface". Its goal is to produce a standard for describing multimedia content data to meet the needs of real-time, non-real-time and push-pull applications. It is different from waveform-based and compression-based representations such as MPEG-1 and MPEG-2, and object-based representations such as MPEG-4. Instead, it standardizes the description of various types of multimedia information and links the description with the content described to achieve fast and efficient search.
2. Objectives of MPEG-7
1.Support multiple audio and visual descriptions
Descriptions include free text, n-dimensional space-time structure, statistical information, objective attributes, subjective attributes, production attributes, and combined information. For visual information, descriptions may include color, visual objects, textures, sketches, shapes, volumes, spatial relationships, motion and deformation, etc. For audio information, descriptions may include pitch, mode, speed, speed change, etc.
2. Based on the abstract level of information, provide a method to describe multimedia materials so as to represent the information needs of users at different levels.
3. Support flexibility in data management, globalization and interoperability of data resources.
3. MPEG-7 components
The main elements of MPEG-7 include the following.
1. Description tools, including a set of descriptors D (Descriptor) and description schemes DS (Description Schemes). Descriptors refer to the syntax or grammar used to define and express certain aspects of an entity. The expression entity is composed of feature identifiers (such as color) and data types (such as strings). Data types can be "composite", which can be composed of a combination of several data types, or several Ds can be used to "describe" a feature. The description scheme is composed of one or more Ds and DSs, and DS specifies the structure and grammar of their relationship.
2. Description Definition Language DDL (Description Definition Language) is a language used to specify description schemes. It is a pattern-based language that represents the results of audio and video data modeling. DDL specifies MPEG's description tools, including descriptors and description schemes, and provides rules for constructing descriptors into description schemes. DDL also allows the definition of extended DS in special applications. Description tools are instantiated through DDL and described in text format (XML).
3. System tools used to support multi-channel descriptions, synchronization issues, transmission mechanisms, file formats, etc.
4. MPEG-7 Attribute Description Tool
The MPEG-7 standard provides a series of attribute description tools to manage attributes in a unified manner. Attribute description tools are mostly used when more than one media is described (such as audio and video). These description tools are divided into five categories according to their functions: content description, content management, content organization, navigation and access, and user interaction.
1. Basic Elements
Basic data types provide a set of extended data types and mathematical types that are helpful for describing AV (Audio-Visual) content, such as matrices and vectors. Basic data types can also be used to build connections between media files, locate content, describe time, place, people, etc. It is these basic elements that form the basis for defining the MPEG-7 description scheme through the DS specification.
2. Content Description
The purpose of content description is to describe the perceptible information content, which is used to characterize the perceptible content information. Content description includes two aspects: structure and semantics. Structural tools describe the temporal and spatial structural information of AV content by dividing clips, frames, static and dynamic areas. Semantic tools describe the real world reflected by AV content through objects, events, abstract concepts and relationships. Structural tools and semantic tools work together through links to complete the description of content.
3. Content Management
Content management deals with information related to multimedia document creation, media ownership and encoding, that is, information that cannot be abstracted.
4. Content Organization
[page]
Content organization provides a method to describe the analysis and classification of multimedia data, which can be used to describe the properties of a group of objects.
5. Navigation and Access
Navigation and access tools are used to define a series of summaries of audio and video content, decompose and transform information, making it easier to browse and obtain AV content. It contains three parts: overview, decomposition and transformation.
6. User Interaction
It describes user preferences and usage information, making media access more personalized and convenient for users. For example, the priority of media can be defined according to the user's preferences, so that users can find the most suitable information as quickly as possible.
5. Application fields of MPEG-7
People require efficient access, interactive operation and display of multimedia information in daily life. These are the two types of applications of MPEG-7 "Pull" and "Push". These two types of applications are closely related to the politics and economy of society and are indispensable in different occasions of professional fields such as education, film and television and consumer applications.
1.Pull type
The purpose of the MPEG-7 standard is to define a specification that makes the query of AV materials as convenient as the current text query. Although its recognized application of multimedia content description is far more than "get", it is still retained as many original MPEG-7 applications. These "get", or "pull" type applications involve databases, multimedia information archives, and network-based Internet models (users request information from servers).
Here are some applications of the "Pull" type.
(1) Commercial music applications (karaoke and music sales)
When a user sees a song on TV, he or she can easily "search" for the complete song from the database by just singing a few verses; after paying an appropriate fee, the entire song can be downloaded to the user's computer.
(2) Sound Effects Library
Artists and sound designers can specify a sound effect type and then choose from a variety of variations of this sound source to suit their needs. For example, they can provide a prototype sound, specify detailed features, or use onomatopoeia, a variation of "searching" by humming a song, to produce the type of abstract sound he wants to find.
(3) Historical database
People can "search" for an audio or video recording or other related event by using specific keywords ("The People's Republic of China was founded!"), key events (WTO), speakers (Bill Gates), locations (capital), dates (September 11, 2001), or any combination of the above.
(4) Movie scene “search” through recallable auditory events
In people's memory world, many visual events are unforgettable. The most obvious example is to use a specific "description" to refer to a movie or TV scene or dialogue, sound, etc., and use this method to find a movie.
2. Push type
"Push" and "Pull" type applications are opposite. "Push" type applications are more like broadcasting and the newly emerging network broadcasting. The "Pull" model is from indexing to "searching", and the "Push" model is from selection to "filtering". These two types of applications have completely different requirements. Usually, "Pull" processes the static information "description" stored in the database, while "Push" processes the changing dynamic information "description". The requirement of "Push", that is, "filtering", is to provide the multimedia information that users only want to watch or listen to.
For example, in digital systems (including data broadcasting), MPEG-7 descriptions can help users select programs and various types of data broadcast information for immediate or later viewing, as well as recording and storage. In the context of personalized broadcast systems, the data provided to users can be "filtered" from data broadcasts according to their respective types, and the generation of types can be automatic (such as based on location, age, gender, or previous selection behavior, etc.) or semi-automatic (such as based on preset interests, etc.).
VI. Conclusion
The emergence of MPEG-7 is the inevitable product of the transition from the text information age to the multimedia information age. In the future multimedia information retrieval service, MPEG-7 will play a leading role. At present, many research institutions have begun to study the key technologies and have achieved certain results, but there is still a considerable gap from practical application. With the rapid development of MPEG standards and network systems, the application of MPEG-7 will also flourish, providing more convenience for our study and life.
Previous article:Introduction to automatic brightness control in LED displays
Next article:Design of AT056TN04 LCD screen driver controller
- Innovation is not limited to Meizhi, Welling will appear at the 2024 China Home Appliance Technology Conference
- Enjoy big-screen gaming anytime, anywhere: Making portable 4K UHD 240Hz gaming projector a reality
- AMD surpasses Intel: CPU shipments surge in Q3 this year
- Exynos is losing ground, Samsung plans to use Qualcomm chips in home appliances
- Intel and 50 partners unveiled a full range of 30 notebook and desktop AI PCs equipped with Intel Core Ultra (2nd Generation)
- Innovation leads the new trend of mobile refrigeration GMCC will present new products at 2024 CIAAR
- Lenovo and NVIDIA expand collaboration to jointly launch new liquid-cooled AI servers
- Ceiling fan solution based on XMC1302
- Gartner: Global AI PC shipments are expected to account for 43% of total PC shipments in 2025
- Innolux's intelligent steer-by-wire solution makes cars smarter and safer
- 8051 MCU - Parity Check
- How to efficiently balance the sensitivity of tactile sensing interfaces
- What should I do if the servo motor shakes? What causes the servo motor to shake quickly?
- 【Brushless Motor】Analysis of three-phase BLDC motor and sharing of two popular development boards
- Midea Industrial Technology's subsidiaries Clou Electronics and Hekang New Energy jointly appeared at the Munich Battery Energy Storage Exhibition and Solar Energy Exhibition
- Guoxin Sichen | Application of ferroelectric memory PB85RS2MC in power battery management, with a capacity of 2M
- Analysis of common faults of frequency converter
- In a head-on competition with Qualcomm, what kind of cockpit products has Intel come up with?
- Dalian Rongke's all-vanadium liquid flow battery energy storage equipment industrialization project has entered the sprint stage before production
- Allegro MicroSystems Introduces Advanced Magnetic and Inductive Position Sensing Solutions at Electronica 2024
- Car key in the left hand, liveness detection radar in the right hand, UWB is imperative for cars!
- After a decade of rapid development, domestic CIS has entered the market
- Aegis Dagger Battery + Thor EM-i Super Hybrid, Geely New Energy has thrown out two "king bombs"
- A brief discussion on functional safety - fault, error, and failure
- In the smart car 2.0 cycle, these core industry chains are facing major opportunities!
- The United States and Japan are developing new batteries. CATL faces challenges? How should China's new energy battery industry respond?
- Murata launches high-precision 6-axis inertial sensor for automobiles
- Ford patents pre-charge alarm to help save costs and respond to emergencies
- New real-time microcontroller system from Texas Instruments enables smarter processing in automotive and industrial applications
- What waveforms can power amplifiers amplify?
- Download: Distributed Wi-Fi for Smart Homes
- 【ST NUCLEO-H743ZI Review】+ Unboxing and lighting
- Introduction to ST MEMS Sensor Machine Learning (MLC) Function
- Tips for Setting the Baud Rate of the MSP430 Serial Port
- A new type of microstrip array antenna
- EEWORLD University ---- ESP32 Video Tutorial
- How to purchase website core points?
- Can this zero-crossing detection realize zero-crossing detection in two directions? What is the working principle?
- FPGA Design Skills and Case Development Detailed Explanation (2nd Edition)