MindMap Gallery Alibaba Cloud Intelligent Media Management
Intelligent Media Management IMM (Intelligent Media Management) is a scenario-based intelligent analysis and management tool for encapsulated data. Provides one-stop data processing, analysis, retrieval and other management and control experiences for documents and image data on the cloud. According to different application scenarios such as image analysis and data storage, complete processing capabilities are encapsulated and integrated to allow rapid data flow.
Edited at 2024-01-13 15:37:11One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
One Hundred Years of Solitude is the masterpiece of Gabriel Garcia Marquez. Reading this book begins with making sense of the characters' relationships, which are centered on the Buendía family and tells the story of the family's prosperity and decline, internal relationships and political struggles, self-mixing and rebirth over the course of a hundred years.
Project management is the process of applying specialized knowledge, skills, tools, and methods to project activities so that the project can achieve or exceed the set needs and expectations within the constraints of limited resources. This diagram provides a comprehensive overview of the 8 components of the project management process and can be used as a generic template for direct application.
Alibaba Cloud Intelligent Media Management
Product introduction
Intelligent media management encapsulates and integrates complete processing capabilities for business scenarios in different industries, providing document format conversion and preview, image content recognition, face detection, QR code detection, face search and other functions. It is suitable for media asset management, intelligent Used by developers of network disks, social applications, photo libraries, etc. Intelligent media management can combine object storage (OSS) and table storage (Tablestore) to provide practical scenario-based one-stop solutions for document management, image social analysis and other fields.
Features
Document conversion and preview
Integrate document-related format conversion and preview to quickly realize intelligent document management capabilities.
Function illustrate format conversion Convert 48 document formats such as PPTX, PPT, XLS, DOC, PDF, HTML, HTM to JPG, PNG, PDF, TXT and VECTOR vector formats. For more information, see Document Format Conversion. Document preview Choose an appropriate method for document preview based on actual needs. Document Preview V1: After converting the input document into VECTOR vector format, it can achieve an easier-to-use, more powerful and customized document preview effect by docking with the front-end rendering engine provided by intelligent media management. For more information, see Document Preview V1. Document Preview V2: After obtaining the preview address and AccessToken of the document, there is no need to specify the iframe element. The JS file will automatically generate an iframe under the custom block element, and the AccessToken can be set through the JS file to quickly preview the document. See Document Preview V2 for more information.
Intelligent picture detection
Integrate AI functions such as content recognition and face detection to quickly realize intelligent management of pictures.
Function illustrate content recognition Identify scenes, objects, events and other information in pictures and obtain metadata information of tags. For more information, see Content Awareness. Face Detection Detect the face in the picture as well as the person's age, gender, mood, etc., and obtain the metadata information of the face. For more information, see Face Detection. QR code detection Detect the QR code in the image and the content stored in the QR code. For details, see QR code recognition. Human detection Detect human body regions in images with confidence. For more information, see Human Detection. Face search Search for the top N pictures that are most similar to the specified picture, and the results are sorted in descending order of similarity. For more information, see Face Search. Face comparison Compare the similarity of the two largest faces in the two pictures. For more information, see Face Comparison. Picture blind watermark Add a blind watermark of image or text type to the image. After the blind watermark is added, the watermark cannot be directly seen in the picture, but the hidden watermark in the picture can be restored by using the parsing picture blind watermark function of intelligent media management. For more information, see Blind Image Watermarking.
Product advantages
Scenario pain point analysis
Picture application
For picture applications, first upload pictures and videos to object storage. With the continuous development of business and the regulatory needs of laws and policies, AI analysis functions such as pornographic detection, label detection, face detection, and OCR recognition will be added, as shown in the figure below. Show.
Usually, image applications will choose to use business servers to install AI analysis capabilities from different manufacturers. Through these functions, key metadata information can be obtained and then saved in the database, thereby better supporting metadata retrieval and supporting business and regulatory needs. There are following problems with this solution:
The interface is not unified
Because there are multiple manufacturers to choose from, the compatibility of interfaces from different manufacturers needs to be considered.
Waste of resources
The same picture will be read multiple times or even transmitted to the external network, which wastes network bandwidth.
Low-cost batch processing solution without existing data
Using the manufacturer's synchronous processing is expensive, and it is necessary to provide a low-cost batch processing solution for existing data and accept asynchronous interfaces to return detection results. For example, perform marking processing on all images in the existing OSS Bucket.
netdisc
For network disk applications, functions such as user login, directory services, direct data transmission to OSS, and AI intelligent processing are usually required. Using the server and database through the backend, the architecture is shown in the figure below.
In order to support the data management of network disks, it is usually necessary to provide various types of metadata management. Especially in AI intelligent processing scenarios, it is necessary to define relevant storage formats and handle database exceptions, which brings greater development difficulty. There are following problems with this solution:
Metadata table design is difficult
For different metadata, various types of table structure storage need to be classified and designed, and there are technical thresholds.
Multi-dimensional metadata management is challenging
It is necessary to combine multiple metadata for associated query processing, which presents design challenges.
Maintaining metadata consistency challenges
Solving the recovery process of metadata in abnormal scenarios is a system-level problem.
Intelligent media management product advantages
Intelligent media management is designed around six key points: massive data, end-to-cloud connectivity, unified standards, intelligent analysis, scene combination, and one-click processing, and provides scenario-based AI intelligent processing solutions, as shown in the figure below.
Through targeted architecture design, intelligent media management has the following advantages:
Store data seamlessly
Directly related to object storage OSS to automatically process data on the cloud.
Rich data processing
Combined with the industry's advanced recognition and processing capabilities, it provides rich functional support for application processing.
Simplify operation and maintenance
Provide serverless services without having to worry about business operation and maintenance.
Scenario-based one-stop solution
Construct fast metadata management for scenarios and quickly implement applications.
product architecture
Intelligent media management is designed using a layered architecture. The layered architecture includes three layers: processing engine, metadata management, and scenario-based encapsulation, and there are dependent contexts, as shown in the figure below.
It relies on Alibaba Cloud storage services such as object storage and file storage to access unstructured data (such as pictures and videos) in Alibaba Cloud storage through a secure mechanism and extract valuable information.
It is encapsulated based on scene understanding to support image and video application scenarios such as network disks, cloud photo albums, social galleries, and home monitoring, providing new value for applications.
processing engine layer
Alibaba Cloud Storage provides a nearby computing framework that supports batch asynchronous processing and quasi-real-time synchronous processing. After one-click association with Alibaba Cloud Storage (such as specifying the directory prefix of the OSS Bucket, specifying an object of the OSS Bucket), rapid Automatic data processing, by integrating the industry's advanced data processing algorithms, the current processing engine provides the following functions:
Document format conversion
It supports converting documents in 48 formats, including OFFICE, into 5 formats: JPG, PNG, PDF, TXT, and VECTOR, which can be used for network disk document browsing and other scenarios.
content recognition
Identify scenes, objects, events and other information in pictures to realize automatic marking of pictures, which can be used in picture content review, picture retrieval and other scenarios.
Face Detection
Detect faces in pictures as well as people's age, gender, mood, etc., and can be used in scenarios such as photo album classification.
QR code detection
Detecting the QR code in the picture and the content stored in the QR code can determine whether the picture contains QR code information, and output the information contained in the QR code, which can be used in scenarios such as image content review.
Human detection
Detecting human body areas and confidence levels in images can be used in scenarios such as abnormal behavior detection.
Face search
Search for the top N pictures that are most similar to the specified picture. The results are arranged in descending order of similarity. It can be used in scenarios such as member management, album classification, and target person search.
Face comparison
Comparing the similarity of the two largest faces in two pictures can be used in scenarios such as identity recognition verification.
Picture blind watermark
Add a blind watermark of image or text type to the image. After the blind watermark is added, the watermark cannot be directly seen in the picture, but the watermark hidden in the picture can be restored by using the blind watermark parsing function of intelligent media management, which can be used in scenarios such as picture copyright tracing.
metadata management
Based on the functions provided by the processing engine and through in-depth understanding and sorting of the scene, intelligent media management encapsulates the metadata design of the scene and provides the metadata access interface of the scene to the outside world, simplifying the design difficulty of scene application and eliminating the need to pay attention to the metadata index database. For operation and maintenance work, the currently supported metadata indexes are as follows:
Face cluster index
Construct a metadata collection, then call the index interface of face grouping to analyze the image, and add the obtained metadata to the metadata collection, so that similar faces in the collection can be obtained. Through this index, you can quickly support scenarios such as face photo albums on network disks, stranger detection for home monitoring, and customer management for new retail.
Tag group index
Construct a metadata collection, then call the indexing interface of tag grouping to analyze the pictures, and add the obtained metadata to the metadata collection, so that pictures can be searched based on tags. Through this index, you can quickly support the search for tags such as scene albums on the network disk, pet tracking for home monitoring, and vulgar pictures.
scene encapsulation layer
Through Alibaba Cloud's support for scenarios, the functions of the processing engine layer and metadata management layer are packaged and provided in the form of resource packages, thereby simplifying use, facilitating quick access to applications, and achieving a close integration of AI and scenarios. Currently, Examples of supported scenarios are as follows:
Document standard
Integrate document-related format conversion and preview to quickly realize intelligent document management capabilities.
Picture standard type
Integrate AI functions such as content recognition and face detection to quickly realize intelligent management of pictures.
Application scenarios
Document management scenario
In applications such as network disks, mailboxes, and document management, using the document standard projects provided by intelligent media management can quickly realize the following scenarios:
Document preview
Use the format conversion function to convert 48 common document formats into 5 target types: JPEG, PNG, PDF, TXT, and VECTOR. Then, based on the format conversion results, combined with the front-end rendering engine, you can preview the document on PC and mobile devices.
Full Text Search
Extract the text in the DOC document and output it by page. Based on the text extraction results, build a full-text index based on the corresponding page of the document to achieve page-based full-text retrieval.
Image social analysis scenario
In applications such as picture social networking, e-commerce websites, and photo galleries, using the picture standard projects provided by intelligent media management can quickly realize the following scenarios:
Image content review
Use the content identification function to effectively identify vulgar and prohibited content in pictures, such as pornography, violence and terror, illegal content, etc., to meet the increasingly stringent regulatory requirements of regulatory authorities on content and avoid operational risks.
Use the QR code detection function to determine whether the image contains a QR code and output the content of the QR code.
Image classification and retrieval
Use the content recognition function to mark and group pictures, and then search for matching pictures in photo galleries, material websites, network disks, smart photo albums and other applications by searching for keywords.
Image copyright traceability
Use the picture blind watermark function to add a picture or text type blind watermark to the picture. After the blind watermark is added, the watermark cannot be directly seen in the picture, but the hidden watermark in the picture can be restored by using the parsing picture blind watermark function of intelligent media management.
Home device data storage scenario
After home devices (such as cameras) upload family pictures and surveillance videos to OSS, intelligent media management analyzes the face information of pictures and videos uploaded by multiple home devices saved in OSS, and obtains metadata information such as face detection and grouping, and uses metadata to Data information can conveniently realize functions such as face albums and stranger detection.
In this way, the association of multiple devices and the interaction between the device and the cloud are realized, making the AI capabilities of intelligent media management more inclusive, thus bringing the following advantages:
Intelligent management of multiple devices
Through multi-device picture and video data processing, combined analysis and intelligent management can be achieved on the cloud.
Intelligent collaboration between devices and cloud
Obtain the AI metadata of pictures and videos with very little bandwidth, allowing the device to quickly enjoy the AI intelligence of the cloud.
In applications such as home monitoring and smart photo albums, using the picture standard projects provided by smart media management can quickly realize the following scenarios:
Face classification and retrieval
Use the face search function to search the top N pictures in the gallery that are most similar to the specified face. The results are sorted in descending order of similarity.
Identity verification
Use the face comparison function to compare the similarity of the two largest faces in two pictures to detect strangers or verify whether two people are the same person.