Design and implementation of digital content metadata system

2013-12-20 07:22JinhoSeoTaewonSeoJunghoonShinSanggeunSongChengDongSangjunLee

Jinho Seo, Taewon Seo, Junghoon Shin, Sanggeun Song, Cheng Dong, Sangjun Lee

(Department of Computer Science, Soongsil University, Seoul 156-743, Korea)

Design and implementation of digital content metadata system

Jinho Seo, Taewon Seo, Junghoon Shin, Sanggeun Song, Cheng Dong, Sangjun Lee

(Department of Computer Science, Soongsil University, Seoul 156-743, Korea)

Today's multimedia services are far beyond just the voice and data services: they have been diversified tremendously after fueled by the advancement of network infrastructures as well as the sudden surge of multimedia data itself. Currently, researches on metadata insertion, management and transfer keep going very well in order to provide a variety of services to users. In this paper, we propose the design and implementation methods of digital contents metadata system for insertion, storage and retrieval of metadata. The performance evaluation shows that the proposed method performs better than the existing method.

Internet protocol television (IPTV); video metadata; binary extensible markup language (XML)

CLD number: TP37 Document code: A

Various multimedia services are expanding over the fields of the voice and data services since the multimedia data quantity has been explosively increased with the high-speed development of network infrastructure in mobile Internet environment, etc[1,2]. In these trends, a lot of studies of metadata creation, management and transmission regarding motion picture have been actively made to provide various services to users[3,4]. The motion picture can be considered as three dimensional data that is created by converting two dimensional images as a time flow. Generally, the data will be enormous from multi-megabyte to multi-gigabyte depending on the quality of the picture or encoding method[5]. Especially, as digital television (DTV) has been supplied recently, the size of motion picture will be 10 gigabytes per each running time when high definition (HD) movies are served. To efficiently handle these enormous data, complex data processing technology is required. The metadata will be inserted to provide various services, and efficient transmission technology is also considered emphatically as the metadata has been increasing.

In this study, we propose a system in which the metadata can be inserted into the motion picture and be stored/retrieved efficiently. The composition of the study is as follows. In section 1, current studies of metadata insertion into the motion picture are introduced, and binary extensible markup language (XML) is the expression method of metadata. In section 2, the system design and realization methods, including metadata insertion, metadata storage and metadata retrieval, are suggested. Finally, the conclusion is stated in section 3.

1 Related research

The content retrieval or browsing is not easy because the motion picture has enormous three dimensional data that is created by converting two dimensional images as a time flow. To solve this problem, the metadata had been inserted into motion picture, and content retrieval and browsing had been executed based on this data in past studies[5,6]. For current metadata insertion method, a key frame is selected from a motion picture to prepare metadata insertion, and the concrete process is shown in Fig.1.

The metadata can be inserted after extracting a key frame. Fig.2 shows how the metadata is inserted into the extracted key frame.

Fig.2 Metadata insertion

The editor marks the object area where the metadata will be inserted into the extracted key frame, and inputs the object information. This process is called a link, which includes information about the key frame showing time of the motion picture, object related tag, keyword and so on. The inserted metadata is stored with XML file format.

2 Digital content metadata system

2.1 System architecture

The overall system architecture is shown in Fig.3.

Fig.3 Architecture diagram of digital content metadata system

After the metadata is inserted into the motion picture with the motion picture metadata editing tool, and the motion picture and inserted metadata are sent to the server. The server stores the received motion picture and metadata, and thus the metadata can provide the motion picture when receiving the user request. The motion picture metadata editing tool and motion picture player are made with the Microsoft's Silverlight[7,8]. The Silverlight is the web browser plug-in which provides the support to rich Internet application (RIA) and can be applied in various web browsers of the Microsoft Windows and Mac OS X operating system. The server provides the web services with installing Internet information services (IIS) 7.0 in Microsoft Windows server 2008 environment.

2.2 Motion picture metadata editing tool

In this study, the metadata insertion method for the moving object in the motion picture is proposed and is different from the current metadata insertion method for the key frame. The editor inputs the information regarding the object in the motion picture and masks each object. The editor needs to play the motion picture and tracks the object movement to mask because the objects of the motion picture will appear, move or disappear depending on the time flow. Figs.4 and 5 shows the metadata insertion and object masking display, respectively.

Fig.4 Metadata insertion to object

Fig.5 Object masking

The editor inputs the metadata regarding the object of motion picture. As shown in the right window of Fig.4, the metadata will be inserted as inputting the name and related keywords of the object. After the metadata is inserted, the object movement in the motion picture will be recorded by masking the object appearance portion. The circle of Fig.5 is the masking portion of a motion picture. The motion picture will be played when the masking starts, then the editor drags and masks by moving the circle to follow the object movement. The editing tool displays the object movement by storing the coordinate and size of the circle as the time flow.

2.3 Metadata storage structure

The metadata which is inserted by the motion picture metadata editing tool is stored with XML file format[9]. The inserted metadata is divided and stored to different files by the object and masking information. Figs.6 and 7 show the XML file structures of object storage information and masking information, respectively.

Fig.6 XML file structure of object information

Fig.7 XML file structure of masking information

The object information which is inputted by an editor will be stored as XML file. The name and related keywords regarding the object in the motion picture will be stored finally with the form of Fig.6. The object related masking information will be stored as XML file, and the structure is shown in Fig.7. One mask has information regarding the related object's name (ObjectName), object appearance time (BeginTime) and disappearance time (EndTime). The mask has a list which consists of multiple single masks to record the object movement. A single mask shows the object position (mask position) at certain timing. To realize this function, information of certain time (Time), size of the mask (Size) and X, Y coordinates of displaying position are stored.

2.4 Binary XML metadata server

The binary XML[10]metadata server architecture of this study is shown in Fig.8. The metadata is stored in the metadata storage (Repository) with XML converting module which receives XML metadata with metadata editing tool and converts XML to binary XML based on the Fast Infoset standard. The object information requested by an user will be replied by retrieving the mask and object from Repository through the metadata retrieval module of the metadata server.

Fig.8 Architecture of binary XML metadata server

2.5 Metadata transmission

The user clicks a certain object when the motion picture is playing. The player will transmit the kequest to the server. The request form will be as follows:

id=“1”; time=“00:03:29:2460000”; x=“343”; y=245”

The motion picture id, clicking time, clicked position coordinate information are included into the request. After receiving the request, the server will retrieve the masking information of XML file which had been stored in the server and givea reply to the player depending on the mask state regarding the user click time and position. The response has the following form.

If there is no related mask, the “none” string will be replied to the player to inform that there is no mask. In this case, the player will not take any action and continue to play the motion picture. If there is the designated mask, the player will pause to play the motion picture and the selected object's metadata information will be displayed on the screen. Fig.9 shows the metadata displaying screen when there is the related mask.

Fig.9 Example of displayed metadata

The main purpose of the results is to show that your technique or method is working. Therefore, you only need to provide information relevant to your technique, give analysis and demonstrate the accuracy of your technique.

3 Conclusion

In this study, we propose the system of metadata insertion, storage and retrieve for digital content. With the proposed system, the metadata can be inserted into the moving object of a motion picture n-ot like the existing study, and the unnecessary overlap can be minimized by improving the storage structure. To achieve a minimized storage space and higher retrieval speed, the metadata server with a binary XML is made, and the transmission data size for metadata retrieving is reduced. Furthermore, the efficiency improvement rate of 28% for storage space reduction and 34% faster retrieval speed than the existing system have been verified.

[1] Cisco visual networking index forecast. Informa Media and Telecoms, 2010.

[2] Cisco visual networking index: global mobile data traffic forecast update, 2010-2015. [2013-05-12]. http:∥newsroom.cisco.com/ekits/Cisco_VNI_Global_Mobile_Data_Traffic_Forecast_2010_2015.pdf.

[3] Kim S H, Lee S Y. Web technology and standardization for web 2.0 based IPTV service. In: Proceeding of the 10th International Conference on Advanced Communication Technology (ICACT'08), 2007, 6: 74-83.

[4] Lyu J H, Pyo S J, Lim J Y, et al. A personalized TV service under open network environment. Journal of Korean Society of Broadcast Engineers, 2006: 279-282.

[5] Chun S D, Joo S W, Lee S J. Development of digital contents authoring tool using metadata. Journal of Korean Institute of Information Scientists and Engineers, 2007, 2(C): 34.

[6] Chun S D, Shin J H, Lee S J. Implementation of an efficient browsing using metadata of digital contents. Journal of Korean Institute of Information Scientists and Engineers, 2008, 1(C): 7-10.

[7] Microsoft. Silverlight. [2013-05-11]. http:∥www.microsoft.com/silverlight/.

[8] Little J A, Beres J, Hinkson G, et al. Silverlight 3 programmer's reference. Wiley, John & Sons, USA, 2009.

[9] Cowan C. XML in technical communication. 2rd ed. International Standard Text Code, 2010.

[10] W3C, binary XML characterization. [2013-05-11]. http:∥www.w3.org/TR/xbc-characterization/.

date: 2013-06-21

The MSIP (Ministry of Science, ICT & Future Planning), Korea, under the ITRC (Information Technology Research Center) support program (NIPA-2013-H0301-13-2006) supervised by the NIPA (National IT Industry Promotion Agency)

Sangjun Lee (sangjun@ssu.ac.kr)

1674-8042(2013)04-0361-04

10.3969/j.issn.1674-8042.2013.04.013