YAN Yan(阎艳), HAO Jia(郝佳),, WANG Guo-xin(王国新),GONG Lin(宫林), ZHAO Bo(赵博)
(1.School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China;2.Beijing Institute of Astronautic System Engineering, Beijing 310027, China)
Multi-action-based approach for constructing knowledge map
YAN Yan(阎艳)1, HAO Jia(郝佳), WANG Guo-xin(王国新)1,GONG Lin(宫林)1, ZHAO Bo(赵博)2
(1.School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China;2.Beijing Institute of Astronautic System Engineering, Beijing 310027, China)
To alleviate the information overload in the product design process, this work proposes a multi-action-based method for constructing knowledge map. Since the relationships of knowledge are implicit in the collected user activities, the method calculates the similarity according to the collected user activities. Three concepts, including knowledge, action and user, are explained first. Based on this, the similarity calculation method is illustrated in detail. The dependencies of actions and relations of the user are considered in the calculation method. Further, the approach of applying the constructed knowledge map to alleviate information overload is proposed. At last, the proposed method is validated by a knowledge search and result comparison experiment.
design knowledge; information overload; user action; knowledge map
In modern age, the products become more and more complex and the corresponding design processes are knowledge intensive. To manage design knowledge, many information systems, like knowledge management system (KMS), document management system (DMS) and enterprise content management (ECM), are developed and deployed. These systems enable designers’ access to a large amount of information which leads to the result that much more information is presented than the designers can process. This phenomenon exists widely and it’s commonly called “information overload”[1-3]. To reduce the information overload, researchers are applying new artificial intelligence methods and visualization methods to promote the accuracy of the search engine as well as making the information easy to be accessed.
Knowledge map is one of the widely used methods in the field of knowledge management. It contributes to many aspects of knowledge management, like knowledge visualization[4], information retrieval[3,5], knowledge guidance[6-7]and decision making[1,8-11]. In this paper, it’s used to alleviate the information overload and the goal is to propose a new method for constructing a knowledge map. The basic principle of the method is that the semantic relations between the design knowledge are implicit in the user activity.
1.1 Definitions
Similarities of information are the key elements of a knowledge map. Therefore, the main task is the calculation of similarity. In this paper, a multi-actions-based approach is developed to compute the similarity. There are three concepts, including design knowledge, action and user,should be addressed in this section since they are very important for the explanation of the proposed method.
1.1.1 Design knowledge
There isn’t widely accepted concept and the meaning of design knowledge is complex. Even in the context of product design, there are various meanings for the term of design knowledge.In this paper, the design knowledge is defined as the resource which can be used to support engineering design. The resource can be divided into Pictorial, Symbolic, Linguistic, Virtual (2D/3D) and Algorithm, etc.
1.1.2 Action
The users’ actions can be recorded when they use information systems and the record is called activity. Many actions can be defined according to the requirement of the information systems and each of the actions indicates the user’s interest to some extent. The relationships between actions are different. Some actions are dependent while some are not. The actions indicate the user’s interest at varying degrees which means some of the actions are more important than others in terms of indicating the user’s interest. In this paper, each action will be assigned a weight to express their ability of indicating the user’s interest.
1.1.3 User
The user is regarded as the person who uses the information systems. They can perform actions on information. In this paper, we assume that the user uses the information system rationally, which means he/she only performs actions to information of his/her interest. The user can be represented by a list or a tree structure. A list of users means that the relationship between users is ignored. Users in a tree structure mean that the relationship between users is considered.
1.2 Data structure
In our method,each activity is represented as a triple. The triple demonstrate a specific user performs an action on specific information. For example, the triple {u1,a2,k5} means the user (u1) performs the action (a2) on the information (k5). Based on that, all the activities can be organized into a three-dimensional matrix (activity matrix). Fig.1 shows an example of activity matrix including three users, two information items and two actions.
Fig.1 Example of activity matrix
In this matrix, the value 1 represents that the action is performed on the corresponding information item by the corresponding user, otherwise, the value is 0. During the analysis, the activity matrix is decomposed intoNatwo-dimensional (matrix) matrices as
(1)
andNais the number of actions.
In this section, we will analyze the approach based on a simple model including four users and two knowledge items.The model is denoted by “4U2K”.
2.1 User independent models
In this section, the user independent model, in which the users are organized into a list, is discussed first. Three models will be discussed, they are“4U2K1A”, “4U2K2A” and “4U2K2A(d)”.
2.1.1 4U2K1A model
As shown in Fig.2, the 4U2K1A model means that only one action exists in the 4U2K model. The corresponding activity matrix is shown in the right graph of the figure. The row vectors means the information and the column vectors represent the user.
Fig.2 4U2K1A model
Based on the activity matrix, we can figure out the similarity by the following equation as
(2)
where k1,k2are the row vectors of the activity matrix. The action similarity ranges from 0 to 1. In this case the action similarity between the two knowledge items is 0.5.
2.1.2 4U2K2A model
If two independent actions are included, the 4U2K model becomes a 4U2K2A model. As shown in Fig.3, the model can be divided into two 4U2K1A models because of the independence of the two actions. Therefore, the action similarity of the decomposed models can also be calculated by Eq. (2) respectively.
Fig.3 4U2K2A model
After the similarity calculation of the two separated 4U2K1A models,the similarity of the 4U2K2A model can be obtained by
(3)
whereαandβistheweightofcorrespondingactionandka1,ka2, kb1,kb2represent the row vector of similarity matrix. According to Eq. (3), the action similarity between the two knowledge items is 0.75 when α=β=1.
2.1.3 4U2K2A(d)model
Whenthetwoactionsinthe4U2K2Amodelarenotindependent,themodelbecomesa4U2K2A(d)model.Inthismodelwedefinethattheoccurrenceofactionbdependsonactiona.Whatthedependentactionscouldproduceistherepetitiveaccountduringtheactionsimilaritycalculation.Therefore,theactionsimilaritycannotbedirectlycomputedbythepreviousmethod.Thefollowingequationisusedtocomputetheactionsimilarityofthedependentactions.
Aa=Aa∧(Aa∧Ab)
(4)
whereAaandAbdenotethetwosimilaritymatrices.Basedonthis,thefinalactionsimilaritycanbeobtainedfromEq.(3).Inthismodel,thefinalactionsimilarityis0.75whenα=0.5andβ=1.5.
2.2Userdependentmodel
Inthissection,theuserdependenceisaddedintothe4U2Kmodel.AnexampleofusertreeisshowninFig.4,whereu2andu3arethechildusersofu1andu4isasiblinguserofu1.Toexpresstheuserdependence,weconverttheusertreestructureintoauserdependencematrixUasshownintherightgraphofthefigure.Thevalueuijofthematrixis1whenUiisthechildorsiblingofUjand0otherwise.
Fig.4 Example of user tree structure
2.2.1 4U(d)2K1Amodel
Basedonthe4U2K1Amodel,a4U(d)2K1Acanbeobtainediftheuserdependencesareconsidered.Theactivitiesexpresstheusers’interests,andwebelievetheinterestsofsiblingandchildimpliestheusers’intereststosomeextent.Ifauserdoesnotperformtheactiononaninformationitem,butmostofthechildrenandsiblingsdo,wecandeemtheuserperformstheactiontosomeextent.Theactivitymatrixisfirstpreprocessedtoaddtheuserdependenceinformation.Theruletoprocesstheactivitymatrixisthatwhethertheusersperformanactionnotonlydependsontheiractualactivitiesbutalsodependsontheirchildrenandsiblings.Ifthenumberofchildrenandsiblingswhohaveperformedtheactionexceedsthethresholdvalue,theusercanbedeemedtohaveperformedtheaction.Thefollowingequationisusedtoprocesstheoriginalactivitymatrix.
(5)
whereaijis the element of similarity matrix andauijis the element ofAU.
2.2.2 4U(d)2K2Amodel
ThismodelcanbeestablishedaccordingtoEq. (3),andcanbedividedintotwo4U(d)2K1Amodels.Inthiscase,thesimilarityis1undertheconditionofα=β=1andδ=2.
2.2.3 4U(d)2K2A(d)model
The4U(d)2K2A(d)modelconsidersboththeuserdependenceandtheactiondependence.Theinitialactivitymatrixisprocessedbythestepsasfollows.
①ProcessingtheactionmatrixbyEq. (4)toeliminatetheactiondependence.
②ProcessingtheactionmatrixbyEq. (3)toeliminatetheuserdependence.
③DividingthemodelintotwomodelsandcomputethesimilarityfromEq. (1)respectively.
④ComputethefinalsimilarityfromEq. (2).
Inthiscase,thefinalsimilarityis1.625undertheconditionofα=0.5,β=1.5,andδ=1.Basedonthismodel,thegenericmodelcanbeestablishedsinceagenericmodelcanbedividedintomultisimplemodels.
In this paper, we apply the knowledge map to alleviate the information overload in a design knowledge management.The basic idea to handle the problem is to find a small amount of information which is most likely to be accessed. Here the search result will be filtered and reordered. Each of the searching results can be assigned a weight that expresses the probability that the user may access the result. The weight can be figured out by
(6)
wherewjdenotes the weight of the knowledge.
In this section, we first demonstrate the method of collecting user activities and then build a system to compare the full-text search result and the filtered result.
4.1 Activity collection
In practice, the user activities can be recorded by the deployed information management system. In this paper, the user activities were obtained according to the steps as follows.
① We built a design knowledge database which included 854 design knowledge items of computer numeric control (CNC) field.
② Experts in the CNC field provided us 155 keywords to describe the field.
③ We invited twelve participants and each of them selected three to five keywords.
④ Each participant got a list of design knowledge items by keyword search (only the top 100).
⑤ The participants performed actions including view, favorite and comment on the list of design knowledge items.
After this process, we got the activity matrix.
4.2 Validate the result
The activity matrix is further processed according to the approach introduced in section 3 to get the action similarity matrix. The weight vector of the actions is {0.7, 1.1, 1.2}. Three knowledge maps are constructed under the condition of δ=1,δ=2,andδ=3.Tovalidatetheresult,asystemisdevelopedasshowninFig.5.
Fig.5 Interface used to test the result
Inthesystem,theusercanquerythedesignknowledgeandtwokindsofsearchresultsarelistedintheuserinterface.Oneoftheresultsisprovidedbyfull-textsearchfunction.Anotherresultisreorderedaccordingtoourmethod.It’snoteworthythatthepositions(leftorright)oftheresultsarerandom.Thentheusercanselectthedesignknowledgeofinterestfromanyofthelists.Thesystemcanrecordthelistfromwhichtheuserselectstheknowledgeofinterest.
Thetwelveparticipantscaninputakeywordandthenpressthesearchbutton.Afterthis,theparticipantsbrowsethetworesultlists.Ifthedesignknowledgeofinterestisfound,theparticipantsclicktheinterestbuttonandstartthenextsearch.Foreachkeyword,threetestsareconductedunderdifferentvaluesofδ.Duringthatprocess,thesystemwillrecordtheeventsofsearchandinterestbuttonclick.Atotalof429queryeventsarerecordedbasedonthekeywordsqueryprocess.Fig. 6showstheresultoftheseevents.
InFig.6,thefilteredresultvaluemeansthenumberofsucheventsthatthedesignknowledgeofinterestisfoundinthefilteredresult.Thefull-textvaluemeansthenumberofsearcheventsthatthedesignknowledgeofinterestisfoundinthefull-textsearchresult.Thefailurevaluemeansthenumberofsearcheventsthatnodesignknowledgeofinterestisfound.ThesymbolTisusedtoexpressthevalue.
Fig.6 Validation result
WecanconcludefromFig.6thatthefilteredresultisbetterthanthefull-textsearchresultunderdifferentδ.Onaverage,theparticipantsfindthedesignknowledgeinfilteredresultin67.83%ofthesearcheventswhiletheyfindthedesignknowledgeinfull-textsearchresultinonly25.64%ofthesearchevents.Theydidnotfindthedesignknowledgein6.53%ofthesearchevents.Thisistosaythefilteredresultisinaccordancewiththeinterestoftheusers.
Wecanfindthatwhenthevalueofδis2,theresultisslightlybettersincethevalueofT21=101isslightlybiggerthanthevalueofT11=96andT31=94.However,differentδvalueshavelittleeffectontheresultinthiscase.Thereasonmaybethatthenumberofparticipantsissmall,whichmeansthedifferenceoftheactionsimilarityisnotobviousunderdifferentδvalues.
Inthispaper,amulti-actionbasedknowledgemapconstructionmethodisproposed.Theconceptsofdesignknowledge,userandactionarefirstaddressedandexplained.Basedonthis,themeasurementoftheaction-similarityofdesignknowledgeisillustratedindetail.Amethodispresentedtoalleviatetheinformationoverloadexistedinproductdesignbyusingknowledgemaps.Inthismethod,thefull-textsearchingresultsarereorderedaccordingtotheaction-similarityofthedesignknowledge.Anexperimentisconductedtovalidatetheproposedmethod.Theresultshowsthattheproposedmethodcanprovidethedesignknowledgeofinteresttotheuser.TheproposedmethodusesaBooleanvaluetoindicatetherelationshipofactionsanduserswithouttheconsiderationofthenumberoftimesoftheactivities.Futureworkwillconsiderthenumberoftimesoftheactivitiesaswellasthetimewhentheactivityhappened.
[1] Ong T H, Chen H, Sung W K, et al. Newsmap: A knowledge map for online news [J]. Decision Support Systems, 2005, 39(4): 583-597.
[2] Chen Y J, Chen Y M. Knowledge evolution course discovery in a professional virtual community [J]. Knowledge-Based Systems, 2012, 33: 1-28.
[3] Chung W Y, Chen H C, Jay F N. A visual framework for knowledge discovery on the web: an empirical study of business intelligence exploration [J]. Journal of Management Information Systems, 2005, 21(4): 57-84.
[4] Martin J E, Remo A B. Visual representations in knowledge management: framework and cases [J]. Journal of Knowledge Management, 2007, 11(4): 112-122.
[5] Fu R L, C MH. Knowledge map creation and maintenance for virtual communities of practice [J]. Information Processing and Management, 2006, 42(2): 551-568.
[6] J L G. Creating knowledge maps by exploiting dependent relationships [J]. Knowledge-Based Systems, 2000, 13(2): 71-79.
[7] Liu Jun, Wang Jincheng, Zheng Qinghua, et al. Topological analysis of knowledge maps [J]. Knowledge-Based Systems, 2012, 36:260-267.
[8] Yoon B, Lee S, Lee G. Development and application of a keyword-based knowledge map for effective R&D planning [J]. Scientometrics, 2010, 85(3): 803-820.
[9] Woo J H, Clayton M J, Johnson R E, et al. Dynamic knowledge map: reusing experts’ tacit knowledge in the AEC industry [J]. Automation in Construction, 2004, 13(2): 203-207.
[10] Berg C V, Popescu I. An experience in knowledge mapping [J]. Journal of Knowledge Management, 2005, 9(2): 123-128.
[11] Zhuge Hai, Luo Xiangfeng. Knowledge map: Mathematical model and dynamic behaviors [J]. Journal of Computer Science and Technology, 2005, 20(3): 289-295.
(Edited by Cai Jianying)
10.15918/j.jbit1004-0579.201524.0308
TP 319.7 Document code: A Article ID: 1004- 0579(2015)03- 0335- 06
Received 2013- 11- 01
Supported by the National Natural Science Foundation of China (51375049); National Defense Basic Scientific Research(A222011; A222013)
E-mail: haojia632@gmail.com
Journal of Beijing Institute of Technology2015年3期