王子岩 司亮 刘滨 刘宇 孙中贤 刘增杰 张红斌 刘青
摘 要:手机等移动智能终端在全社会的普及,使得数字内容的生产能力下沉到社会各个层面,形成了多源、自主、原生的互联网媒体内容制造格局;而以社交媒体、自媒体为代表的各类新兴媒体的蓬勃发展,使得数字内容的传播能力极大增强,大量衍生内容在敏感、热点、重要事件的报道传播中产生。互联网资讯具有海量、内容质量参差不齐、观点多极等特点。如何将价值导向对正确的、信息披露准确的资讯进行精准推荐,维护和促进社会公平正义,成为司法领域的新问题和新挑战。
推荐系统有效解决了用户在海量信息中难以高效获得信息的问题。基于内容的推荐技术通过分析用户以往感兴趣的项目,经计算得到相似的项目,再将相似度最高的若干项目推送给用户。推荐系统中应用最广泛的是协同过滤推荐算法(collaborative filtering,CF),该概念最早于1992年由GOLDBERG等在开發Tapestry邮件过滤系统时首次提出,其核心思想是运用算法对用户的历史行为数据进行分析,挖掘出用户的兴趣偏好,根据不同的兴趣偏好对用户进行类别划分并推荐相似偏好的物品。当前,个性化推荐已经在电子商务、影视作品、餐饮美食、新闻资讯等领域获得了较为广泛的应用。“京东”的推荐起步于2012年,当时的产品推荐是基于规则匹配进行的,整个推荐产品线组合就像一个个松散的原始部落,部落与部落之间没有任何工程、算法的交集。“淘宝”从2013年推出了“个性化推荐”即“千人千面”的推荐引擎,利用用户的一些行为,通过算法推测出用户可能喜欢的东西。“美团”构建了世界上最大的菜品知识库,为200多万商家、3亿多件商品绘制了知识图谱,并为2.5亿用户画像,构建了世界上用户规模最大的O2O智能推荐平台。“豆瓣”利用社交行为分析解决推荐问题,如基于用户相同行为的协同过滤技术、友邻推荐等,也是个性化推荐的一个补充。社交化推荐的引入,可以解决因单纯产品内容推荐导致推荐范围越来越窄的问题。“今日头条”的个性化推荐算法基于投票方法,其核心理念就是投票,每个用户一票,喜欢哪篇文章就把票投给哪篇文章,经过统计,最后得到的结果很可能是此类人群里最好的文章,并把这篇文章推荐给同类人群用户。该方法看起来似乎很简单,但实际上需要基于对海量用户行为的数据挖掘与分析。
Study on information personalized recommendation
based on system dynamics
WANG Ziyan1,SI Liang2,3,LIU Bin2,3,LIU Yu4,
SUN Zhongxian2,3,LIU Zengjie2,3,ZHANG Hongbin5,LIU Qing2,3
(1.School of Law,Hebei University of Economics and Business,Shijiazhuang,Hebei 050061,China; 2.School of Economics and Management,Hebei University of Science and Technology,Shijiazhuang,Hebei 050018,China; 3.Big Data and Social Computing Research Center,Hebei University of Science and Technology,Shijiazhuang,Hebei 050018,China; 4.Library,Hebei Professional College of Political Science and Law,Shijiazhuang,Hebei 050061,China; 5.School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang,Hebei 050018,China)
With the popularity of mobile intelligent terminals such as mobile phones in the whole society,the production capacity of digital content has sunk to all levels of the society,forming a multi-source,independent and native Internet media content manufacturing pattern.With the vigorous development of various emerging media represented by social media and we media,the propagation ability of digital content has been greatly enhanced,especially in the reporting of sensitive,hot and important events in the process of propagation,which will produce a lot of derivative content.The improvement of the above two abilities causes the internet information to be characterized by mass,uneven content quality and multi-point of view.How to accurately recommend the news of correct value orientation and accurate information disclosure related to judicial work to maintain and promote social fairness and justice has become a new problem and challenge in the judicial field.
Recommender system effectively solved the problem that it was difficult for users to find the information they need efficiently in the mass of information.Content based recommendation technology analyzed the items that users are interested in before,got the similar items by calculation,and then pushed the items with the highest similarity to users.Collaborative filtering (CF) is the most widely used recommendation system,which was first proposed by Goldberg in 1992 when developing tapestry e-mail filtering system.Its core idea is to analyze the user′s historical behavior data through the algorithm,mine the user′s interest preferences,classify users according to different interest preferences,and recommend items with similar preferences.Collaborative filtering is the most widely used algorithm in recommendation system.It was first proposed by Goldberg in 1992 when developing tapestry e-mail filtering system.Its core idea was to analyze the user′s historical behavior data through the algorithm,mine the user′s interest preferences,classify users according to different interest preferences,and recommend items with similar preferences.At present,personalized recommendation has been widely used in e-commerce,film and television works,food and beverage,news and other fields.For example,the recommendation of Jingdong started in 2012,and the recommendation products were based on rule matching.The combination of the whole recommendation product lines was like a loose primitive tribe,and there was no intersection of engineering and algorithm between the tribes.Taobao launched the recommendation engine of "personalized recommendation",namely "thousands of people and thousands of faces" in 2013,which used some users′behaviors to speculate what users may like through algorithms.Meituan has built the world′s largest food knowledge base,created knowledge graphs for more than 2 million businesses and 300 million products,made user portraits for 250 million users,and built the world′s largest O2O intelligent recommendation platform for users. Douban used social behavior analysis to solve recommendation problems,such as collaborative filtering technology based on the sameusers′behavior ,and friends or neighbors recommendation,etc.,which is also a supplement of personalized recommendation.The introduction of social recommendation can solve the problem of narrow recommendation range caused by simple product content recommendation.The personalized recommendation algorithm of Toutiao was based on voting,and its core idea was to vote.Each user can cast his only vote to the article he likes.After statistics,the final result was likely to be the best article in this crowd,and the article would be recommended to the same group of users.This method seems to be very simple,but in fact,it needs massive user behavior data mining and analysis.
System dynamics is an interdisciplinary subject based on system theory,cybernetics and information theory,and with the help of computer simulation technology.From the perspective of system,structured and dynamic analysis and model simulation are conducted,which is good at analyzing high-order,nonlinear and time-varying complex systems,and is suitable for analyzing the dynamic and complex process of personalized information recommendation by combining qualitative and quantitative analysis..Based on the theory of system dynamics,this paper modeled and simulated the important factors that affect the effect of information recommendation in Vensim software,and constructed the causal feedback model and stock flow model including the number of users,articles,tags and the influence among subsystems.The system dynamics equation model was established,and the sensitivity analysis of related factors was carried out.The results show that the number of articles,the characteristic tags and the interest factors of articles all have an important impact on the recommendation effect.They are the key factors to be considered in the design of the recommendation system,and are also the important ways to solve the key problems,such as the cold start,real-time and "information cocoon room" of the recommendation system.Research on information personalized recommendation based on system dynamics can actively and effectively meet the challenges of information disclosure in the judicial field and improve the accurate recommendation effect.
data processing; personalized recommendation; judicial work; system dynamics; information platform;simulation
1 资讯个性化推荐系统重要因素系统动力学识别
1.1 构建资讯个性化推荐系统的因果反馈模型
1.2 因果反馈模型分析
2 资讯个性化推荐系统影响因素存量流量模型
2.1 绘制存量流量图
2.2 构建系统动力学方程
2.2.1 建立Vensim方程式
兴趣点因子= (感兴趣文章占比*文章量+感兴趣标签占比*标签量)/文章词量;
文章总量= INTEG(文章采集-文章过时,1 000);
标签总量= INTEG(标签采集-标签消亡,18);
用户= INTEG(用户增加-用户减少,500);
2.2.2 建立表函数
2.2.3 确定影响因素的初始值
2.3 模型检验
第1步,需要对系统的边界进行检验,通过观察模型中增加或者减少变量是否能够形成闭合回路来验证该变量是否对所研究的问题产生影响。验证无误后进行第2步运行检验,使用Vensim中model菜单下的“Check Model”功能,对模型中所有的变量进行检验。检验显示,所有变量已经赋值,且各变量之间均已建立了系统动力学方程。
3 系统动力学仿真分析
3.1 因素模拟分析
在图4中,用户量、文章量、标签量存在明显正相关关系,这3个量都是存量,是一个累计值,从趋势上看,都是随着时间的变化先增加后减少,并且在模型运行初期都有明显加速的过程,在模拟到30 d左右达到峰值,之后虽然推荐系统也在不断工作,但是在没有外因进行刺激的情况下,资讯平台的用户量开始呈下降趋势。其余因素的变化,亦可通过仿真分析变化趋势,由于篇幅所限,在此不再赘述。
3.2 变动参数的仿真结果分析
在推荐系统中,普遍存在稀疏性问题,即在系统运行初期,由于用户和文章的特征较少,所以导致推荐效果较差。因此在对文章内容上下文场景进行分析时,文章内容的长短影响着阅读体验,同时,系统对文章的切词和分词进行语义处理,也影响着数据的稀疏性问题。以文章3 000字数为基准,测试字数对吸引力因子和用户量的影响,如图6所示。由图6可知,在合理范围内,文章字数的变化对吸引力因子和用户量的影响不大,解决稀疏性问题可通过用户或文章聚类以及增加对内页的优化,增加转发与点赞、打分等手段,增加用户对平台的互动性,收集用户和文章的特征信息,在一定程度上可解决稀疏性问题。
4 结 语
