
2017-09-18 18:41王兆国谢峰关毅薛一波
智能计算机与应用 2017年4期

王兆国++谢峰++关毅 薛一波

术学院, 哈尔滨 150001; 2 清华大学 信息科学与技术国家实验室, 北京 100084)

摘要: 关键词: 中图分类号: 文献标志码: A文章编号: 2095-2163(2017)04-0001-05

2 National Lab for Information Sci. & Tech, Tsinghua University, Beijing 100084, China)

Abstract: With the development of social media, Internet is not only people's tool to get information, but also a channel to share information. Usergenerated contents make people face overload information. So that a lot of really valuable information is difficult to be found. On the strength of lower user involvement, the personalized recommendation system has been considered as one of the most potential methods to solve information overload at present. However, currently the most mature and widely used collaborative filtering recommendation method is facing such problems as data sparseness, diversity and so on. Its recommended effect is not ideal. A recommendation method based on linear regression is proposed in this paper. A linear regression model is established by using the rating frequency information of the users or items to predict the uses' scores on nonscored items. The method has the advantages of low complexity, incremental updating, and high accuracy and so on.




为了比较不同方法对数据稀疏程度的容忍度,本节将MovieLens 1M数据集切分成不同比例的训练集和测试集。比例x%从10%以10%的步长增长到90%。分别比较了2.3节所述方法的评分预測准确性、分类准确性指标,以及模型建立和预测时间的相应结果对照,具体研究阐释论述如下。




Fig. 1RMSE comparative experiment





Fig. 2F-Measure comparison experiment



实际生产环境总是对推荐结果的响应时间有一定的需求,特别是用户和商品过亿的大型真实系统对算法的耗时将更加敏感。本节给出了基于线性回归的推荐方法与2.3节所介绍的方法的建模时间和预测时间的对比,结果如表2所示,其中,IA表示Item Average、IC表示Item Correlation、RF表示Rating Frequency、LR表示Linear Regression。


Tab. 2Comparison of modeling time and forecast time





