李灵芳 黄文培 胡伟健
摘要:差分隐私保护是Dwork提出的基于数据失真技术的一种新的隐私保护模型,由于其克服了传统隐私保护需要背景知识假设和无法定量分析隐私保护水平的缺点,近年来迅速成为隐私保护领域研究热点。PINQ是最早实现差分隐私保护的交互型原型系统。介绍了差分隐私保护相关理论基础,分析了PINQ框架的实现机制。以PINQ中差分隐私保护下K-means聚类实现为例,研究了差分隐私在聚类中的应用。仿真实验表明,在不同的隐私预算下,实现的隐私保护级别也不同。
关键词:K-means; 数据失真;差分隐私; PINQ
DOIDOI:10.11907/rjdk.161175
中图分类号:TP309文献标识码:A文章编号:1672-7800(2016)006-0204-05
参考文献:
[1]周水庚, 李丰, 陶宇飞,等.面向数据库应用的隐私保护研究综述[J]. 计算机学报, 2009, 32(5):847-861.
[2]李杨, 温雯, 谢光强. 差分隐私保护研究综述[J].计算机应用研究, 2012, 29(9):3201-3205.
[3]MCSHERRY F. Privacy integrated queries[C].In Proc. ACM SIGMOD International Conference on Management of Data,2009.
[4]MOHAN P, THAKURTA A, SHI E, et al. GUPT:privacy preserving data analysis made easy[C].Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. ACM,2012:349-360.
[5]ROY I, SETTY S T V, KILZER A, et al. Airavat:security and privacy for mapreduce[J]. Usenix Org, 2010:297-312.
[6]DWORK C. A firm foundation for private data analysis[J]. Communications of the Acm, 2011, 54(1):86-95.
[7]DWORK C, MCSHERRY F, NISSIM K, et al. Calibrating noise to sensitivity in private data analysis[M]. Theory of Cryptography,Springer Berlin Heidelberg, 2006:265-284.
[8]FRIEDMAN A, SCHUSTER A. Data mining with differential privacy[C].Acm Sigkdd International Conference on Knowledge Discovery & Data Mining,2010:493-502.
[9]MCSHERRY F D. Privacy integrated queries: an extensible platform for privacy-preserving data analysis[J]. Proc,2011(1):26-30.
[10]BLUM A, DWORK C, MCSHERRY F, et al. Practical privacy: the sulq framework[J]. In PODS 05: Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2005(6):128-138.
[11]DWORK C. A firm foundation for private data analysis[J]. Communications of the Acm, 2011, 54(1):86-95.
[12]STEINBACH M, GEORGE. Karypis and vipin kumar 2000, a comparison of document clustering techniques[J]. Kdd Workshop on Text Mining, 2000(3):123-130.
[13]李杨, 郝志峰, 温雯,等. 差分隐私保护k-means聚类方法研究[J]. 计算机科学, 2013, 40(3):287-290.
[14]张啸剑, 王淼, 孟小峰. 差分隐私保护下一种精确挖掘top-k频繁模式方法[C].第30届中国数据库学术会议, 2013.
[15]熊平, 朱天清, 金大卫. 一种面向决策树构建的差分隐私保护算法[J]. 计算机应用研究, 2014, 31(10):3108-3112.
[16]ANIL K,JAIN. Data clustering: 50 years beyond K-means [J]. Pattern Recognition Letters, 2010, 31(8):651-666.