前面已经证明,满足的差分隐私保护机制是可以推导出任意两个相邻数据集(相差一个单一个体)的假设检验效力函数上界的,而在后续的一篇文章The Composition Theorem for Differential Privacy中,作者证明了这两者其实是充要条件,那么该文章的想法很intuitive,为什么不直接用任意两个相邻数据集的假设检验难度来定义差分隐私呢?那么首先,我们需要给出假设检验难度的一个衡量。
Trade-off 函数
Trade-off的意思就是两个事物有某种权衡,此消彼长。Trade-off函数的定义如下: Definition 2.1 (trade-off function). For any two probability distributions and on the same space, define the trade-off function as
where the infimum is taken over all (measurable) rejection rules.
该函数用简单的语言来理解就是在第一类错误固定为x的时候,将第二类错误的误差下确界作为y。
可以证明的是,一个定义域和值域都在01区间的函数如果拥有凸性,连续性,单调下降以及被1-x作为上界的性质的话,等价于该函数就是一个Trade-off函数。 Proposition 2.2. A function is a trade-off function if and only if is canvex, continuous, non-increasing, and for
f-differential 定义
Definition 2.3 ( -differential privacy). Let be a trade-off function. A mechanism is said to be -differentially private if for all neighboring datasets and . 我们称一个满足上述条件的算法为满足f-DP隐私保护的算法,也就是说,f-DP给出了关于任意两个相邻数据集的Trade-off函数的下界,其实本质上就是用假设检验的难度来衡量差分隐私。
[1]https://www.ece.rutgers.edu/~asarwate/nips2017/ [2] Wasserman, L., & Zhou, S. (2010). A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489), 375–389. https://doi.org/10.1198/jasa.2009.tm08651 [3] Warner, S. L. (1965). Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. Journal of the American Statistical Association, 60(309), 63–69. https://doi.org/10.1080/01621459.1965.10480775 [4] Balle B, Wang Y X. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising[C]//International Conference on Machine Learning. PMLR, 2018: 394-403. [5] https://www.youtube.com/watch?v=QbMxd0dTexo [6] Cynthia Dwork and Guy N Rothblum. Concentrated differential privacy. arXiv preprint arXiv:1603.01887, 2016. [7] Abadi, M., McMahan, H. B., Chu, A., Mironov, I., Zhang, L., Goodfellow, I., & Talwar, K. (2016). Deep learning with differential privacy. Proceedings of the ACM Conference on Computer and Communications Security, 24-28-October-2016(Ccs), 308–318.