形式化的定义是:假设有 k个客户端,数据存储在其中(客户端可以是移动电话、可穿戴设备或者临床机构的数据仓库等)。用 Dk 表示与客户 k 相关的数据分布,并用 nk 表示从客户端获得的样品数量。N = ∑ Kk = 1nk 是总样本量,F(w)是损失函数,w是待优化的模型参数矩阵。联邦学习归结为如何使用某种算法,来让下图的损失函数最小化的问题:
1. Mohri M, Sivek G, Suresh AT (2019) Agnostic federated learning. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th International conference on machine learning, proceedings of machine learning research, vol 97. PMLR, Long Beach, pp 4615–4625
2. Li T, Sanjabi M, Smith V (2019) Fair resource allocation in federated learning. arXiv:1905.10497
3. Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453
4. Lee J, Sun J, Wang F, Wang S, Jun CH, Jiang X (2018) Privacy-preserving patient similarity learning in a federated environment: development and analysis. JMIR Medical Informatics 6(2):e20
5. Liu D, Dligach D, Miller T (2019) Two-stage federated phenotyping and patient representation learning. arXiv:1908.05596
6. Gupta O, Raskar R (2018) Distributed learning of deep neural network over multiple agents. J Netw Comput Appl 116:1–8
7. Huang L, Liu D (2019) Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. arXiv:1903.09296