【强基固本】MMD:最大均值差异
“强基固本,行稳致远”,科学研究离不开理论基础,人工智能学科更是需要数学、物理、神经科学等基础学科提供有力支撑,为了紧扣时代脉搏,我们推出“强基固本”专栏,讲解AI领域的基础知识,为你的科研学习提供助力,夯实理论基础,提升原始创新能力,敬请关注。
01
02
03
04
05
import numpy as np
import torch
def guassian_kernel(source, target, kernel_mul=2.0, kernel_num=5, fix_sigma=None):
"""计算Gram核矩阵
source: sample_size_1 * feature_size 的数据
target: sample_size_2 * feature_size 的数据
kernel_mul: 这个概念不太清楚,感觉也是为了计算每个核的bandwith
kernel_num: 表示的是多核的数量
fix_sigma: 表示是否使用固定的标准差
return: (sample_size_1 + sample_size_2) * (sample_size_1 + sample_size_2)的
矩阵,表达形式:
[ K_ss K_st
K_ts K_tt ]
"""
n_samples = int(source.size()[0])+int(target.size()[0])
total = torch.cat([source, target], dim=0) # 合并在一起
total0 = total.unsqueeze(0).expand(int(total.size(0)), \
int(total.size(0)), \
int(total.size(1)))
total1 = total.unsqueeze(1).expand(int(total.size(0)), \
int(total.size(0)), \
int(total.size(1)))
L2_distance = ((total0-total1)**2).sum(2) # 计算高斯核中的|x-y|
# 计算多核中每个核的bandwidth
if fix_sigma:
bandwidth = fix_sigma
else:
bandwidth = torch.sum(L2_distance.data) / (n_samples**2-n_samples)
bandwidth /= kernel_mul ** (kernel_num // 2)
bandwidth_list = [bandwidth * (kernel_mul**i) for i in range(kernel_num)]
# 高斯核的公式,exp(-|x-y|/bandwith)
kernel_val = [torch.exp(-L2_distance / bandwidth_temp) for \
bandwidth_temp in bandwidth_list]
return sum(kernel_val) # 将多个核合并在一起
def mmd(source, target, kernel_mul=2.0, kernel_num=5, fix_sigma=None):
n = int(source.size()[0])
m = int(target.size()[0])
kernels = guassian_kernel(source, target,
kernel_mul=kernel_mul, kernel_num=kernel_num, fix_sigma=fix_sigma)
XX = kernels[:n, :n]
YY = kernels[n:, n:]
XY = kernels[:n, n:]
YX = kernels[n:, :n]
XX = torch.div(XX, n * n).sum(dim=1).view(1,-1) # K_ss矩阵,Source<->Source
XY = torch.div(XY, -n * m).sum(dim=1).view(1,-1) # K_st矩阵,Source<->Target
YX = torch.div(YX, -m * n).sum(dim=1).view(1,-1) # K_ts矩阵,Target<->Source
YY = torch.div(YY, m * m).sum(dim=1).view(1,-1) # K_tt矩阵,Target<->Target
loss = (XX + XY).sum() + (YX + YY).sum()
return loss
if __name__ == "__main__":
# 样本数量可以不同,特征数目必须相同
# 100和90是样本数量,50是特征数目
data_1 = torch.tensor(np.random.normal(loc=0,scale=10,size=(100,50)))
data_2 = torch.tensor(np.random.normal(loc=10,scale=10,size=(90,50)))
print("MMD Loss:",mmd(data_1,data_2))
data_1 = torch.tensor(np.random.normal(loc=0,scale=10,size=(100,50)))
data_2 = torch.tensor(np.random.normal(loc=0,scale=9,size=(80,50)))
print("MMD Loss:",mmd(data_1,data_2))
# MMD Loss: tensor(1.0866, dtype=torch.float64)
# MMD Loss: tensor(0.0852, dtype=torch.float64)
参考
1. 随机变量的矩和高阶矩有什么实在的含义? - 姚岑卓的回答 https://www.zhihu.com/question/25344430/answer/64509141
2. abA Hilbert Space Embedding for Distributions https://www.cc.gatech.edu/~lsong/papers/SmoGreSonSch07.pdf
3. Kernel Mean Embedding of Distributions: A Review and Beyond https://arxiv.org/abs/1605.09522
4. A Story of Basis and Kernel http://songcy.net/posts/story-of-basis-and-kernel-part-2/
5. Kernel Distribution Embedding - 李新春的文章 - 知乎 https://zhuanlan.zhihu.com/p/114264831
6. MMD计算的核技巧公式推导 - 王晋东不在家的文章 https://zhuanlan.zhihu.com/p/63026435
7. https://github.com/easezyc/deep-transfer-learning https://github.com/easezyc/deep-transfer-learning
本文目的在于学术交流,并不代表本公众号赞同其观点或对其内容真实性负责,版权归原作者所有,如有侵权请告知删除。
“强基固本”历史文章
目标跟踪系列--KCF算法
神经网络常用卷积总结
从零开始的自然语言处理(基于隐马尔可夫模型的词性标注)
写给新手炼丹师:2021版调参上分手册
机器学习-嵌入Embedding
各种Normalization
VGG网络的Pytorch官方实现过程解读
卷积神经网络(CNN)反向传播算法推导
全连接神经网络中反向传播算法数学推导
损失函数之DIoU Loss和CIoU Loss
广义正则对偶平均(gRDA)算法简介
深度学习和神经网络:神经网络的训练和评估
一网打尽CNN前向和反向 — 池化、padding、dropout
神经网络、流形和拓扑
更多强基固本专栏文章,
请点击文章底部“阅读原文”查看
分享、点赞、在看,给个三连击呗!