This article will familiarize you with the transformation process of Python - & gt; cafe - & gt; om model

Huawei cloud developer community 2021-02-22 19:48:20
article familiarize transformation process python


Abstract : This article aims to share Pytorch->Caffe->om Model transformation process .

Standard network

BaselinePytorchToCaffe

The main function code is :

PytorchToCaffe
+-- Caffe
| +-- caffe.proto
| +-- layer_param.py
+-- example
| +-- resnet_pytorch_2_caffe.py
+-- pytorch_to_caffe.py

Direct use can refer to resnet_pytorch_2_caffe.py, If the operation in the network Baseline Zhongdu has been realized , Can be converted directly to Caffe Model .

Add custom actions

If you encounter an operation that is not implemented , There are two situations to consider .

Caffe There are corresponding operations in

With arg_max For example, let's share how to add operations .

The first thing to look at is Caffe The parameters of the corresponding layer in :caffe.proto For the corresponding version caffe The definition of layers and parameters , You can see ArgMax Defined out_max_val、top_k、axis Three parameters :

message ArgMaxParameter {
// If true produce pairs (argmax, maxval)
optional bool out_max_val = 1 [default = false];
optional uint32 top_k = 2 [default = 1];
// The axis along which to maximise -- may be negative to index from the
// end (e.g., -1 for the last axis).
// By default ArgMaxLayer maximizes over the flattened trailing dimensions
// for each index of the first / num dimension.
optional int32 axis = 3;
}

And Caffe Operator boundary The parameters in are consistent .

layer_param.py Build the instance of parameter class in concrete transformation , Realized the operation parameter from Pytorch To Caffe The transfer :

def argmax_param(self, out_max_val=None, top_k=None, dim=1):
argmax_param = pb.ArgMaxParameter()
if out_max_val is not None:
argmax_param.out_max_val = out_max_val
if top_k is not None:
argmax_param.top_k = top_k
if dim is not None:
argmax_param.axis = dim
self.param.argmax_param.CopyFrom(argmax_param)

pytorch_to_caffe.py It defines Rp class , Used to implement Pytorch Operate to Caffe The transformation of operation :

class Rp(object):
def __init__(self, raw, replace, **kwargs):
self.obj = replace
self.raw = raw
​
def __call__(self, *args, **kwargs):
if not NET_INITTED:
return self.raw(*args, **kwargs)
for stack in traceback.walk_stack(None):
if 'self' in stack[0].f_locals:
layer = stack[0].f_locals['self']
if layer in layer_names:
log.pytorch_layer_name = layer_names[layer]
print('984', layer_names[layer])
break
out = self.obj(self.raw, *args, **kwargs)
return out

When adding operations , To use Rp Class replacement operation :

torch.argmax = Rp(torch.argmax, torch_argmax)

Next , To implement this operation :

def torch_argmax(raw, input, dim=1):
x = raw(input, dim=dim)
layer_name = log.add_layer(name='argmax')
top_blobs = log.add_blobs([x], name='argmax_blob'.format(type))
layer = caffe_net.Layer_param(name=layer_name, type='ArgMax',
bottom=[log.blobs(input)], top=top_blobs)
layer.argmax_param(dim=dim)
log.cnet.add_layer(layer)
return x

It is realized. argmax operation Pytorch To Caffe Transformation .

Caffe There is no direct corresponding operation in

If the operation to be converted is in Caffe There is no directly corresponding layer implementation in , There are two main solutions :

1) stay Pytorch Decompose unsupported operations into supported operations :

Such as nn.InstanceNorm2d, Instance normalization is used in conversion BatchNorm It's done , I won't support it affine=True perhaps track_running_stats=True, Default use_global_stats:false, but om On conversion use_global_stats It has to be for true, So you can go to Caffe, But turn around om unfriendly .

InstanceNorm Is in featuremap Each Channel Normalization operation is carried out on , therefore , Can achieve nn.InstanceNorm2d by :

class InstanceNormalization(nn.Module):
def __init__(self, dim, eps=1e-5):
super(InstanceNormalization, self).__init__()
self.gamma = nn.Parameter(torch.FloatTensor(dim))
self.beta = nn.Parameter(torch.FloatTensor(dim))
self.eps = eps
self._reset_parameters()
​
def _reset_parameters(self):
self.gamma.data.uniform_()
self.beta.data.zero_()
​
def __call__(self, x):
n = x.size(2) * x.size(3)
t = x.view(x.size(0), x.size(1), n)
mean = torch.mean(t, 2).unsqueeze(2).unsqueeze(3).expand_as(x)
var = torch.var(t, 2).unsqueeze(2).unsqueeze(3).expand_as(x)
gamma_broadcast = self.gamma.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
beta_broadcast = self.beta.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
out = (x - mean) / torch.sqrt(var + self.eps)
out = out * gamma_broadcast + beta_broadcast
return out

But in verification HiLens Caffe Operator boundary Found in , om Model transformation does not support Channle Sum or mean operations outside of dimensions , To circumvent this operation , We can reimplement it with supported operators nn.InstanceNorm2d:

class InstanceNormalization(nn.Module):
def __init__(self, dim, eps=1e-5):
super(InstanceNormalization, self).__init__()
self.gamma = torch.FloatTensor(dim)
self.beta = torch.FloatTensor(dim)
self.eps = eps
self.adavg = nn.AdaptiveAvgPool2d(1)
​
def forward(self, x):
n, c, h, w = x.shape
mean = nn.Upsample(scale_factor=h)(self.adavg(x))
var = nn.Upsample(scale_factor=h)(self.adavg((x - mean).pow(2)))
gamma_broadcast = self.gamma.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
beta_broadcast = self.beta.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
out = (x - mean) / torch.sqrt(var + self.eps)
out = out * gamma_broadcast + beta_broadcast
return out

After verification , Equivalent to the original operation , It can be changed to Caffe Model

2) stay Caffe By using existing operations to achieve :

stay Pytorch turn Caffe In the process of discovery , If there is featuremap + 6 This involves constant operations , In the process of conversion, it will appear that blob The problem of . Let's first look at pytorch_to_caffe.py in add Specific conversion method of operation :

def _add(input, *args):
x = raw__add__(input, *args)
if not NET_INITTED:
return x
layer_name = log.add_layer(name='add')
top_blobs = log.add_blobs([x], name='add_blob')
if log.blobs(args[0]) == None:
log.add_blobs([args[0]], name='extra_blob')
else:
layer = caffe_net.Layer_param(name=layer_name, type='Eltwise',
bottom=[log.blobs(input),log.blobs(args[0])], top=top_blobs)
layer.param.eltwise_param.operation = 1 # sum is 1
log.cnet.add_layer(layer)
return x

You can see that for blob There is no case to judge , All we need to do is log.blobs(args[0]) == None To modify under certain conditions , A natural idea is to use Scale Layer implementation add operation :

def _add(input, *args):
x = raw__add__(input, *args)
if not NET_INITTED:
return x
layer_name = log.add_layer(name='add')
top_blobs = log.add_blobs([x], name='add_blob')
if log.blobs(args[0]) == None:
layer = caffe_net.Layer_param(name=layer_name, type='Scale',
bottom=[log.blobs(input)], top=top_blobs)
layer.param.scale_param.bias_term = True
weight = torch.ones((input.shape[1]))
bias = torch.tensor(args[0]).squeeze().expand_as(weight)
layer.add_data(weight.cpu().data.numpy(), bias.cpu().data.numpy())
log.cnet.add_layer(layer)
else:
layer = caffe_net.Layer_param(name=layer_name, type='Eltwise',
bottom=[log.blobs(input), log.blobs(args[0])], top=top_blobs)
layer.param.eltwise_param.operation = 1 # sum is 1
log.cnet.add_layer(layer)
return x

Allied ,featuremap * 6 This simple multiplication can be done in the same way .

The pit of tread

  • Pooling:Pytorch Default ceil_mode=false,Caffe Default ceil_mode=true, It can lead to dimensional changes , If there is a size mismatch, you can check Pooling Is the parameter correct . in addition , Although not seen in the document , however kernel_size > 32 Although the post model can be transformed , But reasoning will report errors , This can be done in two layers Pooling operation .
  • Upsample :om In the boundary operator Upsample layer scale_factor Parameter must be int, It can't be size. If the existing model parameter is size It's going to run normally Pytorch turn Caffe The process of , But this time Upsample Parameter is empty . Parameter is size It can be considered that scale_factor Or use Deconvolution To achieve .
  • Transpose2d:Pytorch in output_padding Parameters are added to the size of the output , but Caffe Can't , The output feature map will be smaller , Now, after deconvolution featuremap It's going to get bigger , Can pass Crop The layers are cut , Make it the same size as Pytorch The corresponding layers are consistent . in addition ,om The speed of deconvolution reasoning is slow , It's better not to use , It can be used Upsample+Convolution replace .
  • Pad:Pytorch in Pad There are many different operations , but Caffe Can only be carried out in H And W Symmetry in dimensions pad, If Pytorch There is h = F.pad(x, (1, 2, 1, 2), "constant", 0) This asymmetry pad operation , The solution is as follows :
  1. If it's asymmetric pad There is no subsequent dimension mismatch problem in the layer of , You can judge first pad The effect on the result , Some tasks are affected by pad The impact is very small , Then there's no need to modify .
  2. If there is a dimension mismatch problem , It can be considered that according to the larger parameters pad after Crop, Or the front and back (0, 0, 1, 1) And (1, 1, 0, 0) Of pad In one (1, 1, 1, 1), It depends on the specific network structure .
  3. If it is Channel Dimensionally pad Such as F.pad(x, (0, 0, 0, 0, 0, channel_pad), "constant", 0), We can consider zero convolution cat To featuremap On :
zero = nn.Conv2d(in_channels, self.channel_pad, kernel_size=3, padding=1, bias=False)
nn.init.constant(self.zero.weight, 0)
pad_tensor = zero(x)
x = torch.cat([x, pad_tensor], dim=1)
  • Some operations can go to Caffe, but om Standards are not supported Caffe All operations , If you want to go to om Check the boundary operator against the document .
This article is shared from Huawei cloud community 《Pytorch->Caffe Model transformation 》, Original author : Du Fu built a house .

 

Click to follow , The first time to learn about Huawei's new cloud technology ~

版权声明
本文为[Huawei cloud developer community]所创,转载请带上原文链接,感谢
https://pythonmana.com/2021/02/20210222115552049B.html

  1. 翻译:《实用的Python编程》02_02_Containers
  2. Python基于粒子群优化的投资组合优化研究
  3. ubuntu部署django项目
  4. 兩年Java,去位元組跳動寫Python和Go
  5. 翻譯:《實用的Python程式設計》02_02_Containers
  6. 这样学习Python,爷爷都学会了!超简单Python入门
  7. [Python] 基于 jieba 的中文分词总结
  8. 【python】递归听了N次也没印象,读完这篇你就懂了
  9. [Python] 基于 jieba 的中文分词总结
  10. 人理解迭代,神则体会递归,从电影艺术到Python代码实现神的逆向思维模式
  11. [Python] 基於 jieba 的中文分詞總結
  12. Python属于后端开发还是前端开发?Python入门!
  13. 【python】递归听了N次也没印象,读完这篇你就懂了
  14. 一天快速入门python
  15. 学习Python对年龄有没有要求?30岁可以吗?
  16. 清华教授!12小时整理的最全Python教程(文末无偿分享)
  17. 使用Python开发DeFi项目
  18. python 函数详解
  19. Python工程师是做什么的?前景如何?
  20. Filecmp -- comparison of files and directories
  21. Python - zip() 函数
  22. 30 周年生日,Python 先驱是怎么评价这门语言的?
  23. Drawing and beautifying skills of [Python] Matplotlib chart
  24. Python + dash rapid web application development static components
  25. Translation: practical Python Programming 02_ 01_ Datatypes
  26. python将excel自适应导入数据库
  27. 从小白到大师,这里有一份Pandas入门指南
  28. [Python] 茎叶图和复合饼图的画法
  29. [Python interface automation] - regular use case parameterization
  30. Translation: practical Python Programming 02_ 02_ Containers
  31. Two years of Java, to write Python and go
  32. Translation: practical Python Programming 02_ 02_ Containers
  33. Two years of Java, to write Python and go
  34. Python-geoplot 空间核密度估计图绘制
  35. Python-seaborn 经济学人经典图表仿制
  36. python空间绘图- regionmask掩膜操作示例
  37. Python 空间绘图 - Cartopy 经纬度添加
  38. Python-pykrige包-克里金(Kriging)插值计算及可视化绘制
  39. Python 批量重采样、掩膜、坡度提取
  40. python - 多种交通方式可达圈分析
  41. Python 空间绘图 - 房价气泡图绘制
  42. Translation: practical Python Programming 02_ 02_ Containers
  43. Research on Portfolio Optimization Based on particle swarm optimization
  44. Ubuntu deploying Django project
  45. Two years of Java, write Python and go without byte beating
  46. Translation: practical Python Programming 02_ 02_ Containers
  47. So learn python, grandfather learned! Introduction to super simple Python
  48. python3 多线程 与 mongo亿级消费日志数据 新鲜demo 【优化第一版】
  49. Summary of Chinese word segmentation based on Jieba
  50. I've heard it n times, but I'm not impressed. After reading this, you'll understand
  51. Summary of Chinese word segmentation based on Jieba
  52. From movie art to Python code to realize God's reverse thinking mode
  53. Summary of Chinese word segmentation based on Jieba
  54. ARIMA模型预测CO2浓度时间序列-python实现
  55. Python belongs to back-end development or front-end development? Introduction to Python!
  56. python isinstance()
  57. I've heard it n times, but I'm not impressed. After reading this, you'll understand
  58. This article will familiarize you with the transformation process of Python - & gt; cafe - & gt; om model
  59. 如何用Python一键修改上万个文件名
  60. One day quick start to Python