This article will familiarize you with the transformation process of Python - & gt; cafe - & gt; om model

Huawei cloud developer community 2021-02-22 15:22:03
article familiarize transformation process python


Abstract : This article aims to share Pytorch->Caffe->om Model transformation process .

Standard network

BaselinePytorchToCaffe

The main function code is :

PytorchToCaffe
+-- Caffe
| +-- caffe.proto
| +-- layer_param.py
+-- example
| +-- resnet_pytorch_2_caffe.py
+-- pytorch_to_caffe.py

Direct use can refer to resnet_pytorch_2_caffe.py, If the operation in the network Baseline Zhongdu has been realized , Can be converted directly to Caffe Model .

Add custom actions

If you encounter an operation that is not implemented , There are two situations to consider .

Caffe There are corresponding operations in

With arg_max For example, let's share how to add operations .

The first thing to look at is Caffe The parameters of the corresponding layer in :caffe.proto For the corresponding version caffe The definition of layers and parameters , You can see ArgMax Defined out_max_valtop_kaxis Three parameters :

message ArgMaxParameter {
// If true produce pairs (argmax, maxval)
optional bool out_max_val = 1 [default = false];
optional uint32 top_k = 2 [default = 1];
// The axis along which to maximise -- may be negative to index from the
// end (e.g., -1 for the last axis).
// By default ArgMaxLayer maximizes over the flattened trailing dimensions
// for each index of the first / num dimension.
optional int32 axis = 3;
}

And Caffe Operator boundary The parameters in are consistent .

layer_param.py Build the instance of parameter class in concrete transformation , Realized the operation parameter from Pytorch To Caffe The transfer :

def argmax_param(self, out_max_val=None, top_k=None, dim=1):
argmax_param = pb.ArgMaxParameter()
if out_max_val is not None:
argmax_param.out_max_val = out_max_val
if top_k is not None:
argmax_param.top_k = top_k
if dim is not None:
argmax_param.axis = dim
self.param.argmax_param.CopyFrom(argmax_param)

pytorch_to_caffe.py It defines Rp class , Used to implement Pytorch Operate to Caffe The transformation of operation :

class Rp(object):
def __init__(self, raw, replace, **kwargs):
self.obj = replace
self.raw = raw
​
def __call__(self, *args, **kwargs):
if not NET_INITTED:
return self.raw(*args, **kwargs)
for stack in traceback.walk_stack(None):
if 'self' in stack[0].f_locals:
layer = stack[0].f_locals['self']
if layer in layer_names:
log.pytorch_layer_name = layer_names[layer]
print('984', layer_names[layer])
break
out = self.obj(self.raw, *args, **kwargs)
return out

When adding operations , To use Rp Class replacement operation :

torch.argmax = Rp(torch.argmax, torch_argmax)

Next , To implement this operation :

def torch_argmax(raw, input, dim=1):
x = raw(input, dim=dim)
layer_name = log.add_layer(name='argmax')
top_blobs = log.add_blobs([x], name='argmax_blob'.format(type))
layer = caffe_net.Layer_param(name=layer_name, type='ArgMax',
bottom=[log.blobs(input)], top=top_blobs)
layer.argmax_param(dim=dim)
log.cnet.add_layer(layer)
return x

It is realized. argmax operation Pytorch To Caffe Transformation .

Caffe There is no direct corresponding operation in

If the operation to be converted is in Caffe There is no directly corresponding layer implementation in , There are two main solutions :

  1. stay Pytorch Decompose unsupported operations into supported operations :

    Such as nn.InstanceNorm2d, Instance normalization is used in conversion BatchNorm It's done , I won't support it affine=True perhaps track_running_stats=True, Default use_global_stats:false, but om On conversion use_global_stats It has to be for true, So you can go to Caffe, But turn around om unfriendly .

    InstanceNorm Is in featuremap Each Channel Normalization operation is carried out on , therefore , Can achieve nn.InstanceNorm2d by :

    class InstanceNormalization(nn.Module):
    def __init__(self, dim, eps=1e-5):
    super(InstanceNormalization, self).__init__()
    self.gamma = nn.Parameter(torch.FloatTensor(dim))
    self.beta = nn.Parameter(torch.FloatTensor(dim))
    self.eps = eps
    self._reset_parameters()
    ​
    def _reset_parameters(self):
    self.gamma.data.uniform_()
    self.beta.data.zero_()
    ​
    def __call__(self, x):
    n = x.size(2) * x.size(3)
    t = x.view(x.size(0), x.size(1), n)
    mean = torch.mean(t, 2).unsqueeze(2).unsqueeze(3).expand_as(x)
    var = torch.var(t, 2).unsqueeze(2).unsqueeze(3).expand_as(x)
    gamma_broadcast = self.gamma.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
    beta_broadcast = self.beta.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
    out = (x - mean) / torch.sqrt(var + self.eps)
    out = out * gamma_broadcast + beta_broadcast
    return out

    But in verification HiLens Caffe Operator boundary Found in ,om Model transformation does not support Channle Sum or mean operations outside of dimensions , To circumvent this operation , We can reimplement it with supported operators nn.InstanceNorm2d

    class InstanceNormalization(nn.Module):
    def __init__(self, dim, eps=1e-5):
    super(InstanceNormalization, self).__init__()
    self.gamma = torch.FloatTensor(dim)
    self.beta = torch.FloatTensor(dim)
    self.eps = eps
    self.adavg = nn.AdaptiveAvgPool2d(1)
    ​
    def forward(self, x):
    n, c, h, w = x.shape
    mean = nn.Upsample(scale_factor=h)(self.adavg(x))
    var = nn.Upsample(scale_factor=h)(self.adavg((x - mean).pow(2)))
    gamma_broadcast = self.gamma.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
    beta_broadcast = self.beta.unsqueeze(1).unsqueeze(1).unsqueeze(0).expand_as(x)
    out = (x - mean) / torch.sqrt(var + self.eps)
    out = out * gamma_broadcast + beta_broadcast
    return out

    After verification , Equivalent to the original operation , It can be changed to Caffe Model

  2. stay Caffe By using existing operations to achieve :

    stay Pytorch turn Caffe In the process of discovery , If there is featuremap + 6 This involves constant operations , In the process of conversion, it will appear that blob The problem of . Let's first look at pytorch_to_caffe.py in add Specific conversion method of operation :

    def _add(input, *args):
    x = raw__add__(input, *args)
    if not NET_INITTED:
    return x
    layer_name = log.add_layer(name='add')
    top_blobs = log.add_blobs([x], name='add_blob')
    if log.blobs(args[0]) == None:
    log.add_blobs([args[0]], name='extra_blob')
    else:
    layer = caffe_net.Layer_param(name=layer_name, type='Eltwise',
    bottom=[log.blobs(input),log.blobs(args[0])], top=top_blobs)
    layer.param.eltwise_param.operation = 1 # sum is 1
    log.cnet.add_layer(layer)
    return x

    You can see that for blob There is no case to judge , All we need to do is log.blobs(args[0]) == None To modify under certain conditions , A natural idea is to use Scale Layer implementation add operation :

    def _add(input, *args):
    x = raw__add__(input, *args)
    if not NET_INITTED:
    return x
    layer_name = log.add_layer(name='add')
    top_blobs = log.add_blobs([x], name='add_blob')
    if log.blobs(args[0]) == None:
    layer = caffe_net.Layer_param(name=layer_name, type='Scale',
    bottom=[log.blobs(input)], top=top_blobs)
    layer.param.scale_param.bias_term = True
    weight = torch.ones((input.shape[1]))
    bias = torch.tensor(args[0]).squeeze().expand_as(weight)
    layer.add_data(weight.cpu().data.numpy(), bias.cpu().data.numpy())
    log.cnet.add_layer(layer)
    else:
    layer = caffe_net.Layer_param(name=layer_name, type='Eltwise',
    bottom=[log.blobs(input), log.blobs(args[0])], top=top_blobs)
    layer.param.eltwise_param.operation = 1 # sum is 1
    log.cnet.add_layer(layer)
    return x

    Allied ,featuremap * 6 This simple multiplication can be done in the same way .

The pit of tread

  • Pooling:Pytorch Default ceil_mode=false,Caffe Default ceil_mode=true, It can lead to dimensional changes , If there is a size mismatch, you can check Pooling Is the parameter correct . in addition , Although not seen in the document , however kernel_size > 32 Although the post model can be transformed , But reasoning will report errors , This can be done in two layers Pooling operation .
  • Upsample :om In the boundary operator Upsample layer scale_factor Parameter must be int, It can't be size. If the existing model parameter is size It's going to run normally Pytorch turn Caffe The process of , But this time Upsample Parameter is empty . Parameter is size It can be considered that scale_factor Or use Deconvolution To achieve .
  • Transpose2d:Pytorch in output_padding Parameters are added to the size of the output , but Caffe Can't , The output feature map will be smaller , Now, after deconvolution featuremap It's going to get bigger , Can pass Crop The layers are cut , Make it the same size as Pytorch The corresponding layers are consistent . in addition ,om The speed of deconvolution reasoning is slow , It's better not to use , It can be used Upsample+Convolution replace .
  • Pad:Pytorch in Pad There are many different operations , but Caffe Can only be carried out in H And W Symmetry in dimensions pad, If Pytorch There is h = F.pad(x, (1, 2, 1, 2), "constant", 0) This asymmetry pad operation , The solution is as follows :

    1. If it's asymmetric pad There is no subsequent dimension mismatch problem in the layer of , You can judge first pad The effect on the result , Some tasks are affected by pad The impact is very small , Then there's no need to modify .
    2. If there is a dimension mismatch problem , It can be considered that according to the larger parameters pad after Crop, Or the front and back (0, 0, 1, 1) And (1, 1, 0, 0) Of pad In one (1, 1, 1, 1), It depends on the specific network structure .
    3. If it is Channel Dimensionally pad Such as F.pad(x, (0, 0, 0, 0, 0, channel_pad), "constant", 0), We can consider zero convolution cat To featuremap On :

      zero = nn.Conv2d(in_channels, self.channel_pad, kernel_size=3, padding=1, bias=False)
      nn.init.constant(self.zero.weight, 0)
      pad_tensor = zero(x)
      x = torch.cat([x, pad_tensor], dim=1)
  • Some operations can go to Caffe, but om Standards are not supported Caffe All operations , If you want to go to om Check the boundary operator against the document .

This article is shared from Huawei cloud community 《Pytorch->Caffe Model transformation 》, Original author : Du Fu built a house .

Click to follow , The first time to learn about Huawei's new cloud technology ~

版权声明
本文为[Huawei cloud developer community]所创,转载请带上原文链接,感谢
https://pythonmana.com/2021/02/20210222095921099C.html

  1. Python中的解决中文字符编码的问题
  2. Solving the problem of Chinese character coding in Python
  3. Translation: practical Python Programming 02_ 01_ Datatypes
  4. Installation and use of Python and tensorflow in win10 environment (Python version 3.6, tensorflow version 1.6)
  5. Python series 46
  6. Linux安装Python3
  7. 【python接口自动化】- 正则用例参数化
  8. Python RestFul Api 设计
  9. filecmp --- 文件及目录的比较│Python标准库
  10. Installing python3 on Linux
  11. [Python] Matplotlib 圖表的繪製和美化技巧
  12. (資料科學學習手札108)Python+Dash快速web應用開發——靜態部件篇(上)
  13. 翻譯:《實用的Python程式設計》02_01_Datatypes
  14. 【python接口自动化】- 正则用例参数化
  15. 翻译:《实用的Python编程》02_02_Containers
  16. 两年Java,去字节跳动写Python和Go
  17. [Python interface automation] - regular use case parameterization
  18. Python restful API design
  19. 翻译:《实用的Python编程》02_02_Containers
  20. 两年Java,去字节跳动写Python和Go
  21. 翻译:《实用的Python编程》02_02_Containers
  22. Python基于粒子群优化的投资组合优化研究
  23. ubuntu部署django项目
  24. 兩年Java,去位元組跳動寫Python和Go
  25. 翻譯:《實用的Python程式設計》02_02_Containers
  26. 这样学习Python,爷爷都学会了!超简单Python入门
  27. [Python] 基于 jieba 的中文分词总结
  28. 【python】递归听了N次也没印象,读完这篇你就懂了
  29. [Python] 基于 jieba 的中文分词总结
  30. 人理解迭代,神则体会递归,从电影艺术到Python代码实现神的逆向思维模式
  31. [Python] 基於 jieba 的中文分詞總結
  32. Python属于后端开发还是前端开发?Python入门!
  33. 【python】递归听了N次也没印象,读完这篇你就懂了
  34. 一天快速入门python
  35. 学习Python对年龄有没有要求?30岁可以吗?
  36. 清华教授!12小时整理的最全Python教程(文末无偿分享)
  37. 使用Python开发DeFi项目
  38. python 函数详解
  39. Python工程师是做什么的?前景如何?
  40. Filecmp -- comparison of files and directories
  41. Python - zip() 函数
  42. 30 周年生日,Python 先驱是怎么评价这门语言的?
  43. Drawing and beautifying skills of [Python] Matplotlib chart
  44. Python + dash rapid web application development static components
  45. Translation: practical Python Programming 02_ 01_ Datatypes
  46. python将excel自适应导入数据库
  47. 从小白到大师,这里有一份Pandas入门指南
  48. [Python] 茎叶图和复合饼图的画法
  49. [Python interface automation] - regular use case parameterization
  50. Translation: practical Python Programming 02_ 02_ Containers
  51. Two years of Java, to write Python and go
  52. Translation: practical Python Programming 02_ 02_ Containers
  53. Two years of Java, to write Python and go
  54. Python-geoplot 空间核密度估计图绘制
  55. Python-seaborn 经济学人经典图表仿制
  56. python空间绘图- regionmask掩膜操作示例
  57. Python 空间绘图 - Cartopy 经纬度添加
  58. Python-pykrige包-克里金(Kriging)插值计算及可视化绘制
  59. Python 批量重采样、掩膜、坡度提取
  60. python - 多种交通方式可达圈分析