PyTorch、ONNX到TensorRT


Posted on 12.2, 2020


0.前沿

     课程 >
    核心:特殊层ONNX插件的编写方法
    实现自己所想所需要的任意操作、算子,而不会遇到棘手的事情。

1.初探

简单示例
  • torch.onnx 模块可以将模型导出成 ONNX IR 形式.被导出的模型可以通过 ONNX 库被重新导入, 然后转化为可以在其它的深度学习框架上运行的模型.
  • 官方文档
  •             dummy_input = torch.randn(10, 3, 224, 224, device='cuda')
                model = torchvision.models.alexnet(pretrained=True).cuda()
    
                # within the model's graph. Setting these does not change the semantics
                # of the graph; it is only for readability.
                input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
    
                output_names = [ "output1" ]
    
                torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)
            
                # These are the inputs and parameters to the network, which have taken on
    # the names we specified earlier.
    graph(%actual_input_1 : Float(10, 3, 224, 224)
          %learned_0 : Float(64, 3, 11, 11)
          %learned_1 : Float(64)
          %learned_2 : Float(192, 64, 5, 5)
          %learned_3 : Float(192)
          # ---- omitted for brevity ----
          %learned_14 : Float(1000, 4096)
          %learned_15 : Float(1000)
    
        {
          # Every statement consists of some output tensors (and their types),
          # the operator to be run (with its attributes, e.g., kernels, strides,
          # etc.), its input tensors (%actual_input_1, %learned_0, %learned_1)
          %17 : Float(10, 64, 55, 55) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[11, 11], pads=[2, 2, 2, 2], strides=[4, 4]](%actual_input_1, %learned_0, %learned_1), scope: AlexNet/Sequential[features]/Conv2d[0]
          %18 : Float(10, 64, 55, 55) = onnx::Relu(%17), scope: AlexNet/Sequential[features]/ReLU[1] # 就是第一层的输出张量,尺寸BCHW为(10, 64, 55, 55)
          %19 : Float(10, 64, 27, 27) = onnx::MaxPool[kernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[2, 2]](%18), scope: AlexNet/Sequential[features]/MaxPool2d[2]
          # ---- omitted for brevity ----
          %29 : Float(10, 256, 6, 6) = onnx::MaxPool[kernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[2, 2]](%28), scope: AlexNet/Sequential[features]/MaxPool2d[12]
          # Dynamic means that the shape is not known. This may be because of a
          # limitation of our implementation (which we would like to fix in a
          # future release) or shapes which are truly dynamic.
          %30 : Dynamic = onnx::Shape(%29), scope: AlexNet
          %31 : Dynamic = onnx::Slice[axes=[0], ends=[1], starts=[0]](%30), scope: AlexNet
          %32 : Long() = onnx::Squeeze[axes=[0]](%31), scope: AlexNet
          %33 : Long() = onnx::Constant[value={9216}](), scope: AlexNet
          # ---- omitted for brevity ----
          %output1 : Float(10, 1000) = onnx::Gemm[alpha=1, beta=1, broadcast=1, transB=1](%45, %learned_14, %learned_15), scope: AlexNet/Sequential[classifier]/Linear[6]
          return (%output1);
        }
            
  • 局限
  •         1. 基于轨迹
            ONNX 导出器是一个基于轨迹的导出器,这意味着它执行时需要运行一次模型,然后导出实际参与运算的运算符.
            这也意味着, 如果你的模型是动态的,例如,改变一些依赖于输入数据的操作,这时的导出结果是不准确的.
            同样,一 个轨迹可能只对一个具体的输入尺寸有效 (这是为什么我们在轨迹中需要有明确的输入的原因之一.) 我们建议检查 模型的轨迹,确保被追踪的运算符是合理的.
            2. 训推不一致
            Pytorch 和 Caffe2 中的一些运算符经常有着数值上的差异.根据模型的结构,这些差异可能是微小的,但它们会在 表现上产生很大的差别 (尤其是对于未训练的模型.)
            之后,为了帮助你在准确度要求很高的情况中,能够轻松地避免这 些差异带来的影响,我们计划让 Caffe2 能够直接调用 Torch 的运算符.