PyTorch、ONNX到TensorRT
Posted on 12.2, 2020
0.前沿
课程 > 核心:特殊层ONNX插件的编写方法 实现自己所想所需要的任意操作、算子,而不会遇到棘手的事情。
1.初探
简单示例- torch.onnx 模块可以将模型导出成 ONNX IR 形式.被导出的模型可以通过 ONNX 库被重新导入, 然后转化为可以在其它的深度学习框架上运行的模型.
- 官方文档・
-
dummy_input = torch.randn(10, 3, 224, 224, device='cuda') model = torchvision.models.alexnet(pretrained=True).cuda() # within the model's graph. Setting these does not change the semantics # of the graph; it is only for readability. input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ] output_names = [ "output1" ] torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)
# These are the inputs and parameters to the network, which have taken on # the names we specified earlier. graph(%actual_input_1 : Float(10, 3, 224, 224) %learned_0 : Float(64, 3, 11, 11) %learned_1 : Float(64) %learned_2 : Float(192, 64, 5, 5) %learned_3 : Float(192) # ---- omitted for brevity ---- %learned_14 : Float(1000, 4096) %learned_15 : Float(1000) { # Every statement consists of some output tensors (and their types), # the operator to be run (with its attributes, e.g., kernels, strides, # etc.), its input tensors (%actual_input_1, %learned_0, %learned_1) %17 : Float(10, 64, 55, 55) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[11, 11], pads=[2, 2, 2, 2], strides=[4, 4]](%actual_input_1, %learned_0, %learned_1), scope: AlexNet/Sequential[features]/Conv2d[0] %18 : Float(10, 64, 55, 55) = onnx::Relu(%17), scope: AlexNet/Sequential[features]/ReLU[1] # 就是第一层的输出张量,尺寸BCHW为(10, 64, 55, 55) %19 : Float(10, 64, 27, 27) = onnx::MaxPool[kernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[2, 2]](%18), scope: AlexNet/Sequential[features]/MaxPool2d[2] # ---- omitted for brevity ---- %29 : Float(10, 256, 6, 6) = onnx::MaxPool[kernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[2, 2]](%28), scope: AlexNet/Sequential[features]/MaxPool2d[12] # Dynamic means that the shape is not known. This may be because of a # limitation of our implementation (which we would like to fix in a # future release) or shapes which are truly dynamic. %30 : Dynamic = onnx::Shape(%29), scope: AlexNet %31 : Dynamic = onnx::Slice[axes=[0], ends=[1], starts=[0]](%30), scope: AlexNet %32 : Long() = onnx::Squeeze[axes=[0]](%31), scope: AlexNet %33 : Long() = onnx::Constant[value={9216}](), scope: AlexNet # ---- omitted for brevity ---- %output1 : Float(10, 1000) = onnx::Gemm[alpha=1, beta=1, broadcast=1, transB=1](%45, %learned_14, %learned_15), scope: AlexNet/Sequential[classifier]/Linear[6] return (%output1); }
- 局限
1. 基于轨迹 ONNX 导出器是一个基于轨迹的导出器,这意味着它执行时需要运行一次模型,然后导出实际参与运算的运算符. 这也意味着, 如果你的模型是动态的,例如,改变一些依赖于输入数据的操作,这时的导出结果是不准确的. 同样,一 个轨迹可能只对一个具体的输入尺寸有效 (这是为什么我们在轨迹中需要有明确的输入的原因之一.) 我们建议检查 模型的轨迹,确保被追踪的运算符是合理的. 2. 训推不一致 Pytorch 和 Caffe2 中的一些运算符经常有着数值上的差异.根据模型的结构,这些差异可能是微小的,但它们会在 表现上产生很大的差别 (尤其是对于未训练的模型.) 之后,为了帮助你在准确度要求很高的情况中,能够轻松地避免这 些差异带来的影响,我们计划让 Caffe2 能够直接调用 Torch 的运算符.