TensorRT采坑api

本文介绍了TensorRT API的使用,特别是对接TensorFlow时的转换,以及在nvinfer1::INetworkDefinition中遇到的坑,包括addInput、addReduce、addShuffle等操作。特别讨论了动态reshape、LSTM实现方式以及内存管理和优化策略,如matmul操作的优化。


TensorRT链接

官方API链接:https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/


TensorRT工具

trtexec集成了TensorRT对接三方格式的parser。

对接TensorFlow

  1. pb转uff
  • 环境
    cd python
    pip install tensorrt-xxxxx.whl
    cd ../uff
    pip install uff-xxxxx.whl
    cd ../graphsurgeon
    pip install graphsurgeon-xxxxx.whl

  • 命令
    convert-to-uff xxxx.pb

  1. pb转onnx
    python -m tf2onnx.convert --graphdef xxxxx.pb --output xxxxx.onnx --inputs input1:0,input2:0 --outputs output1:0,output2:0

  2. trtexec
    trtexec --uff=xxxx.uff --output=xxxx,xxxx --uffInput=input1,C,H,W --uffInput=input2,C,H,W --batch=N
    trtexec --onnx=xxxx.onnx --explicitBatch


采坑 API

nvinfer1::INetworkDefinition

add各种layer的文档写的真的是,一言难尽

network 分成两种:

  1. implicit(隐式) batch dimension的网络(比如,HWC)
  2. explicit(显式) dimensions = full dims网络(NHWC)

addinput的时候会有明显差别。

addInput

官网注释:

For networks with an implicit batch dimension, this volume includes the batch dimension with its length set to the maximum batch size. For networks with all explicit dimensions and with wildcard dimensions, the volume is based on the maxima specified by an IOptimizationProfile.Dimensions are normally non-negative integers. The exception is that in networks with all explicit dimensions, -1 can be used as a wildcard for a dimension to be specified at runtime. Input tensors with such a wildcard must have a corresponding entry in the IOptimizationProfiles indicating the permitted extrema, and the input dimensions must be set by IExecutionContext::setBindingDimensions. Different IExecutionContext instances can have different dimensions. Wildcard dimensions are only supported for EngineCapability::kSTANDARD. They are not supported in safety contexts. DLA does not support Wildcard dimensions.

以NCHW的输入为例,

  1. implicit(隐式) batch dimension的网络
    正常来说,应该是只输入CHW,在execute的时候再设置batch size。
    如果此时输入是NCHW,那么N就是作为batch size的最大值。然后,再execute的时候,以设置的batch size为准?那网络中各个链接tensor的申请的内存呢?以max为准?

  2. explicit(显式) dimensions网络
    正常来首,应该设置非负的NCHW。但是,输入维度可以为未知数(-1表示)。如果输入维度里有-1,构图的依据是需要IOptimizationProfile.Dimensions来设置-1维度的取值范围,在execute之前,通过 IExecutionContext::setBindingDimensions确定。

// HW is -1 wildcard
auto input = preprocessorNetwork->addInput("input", nvinfer1::DataType::kFLOAT, Dims4{1, 1, -1, -1});

// Create an optimization profile so that we can specify a range of input dimensions.
nvinfer1::IOptimizationProfile* profile = builder->createOptimizationProfile();
// This profile will be valid for all images whose size falls in the range of [(1, 1, 1, 1), (1, 1, 56, 56)]
// but TensorRT will optimize for (1, 1, 28, 28)
// We do not need to check the return of setDimension and addOptimizationProfile here as all dims are explicitly set
profile->setDimensions(input->getName(), OptProfileSelector::kMIN, Dims4{1, 1, 1, 1});
profile->setDimensions(input->getName(), OptProfileSelector::kOPT, Dims4{1, 1, 28, 28});
profile->setDimensions(input->getName(), OptProfileSelector::kMAX, Dims4{1, 1, 56, 56});
preprocessorConfig->addOptimizationProfile(profile);

// Set the input size for the preprocessor
mPreprocessorContext->setBindingDimensions(0, inputDims), false, "Invalid binding dimensions.";

// We can only run inference once all dynamic input shapes have been specified.
bool ret = mPreprocessorContext->allInputDimensionsSpecified();

addReduce

头文件和文档注释:

//! \param input The input tensor to the layer.
//! \param operation The reduction operation to perform.
//! \param reduceAxes The reduction dimensions.
//!        The bit in position i of bitmask reduceAxes corresponds to explicit dimension i if result.
//!        E.g., the least significant bit corresponds to the first explicit dimension and the next to least significant bit corresponds to the second explicit dimension.
//!
//! \param keepDimensions The boolean that specifies whether or not to keep the reduced dimensions in the output of the layer.

IReduceLayer* addReduce(ITensor& input, ReduceOperation operation, uint32_t reduceAxes, bool keepDimensions);

降维算子,根据reduceAxes的轴做降维,降维方式可以选ReduceOperation的kSUM,kPROD,kMAX,kMIN,kAVG。

reduceAxes,直接翻译:位掩码的i位,对应i维度的if取值?小端位对应第一个维度,第二小端位对应第二个维度。

reduceAxis |= 1u << axis_data;

axis index 二进制 十进制
3 1000 8
2 0100 4
1 0010 2
0 0001 1

addShuffle

很多改变维度的算子,比如reshape、flatten、squeeze、unsqueeze、transpose等。
固定常量维度的,直接用setReshapeDimensions可以设定。
transpose的常量perm,用setFirstTranspose设置。

dynamic reshape算子

nvinfer1::ITensor两类tensor,shape tensor 和 execution tensor。shape tensor 是表示shape信息的,shape算子的输出就是一个shape tensor。execution tensor 就是实际做计算的。一般来说一个网络的输入和输出tensor都应该是execution tensor。

reshape算子的shape如果是常量,直接用setReshapeDimension设置即可。
如果shape是变量,此时的shape对应的变量tensor就是一个shape tensor。
nvinfer1::IShuffleLayer默认是static的,setInput(0, xxxx)更新需要被reshape的tensor。
setInput(1, xxxx)第二个参数是一个shape tensor时,nvinfer1::IShuffleLayer会变为dynamic,可动态计算reshape。

addPluginV2

TensorRT不支持的算子,可以自己实现plugin的方式。
头文件模板:

class EqualPluginCreater : public nvinfer1::IPluginCreator {
   
   
 public:
  EqualPluginCreater();

  const char *getPluginName() const noexcept override;

  const char *getPluginVersion() const noexcept override;

  const nvinfer1::PluginFieldCollection *getFieldNames() noexcept override;

  nvinfer1::IPluginV2 *createPlugin(const char *name, const nvinfer1::PluginFieldCollection *fc) noexcept override;

  nvinfer1::IPluginV2 *deserializePlugin(const char *name, const void *serialData,
                                         size_t serialLength) noexcept override;

  void setPluginNamespace(const char *pluginNamespace) noexcept override;

  const char *getPluginNamespace() const noexcept override;

 private:
  static nvinfer1::PluginFieldCollection field_collection_;
  static std::vector<nvinfer1::PluginField> fields_;
  std::string name_space_;
};

class EqualPlugin : public nvinfer1::IPluginV2DynamicExt {
   
    // 支持动态input shape要用这个
 public:
  explicit EqualPlugin(const std::string name) : layer_name_(name) {
   
   }

  // It doesn't make sense to make GeluPluginDynamic without arguments, so we delete
  // default constructor.
  EqualPlugin() = delete;

  // IPluginV2DynamicExt Methods
  nvinfer1::IPluginV2DynamicExt *clone() const noexcept override;
  // 构图的时候调用,输出的tensor的维度
  nvinfer1::DimsExprs getOutputDimensions(int outputIndex, const nvinfer1::DimsExprs *inputs, int nbInputs,
                                          nvinf
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值