文章目录
参考
1、NVIDIA AGX Xavier 部署 CUDA-PointPillars
2、NVIDIA Jetson AGX Orin配置OpenPCDet环境部署PointPillar
Xavier配置

配置PcdDet环境
配置Arm Anaconda并创建虚拟环境
安装Arm Anaconnda
wget https://github.com/Archiconda/build-tools/releases/download/0.2.3/Archiconda3-0.2.3-Linux-aarch64.sh
sudo bash Archiconda3-0.2.3-Linux-aarch64.sh
创建虚拟环境
conda create -n openpcdet python=3.6
对于Jetpack=4.5的Xavier,pytorch对应的python版本只有3.6,所以只能配置3.6版本的python。对于Orin,Jetpack=5.0,可以配置更高版本的python,但是不建议配置过高版本的pytorch。
下载和验证pytorch
下载参考:PyTorch for Jetson
首先下载pytorch v1.10.0的whl文件,我最开始下载的是v1.8.0版本的pytorch,但是在最后导出onnx模型的时候报torch中onnx的错,把pytorch的版本换成v1.10.0就好了。
conda activate openpcdet
python -m pip install torch-1.10.0-cp36-cp36m-linux_aarch64.whl
验证torch
$ python
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
1.10.0
>>> print('CUDA available: ' + str(torch.cuda.is_available()))
CUDA available: True
>>> print('cuDNN version: ' + str(torch.backends.cudnn.version()))
cuDNN version: 8000
>>> a = torch.cuda.FloatTensor(2).zero_()
>>> print('Tensor a = ' + str(a))
Tensor a = tensor([0., 0.], device='cuda:0')
>>> b = torch.randn(2).cuda()
>>> print('Tensor b = ' + str(b))
Tensor b = tensor([ 1.4377, -0.4534], device='cuda:0')
>>> c = a + b
>>> print('Tensor c = ' + str(c))
Tensor c = tensor([ 1.4377, -0.4534], device='cuda:0')
如果在验证torch的时候,报错非法指令(核心已转储),则更改numpy的版本:
python -m pip install numpy==1.19.3
下载和验证torchvision
下载地址:[下载 torchvision 网站](https://gitcode.net/mirrors/pytorch/vision/-/tree/main)
torch 1.10.0对应torchvision 0.11.1,下载好zip文件,然后执行以下命令:
$ unzip vision-v0.9.0.zip
$ cd vision-v0.9.0/
$ ls
android CODE_OF_CONDUCT.md examples MANIFEST.in README.rst setup.py tox.ini
cmake CONTRIBUTING.md hubconf.py mypy.ini references test version.txt
CMakeLists.txt docs LICENSE packaging setup.cfg torchvision
$ export BUILD_VERSION=0.9.0
$ python setup.py install --user
...
Using /home/nvidia/archiconda3/envs/OpenPCDet_torch18/lib/python3.6/site-packages
Finished processing dependencies for torchvision==0.9.0
验证torchvision
$ python
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>import torchvision
>>> print(torchvision.__version__)
0.11.1
如果在验证的时候报错,可以运行以下命令:
python -m pip install 'pillow<7'
我在安装0.11.1的时候并没有报错,所以就不用更换pillow的版本了。
安装cumm的spconv
cumm 下载网站
spconv 下载网站
NVIDIA Jetson 系列安装 spconv ,得重新编译,而 spconv 又依赖于 cumm ,所以得先装cumm.
$ export CUMM_CUDA_VERSION="10.2"
$ export CUMM_DISABLE_JIT="1"
$ export CUMM_CUDA_ARCH_LIST="7.2"
$ unzip cumm-0.2.9.zip
$ cd cumm-0.2.9/
$ ls
$ python setup.py bdist_wheel
$ pip install dist/cumm_cu102-0.2.9-cp36-cp36m-linux_aarch64.whl
$ cd ..
$ ls
$ cd spconv-2.1.25/
$ ls
$ python setup.py bdist_wheel
$ pip install dist/spconv_cu102-2.1.25-cp36-cp36m-linux_aarch64.whl
在安装的过程中可能会报错,一般是缺少某些编译库,在网上搜索一下报错信息,然后把缺少的依赖库安装上就可以。
验证cumm 和 spconv
$ python
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cumm
>>> print(cumm.__version__)
0.2.9
>>> import spconv
>>> print(spconv.__version__)
2.1.25
安装llvm和llvmlite
llvmlite 与 llvm 的对应关系
因为 OpenPCDet/requirements.txt需要llvmlite,而llvmlite 依赖于llvm
从 llvm/llvm-project 下载 aarch64 的 llvm的预编译好的文件,解压以后添加环境变量。
$ wget https://github.com/llvm/llvm-project/releases/download/llvmorg-10.0.1/clang+llvm-10.0.1-aarch64-linux-gnu.tar.xz
$ tar -xvJf clang+llvm-11.0.1-aarch64-linux-gnu.tar.xz
$ gedit ~/.bashrc
把下边的路径改成自己的
export PATH=$PATH:/home/nvidia/Downloads/clang+llvm-10.0.1-aarch64-linux-gnu/bin # your path to llvm
export LLVM_CONFIG=/home/nvidia/Download/clang+llvm-10.0.1-aarch64-linux-gnu/bin/llvm-config # your path to llvm-config
$ source ~/.bashrc
$ sudo apt-get install libedit-dev
$ sudo ldconfig
python -m pip install llvmlite==0.36.0 -i https://mirror.baidu.com/pypi/simple
验证
$ python
Python 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04)
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import llvmlite
>>> print(llvmlite.__version__)
0.36.0
或者
$ pip show llvmlite
Name: llvmlite
Version: 0.36.0
Summary: lightweight wrapper around basic LLVM functionality
Home-page: http://llvmlite.pydata.org
Author: Continuum Analytics, Inc.
Author-email: numba-users@continuum.io
License: BSD
Location: /home/nvidia/archiconda3/envs/OpenPCDet_torch18/lib/python3.6/site-packages
Requires:
Required-by:
安装OpenPCDet
在安装OpenPCDet的时候吃了个大亏,不能够直接从git上clone下来OpenPCDet的代码,因为会下载最新版的,在OpenPCDet/pcdet/datasets/下会有一个argo2的数据集,这个数据集是新加的,需要安装av2库才能够读取这个数据集,但是av2不支持python 3.6,python3.8就可以,就非常的扯,所以就下载0.5.0版本的OpenPCDet。
$ git clone --branch v0.5.0 https://github.com/open-mmlab/OpenPCDet.git
$ cd OpenPCDet
$ $ gedit requirements.txt
然后把requirements.txt需要安装的包全都安装了。把依赖包全部安装完成之后,运行:
python setup.py develop
验证pcdet:
$ pip show pcdet
Name: pcdet
Version: 0.5.0+0
Summary: OpenPCDet is a general codebase for 3D object detection from point cloud
Home-page: UNKNOWN
Author: Shaoshuai Shi
Author-email: shaoshuaics@gmail.com
License: Apache License 2.0
Location: /home/nvidia/torch_xavier/OpenPCDet
Requires: easydict, llvmlite, numba, numpy, pyyaml, scikit-image, SharedArray, tensorboardX, tqdm
Required-by:
至此,部署CUDA-PointPillars的环境就已经配置好了。
部署CUDA-PointPillars
配置onnx
NVIDIA-AI-IOT/CUDA-PointPillars
$ git clone https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars.git && cd CUDA-PointPillars
Export Pointpillar Onnx Model 把 .pth 转化成 .onnx 需要安装 onnx 的python 包
$ python -m pip install pyyaml scikit-image onnx onnx-simplifier
pyyaml scikit-image 前面已装
onnx安装1.8.1版本的
$ python -m pip install onnx==1.8.1 -i https://mirror.baidu.com/pypi/simple
$ python -m pip install onnx-simplifier==0.3
报错: CMake 版本低了
CMake Error at CMakeLists.txt:1 (cmake_minimum_required):
CMake 3.22 or higher is required. You are running version 3.20.1
$ wget https://github.com/Kitware/CMake/releases/download/v3.22.6/cmake-3.22.6.zip
$ unzip cmake-3.22.6.zip
$ cd CMake-3.22.6/
$ ls
$ ./configure
$ make -j6
$ sudo make install
$ cmake --version
cmake version 3.22.6
CMake suite maintained and supported by Kitware (kitware.com/cmake).
重新安装 onnx-simplifier
$ python -m pip install onnx-simplifier==0.2.16
$ pip install onnx_graphsurgeon --index-url https://pypi.ngc.nvidia.com
同时参考: 处理WARNING: Ignoring invalid distribution -xpython错误
$ cd /home/nvidia/archiconda3/envs/OpenPCDet_torch18/lib/python3.6/site-packages
$ rm -rf '~nnx-1.11.0.dist-info'
$ rm -rf '~nnx'
安装onnxruntime
python -m pip install onnxruntime==1.10.0 -i https://mirror.baidu.com/pypi/simple
最后
$ pip list | grep onnx
onnx 1.8.1
onnx-graphsurgeon 0.3.26
onnx-simplifier 0.2.16
onnxruntime 1.10.0
转化
python exporter.py --ckpt ./pointpillar_7728.pth
mv pointpillar.onnx ../model/ && mv params.h ../include/
部署
mkdir build && cd build
cmake .. && make -j$(nproc)
./demo
make的时候会报错:
/home/nvidia/Downloads/CUDA-PointPillars-main/src/pointpillar.cpp: In destructor ‘TRT::~TRT()’:
/home/nvidia/Downloads/CUDA-PointPillars-main/src/pointpillar.cpp:30:18: error: ‘virtual nvinfer1::IExecutionContext::~IExecutionContext()’ is protected within this context
delete(context_);
^
In file included from /usr/include/aarch64-linux-gnu/NvInfer.h:53:0,
from /home/nvidia/Downloads/CUDA-PointPillars-main/include/pointpillar.h:20,
from /home/nvidia/Downloads/CUDA-PointPillars-main/src/pointpillar.cpp:18:
/usr/include/aarch64-linux-gnu/NvInferRuntime.h:1626:13: note: declared protected here
virtual ~IExecutionContext() noexcept {}
^
/home/nvidia/Downloads/CUDA-PointPillars-main/src/pointpillar.cpp:31:17: error: ‘virtual nvinfer1::ICudaEngine::~ICudaEngine()’ is protected within this context
delete(engine_);
^
In file included from /usr/include/aarch64-linux-gnu/NvInfer.h:53:0,
from /home/nvidia/Downloads/CUDA-PointPillars-main/include/pointpillar.h:20,
from /home/nvidia/Downloads/CUDA-PointPillars-main/src/pointpillar.cpp:18:
/usr/include/aarch64-linux-gnu/NvInferRuntime.h:1297:13: note: declared protected here
virtual ~ICudaEngine() {}
^
CMakeFiles/demo.dir/build.make:124: recipe for target 'CMakeFiles/demo.dir/src/pointpillar.cpp.o' failed
make[2]: *** [CMakeFiles/demo.dir/src/pointpillar.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
CMakeFiles/Makefile2:82: recipe for target 'CMakeFiles/demo.dir/all' failed
make[1]: *** [CMakeFiles/demo.dir/all] Error 2
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2
把src/pointpillar.cpp文件中的第30、31行改为:
context_ -> destroy();
engine_ -> destroy();
编译成功了,但是在最后跑./demo的时候报错:
trt_infer: ../builder/cudnnBuilder2.cpp (1064) - Assertion Error in makeEngineTensor: 0 (0 <= vectorDim)
: engine init null!
目前还未解决,先不管了,五一去玩了,玩回来再说。
先浅浅附一下环境依赖:
ccimport 0.3.7
certifi 2022.12.7
charset-normalizer 2.0.12
colored 1.4.4
contextvars 2.4
cumm-cu102 0.2.9
cycler 0.11.0
dataclasses 0.8
decorator 4.4.2
easydict 1.10
fire 0.5.0
flatbuffers 23.3.3
idna 3.4
imageio 2.15.0
immutables 0.19
importlib-metadata 4.8.3
importlib-resources 5.4.0
jsonpath 0.82
kiwisolver 1.3.1
lark 1.1.5
llvmlite 0.36.0
matplotlib 3.3.4
networkx 2.5.1
ninja 1.11.1
numba 0.53.1
numpy 1.19.5
onnx 1.8.1
onnx-graphsurgeon 0.3.26
onnx-simplifier 0.2.16
onnxruntime 1.10.0
open3d 0.15.1
opencv-python 3.4.18.65
packaging 21.3
pccm 0.3.4
pcdet 0.5.0+0 /home/nvidia/Downloads/OpenPCDet
Pillow 8.4.0
pip 21.3.1
portalocker 2.7.0
protobuf 3.19.6
pybind11 2.10.4
pyparsing 3.0.9
python-dateutil 2.8.2
PyWavelets 1.1.1
PyYAML 6.0
requests 2.27.1
scikit-image 0.17.2
scipy 1.5.4
setuptools 58.0.4
SharedArray 3.2.1
six 1.16.0
spconv-cu102 2.1.25
tensorboardX 2.6
tensorrt 7.1.3.0
termcolor 1.1.0
tifffile 2020.9.3
torch 1.10.0
torchvision 0.11.1
tqdm 4.64.1
typing_extensions 4.1.1
urllib3 1.26.15
wheel 0.37.1
zipp 3.6.0
五一玩回来后
发现之前的那个报错是tensorrt版本低的问题,无法把pointpillars整合成一个网络,主要是因为scatter层,而tensorrt 8支持把pointpillars整合成一个网络,所以我在自己电脑上配置了一个openpcdet的环境,按照上边的步骤即可,或者在网上或者官方找配置openpcdet环境的步骤。配置好之后直接在自己电脑上导出onnx模型,然后在orin上把生成的模型放到CUDA-PointPillars的model目录下,然后新建build目录,在build目录下进行编译,最后运行生成的demo可执行文件。最后附一下自己电脑中openpcdet的环境。
actionlib 1.14.0
angles 1.9.13
bondpy 1.8.6
camera-calibration 1.17.0
camera-calibration-parsers 1.12.0
catkin 0.8.10
ccimport 0.3.7
certifi 2022.12.7
charset-normalizer 3.1.0
coloredlogs 15.0.1
contourpy 1.0.7
controller-manager 0.19.6
controller-manager-msgs 0.19.6
cumm-cu111 0.2.9
cv-bridge 1.16.2
cycler 0.11.0
diagnostic-analysis 1.11.0
diagnostic-common-diagnostics 1.11.0
diagnostic-updater 1.11.0
dill 0.3.6
dynamic-reconfigure 1.7.3
easydict 1.10
et-xmlfile 1.1.0
fire 0.5.0
flatbuffers 23.3.3
fonttools 4.39.3
gazebo_plugins 2.9.2
gazebo_ros 2.9.2
gencpp 0.7.0
geneus 3.0.0
genlisp 0.4.18
genmsg 0.6.0
gennodejs 2.0.2
genpy 0.6.15
graphsurgeon 0.4.5
grpcio 1.54.0
humanfriendly 10.0
idna 3.4
image-geometry 1.16.2
imageio 2.28.1
importlib-metadata 6.6.0
importlib-resources 5.12.0
interactive-markers 1.12.0
joint-state-publisher 1.15.1
joint-state-publisher-gui 1.15.1
kiwisolver 1.4.4
lark 1.1.5
laser_geometry 1.6.7
lazy_loader 0.2
llvmlite 0.35.0
markdown-it-py 2.2.0
matplotlib 3.7.1
mdurl 0.1.2
message-filters 1.16.0
mpmath 1.3.0
multiprocess 0.70.14
ncnn 1.0.20230426 /home/qsz/my_projects/mmdeploy_pointpillars/mmdeploy-dep/ncnn/python
networkx 3.1
ninja 1.11.1
numba 0.52.0
numpy 1.23.0
onnx 1.13.1
onnx-graphsurgeon 0.3.26
onnx-simplifier 0.4.28
onnxruntime 1.10.0
opencv-python 4.7.0.72
openpyxl 3.1.2
packaging 23.1
pccm 0.3.4
pcdet 0.5.0+0 /home/qsz/my_projects/cuda-pointpillars/OpenPCDet-0.5.0
Pillow 9.5.0
pip 23.1.2
portalocker 2.7.0
protobuf 3.20.2
pybind11 2.10.4
Pygments 2.15.1
pyparsing 3.0.9
python-dateutil 2.8.2
python-qt-binding 0.4.4
PyWavelets 1.4.1
PyYAML 6.0
qt-dotgraph 0.4.2
qt-gui 0.4.2
qt-gui-cpp 0.4.2
qt-gui-py-common 0.4.2
requests 2.29.0
resource_retriever 1.12.7
rich 13.3.5
rosbag 1.16.0
rosboost-cfg 1.15.8
rosclean 1.15.8
roscreate 1.15.8
rosgraph 1.16.0
roslaunch 1.16.0
roslib 1.15.8
roslint 0.12.0
roslz4 1.16.0
rosmake 1.15.8
rosmaster 1.16.0
rosmsg 1.16.0
rosnode 1.16.0
rosparam 1.16.0
rospy 1.16.0
rosservice 1.16.0
rostest 1.16.0
rostopic 1.16.0
rosunit 1.15.8
roswtf 1.16.0
rqt_action 0.4.9
rqt_bag 0.5.1
rqt_bag_plugins 0.5.1
rqt_console 0.4.11
rqt_dep 0.4.12
rqt_graph 0.4.14
rqt_gui 0.5.3
rqt_gui_py 0.5.3
rqt-image-view 0.4.17
rqt_launch 0.4.9
rqt_logger_level 0.4.11
rqt-moveit 0.5.10
rqt_msg 0.4.10
rqt_nav_view 0.5.7
rqt_plot 0.4.13
rqt_pose_view 0.5.11
rqt_publisher 0.4.10
rqt_py_common 0.5.3
rqt_py_console 0.4.10
rqt-reconfigure 0.5.5
rqt-robot-dashboard 0.5.8
rqt-robot-monitor 0.5.14
rqt_robot_steering 0.5.12
rqt_runtime_monitor 0.5.9
rqt-rviz 0.7.0
rqt_service_caller 0.4.10
rqt_shell 0.4.11
rqt_srv 0.4.9
rqt_tf_tree 0.6.3
rqt_top 0.4.10
rqt_topic 0.4.13
rqt_web 0.4.10
rviz 1.14.20
scikit-image 0.19.3
scipy 1.9.1
sensor-msgs 1.13.1
setuptools 67.7.2
SharedArray 3.2.2
six 1.16.0
smach 2.5.1
smach-ros 2.5.1
smclib 1.8.6
spconv-cu111 2.1.25
sympy 1.11.1
tensorboardX 2.6
tensorrt 8.2.3.0
termcolor 2.3.0
terminaltables 3.1.10
tf 1.13.2
tf-conversions 1.13.2
tf2-geometry-msgs 0.7.6
tf2-kdl 0.7.6
tf2-py 0.7.6
tf2-ros 0.7.6
tifffile 2023.4.12
topic-tools 1.16.0
torch 1.10.0+cu111
torchaudio 0.10.0+rocm4.1
torchvision 0.11.0+cu111
tqdm 4.65.0
typing_extensions 4.5.0
uff 0.6.9
urllib3 1.26.15
wheel 0.40.0
xacro 1.14.15
zipp 3.15.0

2817

被折叠的 条评论
为什么被折叠?



