CANN/GE ReLU前移融合Pass示例

Example Usage Guide

【免费下载链接】ge GE(Graph Engine)是面向昇腾的图编译器和执行器,提供了计算图优化、多流并行、内存复用和模型下沉等技术手段,加速模型执行效率,减少模型内存占用。 GE 提供对 PyTorch、TensorFlow 前端的友好接入能力,并同时支持 onnx、pb 等主流模型格式的解析与编译。 【免费下载链接】ge 项目地址: https://gitcode.com/cann/ge

Function Description

This example is a custom pass for moving ReLU before Concat. When fusion pass scenarios involve operators with dynamic input/output counts, refer to this example. The example provides both online inference and atc tool offline model compilation to demonstrate how the framework calls custom pass to complete graph optimization, using eager style api and fusion interfaces.

Directory Structure

├── src
│   ├──move_relu_before_concat_pass.cpp                 // pass implementation file
├── CMakeLists.txt                                      // build script
├── data
|   ├──es_gen_air.py                                // export air
|   ├──torch_forward.py                                 // torch script for online inference
|—— gen_es_api
|   |——CMakeLists.txt                                   // build script for generating eager style api

Environment Requirements

Implementation Steps

  1. Define MoveReluBeforeConcatPass class inheriting FusionBasePass.
  2. Override base class FusionBasePass Run method, implement custom pass logic.
  3. Define FindConcatNodesMeetRequirements to traverse nodes in graph, get Concat nodes meeting conditions.
  4. Define MoveReluBeforeConcat to implement graph modification:
    • Replacement builds replacement structure based on concat node
    • GetSubgraphBoundary builds boundary of subgraph to be replaced
    • Finally call SubgraphRewriter Replace method to implement replacement

Program Compilation

Assume CANN package installation directory is INSTALL_PATH, e.g., /home/HwHiAiUser/Ascend/.

  1. Configure environment variables.

    Run environment variable script from the package:

    source ${ASCEND_PATH}/set_env.sh
    

    ${ASCEND_PATH} is the cann path under CANN package installation directory. Replace with actual installation path, e.g., ${INSTALL_PATH}/cann.

  2. Modify the following information in CMakeLists.txt as needed.

    • ASCEND_PATH: Can set default package path. If $ASCEND_HOME_PATH is set via set_env.sh, no modification needed.

    • PASS_SO_DIR: Can set custom fusion pass dynamic library installation directory name, default is pass_so_dir.

    • target_include_directories: Header files to include. For this example, no modification needed. For user-developed code, when adding headers, add lines below the example, do not delete existing items. If network has custom operators, add custom operator prototype definition header files.

    • target_link_libraries: Libraries to link. For this example, no modification needed. For user-developed code, when adding link libraries, add lines below the example, do not delete existing items.

      Do not link other so from the package, otherwise may cause compatibility issues during future upgrades.

  3. Execute sequentially:

    mkdir build && cd build
    cmake ..
    
  4. Execute make command to compile custom pass so. After successful compilation, use make install to install dynamic library file libmove_relu_before_concat_pass.so to custom fusion pass directory. Can add optional parameter -j$(nproc) after make for parallel build, $(nproc) dynamically gets CPU core count.

    make -j$(nproc) move_relu_before_concat_pass
    make install
    

    After example validation completes, execute the following command to clean custom pass so installed under CANN package, to avoid affecting subsequent UT/ST:

    make clean_custom_pass
    

Program Execution

  1. Configure environment variables (if already done, skip).

    • Run environment variable script from the package:

      source ${ASCEND_PATH}/set_env.sh
      

      Replace ${ASCEND_PATH} with actual package installation path.

  2. Use ATC offline inference.

    • Set environment variable to dump model graph during compilation:

      export DUMP_GE_GRAPH=1
      
    • Install es_all.whl

      pip install --force-reinstall --upgrade --target ${ASCEND_PATH}/python/site-packages/ 
      ${BUILD_PATH}/es_output/whl/es_all-*****.whl
      

      Replace ${BUILD_PATH} with actual build directory path.

    • Set environment variable to add es_all.so path

      LD_LIBRARY_PATH="${BUILD_PATH}/es_output/lib64:${LD_LIBRARY_PATH}"
      
    • Enter data directory and execute .py file to export air:

      python es_gen_air.py
      
    • After execution, .air format model file named graph.air is generated in data directory.

    • Execute ATC tool command (for detailed ATC tool instructions, visit Ascend Documentation and search "ATC Offline Model Compilation Tool"), modify soc_version per actual environment:

      atc --model=./graph.air --framework=1 --soc_version=xxx --output=./model
      
    • Following log appears:

      MoveReluBeforeConcatPass
      Define Replacement for MoveReluBeforeConcatPass
      Replacement of MoveReluBeforeConcatPass succeeded
      
  3. Online inference

    • Set environment variable to dump model graph during compilation:

      export DUMP_GE_GRAPH=1
      
    • Enter data directory and execute .py file for online inference (ensure torch_npu plugin is installed for online inference):

      python torch_forward.py
      
    • Following log appears:

      MoveReluBeforeConcatPass
      Define Replacement for MoveReluBeforeConcatPass
      Replacement of MoveReluBeforeConcatPass succeeded
      
  4. View execution results

    • After ATC tool command completes, a series of .pbtxt files are generated in the directory. Compare the following dump graphs:

      • ge_onnx_xxxxx_PreRunBegin.pbtxt dump graph before execution
      • ge_onnx_xxxxx_RunCustomPassBeforeInferShape.pbtxt custom pass dump graph before InferShape execution

      Find model optimized as expected, i.e., ReLU moved before Concat.

    • If expected result is not obtained, can set the following environment variables (if using atc command, also add parameter --log=debug) to print logs to screen for troubleshooting:

        export ASCEND_SLOG_PRINT_TO_STDOUT=1 # print logs to screen
        export ASCEND_GLOBAL_LOG_LEVEL=0 # log level debug
      

【免费下载链接】ge GE(Graph Engine)是面向昇腾的图编译器和执行器,提供了计算图优化、多流并行、内存复用和模型下沉等技术手段,加速模型执行效率,减少模型内存占用。 GE 提供对 PyTorch、TensorFlow 前端的友好接入能力,并同时支持 onnx、pb 等主流模型格式的解析与编译。 【免费下载链接】ge 项目地址: https://gitcode.com/cann/ge

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值