Preparing Deep Learning Models for Isaac ROS

Obtaining a Pre-trained Model from NGC

The NVIDIA GPU Cloud hosts a catalog of Deep Learning pre-trained models that are available for your development.

  1. Use the Search Bar to find a pre-trained model that you are interested in working with.

  2. Click on the model’s card to view an expanded description, and then click on the File Browser tab along the navigation bar.

  3. Using the File Browser, find a deployable .etlt file for the model you are interested in.

    Note: The .etlt file extension indicates that this model has pre-trained but encrypted weights, which means one needs to use the tao-converter utility to decrypt the model, as described below.

  4. Under the Actions heading, click on the icon for the file you selected in the previous step, and then click Copy wget command.

  5. Paste the copied command into a terminal to download the model in the current working directory.

Using tao-converter to decrypt the Encrypted TLT Model (.etlt) Format

As discussed above, models distributed with the .etlt file extension are encrypted and must be decrypted before use via NVIDIA’s tao-converter.

tao-converter is already included in the Docker images available as part of the standard Isaac ROS Development Environment.

The per-platform installation paths are described below:

Platform

Installation Path

Symlink Path

x86_64

/opt/nvidia/tao/tao-converter-x86-tensorrt8.0/tao-converter

/opt/nvidia/tao/tao-converter

Jetson (aarch64)

/opt/nvidia/tao/jp5

/opt/nvidia/tao/tao-converter

Converting .etlt to a TensorRT Engine Plan

Here are some examples for generating the TensorRT engine file using tao-converter. In this example, we will use the PeopleSemSegnet Shuffleseg model:

Generate an engine file for the fp16 data type

mkdir -p /workspaces/isaac_ros-dev/models && \
   /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t fp16 -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt

Note

The specific values used in the command above are retrieved from the PeopleSemSegnet page under the Overview tab.The model input node name and output node name can be found in peoplesemsegnet_shuffleseg_cache.txt from File Browser. The output file is specified using the -e option. The tool needs write permission to the output directory.

A detailed explanation of the input parameters is available here.

Generate an engine file for the data type int8

Create the models directory:

mkdir -p /workspaces/isaac_ros-dev/models
Download the calibration cache file:

Note

Check the model’s page on NGC for the latest wget command.

wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt
/opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /workspaces/isaac_ros-dev/models/peoplesemsegnet_shuffleseg.engine -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt

Note

The calibration cache file (specified using the -c option) is required to generate the int8 engine file. This file is provided in the File Browser tab of the model’s page on NGC.

Using trtexec to convert an ONNX model to a TensorRT Plan File

Assuming that a model called model.onnx is available, the conversion is performed using:

/usr/src/tensorrt/bin/trtexec --onnx=model.onnx --saveEngine=model.plan

Warning

Reading the documentation of trtexec is highly recommended to obtain best performance. In particular, we recommend pay attention to the quantization of the model (e.g. fp32 vs fp16 vs int8).

Inspecting The Input and Output Binding Names of a Model

Deep learning models have input_binding_names and output_binding_names. These correspond to the model’s inputs and outputs respectively. These are determined by the model itself during export. There are two methods one can perform to determine this, but the recommended way is using a TensorRT Plan File.

Note

In addition, the TensorRTNode and TritonNode have parameters called input_tensor_names and output_tensor_names, these correspond to the expected tensor names within the ROS 2 TensorList.

Using an ONNX Model File

If an ONNX Model file is used, one can use netron to visualize the ONNX model, and note down the input and output names and dimensions.

Using a TensorRT Plan File

If a TensorRT Plan file is used, one can use NVIDIA’s polygraph tool to determine it.

  1. Install TensorRT’s Python binding and the polygraph tool:

    pip install tensorrt tensorrt_bindings
    pip install colored polygraphy --extra-index-url https://pypi.ngc.nvidia.com
    
  2. Add /home/admin/.local/bin to your PATH to use polygraph more conveniently:

    export PATH="/home/admin/.local/bin:$PATH"
    
  3. Obtain the desired model. In this case, we’ll show how to get the PeopleSemSegnet ShuffleSeg network:

    mkdir -p /tmp/models/peoplesemsegnet_shuffleseg/1 && \
      cd /tmp/models/peoplesemsegnet_shuffleseg && \
      wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_etlt.etlt && \
      wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplesemsegnet/versions/deployable_shuffleseg_unet_v1.0/files/peoplesemsegnet_shuffleseg_cache.txt
    
  4. Convert the obtained model from an etlt file to a plan file (called model.plan):

    /opt/nvidia/tao/tao-converter -k tlt_encode -d 3,544,960 -p input_2:0,1x3x544x960,1x3x544x960,1x3x544x960 -t int8 -c peoplesemsegnet_shuffleseg_cache.txt -e /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan -o argmax_1 peoplesemsegnet_shuffleseg_etlt.etlt
    
  5. Now go to the directory where we obtained the PeopleSemSegNet ShuffleSeg model:

    cd /tmp/models/peoplesemsegnet_shuffleseg/1
    
  6. Now use polygraph to inspect the names of the inputs and outputs of the model. In this case, the model we obtained is called model.plan:

    polygraphy inspect model model.plan
    

    The expected output should look like this:

    [I] Loading bytes from /tmp/models/peoplesemsegnet_shuffleseg/1/model.plan
    [I] ==== TensorRT Engine ====
      Name: Unnamed Network 0 | Explicit Batch Engine
    
      ---- 1 Engine Input(s) ----
      {input_2:0 [dtype=float32, shape=(1, 3, 544, 960)]}
    
      ---- 1 Engine Output(s) ----
      {argmax_1 [dtype=int32, shape=(1, 544, 960, 1)]}
    
      ---- Memory ----
      Device Memory: 21269504 bytes
    
      ---- 1 Profile(s) (2 Tensor(s) Each) ----
      - Profile: 0
          Tensor: input_2:0          (Input), Index: 0 | Shapes: min=(1, 3, 544, 960), opt=(1, 3, 544, 960), max=(1, 3, 544, 960)
          Tensor: argmax_1          (Output), Index: 1 | Shape: (1, 544, 960, 1)
    
      ---- 73 Layer(s) ----
    

In this case, the input_binding_names for this network is ['input_2:0'], whereas the output_binding_names is ['argmax_1']. The shape of each dimension can also be observed from this command.

These values can be taken and used as the input_binding_names and output_binding_names for the TensorRTNode or TritonNode. If a model has multiple inputs or outputs, these must be passed in as a string list of all the values. Once again, ensure that the TensorRTNode or TritonNode’s input_tensor_names and output_tensor_names parameters are correctly set according to the names of the ROS 2 TensorList message obtained from any upstream nodes or expected by any downstream nodes.