Table of Contents

Step-by-step example

The step-by-step example goes through the whole process of setting up and using an NVIDIA Jetson Xavier NX with our custom YOLOv5 implementation.

1. Setting up the NVIDIA Jetson Xavier NX

To set up the NVIDIA Jetson Xavier NX, follow their respective guide to get started:

Getting Started with Jetson Xavier NX Developer Kit

2. Plugging in I/O devices

After the NVIDIA Jetson Xavier NX has been set up properly, you should plug in the required I/O devices.

You will need the following devices:

3. Preparing the NVIDIA Jetson Xavier NX

First, you may want to head over to the system settings to turn off the automatic screen saver and screen lock. It may cause problems when you build the Docker image, as this will take some more time.

After that, it is recommended to update all packages on the system with:

sudo apt update; sudo apt upgrade -y

You can quickly open a new terminal on Ubuntu with the hotkey Ctrl+Alt+T.

4. Checking I/O devices

4.1 Checking camera

First, list all current devices on the system with:

la /dev

You should be able to see a video0 device in the terminal output. This represents the connected camera.

The NVIDIA Jetson Xavier NX provides a tool called nvgstcapture-1.0 that can test a camera's output by using pre-constructed GStreamer pipelines.

In this example we assume that we have connected a ELP-USB8MP02G-SFV camera, which is a USB camera.

To test a USB camera with the nvgstcapture-1.0 tool, run the following:

nvgstcapture-1.0 --camsrc=0 --cap-dev-node=0

You should now be able to see the camera's output in an additional window.

Consult the repository's README to get more information about the nvgstcapture-1.0 tool.

Furthermore, you should check the camera's supported formats, as those will be relevant later when configuring YOLOv5's source.

The v4l-utils package makes this process very simple. First, install it via apt, like so:

sudo apt install -y v4l-utils

If the installation was successful, you can go ahead and use it:

v4l2-ctl -d /dev/video0 --list-formats-ext

For the camera used in this example (ELP-USB8MP02G-SFV) it will produce the following output:

ioctl: VIDIOC_ENUM_FMT
        Index       : 0
        Type        : Video Capture
        Pixel Format: 'MJPG' (compressed)
        Name        : Motion-JPEG
                Size: Discrete 1600x1200
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 3264x2448
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 2592x1944
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 2048x1536
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 1280x960
                        Interval: Discrete 0.067s (15.000 fps)
                Size: Discrete 1024x768
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 800x600
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 640x480
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 329x240
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 1600x1200
                        Interval: Discrete 0.067s (15.000 fps)
 
        Index       : 1
        Type        : Video Capture
        Pixel Format: 'YUYV'
        Name        : YUYV 4:2:2
                Size: Discrete 1600x1200
                        Interval: Discrete 0.100s (10.000 fps)
                Size: Discrete 3264x2448
                        Interval: Discrete 0.067s (2.000 fps)
                Size: Discrete 2592x1944
                        Interval: Discrete 0.333s (3.000 fps)
                Size: Discrete 2048x1536
                        Interval: Discrete 0.333s (3.000 fps)
                Size: Discrete 1280x960
                        Interval: Discrete 0.100s (10.000 fps)
                Size: Discrete 1024x768
                        Interval: Discrete 0.100s (10.000 fps)
                Size: Discrete 800x600
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 640x480
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 329x240
                        Interval: Discrete 0.067s (30.000 fps)
                Size: Discrete 1600x1200
                        Interval: Discrete 0.100s (10.000 fps)

4.2 Checking configured Arduino

List all current devices on the system with:

la /dev

You should be able to see a ttyUSB0 device in the terminal output. This represents the connected configured Arduino.

5. Setting up our custom implementation of YOLOv5

5.1 Cloning the repositories

First, clone the repository into the home (~) directory:

git clone https://gitlab.com/fablabkamplintfort1/farmrobot.git

Then, change into the repository's directory and clone the official YOLOv5 into it:

cd farm-robot-yolov5
 
git clone https://github.com/ultralytics/yolov5.git

Once this is set up, you are theoretically ready to build the Docker image. But you usually want to configure the implementation via the .toml configuration files first.

5.2 Configuration

The repository comes with a default_config.toml which contains the default configuration. However, you should not edit this file directly, since it is version controlled by the repository! Instead, copy the file with a new name (config.toml), like so:

cp default_config.toml config.toml

The config.toml file is ignored by default, so you can edit it however you like.

In the newly created config.toml you will have to edit atleast one value, the source. The source value should consist of a GStreamer pipeline. With the ELP-USB8MP02G-SFV camera, you can use the following GStreamer pipeline:

v4l2src device=/dev/video0 ! image/jpeg, width=2592, height=1944, framerate=15/1 ! nvv4l2decoder mjpeg=1 ! nvvidconv flip-method=0 ! video/x-raw, format=BGRx ! videoconvert ! video/x-raw, format=BGR ! appsink

It is important to replace the width=, height= and framerate= values, with the values of a supported format of the used camera!

The configured config.toml should look something like this now:

[yolov5]
# Input source.
source = "v4l2src device=/dev/video0 ! image/jpeg, width=2592, height=1944, framerate=15/1 ! nvv4l2decoder mjpeg=1 ! nvvidconv flip-method=0 ! video/x-raw, format=BGRx ! videoconvert ! video/x-raw, format=BGR ! appsink"
 
...

Consult the repository's README to get more information about GStreamer pipelines.

If you intend to run the NVIDIA Jetson Xavier NX without an internet connection, you have to manually download the configured weights, before you build and instantiate the Docker container. The default YOLOv5 weights can be found on its releases page.

5.3 Building the Docker image

Building the Docker image is easy and straightforward. Simply run:

sudo docker build -t farm-robot-yolov5 .

The build process of the Docker image could take up to an hour!

5.4 Instantiating the Docker container

When instantiating the Docker container you need to pay attention to mount all required devices (camera and configured Arduino)! With the devices used in this example, you can instantiate the container, like so:

sudo docker run -it --rm --runtime nvidia --device /dev/video0 --device /dev/ttyUSB0 farm-robot-yolov5

Consult the repository's README to get more information about mounting devices to a Docker container.

5.5 Starting the inference

Once the interactive Docker container booted, you can run the main.py:

python3 .

If everything is configured correctly, the delta arm should start to move to its home position and then to its configured initial position. A few seconds after that, YOLOv5's inference should start. When YOLOv5 detects an object that has been declared as a “target” (via the targets list in the config.toml), the delta arm should move to the object's position.