The NVIDIA Holoscan Sensor Bridge has become an alternative to low-latency capture, interconnecting an MIPI camera to a Lattice CrossLink NX FPGA. The data is transmitted to a Lattice CertusPro NX FPGA as an intermediate FPGA for data packaging and transmission over 10-Gigabit Ethernet. The idea of having an intermediate FPGA is to integrate image signal processing algorithms before the image leaves the Holoscan Sensor Bridge, ready to consume by other platforms like the NVIDIA AGX Orin, having gains in shortening the capture latency. This will be explored in the future. For more information, please visit our blog post Leveraging Low Latency to the Next Level with the Holoscan Sensor Bridge.
The Holoscan Sensor Bridge can also be connected to DPDK-compatible (Data Plane Development Kit) platforms, such as the NVIDIA IGX Orin, simplifying the network stack and reducing kernel overhead over the packet exchange.
For this blog, we will measure the glass-to-glass latency of the Holoscan Sensor Bridge with an IMX274 camera connected to an NVIDIA AGX Orin. No optimisations on the FPGA or the network stack have been performed during these measurements. Moreover, rough comparisons against the latency of the Argus MIPI integrated into the NVIDIA AGX orin.
What is Glass-To-Glass Latency
Glass-to-glass latency is a common metric for live video display that quantifies the delay between the camera capture and the video display. Low latencies are crucial in applications like medicine endoscopy to avoid medical misinterpretations.
The test consists of a camera connected to a system that displays the image on a monitor. The setup has two monitors: one showing a chronometer and another displaying the live video capture. The camera must point to both monitors. Then, with a camera, the setup is photographed multiple times until a clear measurement of both monitors showing the clock. The following picture illustrates the aforementioned scenario:
Setup of the Holoscan Sensor Bridge and Jetson AGX Orin
For the glass-to-glass measurement, the following hardware components are needed:
An NVIDIA Jetson AGX Orin developer kit
A USB-C power supply for the NVIDIA Jetson AGX orin
A USB-C power supply for the Holoscan Sensor Bridge (2A works)
An ethernet cable: category 6 or superior
A USB-micro data cable
A display port cable (optional: with HDMI adapter for monitors)
A monitor or display
A keyboard and mouse with USB connection
A smartphone with camera (or digital camera) to take pictures
For the setup and instructions on getting started, please visit our blog post Getting Started with the NVIDIA Jetson and the Holoscan Sensor Bridge and our developer wiki.
Measurements
This section shows the results and the tools used to get the measurements.
Tools
For the live capture, we followed the instructions from our blog Getting Started with the NVIDIA Jetson and the Holoscan Sensor Bridge. It details the instructions from installing the Jetpack and running the Holoscan demo.
For the chronometer, it is possible to use any of the following alternatives:
Image Signal Processing Pipeline
To understand the origin of the latency and how to decompose it, it is necessary to have a look at the ISP pipeline, which is illustrated below:
The process begins with the capture on the Holoscan Sensor Bridge, which is composed of the camera PHY and packetisation using UDP over Ethernet. Then, the NVIDIA AGX Orin receives the packets using UDP sockets, where the image is a pure Bayer pattern, followed by an Image Signal Processor equipped with black-and-white balancing. Then, the demosaicing happens, converting a stabilised Bayer image into a 64-bit RGBA image that is later gamma-corrected and visualised through Vulkan.
Results
First, configure the Jetson Clocks and the Power Mode to Maximum Power Profile (using jetson-stats).
The results obtained from the capture are the following:
Sensor | Image Dimensions (px) | CPU Usage (%) | RAM Usage (MiB) | GPU Usage (%) | Glass-to-Glass Latency (ms) |
IMX274 | 3840x2160 (4K) | 2.56 (total) 30.6 (core) | 347 | 14 | 96 |
The (total) CPU usage represents the percentage of the entire CPU, whereas the (core) is the use percentage relative to a single CPU core.
An illustration of the setup is the following:
Comparison Against Other Capture Methods on NVIDIA Jetson
The camera can also be connected directly to the MIPI CSI PHY of the NVIDIA Jetson AGX Orin. It uses an integrated ISP chip device and the LibArgus capture to deliver the image. For this purpose, we use GStreamer with the nvarguscamerasrc element to create a glass-to-class display video capture.
The results are:
Sensor | Image Dimensions (px) | CPU Usage (%) | RAM Usage (MiB) | GPU Usage (%) | Glass-to-Glass Latency (ms) |
IMX477 - Xavier AGX JP5 | 3840x2160 (4K 30fps) | - | - | 0 | 115 |
IMX477 - Xavier NX JP 4 | 1920x1080 (1080p 30fps) | - | - | - | 77.8 |
IMX477 - Xavier NX JP 4 | 1920x1080 (1080p 60fps) | - | - | - | 110.6 |
Despite the results being close to the 96 ms mark achieved by the Holoscan Sensor Bridge, this framework has not been exploited enough to lower the latency. Our next blog will show that it is possible to modify the ISP to get latencies below 60 ms. Moreover, it is also possible to lower the latency by optimising the network communication by using DPDK.
Important Remarks
For stereo capture, the processing platform (i.e. the NVIDIA Jetson) must have two separate network interfaces, given that each camera stream is delivered in separate ports.
The NVIDIA Jetson AGX Orin developer kit does not possess a DPDK-compatible card, falling back into the Linux socket system. This increases the glass-to-glass latency. For DPDK, it is necessary to connect a DPDK-compatible card or use a custom carrier board with a compatible NIC.
Expect more information from us
The next blog post will show how to lower the latency by using our CUDA ISP.
If you want to know more about how to leverage this technology in your project: Contact Us.