Can Your Vision AI Solution Keep Up with Cortex-M85?

PublishTime： 2023-12-14 Article Source：RENESAS Blogs

In this blog, We will discuss a vision AI application built on the new RA8D1 graphics-enabled MCUs featuring the same Cortex-M85 core and the use of Helium to accelerate the neural network. RA8D1 MCUs provide a unique combination of advanced graphics capabilities, sensor interfaces, large memory, and the powerful Cortex-M85 core with Helium for the acceleration of the vision AI neural networks, making them ideally suited for these vision AI applications.

Graphics and Vision AI Applications with RA8D1 MCUs

Renesas has successfully demonstrated the performance uplift with Helium, in various AI / ML use cases showing significant improvement over a Cortex-M7 MCU – more than 3.6x in some cases.

One such use case is a people detection application developed in collaboration with Plumerai, a leading provider of vision AI solutions. This camera-based AI solution has been ported and optimized for the Helium-enabled Arm Cortex-M85 core, successfully demonstrating both the performance as well as the graphics capabilities of the RA8D1 devices.

Accelerated with Helium, the application achieves a 3.6x performance uplift vs. Cortex-M7 core and 13.6 fps frame rate, a strong performance for an MCU without hardware acceleration. The demo platform captures live images from an OV7740 image-sensor-based camera at 640x480 resolution and presents detection results on an attached 800x480 LCD. The software detects and tracks each person within the camera frame, even if partially occluded, and shows bounding boxes drawn around each detected person overlaid on the live camera display.

Figure 1: Renesas People Detection AI Demo Platform, showcased at Embedded World 2023

Plumerai people detection software uses a convolution neural network with multiple layers, trained with over 32 million labeled images. The layers that account for the majority of the total latency, are Helium accelerated, such as the Conv2D and fully connected layers, as well as depthwise convolution and transpose convolution layers.

The camera module provides images in YUV422 format which is converted to RGB565 format for display on the LCD screen. The 2D graphics engine integrated on the RA8D1 resizes and converts the RGB565 to ABGR8888 at resolution 256x192 for input to the neural network. The software then converts the ARBG8888 format to the neural network model input format and runs the people detection inference function. The graphics LCD controller and 2D drawing engine on the RA8D1 are used to render the camera input to the LCD screen as well as draw bounding boxes around detected people and present the frame rate. The people detection software uses roughly 1.2MB of flash and 320KB of SRAM, including the memory for the 256x192 ABGR8888 input image.

Figure 2: People Detection AI application on the RA8D1 MCU

Benchmarking was done to compare the latency of Plumerai’s people detection solution as well as the same neural network running with TFMicro using Arm’s CMSIS-NN kernels. Additionally, for the Cortex-M85, the performance of both solutions with Helium (MVE) disabled was also benchmarked. This benchmark data shows pure inference performance and does not include latency for the graphics functions, such as image format conversions.

Figure 3: The Renesas people detection demo based on the RA8D1 demonstrates a performance uplift of 3.6x over the Cortex-M7 core

Figure 4: Inference performance of 13.6fps @ 480MHz using RA8D1 with Helium enabled

This application makes optimal use of all the resources available on the RA8D1:

High-performance 480MHz processor
Helium for neural network acceleration
Large flash and SRAM for storage of model weights and input activations
Camera interface for capture of input images/video
Display interface to show the people detection results

Renesas has also demonstrated multi-modal voice and vision AI solutions based on the RA8D1 devices that integrate visual wake words and face detection and recognition with speaker identification. RA8D1 MCUs with Helium can significantly improve neural network performance without the need for any additional hardware acceleration, thus providing a low-cost, low-power option for implementing AI and machine learning use cases.

+1 Like
Add to Favorites

Recommend

Technical Resources

More>

New Products & Solutions

More>

This document is provided by Sekorm Platform for VIP exclusive service. The copyright is owned by Sekorm. Without authorization, any medias, websites or individual are not allowed to reprint. When authorizing the reprint, the link of www.sekorm.com must be indicated.

Integrated Circuits