Table of Contents
1 Executive Summary
2 AI Processing for Camera Image Data
3 NVIDIA NGC on Supermicro Validated NGC
Ready Systems
4 Al/Ml Deployment in TensorRT
5 Converting ML Model to TensorRT
6 Inference Benchmark Results
7 Sizing for AI Inference
8 Supermicro GPU Servers Specifications
9 How to Run NGC
10 Guidelines for Model Development
11 Additional Training Results
12 Support & Services
13 Conclusion
14 References
Super Micro Computer, Inc.
980 Rock Avenue
San Jose, CA 95131 USA
www.supermicro.com
White Paper
Supermicro® Systems Powered
by NVIDIA GPUs for Best AI
Inference Performance Using
NVIDIA TensorRT
NVIDIA AI Software from the NGC Catalog for Training and Inference
Executive Summary
Deep learning inferencing to process camera image data is becoming
mainstream. With the availability of high-resolution network cameras,
accurate deep learning image processing software, and robust, cost-
effective GPU systems, businesses and governments are increasingly
adopting these technologies. The use cases include retail inventory
tracking, on-premise security, insurance claim damage assessment,
medical image diagnosis, and many other applications.
This document demonstrates the benefits of using NVIDIA NGC and
NVIDIA TensorRT to get the best inference performance using
Supermicro systems powered by NVIDIA GPUs. It also shows how to set
up NGC on a Supermicro server and how to use TensorRT for
inference. The primary focus of this paper is about the key capabilities
of Supermicro systems powered by NVIDIA GPUs for inference.
Benchmark data that Supermicro engineers collected, sizing
recommendations, and server selection for inference deployment are
also included in this paper.
AI Processing of Camera Image Data
In the past ten years, there have been significant advances in the recognition
and classification of camera image data with neural-network-based AI. It began
with a relatively simple Convolution Neural Network (CNN), followed by more
advanced AI models. Recognition accuracy has advanced beyond what a human
can do. By combining classification, segmentation, labeling, and other image
feature extraction techniques, these AI models are now being applied to
business applications, including the following:
Retail inventory tracking
Retail self-checkout
Self-driving cars
On-premise security and monitoring
Automatic insurance claim damage assessment
Medical diagnosis
And many other applications
The general approach is to train an AI model with labeled data using a single
NVIDIA GPU system or a cluster of GPU systems. The AI training could take
hours to days, depending on the amount of data, the number of times (epochs)
data needs to be examined, and types of systems used. Once an AI model is
trained to the required accuracy, then it is deployed in one or more
applications to make the AI inference of new incoming data.
KEY CUSTOMER
BENEFITS
Optimized IT
deployment ROI
Faster AI results
Effective scaling
Faster
deployment
Cost-effective
Energy-savings