TABLE OF CONTENTS
1 EXECUTIVE SUMMARY
2 SUPERMICRO AI / ML SOLUTION
2 AI / ML REFERENCE ARCHITECTURE
3 HOW SUPERMICRO SOLUTION IS
DEPLOYED
3 DEPLOYMENT FOR IT ADMIN
3 DEPLOYMENT FOR DATA SCIENTISTS AND
DEVOP
4 HOW SUPMERMICRO SOLUTION FLOW
WORKS
5 SYSTEM DETAILS
6 CONFIGURATION
7 BENCHMARK RESULTS
8 SUPPORT AND SERVICES
8 CONCLUSION
WHITE PAPER
SUPERMICRO® ARTIFICIAL
INTELLIGENCE / MACHINE
LEARNING READY SOLUTION
Meet all your AI/ML application needs with Supermicro optimized GPU
server solutions
- March 2020
Super Micro Computer, Inc.
980 Rock Avenue
San Jose, CA 95131 USA
www.supermicro.com
EXECUTIVE SUMMARY
The rapid expansion of Articial Intelligence (AI) and Machine Learning (ML) applications into all
aspects of business and everyday life is generating an explosion in Big Data. This advancement comes
with a price, however the need for frequent training, retraining, and hyperparameter tuning longer
times than are now the norm. In addition, AI/ML also requires enormous amounts of processing power
for model training.
Compute-intensive Machine Learning algorithms take extended times to complete when using
hardware without acceleration features, resulting in overall poor application performance and
reduced ROI. With this growing demand for AI/ML applications, enterprise data centers accommodate
budget, space, and IT resources, while also shortening this training time bottleneck.
With no end in sight to expanding datasets, nor to compute and memory-intensive applications,
data center managers must rapidly secure the necessary processing horsepower and matching AI/
ML platforms to satisfy their business needs. With the proper selection of vendors, these hardware-
plus-application solutions will help users to identify trends and patterns, improving throughput and
training times, thus leading to a positive cycle of advancement. This paper describes one such AI/ML
solution from Supermicro.
Supermicro®AI / ML Ready Solution White Paper2
SUPERMICRO AI / ML SOLUTION
GENERAL DESCRIPTION
As Articial Intelligence and Machine Learning solutions become more accessible and more mature,
global organizations will come to realize the value that these solutions can deliver to solve the
advanced business challenges.
The Supermicro AI/ML solution features a best-in-class hardware platform with the enterprise-ready
Canonical Distribution of Kubernetes (CDK) and software-dened storage capabilities from Ceph.
The solution through its reference architecture integrates network, compute, and storage. The
recommended starting implementation includes a single rack with capabilities to scale to many racks
as required.
AI / ML REFERENCE ARCHITECTURE
The reference architecture is ready to deploy end-to-end AI / ML solution that includes AI SW stack,
orchestration, and containers. The optimized reference design ts machine learning training and
inference applications. The architecture on a high-level comprises software, network switches, control,
compute, storage, and support services.
The reference design shown in Figure 1 contains two data switches, two management switches, three
infrastructure nodes that act as foundation nodes for MAAS / JUJU, and six cloud nodes. It is built on
the Kubernetes platform and provides Canonical hardened packages for Kubernetes containers and
Ceph. Kubeow provides a machine learning toolkit for Kubernetes.
KEY CUSTOMER BENEFITS
• Pre-validated reference
architectures
• Certied components
• Scale-out to multiple racks
•��� TCO Optimization for best
performance / watt /$ /ft2
• Start as professional by
leveraging expertise, support
and service
• Supports TensorFlow, Kubeow,
Kubernetes
• Sharable resources with higher
utilization
SOLUTION CONFIGURATION
• Up to 216 compute cores
• Up to 3072 GB system memory
• Up to 36 TB storage
• Up to 40 Gbe data networking
• 19U height
• High performance caching
utilizing NVMe ash storage
Figure 1. Supermicro AI / ML Reference Architecture
RACK1
Availibility Zone-1
Data Switch(es)
MGMT Switch(es)
Foundation Node
(MAAS/JUJU)
Cloud Nodes
Kubernetes