WHITE PAPER
AI WORKLOADS AT SCALE
Kubernetes Cluster with Supermicro's Systems with
AMD EPYC™ 7002 Series processors
Executive Summary
The Deep Learning (DL) benchmark results in the previous white paper
1
clearly
show that a DL workload in Docker containers performs the same as on the
BareMetal. Building an on-prem Kubernetes cluster with GPU workers and AI
framework-specific Docker containers can help an organization run projects or
productions in a highly reliable and scalable platform. In this white paper,
Supermicro AMD based WIO systems, AS-1114S-WTRT, are introduced as
Kubernetes Admin and master nodes. Along with AS-2023US-TR4, we build an
NVIDIA GPU capable Kubernetes cluster that uses Cloud-native CEPH storage as
persistent volumes and demonstrates how a DL workload can scale the
Kubernetes cluster.
White paper:
SUPERMICRO® SYSTEM COMBINES AMD EPYC™PROCESSORS AND NVIDIA GPUS TO ACHIEVE
CONSISTENT DEEP LEARNING PERFORMANCE WITH LINEAR SCALING
Supermicro AMD based WIO System
SUPERMICRO
Supermicro (Nasdaq: SMCI), the leading
innovator in high-performance, high-
efficiency server and storage technology is
a premier provider of advanced server
Building Block Solutions® for Enterprise
Data Center, Cloud Computing, Artificial
Intelligence, and Edge Computing
Systems worldwide. Supermicro is
committed to protecting the environment
through its “We Keep IT Green®” initiative
and provides customers with the most
energy-efficient, environmentally-friendly
solutions available on the market.
TABLE OF CONTENTS
Executive Summary
.......................
1
System Configuration
.....................
2
Introduction to Kubernetes
...............
4
Kubernetes Cluster Deployment
..........
4
Scale-up With Kubernetes Cluster
.........
8
Conclusion
................................
9