WHITE PAPER

AI WORKLOADS AT SCALE

Kubernetes Cluster with Supermicro's Systems with

AMD EPYC™ 7002 Series processors

Executive Summary

The Deep Learning (DL) benchmark results in the previous white paper

clearly

show that a DL workload in Docker containers performs the same as on the

BareMetal. Building an on-prem Kubernetes cluster with GPU workers and AI

framework-specific Docker containers can help an organization run projects or

productions in a highly reliable and scalable platform. In this white paper,

Supermicro AMD based WIO systems, AS-1114S-WTRT, are introduced as

Kubernetes Admin and master nodes. Along with AS-2023US-TR4, we build an

NVIDIA GPU capable Kubernetes cluster that uses Cloud-native CEPH storage as

persistent volumes and demonstrates how a DL workload can scale the

Kubernetes cluster.

White paper:

SUPERMICRO® SYSTEM COMBINES AMD EPYC™PROCESSORS AND NVIDIA GPUS TO ACHIEVE

CONSISTENT DEEP LEARNING PERFORMANCE WITH LINEAR SCALING

Supermicro AMD based WIO System

SUPERMICRO

Supermicro (Nasdaq: SMCI), the leading

innovator in high-performance, high-

efficiency server and storage technology is

a premier provider of advanced server

Building Block Solutions® for Enterprise

Data Center, Cloud Computing, Artificial

Intelligence, and Edge Computing

Systems worldwide. Supermicro is

committed to protecting the environment

through its “We Keep IT Green®” initiative

and provides customers with the most

energy-efficient, environmentally-friendly

solutions available on the market.

TABLE OF CONTENTS

Executive Summary

.......................

System Configuration

.....................

Introduction to Kubernetes

...............

Kubernetes Cluster Deployment

..........

Scale-up With Kubernetes Cluster

.........

Conclusion

................................

AI Workloads At Scale March 2021

System Configuration

AS-1114S-WTRT is one of the Supermicro AMD EPYC™ 7002 series based WIO series servers, offering a wide range of I/O options

to deliver truly optimized systems for specific requirements. For more detailed system information, please go HERE.

Customers can optimize the storage and networking alternatives to accelerate performance, find the perfect fit for their

applications. In our case, it uses the VMWare host and Kubernetes master nodes. Figure 1 and Figure 2 provide an overview of

the system:

Table 1 provides a sample configuration for AS-1114S-WTRT

System

Part Number

QTY

Part Description

Infrastructure

AS-1114S-WTRT

H12 SSW-NT,CSV116TS-R504WBP

CPU

PSE-ROM7552

AMD EPYC™ 7552 DP/UP 48C/96T 2.2G 192M 200W 4094

Memory

MEM-DR416L-HL01-ER32

16GB DDR4-3200 2Rx8 ECC REG DIMM

HDD/SSD (Storage)

HDS-X2A-XS7680TE70004

[NR]Seagate Lange 7.68TB SAS 12Gb/s, 15mm, 2.5", 0.8DWPD SSD, HF

HDD/SSD (OS)

HDS-SMN1-MZ1LB3T8HMLA07

Samsung PM983 3.84TB NVMe PCIe3x4 V4 M.2 22x110mm (1.3 DWPD)

AOC

MCX4121A-ACAT

Standard Low-profile Mellanox 25GbE card with 2x SFP28 ports

Storage Controller

AOC-S3008L-L8i

AOC-S3008L-L8i Retail Pack

Cable

CBL-SAST-0593

Internal Mini-SAS HD to Mini-SAS HD 60cm,30AWG,12Gb/s

Software License

SFT-DCMS-SINGLE

Supermicro System Management Software Suite node license, HF, RoHS/REACH, PBF

Table 1

Figure 1

Figure 2