Wednesday, April 24, 2019

How to test NGC-Ready Server

Hardware:
B7102, BOS v1.03.B10, BMC v4.0
CPU: Xeon 6128x2
RAM: DDR4 128GB
OS: Ubuntu 18.04.2 LTS Server
GPU: Nvidia Tesla T4 x4

AP: Nvidia


Test Process

Install the Ubuntu Operating System and Nvidia Tesla Driver

1. Install Ubuntu 18.04.2 LTS server

2. Login as user



3. Download Nvidia 418.40 driver




4. Install Nvidia 418.40 driver local repository















5. add the local repo key

6. Prioritize the local repo over the network repos
7. cat local-cuda to verify

8. Update the apt metadata


9. Install Nvidia 418.40 driver
10. Confirm that you can see your Nvidia Tesla T4 card in the nvidia output

Installing Dcoker and the Docker Utility Engine for Nvidia GPUs

Enabling the Docker Repository

1. Installs the Docker prerequisites












2. Adds the Docker official GPG key
3. Add the official stable Docker repository




Installing the Nvidia Container Runtime for Nvidia GPUs
















































4. Reload docker
5. Test the Docker install and Nvidia container runtime by pulling the latest official CUDA image
and running nvidia-smi


Setting Up docker Options

1. Edits the contents of /etc/systemd/system/docker.service.d/override.conf.
2. Reload the systemd manager configuration.
3. Reload Docker.



Test NGC Sanity

1. extract NGC-Ready-master to Linux
2. type cd /tf to change folder to tf
3. vi lanuch to add red parameters

docker build --network=host --no-cache -t $TESTNAME

docker run --network=host -it --security-opt

docker run --network=host -it --runtime=nvidia

















4. type ./launch to execute script
note: this script may take over 8 hours.

No comments:

Post a Comment

How to use ipmi command to locate B7136 HDD LED

 Locate: Hdd0:  Ipmitool –I lanplus –H BMC_IP –U root –P BMC_password raw 0x2e 0x09 0xfd 0x19 0x0 0x06 0xc2 0x00 0x01 0x01 Hdd1:  Ipmitool –...