Skip to content

P4d support on Amazon Linux 1

ddeidda edited this page Jan 22, 2021 · 1 revision

The Nvidia Fabric Manager package is required on P4d instances to fully exploit the potential of A100 GPUs; however this package is not available on Amazon Linux 1. Therefore, although the combination of base_os = alinux with p4d.24xlarge instance type is allowed, CUDA initialization is likely to fail in compute nodes with this configuration because NVLink connections are only enabled after the NVIDIA kernel driver is loaded and Fabric Manager configures them (see: https://docs.nvidia.com/datacenter/tesla/pdf/fabric-manager-user-guide.pdf). Additionally, Amazon Linux 1 reached end-of-life and entered maintenance mode on 12/31/2021, and as such it is no longer recommended for use with ParallelCluster.

We recommend using one of the other supported operating systems as value for the base_os configuration parameter, such as Amazon Linux 2, which requires minimal change in user space relative to Amazon Linux 1. The full set of supported operating systems compatible with the Nvidia Fabric Manager are: alinux2, centos7, centos8, ubuntu1604 or ubuntu1804.

Clone this wiki locally