Getting started with MIG partitioning
Warning
Multi-instance GPU (MIG) mode is supported only by NVIDIA GPUs based on Ampere, Hopper and newer architectures.
Prerequisites
To enable Dynamic MIG Partitioning on a certain node, the following prerequisites must be met:
- if a node has multiple GPUs, all the GPUs must be of the same model
- all the GPUs of the nodes for which you want to enable MIG partitioning must have MIG mode enabled
Enable MIG mode
By default, MIG is not enabled on GPUs. In order to enable it, SSH into the node and run the following command for each GPU you want to enable MIG, where <index>
corresponds to the index of each GPU:
Depending on the kind of machine you are using, it may be necessary to reboot the node after enabling MIG mode for one of its GPUs.
You can check whether MIG mode has been successfully enabled by running the following command and checking if you get a similar output:
$ nvidia-smi -i <index> --query-gpu=pci.bus_id,mig.mode.current --format=csv
pci.bus_id, mig.mode.current
00000000:36:00.0, Enabled
For more information and troubleshooting you can refer to th NVIDIA documentation.
Enable automatic partitioning
You can enable automatic MIG partitioning on a node by adding to it the following label:
The label delegates to nos
the management of the MIG resources of all the GPUs of that node, so you don't have to manually configure the MIG geometry of the GPUs anymore: nos
will dynamically create and delete the MIG profiles according to the resources requested by the pods submitted to the cluster, within the limits of the possible MIG geometries supported by each GPU model.
The available MIG geometries supported by each GPU model are defined in a ConfigMap, which by default contains with the supported geometries of the most popular GPU models. You can override or extend the values of this ConfigMap by editing the field gpuPartitioner.knownMigGeometries
of the installation chart.
Create pods requesting MIG resources
Tip
There is no need to manually create and manage MIG configurations. You can simply submit your Pods to the cluster and the requested MIG devices are automatically provisioned.
You can make your pods request slices of GPU by specifying MIG devices in their containers requests:
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: mig-partitioning-example
spec:
containers:
- name: sleepy
image: "busybox:latest"
command: ["sleep", "120"]
resources:
limits:
nvidia.com/mig-1g.10gb: 1
EOF
In the example above, the pod requests a slice of a 10GB of memory, which is the smallest unit available in NVIDIA-A100-80GB-PCIe
GPUs. If in your cluster you have different GPU models, the nos
might not be able to create the specified MIG resource. You can find the MIG profiles supported by each GPU model in the NVIDIA documentation.
Note
Each container is supposed to request at most one MIG device. If a container needs more resources, then it should ask for a larger, single device as opposed to multiple smaller devices.