API reference
All types belong to API group resource.nvidia.com/v1beta1.
GPU opaque configuration types
These types are set as opaque configuration in ResourceClaim and ResourceClaimTemplate specs to configure GPU resources. They are optional, omit them to use device defaults.
GpuConfig
Configures a full GPU device. Target DeviceClass: gpu.nvidia.com.
With time-slicing:
apiVersion: resource.nvidia.com/v1beta1
kind: GpuConfig
sharing:
strategy: TimeSlicing
timeSlicingConfig:
interval: Default # Default | Short | Medium | Long
With MPS (Multi-Process Service):
apiVersion: resource.nvidia.com/v1beta1
kind: GpuConfig
sharing:
strategy: MPS
mpsConfig:
defaultActiveThreadPercentage: 50 # optional, integer
defaultPinnedDeviceMemoryLimit: "4Gi" # optional, quantity
defaultPerDevicePinnedMemoryLimit: # optional, map
"0": "2Gi"
Fields
| Field | Type | Description |
|---|---|---|
sharing |
object | Optional. Sharing strategy and its configuration. Omit to use the device exclusively. |
sharing.strategy |
string | TimeSlicing or MPS (Multi-Process Service). |
sharing.timeSlicingConfig.interval |
string | Time-slice duration: Default, Short, Medium, or Long. Requires the TimeSlicingSettings feature gate. |
sharing.mpsConfig.defaultActiveThreadPercentage |
integer | Thread percentage limit applied to all processes sharing the GPU. Requires the MPSSupport feature gate. |
sharing.mpsConfig.defaultPinnedDeviceMemoryLimit |
quantity | Pinned memory limit applied to all devices. |
sharing.mpsConfig.defaultPerDevicePinnedMemoryLimit |
map | Per-device override of defaultPinnedDeviceMemoryLimit. Keys are device index (integer) or UUID string. |
MigDeviceConfig
Configures a MIG device slice. Target DeviceClass: mig.nvidia.com.
With time-slicing:
apiVersion: resource.nvidia.com/v1beta1
kind: MigDeviceConfig
sharing:
strategy: TimeSlicing
With MPS:
apiVersion: resource.nvidia.com/v1beta1
kind: MigDeviceConfig
sharing:
strategy: MPS
mpsConfig:
defaultActiveThreadPercentage: 50
Fields
| Field | Type | Description |
|---|---|---|
sharing |
object | Optional. Supports TimeSlicing and MPS strategies. |
sharing.strategy |
string | TimeSlicing or MPS. |
sharing.mpsConfig |
object | MPS configuration. Same fields as GpuConfig.sharing.mpsConfig. Requires MPSSupport. |
Note:
timeSlicingConfigis not available for MIG devices. Time-slicing on MIG slices does not support interval configuration.
VfioDeviceConfig
Configures a VFIO passthrough device. Target DeviceClass: vfio.gpu.nvidia.com. No additional fields beyond the type metadata.
apiVersion: resource.nvidia.com/v1beta1
kind: VfioDeviceConfig
Requires
PassthroughSupport(Alpha, default: false).
ComputeDomain CRDs
ComputeDomains use two Custom Resource Definitions. Unlike the GPU opaque types above, these are concrete Kubernetes resources that exist independently of ResourceClaim specs.
ComputeDomain
A user-created resource that provisions an ephemeral multi-node NVLink fabric. Creating one triggers the controller to deploy a per-domain IMEX daemon fleet and generate ResourceClaimTemplate objects that workload pods reference to claim IMEX channels. The spec is immutable after creation.
apiVersion: resource.nvidia.com/v1beta1
kind: ComputeDomain
metadata:
name: my-compute-domain
spec:
numNodes: 0
channel:
resourceClaimTemplate:
name: my-channel
allocationMode: Single # Single (default) | All
Spec fields
| Field | Type | Default | Description |
|---|---|---|---|
spec.numNodes |
integer | — | Deprecated. Set to 0 when using the default IMEXDaemonsWithDNSNames feature gate. Formerly used to gate workload startup until a specific number of IMEX daemons joined. |
spec.channel.resourceClaimTemplate.name |
string | — | Name of the ResourceClaimTemplate the controller creates for workload pods to claim a channel from. |
spec.channel.allocationMode |
string | Single |
Single allocates one IMEX channel per claim. All allocates the maximum number of channels available in the IMEX domain. |
Status fields
| Field | Type | Description |
|---|---|---|
status.status |
string | Ready when all expected IMEX daemons have joined; NotReady otherwise. |
status.nodes[].name |
string | Node name. |
status.nodes[].ipAddress |
string | Node IP used by the IMEX daemon. |
status.nodes[].cliqueID |
string | NVLink clique identifier for the node. |
status.nodes[].index |
integer | Deterministic index for the node within its clique. Used to map IPs to DNS names. |
status.nodes[].status |
string | Per-node daemon readiness: Ready or NotReady. |
Note: Do not gate workload startup on
status.status. The status field is informational. IMEX daemons start as soon as their local node joins without waiting for all peers.
ComputeDomainClique
A ComputeDomainClique is a driver-managed resource created automatically by the IMEX daemon on each node. You do not create or modify these directly.
Each ComputeDomainClique is namespaced to the driver namespace and named as <computeDomainUID>.<cliqueID>. It tracks which nodes have joined a specific NVLink clique within a ComputeDomain and their daemon readiness.
The ComputeDomain plugin reads these objects to verify that the local IMEX daemon is ready before allowing a workload container to start.
Requires
ComputeDomainCliques(Beta, default: true), which depends onIMEXDaemonsWithDNSNames.
Fields
| Field | Type | Description |
|---|---|---|
daemons[].nodeName |
string | Node name. |
daemons[].ipAddress |
string | Node IP used by the IMEX daemon. |
daemons[].cliqueID |
string | NVLink clique identifier. |
daemons[].index |
integer | Deterministic index for the node within its clique. Maps to a DNS name. |
daemons[].status |
string | Daemon readiness: Ready or NotReady. Defaults to NotReady. |
ComputeDomain opaque configuration types
The following opaque types are set in the ResourceClaim specs generated by the ComputeDomain controller. They are managed automatically, you do not set these directly. They are documented here for reference and debugging.
ComputeDomainChannelConfig
Opaque configuration for IMEX channel ResourceClaim objects. Set by the controller in the ResourceClaimTemplate it generates from spec.channel.resourceClaimTemplate.
| Field | Type | Description |
|---|---|---|
domainID |
string | UID of the parent ComputeDomain resource. |
allocationMode |
string | Inherited from spec.channel.allocationMode. Single or All. May be absent when Single (the default). |
ComputeDomainDaemonConfig
Opaque configuration for IMEX daemon ResourceClaim objects. Set by the controller in the daemon DaemonSet.
| Field | Type | Description |
|---|---|---|
domainID |
string | UID of the parent ComputeDomain resource. |