API reference

All types belong to API group resource.nvidia.com/v1beta1.

GPU opaque configuration types

These types are set as opaque configuration in ResourceClaim and ResourceClaimTemplate specs to configure GPU resources. They are optional, omit them to use device defaults.

GpuConfig

Configures a full GPU device. Target DeviceClass: gpu.nvidia.com.

With time-slicing:

apiVersion: resource.nvidia.com/v1beta1
kind: GpuConfig
sharing:
  strategy: TimeSlicing
  timeSlicingConfig:
    interval: Default       # Default | Short | Medium | Long

With MPS (Multi-Process Service):

apiVersion: resource.nvidia.com/v1beta1
kind: GpuConfig
sharing:
  strategy: MPS
  mpsConfig:
    defaultActiveThreadPercentage: 50             # optional, integer
    defaultPinnedDeviceMemoryLimit: "4Gi"         # optional, quantity
    defaultPerDevicePinnedMemoryLimit:            # optional, map
      "0": "2Gi"

Fields

Field Type Description
sharing object Optional. Sharing strategy and its configuration. Omit to use the device exclusively.
sharing.strategy string TimeSlicing or MPS (Multi-Process Service).
sharing.timeSlicingConfig.interval string Time-slice duration: Default, Short, Medium, or Long. Requires the TimeSlicingSettings feature gate.
sharing.mpsConfig.defaultActiveThreadPercentage integer Thread percentage limit applied to all processes sharing the GPU. Requires the MPSSupport feature gate.
sharing.mpsConfig.defaultPinnedDeviceMemoryLimit quantity Pinned memory limit applied to all devices.
sharing.mpsConfig.defaultPerDevicePinnedMemoryLimit map Per-device override of defaultPinnedDeviceMemoryLimit. Keys are device index (integer) or UUID string.

MigDeviceConfig

Configures a MIG device slice. Target DeviceClass: mig.nvidia.com.

With time-slicing:

apiVersion: resource.nvidia.com/v1beta1
kind: MigDeviceConfig
sharing:
  strategy: TimeSlicing

With MPS:

apiVersion: resource.nvidia.com/v1beta1
kind: MigDeviceConfig
sharing:
  strategy: MPS
  mpsConfig:
    defaultActiveThreadPercentage: 50

Fields

Field Type Description
sharing object Optional. Supports TimeSlicing and MPS strategies.
sharing.strategy string TimeSlicing or MPS.
sharing.mpsConfig object MPS configuration. Same fields as GpuConfig.sharing.mpsConfig. Requires MPSSupport.

Note: timeSlicingConfig is not available for MIG devices. Time-slicing on MIG slices does not support interval configuration.

VfioDeviceConfig

Configures a VFIO passthrough device. Target DeviceClass: vfio.gpu.nvidia.com. No additional fields beyond the type metadata.

apiVersion: resource.nvidia.com/v1beta1
kind: VfioDeviceConfig

Requires PassthroughSupport (Alpha, default: false).

ComputeDomain CRDs

ComputeDomains use two Custom Resource Definitions. Unlike the GPU opaque types above, these are concrete Kubernetes resources that exist independently of ResourceClaim specs.

ComputeDomain

A user-created resource that provisions an ephemeral multi-node NVLink fabric. Creating one triggers the controller to deploy a per-domain IMEX daemon fleet and generate ResourceClaimTemplate objects that workload pods reference to claim IMEX channels. The spec is immutable after creation.

apiVersion: resource.nvidia.com/v1beta1
kind: ComputeDomain
metadata:
  name: my-compute-domain
spec:
  numNodes: 0
  channel:
    resourceClaimTemplate:
      name: my-channel
    allocationMode: Single        # Single (default) | All

Spec fields

Field Type Default Description
spec.numNodes integer Deprecated. Set to 0 when using the default IMEXDaemonsWithDNSNames feature gate. Formerly used to gate workload startup until a specific number of IMEX daemons joined.
spec.channel.resourceClaimTemplate.name string Name of the ResourceClaimTemplate the controller creates for workload pods to claim a channel from.
spec.channel.allocationMode string Single Single allocates one IMEX channel per claim. All allocates the maximum number of channels available in the IMEX domain.

Status fields

Field Type Description
status.status string Ready when all expected IMEX daemons have joined; NotReady otherwise.
status.nodes[].name string Node name.
status.nodes[].ipAddress string Node IP used by the IMEX daemon.
status.nodes[].cliqueID string NVLink clique identifier for the node.
status.nodes[].index integer Deterministic index for the node within its clique. Used to map IPs to DNS names.
status.nodes[].status string Per-node daemon readiness: Ready or NotReady.

Note: Do not gate workload startup on status.status. The status field is informational. IMEX daemons start as soon as their local node joins without waiting for all peers.

ComputeDomainClique

A ComputeDomainClique is a driver-managed resource created automatically by the IMEX daemon on each node. You do not create or modify these directly.

Each ComputeDomainClique is namespaced to the driver namespace and named as <computeDomainUID>.<cliqueID>. It tracks which nodes have joined a specific NVLink clique within a ComputeDomain and their daemon readiness.

The ComputeDomain plugin reads these objects to verify that the local IMEX daemon is ready before allowing a workload container to start.

Requires ComputeDomainCliques (Beta, default: true), which depends on IMEXDaemonsWithDNSNames.

Fields

Field Type Description
daemons[].nodeName string Node name.
daemons[].ipAddress string Node IP used by the IMEX daemon.
daemons[].cliqueID string NVLink clique identifier.
daemons[].index integer Deterministic index for the node within its clique. Maps to a DNS name.
daemons[].status string Daemon readiness: Ready or NotReady. Defaults to NotReady.

ComputeDomain opaque configuration types

The following opaque types are set in the ResourceClaim specs generated by the ComputeDomain controller. They are managed automatically, you do not set these directly. They are documented here for reference and debugging.

ComputeDomainChannelConfig

Opaque configuration for IMEX channel ResourceClaim objects. Set by the controller in the ResourceClaimTemplate it generates from spec.channel.resourceClaimTemplate.

Field Type Description
domainID string UID of the parent ComputeDomain resource.
allocationMode string Inherited from spec.channel.allocationMode. Single or All. May be absent when Single (the default).

ComputeDomainDaemonConfig

Opaque configuration for IMEX daemon ResourceClaim objects. Set by the controller in the daemon DaemonSet.

Field Type Description
domainID string UID of the parent ComputeDomain resource.