#
GPU Infrastructure
Phoeniqs Cloud provides NVIDIA H100 GPU acceleration for AI and high-performance workloads, delivered on HGX nodes (8×H100 per node) in our Swiss data centers. This page explains how GPUs are exposed to tenants today, how to schedule GPU workloads, and what is on the roadmap for GPU passthrough to virtual machines.
Current deployment model
GPUs are delivered to tenants as container resources (nvidia.com/gpu). GPU passthrough to VMs (vfio-pci) is a separate capability currently in pilot and is being rolled out — see
#
Scheduling GPU workloads (containers)
You do not need a nodeSelector or node label to place GPU workloads. Tenants cannot list nodes at tenant scope, and you don't have to. Instead, request the GPU resource in your pod spec and the Kubernetes scheduler will place the workload on a node that has the requested GPUs available.
apiVersion: v1
kind: Pod
metadata:
name: gpu-workload
namespace: my-project
spec:
containers:
- name: cuda-container
image: nvidia/cuda:12.4.1-base-ubi9
resources:
limits:
nvidia.com/gpu: 1 # number of H100 GPUs requested
GPUs for VMs use a different resource
The nvidia.com/gpu resource applies to container workloads. To use GPUs inside a VM, your namespace must instead be granted passthrough GPUs, which is a different resource type — see below.
#
GPU deployment models
#
Shared / container GPUs (available now)
Today, GPUs are exposed to tenants as the standard nvidia.com/gpu container resource. The NVIDIA GPU Operator runs on the bare-metal nodes (installed and managed by Phoeniqs) and advertises GPUs to the scheduler. This is the supported and tested path for AI training and inference workloads.
#
GPU passthrough for VMs
Dedicating GPU nodes to VM (vfio-pci) passthrough — not shared with container workloads — is a new feature that requires a different resource type and a different way of loading the NVIDIA operator on the affected nodes.
Passthrough requires manual enablement
Passthrough GPUs for VMs are not enabled by default and will not work without action from our side. Enabling them requires the NVIDIA operator to be loaded differently and the relevant VMs to be placed on dedicated GPU nodes, separate from container GPU workloads. If you need passthrough before the official July release, open a service ticket and explain why container GPUs do not meet your requirements.
#
permittedHostDevices and deviceNames (KubeVirt)
KubeVirt VM passthrough requires the H100 devices to be registered in the HyperConverged CR's permittedHostDevices with a deviceName that you then reference in the VM spec. The HyperConverged CR is not readable at tenant scope.
At the moment, GPU passthrough to VMs does not work, because using GPU passthrough requires the NVIDIA operator to be loaded in a different way than the current container-focused configuration. As a result, no tenant-facing deviceName is published yet.
Once VM passthrough is enabled for your namespace, the supported deviceNames and the corresponding VM spec snippet will be provided as part of the enablement process. To request enablement, open a service ticket.
#
NVLink and NVSwitch topology
GPU interconnect today is validated for container workloads. Intra-node GPUs on an HGX node are connected via NVSwitch/NVLink; an NVLink-Network fabric spanning nodes (e.g. connecting ~100 GPUs to a single VM) has not been tested or made available for VM workloads.
VM GPU sizing
Because passthrough for VMs is still being rolled out, VM GPU sizing (single VM per 8-GPU node vs. multi-node fabric) is determined case by case. If your architecture depends on a specific NVLink scope, raise it with us so we can validate it together before you design around it.
#
In-VM NVIDIA driver and CUDA responsibility
Why VMs differ
The GPU Operator is installed at the bare-metal level by Phoeniqs for container workloads. Enabling passthrough requires changing that operator on the affected nodes and placing passthrough VMs on separate GPU nodes — which is why, for VMs, the in-guest driver and CUDA stack is the tenant's responsibility.
#
Networking and inter-VM / inter-pod traffic
OpenShift ships without a cluster-level default-deny NetworkPolicy. East-west traffic between your VMs and pods is therefore allowed by default, including service ports such as MariaDB on TCP 3306 and DNS.
Because there is no default deny, you should add your own NetworkPolicy objects to block any services that must not be reachable from outside your namespace.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: my-project
spec:
podSelector: {}
policyTypes:
- Ingress
Protect exposed services yourself
With no default deny in place, unprotected services are reachable across namespaces. Define explicit NetworkPolicy rules for anything sensitive. A platform-wide default-deny is planned for the future, but because of the potential impact on existing tenants it will be introduced gradually — do not rely on it today.
#
Monitoring GPU workloads
OpenShift's built-in monitoring stack (Prometheus/Grafana) is not currently exposed to tenant namespaces, and you cannot list ServiceMonitor objects or the gpu-operator namespace at tenant scope.
Want early access to shared monitoring?
We plan to make a shared monitoring instance available to tenants. If observability for GPU workloads is important to you now, contact us and we can check whether the team can onboard you as an early (sponsor) user.
#
Roadmap summary
#
Related Pages
- IaaS Capacity and Node Provisioning
- OpenShift Architecture Overview
- Namespaces, Quotas and RBAC
- Access Your OpenShift AI
- Confidential Computing