VMware Compute¶

Scope¶

This document covers VMware vSphere compute configuration including cluster sizing, DRS, HA, resource management, VM hardware versions, encryption, Fault Tolerance, content libraries, NUMA awareness, and lifecycle management.

Checklist¶

Why This Matters¶

Compute configuration in vSphere directly determines application availability, performance, and cost efficiency. Over-provisioning vCPUs leads to CPU ready time and co-stop (where multi-vCPU VMs wait for all virtual processors to be scheduled simultaneously), degrading performance worse than under-provisioning. HA admission control misconfiguration can either waste capacity (over-reserving) or leave insufficient room for failover (under-reserving), resulting in VMs that fail to restart after a host outage. DRS affinity rules are frequently audit-critical for software licensing compliance -- Oracle, for example, requires licensing all cores on any host where its VMs could run, making unconstrained DRS a significant financial liability. EVC mode misconfiguration blocks vMotion during hardware refreshes, forcing maintenance windows. Resource pool misuse (particularly nested pools with default shares) is one of the most common vSphere misconfigurations, silently starving workloads during contention.

Common Decisions (ADR Triggers)¶

HA admission control policy -- percentage-based (flexible, works well for uniform clusters) vs slot-based (predictable but wasteful with heterogeneous VM sizes) vs dedicated failover hosts (deterministic but idle capacity); percentage should match N+X host failure tolerance
DRS automation level -- fully automated (best for dynamic workloads, reduces manual intervention) vs partially automated (initial placement only, recommended for workloads sensitive to vMotion) vs manual (recommendations only, for regulated environments requiring change control)
Fault tolerance vs HA -- FT provides zero-downtime failover with a secondary shadow VM but is limited to 8 vCPUs, doubles resource consumption, and blocks snapshots/Storage vMotion; HA restarts VMs after failure with brief downtime but works universally
Resource pools vs VM-level reservations -- pool-level controls for multi-tenant isolation vs per-VM reservations for guaranteed resources; pools are frequently misused as organizational folders, causing unintended resource allocation
VM hardware version strategy -- upgrade immediately for new features (virtual NVMe, Precision Clock, vPMem) vs hold at a specific version for cross-cluster compatibility and rollback flexibility
NUMA-aware sizing -- fit within NUMA boundaries for performance-critical VMs vs allow wide VMs for workloads that need large vCPU counts regardless of NUMA topology
Content library architecture -- single publisher with subscribers for multi-site template consistency vs local libraries per site for independence; published libraries require reliable WAN connectivity

Version Notes¶

Feature	vSphere 7 (7.0 U3)	vSphere 8 (8.0 U2+)	vSphere 9 / VCF 9.0
Maximum VM hardware version	vmx-19	vmx-21	vmx-22
DPU (Data Processing Unit) support	Not available	GA (offload networking/security to SmartNIC)	GA (enhanced)
DRS	GA (load balancing)	GA (improved workload placement, DRS scores)	GA
vSphere Lifecycle Manager (vLCM)	GA (image-based)	GA (enhanced firmware/driver management)	GA (sole lifecycle tool)
Update Manager (VUM)	Supported (baseline-based)	Deprecated (replaced by vLCM)	Removed
Host Profiles	GA	GA	Deprecated (use vSphere Configuration Profiles)
Auto Deploy	GA	GA	Deprecated
vSphere Configuration Profiles	Not available	GA (host config drift remediation)	GA (replaces Host Profiles)
DevOps Center	Not available	GA (VM Service for developer self-service)	GA
VM vGPU profiles	GA (NVIDIA GRID)	GA (improved MIG support, vGPU 16+)	GA (enhanced AI/ML)
vSphere Fault Tolerance	GA (up to 8 vCPU)	GA (up to 8 vCPU, unchanged)	GA
Content Library	GA	GA (improved OVF deployment, check-in/check-out)	GA
vSphere+ (SaaS management)	Not available	GA (cloud-connected vCenter management)	GA
AI/ML workload support	Basic GPU passthrough	Enhanced (vSphere AI integration, DPU offload)	Enhanced (VCF AI focus)
Assignable Hardware	Not available	GA (framework for DPU, GPU, other devices)	GA

Key changes in vSphere 9 / VCF 9.0 compute: - VUM removed: VMware Update Manager (VUM) is fully removed in vSphere 9. All host lifecycle management must use vSphere Lifecycle Manager (vLCM) with image-based desired state. Organizations still using VUM baselines must migrate to vLCM before upgrading. - Host Profiles deprecated: Host Profiles are deprecated in vSphere 9 in favor of vSphere Configuration Profiles. Plan migration from Host Profiles to Configuration Profiles for host configuration management. - Auto Deploy deprecated: vSphere Auto Deploy is deprecated in vSphere 9. Evaluate alternative stateless host provisioning approaches using vLCM image-based management. - Version number alignment: VCF jumped from 5.x directly to 9.0 to align all component versions (vSphere 9, ESXi 9, vSAN 9, NSX 9). There is no VCF 6.x, 7.x, or 8.x.

Key differences between vSphere 7 and 8 compute: - DPU support: vSphere 8 introduced support for Data Processing Units (SmartNICs such as NVIDIA BlueField, AMD Pensando). DPUs offload networking, security, and storage I/O processing from the host CPU, freeing compute resources for workloads. The ESXi control plane runs on the DPU in a "stateless" host model. - VM hardware versions: vmx-21 (vSphere 8) adds support for new virtual device types and performance optimizations. Upgrading VM hardware version is one-way -- ensure all hosts in the cluster are at the target ESXi version before upgrading VMs. - DRS improvements: vSphere 8 DRS uses a workload placement scoring model that provides better initial placement and more predictable migrations. DRS scores are visible in the UI, making it easier to understand why migrations occur. - Lifecycle Manager vs Update Manager: vSphere 7 supported both vLCM (image-based, desired-state) and VUM (baseline-based, legacy). vSphere 8 deprecates VUM in favor of vLCM exclusively. vLCM manages ESXi images, firmware, and drivers as a single desired-state image per cluster. Organizations using VUM-based workflows must migrate to vLCM. - vSphere Configuration Profiles: New in vSphere 8, these enable host configuration drift detection and remediation at the cluster level, ensuring all hosts maintain consistent settings for networking, storage, and security. - vSphere+: Cloud-connected SaaS management layer that provides centralized vCenter visibility, subscription-based licensing, and lifecycle management across distributed vSphere environments.

Reference Links¶

vSphere 8 documentation -- ESXi and vCenter configuration, DRS, HA, and resource management
vSphere resource management guide -- resource pools, shares, reservations, and NUMA scheduling
VMware Configuration Maximums -- supported limits for vSphere clusters, VMs, and hosts