Nutanix Data Protection and Disaster Recovery¶

Scope¶

Data protection, replication, backup, and disaster recovery for Nutanix environments: protection domains, Leap orchestration, NearSync and metro availability replication tiers, third-party backup integration, immutable backup targets (Objects with WORM), retention policies, and DR testing.

Checklist¶

Why This Matters¶

Nutanix provides multiple data protection tiers, each with different RPO, complexity, and resource costs. Protection domains with async replication are the simplest approach but provide only crash-consistent snapshots at hourly or longer intervals. NearSync reduces RPO to 1 minute but requires lightweight snapshots and additional CVM resources. Metro availability provides synchronous replication with zero RPO but requires low-latency (<5ms) links and a witness VM -- without the witness, a network partition forces manual intervention to avoid split-brain. Leap adds orchestration on top of any replication method, automating the complex sequence of powering on VMs in order, remapping networks, and re-addressing IPs -- without Leap, DR failover is a manual, error-prone process under pressure. Backup is distinct from replication: replication protects against site failure, while backup with immutable targets (Objects with WORM) protects against ransomware and accidental deletion. Most organizations need both.

Common Decisions (ADR Triggers)¶

Replication tier -- Async (hourly+ RPO, lowest resource cost) vs NearSync (1-minute RPO, moderate overhead) vs metro availability (zero RPO, requires low-latency link and witness)
DR orchestration -- Nutanix Leap (built-in, recovery plans, test failover) vs third-party DR orchestration (Zerto, VMware SRM with ESXi) vs manual runbooks
Backup solution -- HYCU (Nutanix-native, API-integrated) vs Veeam (broad ecosystem, mature) vs Commvault (enterprise scale, complex) vs Rubrik (SaaS management)
Backup target -- Nutanix Objects with WORM (on-cluster, immutable) vs external NAS (simple, familiar) vs cloud tier to AWS S3/Azure Blob (offsite, pay-per-use)
Snapshot consistency -- Crash-consistent (default, no guest dependency) vs application-consistent (requires NGT/VSS on Windows, pre/post scripts on Linux)
Retention strategy -- Short local + long remote (optimize local storage) vs long local + archive offsite (fastest restore) vs tiered with lifecycle policies
DR site architecture -- Symmetric (identical cluster at DR site, bidirectional replication) vs asymmetric (smaller DR cluster, one-way replication, accept degraded performance during failover)

Reference Links¶

Nutanix Leap disaster recovery -- Leap DR orchestration, protection policies, and recovery plans
Nutanix data protection guide -- snapshots, protection domains, and replication configuration
Nutanix Mine backup integration -- integrated backup solution on Nutanix HCI