Skip to content

IBM Power On-Premises Operations (AIX, IBM i, Linux on Power)

Scope

This file covers on-premises IBM Power estates -- the AIX, IBM i, and Linux on Power workloads that sit behind every Power migration to Skytap, IBM Power Virtual Server, or a continued on-prem refresh. It addresses processor-generation strategy and LPAR sizing, operating system support windows and PTF/TL discipline, backup and recovery tooling (mksysb, BRMS, GoSave, NIM, IBM Storage Protect), high-availability and DR (PowerHA SystemMirror, IBM i journal-based replication, MIMIX/Assure MIMIX, Maxava, Db2 Mirror, Live Partition Mobility), IBM software licensing (Passport Advantage, sub-capacity vs full-capacity, ILMT compliance), and day-to-day operations (HMC, PowerVC, NMON, topas, IBM i Performance Data Investigator). For the post-migration Azure-adjacent platform see providers/skytap/cloud.md. For the IBM Cloud Power platform see providers/ibm/powervs.md. For cross-platform migration methodology see general/workload-migration.md.

Checklist

Processor Generation and LPAR Sizing

  • [Critical] Is the processor generation of every Power frame inventoried (Power8, Power9, Power10, Power11) and aligned with the system's supported-firmware roadmap? (Power8 systems hit IBM hardware end-of-service in 2024; Power9 systems entered the EoS announcement window. Power10 (GA September 2021) and Power11 (GA July 2025) are the current and forward-looking generations. Mixing generations across an HA pair or LPM source/target restricts processor-compatibility-mode settings and limits live-mobility options.)
  • [Critical] Is entitled capacity, vCPU count, and capped vs uncapped sharing mode set deliberately per LPAR -- with the understanding that IBM i partitions are forced capped while AIX and Linux on Power can run uncapped? (Under-entitling a capped IBM i LPAR causes immediate performance degradation under load; over-entitling uncapped AIX LPARs wastes shared-pool headroom that other partitions could burst into. Document the entitled / vCPU / mode tuple per LPAR, not just the totals.)
  • [Critical] Are LPARs sized using the correct unit per OS -- CPW ratings for IBM i, rPerf for AIX -- and not raw GHz or core counts? (CPW (Commercial Processing Workload) and rPerf are IBM's relative-performance benchmarks normalised across processor generations. Treating a Power8 core and a Power10 core as equivalent during sizing routinely produces 30-50% over-provisioning or, worse, under-sized replacements during a frame refresh.)
  • [Recommended] Is memory sized accounting for DDIMM topology, Active Memory Expansion (AME) compression where in use, and Active Memory Sharing (AMS) pool participation? (Power10/11 use OMI-attached DDIMMs with different bandwidth characteristics than Power9 RDIMMs. AME trades CPU for effective memory but is invisible to capacity planners who size by lparstat. AMS pools are easy to oversubscribe.)
  • [Recommended] Is the I/O virtualization model documented -- VIOS pair count, NPIV vs vSCSI for SAN, SR-IOV vs SEA for networking -- and are VIOS partitions sized for the aggregate I/O of their client LPARs? (A single overloaded VIOS is the single most common cause of mysterious LPAR latency. Dual VIOS with NPIV and SEA failover is the production baseline; single VIOS is acceptable only for dev.)

Operating Systems and Patch Management

  • [Critical] Is every AIX LPAR running a supported version with a current Technology Level -- AIX 7.3 (current, TL4 released December 2025, supported through 2028) -- and are AIX 7.1 (ended April 2023) and 7.2 (ended November 2023) migrations planned or completed? (Running unsupported AIX is a compliance and security exposure. Many estates carry one or two AIX 7.1 LPARs that "nobody touches"; those are the LPARs that block migration and fail audits.)
  • [Critical] Is every IBM i LPAR on a supported release with current cumulative PTF and group PTF levels -- IBM i 7.6 (released April 2025, current), 7.5 (released May 2022), 7.4 (service-pack support ending September 2026), and are 7.3 and earlier (service-pack support ended September 2023, extended support to 2028) migrations scheduled? (IBM i release upgrades are infrequent but heavyweight, and ISV applications often pin to specific releases. Discovering an ISV release-lock during a migration cutover, not before, is the typical failure mode.)
  • [Recommended] Is the AIX patch cadence (Service Pack on a Technology Level, with TL refresh roughly annually) and IBM i patch cadence (cumulative PTF + group PTFs for security, HIPER, Db2, Java) operationalised with documented test and apply windows? (Most estates patch reactively when audit findings appear. Service Update Management Assistant (SUMA) for AIX and the IBM i PTF download tooling automate the fetch; the test/apply discipline is the gap.)
  • [Recommended] Are Linux on Power workloads on supported distributions (RHEL for Power, SUSE for Power, Ubuntu Server for Power), and is the distribution's support window aligned with the underlying firmware support window? (Linux on Power is operationally distinct from AIX/IBM i but shares the LPAR, VIOS, and HMC fabric; ignoring it during a Power refresh leaves orphaned LPARs.)

Backup and Recovery

  • [Critical] Is the AIX backup strategy defined -- mksysb for rootvg bare-metal recovery, savevg for non-root volume groups, alt_disk_copy for in-place upgrade rollback -- and is NIM (Network Installation Manager) configured to restore mksysb images network-side without physical media? (mksysb alone is not a backup strategy; without NIM or equivalent, restore requires media, console access, and hours of manual work per LPAR. A NIM master with current resources is the operational backbone of AIX recovery.)
  • [Critical] Is the IBM i backup strategy defined -- GoSave option 21 (full system, restricted state), option 22 (system data), option 23 (user data and IFS) -- and is BRMS (Backup, Recovery, and Media Services) deployed to schedule, catalogue, and drive restores? (Option-21 saves require restricted state and a maintenance window; option-22/23 can run with the system active. BRMS is the difference between "we have backups" and "we can restore a single library in 30 minutes" -- the latter is what audits and outages demand.)
  • [Recommended] Is the backup target architecture documented -- physical tape (LTO generation, library, off-site rotation), virtual tape library, deduplicated disk-to-disk (IBM Storage Protect / Spectrum Protect / TSM, Cohesity, Rubrik, Veeam where IBM i agents exist) -- and are restore-time RTOs measured, not assumed? (Tape restore is hours-to-days; dedup disk-to-disk is minutes-to-hours but requires capacity planning for the dedup pool. Many estates have not actually tested a full-system restore in over a year.)
  • [Recommended] Are IBM i journal receivers being managed -- regular detach, save, and delete -- so they do not silently fill ASP storage? (Unmanaged journal receivers eating ASP is a top-five IBM i outage cause. Continuous journaling is required for HA replication and most application-level recovery, but the receiver chain must be rotated.)

High Availability and Disaster Recovery

  • [Critical] Is the AIX HA topology defined using PowerHA SystemMirror -- Standard Edition for single-site clusters or Enterprise Edition for cross-site clusters with storage replication -- with documented heartbeat networks (primary + at least one alternate, including disk heartbeat or repository disk), shared storage layout, and resource-group dependency policies? (A PowerHA cluster with a single heartbeat network is a split-brain incident waiting to happen. Repository disk on shared storage is required for current PowerHA releases; legacy IP-only heartbeats are not sufficient.)
  • [Critical] Is the IBM i HA approach chosen and operationalised -- IBM PowerHA for i with switched LUNs or geographic mirroring, Precisely Assure MIMIX or Maxava for journal-based logical replication, or Db2 Mirror for synchronous active-active Db2 access -- and is the chosen RPO/RTO matched to the tooling's actual capability? (PowerHA for i provides storage-level role swap; MIMIX/Maxava provide object-level replication that survives logical corruption better; Db2 Mirror provides near-zero-RPO for Db2 but does not protect non-database objects. Mixing tools without a topology owner produces unrecoverable split-brain.)
  • [Recommended] Are Live Partition Mobility (LPM) prerequisites met for the LPARs that will rely on it -- PowerVM Enterprise Edition licensed, dual VIOS pairs at source and target, fully virtualised I/O (no dedicated adapters), shared SAN with NPIV, compatible processor-compatibility modes between source and target frames? (LPM is the maintenance-window-elimination tool for planned outages; it is not a DR tool. Estates often discover an LPAR is not LPM-eligible only when a frame needs servicing.)
  • [Recommended] Is DR replication chosen and tested -- storage-array replication (Global Mirror, Metro Mirror) below the OS, OS/application-level replication (PowerHA cross-site, MIMIX, AIX Geographic Logical Volume Manager), or backup-and-restore-to-DR-site for cold DR -- with at least one annual full failover exercise? (The most common DR failure is not a replication failure but a runbook failure: nobody has done the failover in two years, the DNS/IP plan is stale, and the dependency order is unknown.)

IBM Software Licensing

  • [Critical] Is ILMT (IBM License Metric Tool) deployed on every PVU-licensed LPAR, collecting weekly capacity data, and producing the 90-day-retention audit reports that IBM requires for sub-capacity licensing eligibility? (ILMT sub-capacity rules require the tool to be deployed within 90 days of the first sub-capacity-eligible product install, with at least two snapshot reports per year retained for two years. Missing or stale ILMT reports flip the licensee to full-capacity charging on the entire processor group at audit time -- routinely a multi-million-dollar finding.)
  • [Critical] Is the IBM software inventory mapped to license entitlements -- Passport Advantage agreement, sub-capacity vs full-capacity per product, edition tiering (e.g., IBM i Standard vs Enterprise, AIX Standard vs Enterprise, Db2 editions), and bundled vs standalone licensing? (A surprising fraction of IBM software in production estates is mis-licensed: WebSphere ND charged against the entire frame when only one LPAR uses it, MQ licensed at full capacity when sub-capacity was available, Db2 Advanced when Standard would suffice.)
  • [Recommended] Are upcoming Power frame refreshes modelled for licensing impact before the order is placed -- a move from Power9 to Power10/11 changes the per-core PVU rating and may change the processor group, which can either increase or decrease entitled PVUs at no incremental software cost? (PVU tables and processor-group assignments are the lever that determines whether a refresh is licence-neutral or requires a true-up. IBM software costs typically dwarf hardware costs over the frame's life, so this is the dominant economic question of any refresh.)

Operations and Tooling

  • [Recommended] Is HMC (Hardware Management Console) topology configured for redundancy -- dual HMCs per managed system, current firmware, audited access -- and is the HMC patch and backup schedule aligned with the managed-system maintenance schedule? (A failed HMC during a planned outage extends the window dramatically because LPAR operations, DLPAR, and LPM all flow through it. HMC firmware compatibility matrices with managed-system firmware are tight; mismatches block updates.)
  • [Optional] Is PowerVC deployed for OpenStack-style LPAR provisioning, image management, and automated placement across a Power estate, or is provisioning handled manually through HMC? (PowerVC is the right answer for estates with frequent LPAR churn or self-service requirements; manual HMC suffices for small static estates. Choosing PowerVC late is a project; choosing it day-one is an architectural decision.)
  • [Recommended] Are performance baselines captured per OS using the right tooling -- NMON and topas for AIX, WRKSYSSTS / WRKDSKSTS / IBM i Performance Data Investigator for IBM i, sysstat/perf for Linux on Power -- and retained long enough to compare against post-migration or post-refresh performance? (Performance regressions after a Power refresh or migration are usually real but unprovable without baselines. NMON in particular has near-zero overhead and should be running continuously on every AIX LPAR.)

Why This Matters

For most Power migration engagements the on-premises environment is decades old and operationally idiosyncratic. The estate carries specific BRMS schedules cycled over generations of operators, PowerHA or MIMIX HA pairs whose runbooks live in one person's head, hard-coded entitled capacity that no longer matches the workload, ILMT either not deployed or deployed and never reviewed, journal receivers quietly filling ASP, third-party PASE applications nobody dares touch, and ISVs that pinned to an AIX TL or IBM i release a decade ago. None of this surfaces from a hyperscaler discovery scan or a vCenter export. Migrations that skip the on-prem foundation default to whatever the migration tool happens to surface, and the gaps appear during cutover when they are most expensive to fix.

The HA and DR layer is where most on-prem Power estates have the largest latent risk. PowerHA, MIMIX, Maxava, and Db2 Mirror are not interchangeable; each has a different failure mode and a different operational discipline, and most estates have layered them over years without a current topology owner. Replication is often configured but not exercised, runbooks reference IPs and hostnames that no longer exist, and the dependency order between application tiers is undocumented. A migration that lifts these workloads without first cleaning up the HA topology will lift the latent risk along with the workload.

IBM software licensing is the single largest economic variable in any Power estate -- often larger than the hardware cost over the frame's life -- and is also the area most prone to silent over-charging. ILMT is the gating mechanism for sub-capacity licensing on most IBM software. An ILMT agent missing on a single LPAR, or weekly snapshots not being retained for 90 days, can convert an audit from a clean finding to a seven-figure true-up. The remedy is not glamorous: deploy the agent on every LPAR, verify the central server is collecting, run the snapshot report on schedule, and review it quarterly.

Patch and OS support discipline is the fourth area where Power estates routinely fail audit. AIX 7.1 ended support in 2023; AIX 7.2 ended in late 2023; IBM i 7.3 service-pack support ended in 2023 with extended support running through 2028. Estates that have not moved to AIX 7.3 or IBM i 7.5/7.6 are accumulating compliance and security debt that compounds with every audit cycle, and most ISVs have moved their certification matrix forward, so deferring the OS upgrade also defers application upgrades that depend on it.

Common Decisions (ADR Triggers)

  • Capped vs uncapped LPAR sharing -- IBM i partitions are forced capped by IBM, so the entitled-capacity decision is the workload sizing decision and there is no burst headroom. AIX and Linux on Power can run uncapped to share frame-level burst capacity, at the cost of less predictable performance for any single LPAR. Document the choice per LPAR with the rationale, not just the totals.
  • PowerHA vs MIMIX/Maxava vs Db2 Mirror for IBM i HA -- PowerHA for i provides storage-level role swap and is closest to a traditional cluster; MIMIX or Maxava provide object-level logical replication that survives logical corruption and supports active-active read paths; Db2 Mirror provides near-zero-RPO synchronous replication for Db2 specifically but does not protect non-database objects. The right choice depends on the RPO/RTO target, the application architecture, and the operations team's existing skill set.
  • ILMT-deployed sub-capacity vs full-capacity licensing -- deploying and operating ILMT requires ongoing licensing-operations work (agents, central server, weekly reports, 90-day retention) but unlocks sub-capacity charging that typically reduces IBM software cost by a factor of 2-5 for partitioned estates. Full-capacity licensing has no operational overhead but charges against the entire processor group regardless of LPAR sizing. For any non-trivial Power footprint, ILMT pays for itself within months -- the ADR is whether the licensing-operations capability exists.
  • On-prem retention vs Skytap (Kyndryl Cloud Uplift) vs IBM Power Virtual Server -- continued on-prem retention preserves operations but defers the cloud-adjacency question and requires a hardware refresh; Skytap places Power workloads inside Azure data centres with low-latency access to Azure-native services and Kyndryl-managed engagement; IBM PowerVS places Power workloads inside IBM Cloud with the largest LPAR ceilings and the closest tie to IBM Cloud Pak services. The decision is driven by where the modern application estate sits, not by the Power workload itself.
  • Tape backup retention vs deduplicated disk-to-disk -- physical tape (LTO with off-site rotation) is cheapest per TB at scale and provides air-gap protection against ransomware, with hours-to-days restore times. Deduplicated disk-to-disk (IBM Storage Protect, Cohesity, Rubrik, Veeam where IBM i agents exist) provides minutes-to-hours restore and operational ease but requires capacity planning for the dedup pool and a separate immutability or air-gap design for ransomware resilience. Most production estates run both for different recovery tiers; the ADR is the split.

See Also

  • providers/skytap/cloud.md -- Skytap on Azure (Kyndryl Cloud Uplift) for IBM Power migration to Azure-adjacent
  • providers/ibm/powervs.md -- IBM Power Virtual Server for IBM Power inside IBM Cloud
  • providers/kyndryl/ -- Kyndryl as managed-service provider for Power estates
  • general/workload-migration.md -- migration wave methodology and cutover planning
  • general/enterprise-backup.md -- enterprise backup architecture patterns
  • general/disaster-recovery.md -- DR architecture patterns including replication-based DR
  • patterns/hybrid-cloud.md -- hybrid cloud architecture patterns for legacy-plus-modern estates
  • patterns/application-modernization.md -- modernisation paths around an IBM i / AIX core