IBM Cloud Networking (VPC, Direct Link, Transit Gateway, CIS, VPN)¶
Scope¶
This file covers IBM Cloud networking beyond the platform basics -- VPC architecture (address prefixes, subnets, security groups, network ACLs, public gateways, floating IPs, VPN gateways), the classic-vs-VPC networking split that still shapes large IBM Cloud estates, hybrid connectivity via Direct Link Connect and Direct Link Dedicated, multi-VPC and cross-account routing via Transit Gateway (local vs global), edge services via Cloud Internet Services (CIS, Cloudflare-powered), and the cross-cloud connectivity patterns that pair IBM Cloud with PowerVS, AWS, Azure, and on-premises sites via Megaport or partner exchanges. For account hierarchy, IAM, and key management see providers/ibm/cloud-platform.md. For PowerVS-specific networking (Power Edge Router, Cloud Connections, the post-VPNaaS-EOL guidance) see providers/ibm/powervs.md. For ROKS / IKS networking specifics see providers/ibm/roks-iks.md.
Checklist¶
VPC Architecture¶
- [Critical] Is the VPC address-prefix and subnet plan documented per region with non-overlapping CIDRs against on-premises networks, peered VPCs, classic infrastructure CIDRs, and PowerVS workspaces? (IBM Cloud VPCs use customer-defined address prefixes per zone with subnets carved from those prefixes. Default prefixes are auto-assigned but rarely match the desired plan -- bring-your-own ranges and disable automatic prefix creation for any non-trivial design. CIDR overlaps with on-premises ranges are the dominant cause of "we cannot reach the database from Direct Link" tickets.)
- [Critical] Are Security Groups (stateful, applied at the virtual network interface) and Network ACLs (stateless, applied at the subnet) used as complementary layers, with Security Groups carrying the application-level allow rules and Network ACLs carrying broad subnet-level deny / emergency-block rules? (The IBM Cloud VPC model is similar to AWS: Security Groups are the primary stateful firewall, NACLs are defense-in-depth at the subnet boundary. Trying to do everything in NACLs forces stateless return-traffic rules and is operationally noisy.)
- [Critical] Is outbound internet access designed deliberately per subnet -- Public Gateway (NAT-style egress for an entire subnet, one per zone, free) for general outbound traffic, Floating IPs attached to individual VSIs for inbound-internet workloads, or no internet path for private-only workloads relying on Direct Link / Transit Gateway / Service Endpoints? (Public Gateway is per-subnet-per-zone and shared. A workload that needs predictable egress IPs cannot use Public Gateway; it needs a Floating IP per VSI or an outbound NAT appliance pattern.)
- [Critical] Are Virtual Private Endpoints (VPE for VPC) configured for IBM Cloud platform services (COS, Key Protect, Db2 on Cloud, Container Registry, etc.) so that traffic stays on IBM Cloud's private backbone rather than traversing public endpoints? (VPE is the IBM Cloud counterpart to AWS Interface Endpoints / Azure Private Endpoints. Without VPEs, "private" workloads call services over the public internet by DNS default, which is both a security finding and an unnecessary egress cost.)
- [Recommended] Is Cloud Service Endpoints (CSE) -- the private-network address path for service access from classic infrastructure and certain VPE-incompatible contexts -- documented where it applies, and is the workload's path to platform services (public, CSE, or VPE) explicit per service consumer? (CSE is older than VPE and applies in different contexts; mixing the two without a documented per-service plan produces unpredictable traffic paths.)
VPC VPN¶
- [Recommended] For VPC-side site-to-site connectivity, is VPC VPN chosen between route-based (routing-table-driven, supports BGP-like dynamic paths, the modern choice) and policy-based (ACL-driven, simpler but less flexible) modes, with HA configured by deploying two VPN gateways across zones and a peer-side configuration that supports IKEv2 / strongSwan-compatible parameters? (VPC VPN is the right answer for low-bandwidth, public-internet-encrypted hybrid connectivity; it is not a substitute for Direct Link on production-throughput workloads.)
- [Optional] Is Client VPN (Client-to-Site) evaluated for ad-hoc engineer / admin access into VPC, with mTLS certificates issued from a private CA and access governed by IAM and CBR rather than by a static credential? (Client VPN replaces the legacy classic-infrastructure SSL VPN portal for VPC-native engineering access. Anyone planning bastion-host SSH-over-public-IP is doing it wrong.)
Direct Link¶
- [Critical] Is the right Direct Link offering chosen against bandwidth, residency, and cost requirements -- Direct Link Connect (provider-side shared connection through a partner such as Megaport, Equinix, Cologix, Digital Realty, NetBond; 50 Mbps to 5 Gbps; faster to provision; partner billing) or Direct Link Dedicated (single-tenant cross-connect at an IBM Cloud direct-link location; 1 / 10 Gbps with newer locations supporting higher; MACsec encryption available; IBM port billing)? (Direct Link Connect is the right answer for most enterprise hybrid landings; Dedicated is for high-throughput, deterministic-latency, or MACsec-required patterns. Both are L3 routed services -- not L2 transparent links.)
- [Critical] Is Direct Link redundancy designed against the published resiliency model -- at minimum two connections at one direct-link location for production, two connections across two direct-link locations for facility-level fault tolerance, and BFD / BGP timer tuning for sub-second failover? (A single Direct Link is dev-only. A facility-level outage with only one Direct Link location is a multi-hour hybrid-connectivity outage that no enterprise tolerates.)
- [Recommended] Is Direct Link metered billing factored into the cost model? (As of 2025 IBM Cloud Direct Link applies metering charges on top of port-hour fees. Designs that assumed free or bundled Direct Link from older PowerVS or VPC engagements need refresh; the bandwidth profile drives the metered cost component.)
- [Recommended] Is the Direct Link Gateway (the routing/peering construct on the IBM Cloud side) attached to a Transit Gateway rather than directly to a VPC, so that the same Direct Link can reach multiple VPCs, classic infrastructure, and PowerVS workspaces without per-VPC duplicate connections? (Attaching Direct Link directly to a VPC is the SoftLayer-era pattern; the strategic shape is Direct Link -> Transit Gateway -> any number of VPCs / classic / Power workspaces.)
Transit Gateway¶
- [Critical] Is Transit Gateway used as the central hub for multi-VPC, multi-account, and hybrid connectivity, with the local vs global decision made explicitly -- Local Transit Gateway connects resources within a single region (lower cost, lower latency) and Global Transit Gateway connects resources across regions (higher cost, larger blast radius)? (Mixing local-only patterns with cross-region requirements after the fact requires standing up a second Transit Gateway and migrating attachments; pick the right shape up front.)
- [Critical] Are Transit Gateway connection types documented per attachment -- VPC, classic infrastructure (one connection per account, bridges the VPC-classic divide), Power Virtual Server (via Power Edge Router workspace), Direct Link Gateway (for on-premises reach), GRE tunnel (for overlay or NVA insertion), and cross-account (with the receiving account explicitly approving the request)? (Transit Gateway is the only IBM Cloud service that bridges VPC and classic infrastructure routing; without it the two stay as separate L3 islands.)
- [Recommended] Is the Transit Gateway capacity model (per-connection bandwidth, per-gateway connection limit, per-account quota) sized against forecast attachments, with cross-region pairs limited where bandwidth or cost demands it? (Transit Gateway pricing has per-connection-hour and data-processing components; cross-region traffic also incurs inter-region transfer cost. Hub-and-spoke architectures with chatty east-west traffic need a cost model, not a default deployment.)
Edge Services¶
- [Recommended] For internet-facing workloads, is Cloud Internet Services (CIS) used for DNS, DDoS mitigation, WAF, CDN, and global load balancing -- with the appropriate plan tier (Lite free for proof-of-concept, Standard for production with WAF and DDoS, Enterprise for advanced rate-limiting, custom rule sets, and bot management) -- and integrated with the VPC origin via Global Load Balancer or DNS pointer? (CIS is a Cloudflare-powered service sold and billed by IBM. It is the only IBM-native edge security service; the alternative is fronting IBM Cloud workloads directly with third-party Cloudflare / Akamai / Fastly.)
- [Recommended] For internal DNS, is DNS Services for VPC used as the authoritative resolver for private zones, with forwarding rules to on-premises DNS over Direct Link or Transit Gateway, and is the relationship to CIS public DNS explicit (CIS for external, DNS Services for VPC internal)? (Mixing the two is a frequent source of split-horizon DNS confusion.)
Cross-Cloud and Hybrid Patterns¶
- [Critical] Is the cross-cloud connectivity path defined for IBM Cloud reaching AWS, Azure, GCP, or other clouds -- typically Megaport private connectivity to the partner cloud's ExpressRoute / Direct Connect / Cloud Interconnect, Direct Link to a partner cloud where IBM and the partner share a presence, or public-internet IPsec VPN (acceptable for low-throughput, non-production)? (Cross-cloud routing must be designed up front, not retrofitted. Public-internet routes for production database / application traffic are not acceptable. Latency, throughput, BGP routing, and DNS resolution paths all need explicit ownership.)
- [Recommended] For PowerVS-adjacent connectivity, is the path from PowerVS workspaces to VPC and to on-premises designed via Power Edge Router (PER) + Transit Gateway (the modern shape) rather than legacy Cloud Connections, and is the post-VPNaaS-EOL guidance (Power Virtual Server VPNaaS ended 14 July 2025; new VPN routes through VPC VPN via PER + Transit Gateway) reflected in the design? (Designs that still reference Power Virtual Server VPNaaS or non-PER Cloud Connections are out of date and will not provision in new workspaces. See
providers/ibm/powervs.mdfor the PowerVS side of the same pattern.)
Why This Matters¶
IBM Cloud networking is two networks that share a brand -- the modern VPC infrastructure (software-defined, IAM-governed, the strategic direction) and the classic infrastructure (the SoftLayer heritage, with its own routing, IP allocation, and security model). The two coexist for legitimate reasons (some bare-metal SKUs are classic-only, some Power-adjacent and regulated patterns require classic networking, some long-lived workloads have not been migrated) but they do not share a routing plane. The bridge between them is the Transit Gateway with a VPC connection on one side and a classic connection on the other; without that bridge, a workload in VPC cannot reach a database in classic, and vice versa. The first design conversation on any IBM Cloud engagement should be: are we VPC-only, classic-only, or hybrid -- and if hybrid, where is the Transit Gateway and which attachments does it carry?
Direct Link is the production hybrid-connectivity option, and it has changed materially. Direct Link Connect (the partner-shared offering through Megaport / Equinix / others) is faster to provision and the right answer for most enterprise landings; Direct Link Dedicated (the single-tenant cross-connect) is for high-throughput, MACsec-required, or deterministic-latency patterns. Metering charges now apply on top of port fees as of 2025, which changes the cost model from "fixed port cost" to "port cost plus throughput cost." Redundancy at the resiliency-recommendation level (two connections, two locations) is mandatory for production. The shape that scales is Direct Link -> Direct Link Gateway -> Transit Gateway -> N VPCs / classic / Power, not Direct Link wired directly to one VPC. Designs that wire Direct Link to a single VPC duplicate connections every time another VPC needs hybrid reach.
Transit Gateway is the central network construct of any non-trivial IBM Cloud estate. The local-versus-global decision is irreversible without re-attaching every spoke, and it drives both the cost model (global traffic crosses region boundaries) and the failure-domain model (a global Transit Gateway is one fate-shared construct across regions). The pricing model has per-connection-hour and data-processing components; chatty east-west traffic between VPCs that share a Transit Gateway is real cost, not zero. The same Transit Gateway also carries the bridge to classic infrastructure (one classic-connection per account) and to PowerVS via Power Edge Router workspaces, which is what makes the Transit Gateway-centric shape strategic.
Cross-cloud connectivity is where IBM Cloud architectures most often surprise teams used to single-cloud thinking. PowerVS is the canonical example: AIX / IBM i databases on PowerVS pair with application tiers on AWS or Azure, and the data path crosses a cloud boundary. The supported answer is Megaport private connectivity to the partner's ExpressRoute / Direct Connect / Cloud Interconnect; the unsupported answer is public-internet VPN, which carries variable latency and no SLA. The same pattern applies to IBM Cloud + Azure or IBM Cloud + AWS landings without PowerVS. The cross-cloud network is one of the load-bearing pieces of the design; treating it as an afterthought is the failure mode.
Edge services are an under-considered area in IBM Cloud designs. Cloud Internet Services (CIS) is a Cloudflare-powered offering sold and billed by IBM that provides DNS, DDoS, WAF, CDN, and global load balancing. It is the default IBM-native answer to "what protects the internet-facing endpoints?" -- with the caveat that anyone with an existing Cloudflare / Akamai / Fastly relationship may prefer to keep their existing edge provider and front IBM Cloud origins directly. The two patterns are not mutually exclusive but they need a documented owner and a documented integration path.
Common Decisions (ADR Triggers)¶
- VPC-only vs classic-only vs hybrid VPC + classic -- VPC (modern, strategic, required for most new services) vs classic (legacy, required for certain bare-metal and Power-adjacent patterns) vs hybrid via Transit Gateway. Greenfield should be VPC unless a specific dependency forces classic or hybrid.
- Direct Link Connect vs Direct Link Dedicated -- Connect (partner-shared, 50 Mbps-5 Gbps, faster provisioning, partner billing, no MACsec) vs Dedicated (single-tenant cross-connect, 1-10+ Gbps, IBM port billing, MACsec available). Most enterprise landings choose Connect; Dedicated is for high-throughput or MACsec-required patterns.
- Direct Link resiliency model -- single connection (dev/test only) vs two connections at one location (device-failure tolerance) vs two connections at two locations (facility-failure tolerance, production baseline) vs four connections across two locations (maximum resiliency). The wrong choice surfaces as a multi-hour outage.
- Local Transit Gateway vs Global Transit Gateway -- Local (single region, lower cost, smaller blast radius) vs Global (cross-region, higher cost, single fate-shared construct). Pick by whether cross-region routing is in scope at design time.
- Direct Link attached to VPC vs Direct Link attached to Transit Gateway -- direct-to-VPC (SoftLayer-era pattern, duplicates connections per VPC) vs Direct Link Gateway -> Transit Gateway (modern shape, one connection reaches many VPCs / classic / Power workspaces). Always Transit Gateway except in trivial single-VPC topologies.
- CIS Standard vs Enterprise vs third-party edge -- CIS Standard (DDoS, WAF, CDN at moderate scale) vs Enterprise (advanced rate-limiting, bot management, custom rule sets) vs third-party (Cloudflare, Akamai, Fastly) where an existing relationship and rule set already exist. CIS is the IBM-native default; third-party is acceptable with a documented integration.
- Cross-cloud path: Megaport vs Direct Link to partner cloud vs public-internet VPN -- Megaport private connectivity (production-grade, the IBM published reference for PowerVS-to-Azure), Direct Link to partner cloud where the IBM-and-partner location exists, or public-internet IPsec VPN (non-production only). Public-internet VPN for production database traffic is an audit-finding pattern.
- VPC VPN gateway: route-based vs policy-based -- route-based (routing-table-driven, dynamic, modern choice) vs policy-based (ACL-driven, simpler, suitable for fixed peer-side configurations). Default to route-based unless the peer-side device dictates otherwise.
- Per-subnet Public Gateway vs per-VSI Floating IP vs NAT appliance -- Public Gateway (shared egress, free, no IP control) vs Floating IPs (per-VSI public IP, predictable source IP) vs third-party NAT appliance in a transit VPC (centralized egress IPs, full control, larger ops surface). Pick by whether egress IP predictability matters to downstream partners.
Reference Links¶
- IBM Cloud VPC documentation -- VPCs, subnets, address prefixes, security groups, network ACLs, public gateways, floating IPs
- IBM Cloud Direct Link documentation -- Connect and Dedicated offerings, speeds, MACsec, BFD, BGP routing
- IBM Cloud Transit Gateway documentation -- local vs global, connection types, cross-account, GRE tunnels
- IBM Cloud Internet Services (CIS) documentation -- DNS, DDoS, WAF, CDN, plan tiers, Cloudflare-powered
- IBM Cloud VPN for VPC documentation -- route-based vs policy-based, HA, client-to-site
- IBM Cloud DNS Services for VPC -- private DNS zones for VPC workloads
- Direct Link Connect via Megaport (cross-cloud) -- Megaport as a Direct Link provider for cross-cloud landings
- IBM Cloud network architecture reference -- multi-VPC, hub-spoke, and hybrid reference architectures
- PowerVS + VPC + classic via Transit Gateway -- canonical reference tutorial
- IBM Cloud VPC service endpoints (CSE and VPE) -- private-network paths to platform services
See Also¶
providers/ibm/cloud-platform.md-- account hierarchy, IAM, regions, billing, key managementproviders/ibm/powervs.md-- PowerVS networking via Power Edge Router and Cloud Connections; post-VPNaaS-EOL guidanceproviders/ibm/roks-iks.md-- ROKS / IKS networking on VPC vs classicproviders/aws/networking.md-- AWS Transit Gateway, Direct Connect, PrivateLink (the AWS side of cross-cloud landings)providers/azure/networking.md-- Azure VNet, ExpressRoute, Private Link (the Azure side of cross-cloud landings)providers/cloudflare/cdn-dns.md-- direct-Cloudflare alternative for customers not using CISgeneral/networking.md-- cloud-agnostic networking patterns including segmentation and resiliencepatterns/hybrid-cloud.md-- hybrid cloud architecture patterns including PowerVS-anchored hybridpatterns/multi-cloud.md-- multi-cloud patterns where IBM Cloud hosts part of the workload and another hyperscaler hosts the rest