AWS Migration Services¶

Scope¶

AWS services for workload and data migration to the cloud. Covers Application Migration Service (MGN) for server rehost, DataSync for high-performance data transfer, Transfer Family for managed file transfer protocols, Migration Hub for centralized tracking, and Application Discovery Service.

AWS provides a suite of purpose-built migration services that address different phases and patterns of workload and data movement to the cloud. This file covers Application Migration Service (MGN) for server rehost (lift-and-shift) to EC2, DataSync for high-performance data transfer between on-premises storage and AWS, Transfer Family for managed file transfer protocols (SFTP/FTPS/FTP/AS2) fronting S3 and EFS, and Migration Hub for centralized discovery and tracking across all migration tools. Cross-reference with general/workload-migration.md for migration strategy patterns, general/data-migration-tools.md for database-layer migration (DMS), providers/aws/ec2-asg.md for target compute design, providers/aws/s3.md for storage destinations, and patterns/migration-cutover.md for cutover planning.

Checklist¶

Why This Matters¶

Migration failures rarely stem from the replication technology itself. They stem from insufficient testing, missing post-launch automation, and poor coordination across tools. MGN replicates blocks reliably, but if the launch template puts the server in the wrong subnet, or the application configuration still points to on-premises database endpoints, the migrated server starts up and immediately fails. Every hour of cutover-window troubleshooting erodes stakeholder confidence and risks a rollback decision that wastes weeks of preparation.

Data migration has its own failure modes. DataSync can move petabytes efficiently, but without bandwidth throttling it can saturate a shared WAN link and disrupt production operations. Without data verification enabled, corrupted files may not be detected until weeks after the migration when an application reads them. Transfer Family simplifies file exchange but choosing the wrong endpoint type (public vs. VPC) can either expose the service unnecessarily or make it unreachable by external partners.

Migration Hub exists because large migrations involve dozens of servers, multiple databases, and terabytes of file data — all moving through different tools on different timelines. Without a single pane of glass, teams lose track of which servers have been tested, which are mid-replication, and which are ready for cutover. Discovery data from Application Discovery Service prevents the common failure of migrating a server without realizing it depends on another server that is scheduled for a later wave.

Common Decisions (ADR Triggers)¶

Rehost (MGN) vs. replatform vs. refactor [Critical]¶

MGN performs a lift-and-shift: the server arrives in EC2 with the same OS, applications, and configuration. This is the fastest path but carries technical debt (on-premises assumptions baked into the workload). Replatforming (e.g., moving to RDS instead of a self-managed database on EC2) adds migration effort but reduces operational burden. Refactoring (rewriting for cloud-native services) provides the most benefit but takes the most time. Document the migration strategy per workload tier and the criteria used to assign each workload to a strategy (time pressure, application complexity, vendor support, licensing).

MGN agent-based replication vs. agentless (vCenter) [Recommended]¶

Agent-based replication installs a lightweight agent on each source server that performs continuous block-level replication directly to AWS. This works across all supported platforms (VMware, Hyper-V, physical, Azure VMs, GCP VMs). Agentless replication (via vCenter connector) avoids installing agents on source servers but requires VMware infrastructure and has limitations on supported OS versions and disk configurations. Choose agent-based when source servers span multiple hypervisors or include physical machines. Choose agentless when the environment is purely VMware and the organization prohibits installing third-party agents on production servers. Document the source platform mix and agent installation policy.

DataSync vs. S3 CLI/SDK vs. Snow Family for bulk data transfer [Critical]¶

DataSync provides managed, high-performance transfer with scheduling, throttling, verification, and incremental sync — ideal for ongoing or large one-time transfers over network. S3 CLI (aws s3 sync) is simpler but lacks built-in throttling, scheduling, and verification — suitable for smaller, ad-hoc transfers. Snow Family bypasses the network entirely for multi-terabyte transfers where bandwidth is insufficient (see providers/aws/snow-family.md). Document the data volume, available bandwidth, transfer frequency (one-time vs. ongoing), and whether the data must be verified post-transfer.

Transfer Family endpoint type: public vs. VPC [Recommended]¶

Public endpoints are accessible from the internet with an AWS-provided or custom hostname — appropriate when external partners (customers, vendors) need to upload or download files. VPC endpoints restrict access to traffic arriving over the VPC, Direct Connect, or VPN — appropriate for internal transfers or when partners connect over private circuits. A public endpoint with an allowlisted IP range via security groups is a middle ground but adds management overhead. Document the connectivity model of each file transfer partner and the organization's security policy on internet-facing services.

Transfer Family identity provider: service-managed vs. AD vs. custom [Optional]¶

Service-managed users are the simplest (SSH keys or passwords stored in Transfer Family) but create a separate identity silo. AWS Directory Service integration authenticates users against an existing Active Directory — appropriate when file transfer users already have AD accounts. A custom Lambda-backed identity provider enables arbitrary authentication logic (e.g., querying an external user database, enforcing MFA) but requires developing and maintaining the Lambda function. Document the number of file transfer users, whether they already have AD accounts, and whether custom authentication logic is required.

Discovery approach: agentless (vCenter connector) vs. agent-based [Recommended]¶

Agentless discovery via the Application Discovery Service vCenter connector collects VM inventory, CPU/memory/disk configuration, and utilization metrics without installing anything on guest operating systems. Agent-based discovery installs the AWS Discovery Agent on each server and additionally captures running processes, network connections, and inter-server dependencies. Agentless is faster to deploy and less intrusive but cannot map application-level dependencies. Agent-based provides the network dependency data needed to group servers into migration waves. Document the scope of the migration, whether dependency mapping is required, and whether the organization permits agent installation on production servers.