6.5 AWS DataSync and Transfer Family

Key Takeaways

  • AWS DataSync automates and accelerates online data transfer between on-premises storage and AWS (S3, EFS, FSx) with built-in encryption, scheduling, filtering, and post-transfer integrity validation.
  • DataSync uses a purpose-built protocol that can run up to about 10x faster than open-source tools like rsync, and uses an on-premises agent for on-prem sources (no agent for AWS-to-AWS).
  • AWS Transfer Family is a fully managed SFTP, FTPS, FTP, and AS2 service backed by S3 or EFS — it preserves existing partner file-exchange workflows without changing their clients.
  • Use DataSync for migration and recurring synchronization; use Transfer Family when the requirement is a specific file-transfer protocol (especially SFTP) for external partners.
  • DataSync pricing is per-gigabyte transferred, and it supports bandwidth throttling and include/exclude filters to control cost and scope.
Last updated: June 2026

AWS DataSync — Fast, Automated Online Transfer

Quick Answer: DataSync = fast, automated, online data transfer between on-premises and AWS storage (S3/EFS/FSx) with scheduling, filtering, and integrity checks. Transfer Family = managed SFTP/FTPS/FTP/AS2 backed by S3 or EFS so partners keep their existing clients. Exam shorthand: "bulk migrate or sync files over the network" → DataSync; "partners upload via SFTP" → Transfer Family.

DataSync moves data over the network (not by shipping disks) and is built for both one-time migrations and recurring synchronization. It deploys an agent as a VM (or EC2 instance) next to on-premises NFS/SMB/HDFS/object stores; AWS-to-AWS transfers need no agent.

FeatureDetail
ThroughputUp to ~10 Gbps; roughly 10x faster than rsync/open-source tools
SourcesNFS, SMB, HDFS, self-managed object storage, S3, EFS, FSx
DestinationsS3, EFS, FSx for Windows, FSx for Lustre, FSx for OpenZFS, FSx for NetApp ONTAP
EncryptionTLS in transit; destination-side encryption at rest
ValidationAutomatic data-integrity verification after each task
SchedulingHourly/daily/weekly tasks for ongoing sync
BandwidthConfigurable throttling to protect production links
FilteringInclude/exclude patterns for selective transfer
PricingPer-GB transferred (no separate license)

DataSync vs. scripted copies

CapabilityDataSyncS3 CLI / rsync
SpeedPurpose-built, ~10x fasterStandard network speed
SchedulingBuilt-in task schedulerNeeds cron/external tooling
Integrity checkAutomaticManual
EncryptionTLS by defaultMust configure
OperationsAWS-managedSelf-managed scripts

Worked example: Migrate 50 TB from on-prem NFS to Amazon EFS and then keep them in sync nightly. Deploy a DataSync agent on the NFS network, create a task NFS→EFS, run the initial full transfer, then schedule a nightly task with include filters so only changed datasets move. Validation confirms each task's integrity automatically.

AWS Transfer Family — Managed File-Transfer Protocols

Many organizations have partners and vendors hard-wired to SFTP/FTPS/FTP scripts or B2B AS2 flows. Transfer Family gives those workflows a managed AWS endpoint while the files land directly in S3 or EFS — no protocol change for the partner.

FeatureDetail
ProtocolsSFTP, FTPS, FTP, AS2 (B2B EDI)
Backend storageAmazon S3 or Amazon EFS
AuthenticationService-managed users, Active Directory, or a custom Lambda authorizer
Endpoint typesPublic, VPC, or VPC with internet-facing Elastic IPs
Scaling/HAFully managed, auto-scaling, multi-AZ
AuditingIntegrates with CloudWatch and CloudTrail

Choosing between the two

RequirementService
Bulk migrate or nightly-sync file shares to AWSDataSync
Partners must keep using their SFTP clientTransfer Family
Move data between two AWS file systemsDataSync (no agent)
Inbound B2B EDI document exchange (AS2)Transfer Family

Worked example: A retailer's 200 suppliers drop nightly inventory files via SFTP to an aging on-prem server. Stand up a Transfer Family SFTP server with an S3 backend, give each supplier a service-managed user mapped to its own S3 prefix, and update DNS so existing SFTP scripts connect unchanged. Files now arrive in S3 and trigger downstream Lambda processing.

Common trap: Choosing Transfer Family for a one-time 50 TB migration. Transfer Family is about protocol compatibility for ongoing exchange, not high-throughput bulk migration — that is DataSync (online) or Snow Family (offline). Conversely, do not pick DataSync when the explicit requirement is "partners must use SFTP/FTPS," because DataSync exposes no such protocol endpoint.

On the Exam: "Migrate terabytes from on-prem NFS to EFS over the network, with scheduling and validation" → DataSync. "External partners need SFTP access landing in S3 without changing their tooling" → AWS Transfer Family.

Where each fits among the transfer services

The SAA-C03 exam clusters several transfer options, and the differentiator is almost always the requirement's keyword. DataSync wins on "fast online migration" and "scheduled sync with validation." Transfer Family wins on a named protocol ("SFTP," "FTPS," "AS2") for external parties. Snow Family wins on "limited bandwidth" plus large volume. Storage Gateway wins on "hybrid access" where on-premises apps need ongoing low-latency reads of data cached locally while it is durably stored in AWS. S3 Replication wins only for bucket-to-bucket copies inside AWS.

Reading the requirement keyword first, before the answer choices, prevents the classic mistake of choosing DataSync for an SFTP scenario or Transfer Family for a bulk migration.

Cost and operational considerations

DataSync charges per gigabyte transferred, so for very large one-time moves over a constrained link you should compare its cost and elapsed time against shipping a Snowball Edge device — past a certain volume-to-bandwidth ratio, offline shipping is both cheaper and faster. DataSync's bandwidth throttling lets you cap throughput during business hours so migration traffic does not starve production, and include/exclude filters restrict a task to specific paths.

Transfer Family is billed per enabled protocol-hour plus data transferred, and because it is fully managed and multi-AZ, it removes the need to patch and scale your own SFTP servers. For partner onboarding, map each Transfer Family user to a distinct S3 prefix and a least-privilege session policy so vendors can only see their own drop folder — the same per-identity isolation principle used with Cognito Identity Pools earlier in this chapter.

Test Your Knowledge

A company must migrate 50 TB from on-premises NFS storage to Amazon EFS over the network as quickly as possible, with scheduling and automatic integrity validation. Which service is best?

A
B
C
D
Test Your Knowledge

Business partners currently upload files via SFTP to an aging on-premises server. The company wants the workflow on AWS without partners changing their SFTP clients, storing files in S3. Which service should they use?

A
B
C
D
Test Your Knowledge

An architect needs recurring nightly synchronization between an on-premises NFS file system and Amazon S3, with automatic integrity validation and bandwidth throttling during business hours. Which service fits?

A
B
C
D