6.5 AWS DataSync and Transfer Family
Key Takeaways
- AWS DataSync automates and accelerates data transfer between on-premises storage and AWS (S3, EFS, FSx) with built-in encryption and data validation.
- DataSync can transfer data at speeds up to 10x faster than open-source tools by using a purpose-built network protocol.
- AWS Transfer Family provides managed SFTP, FTPS, FTP, and AS2 services backed by S3 or EFS for existing file transfer workflows.
- DataSync is for migration and ongoing synchronization; Transfer Family is for file exchange workflows that require SFTP/FTP protocols.
- DataSync supports scheduling, bandwidth throttling, and filtering to control what data is transferred and when.
AWS DataSync and Transfer Family
Quick Answer: DataSync = fast, automated data transfer between on-premises and AWS (S3/EFS/FSx) with scheduling and validation. Transfer Family = managed SFTP/FTPS/FTP backed by S3 or EFS for file exchange workflows. Use DataSync for migration/sync, Transfer Family for SFTP access to S3.
AWS DataSync
| Feature | Detail |
|---|---|
| Speed | Up to 10 Gbps, 10x faster than open-source tools |
| Sources | NFS, SMB, HDFS, self-managed object storage, S3, EFS, FSx |
| Destinations | S3, EFS, FSx for Windows, FSx for Lustre, FSx for OpenZFS, FSx for NetApp ONTAP |
| Agent | Deploy agent on-premises (not needed for AWS-to-AWS transfers) |
| Encryption | In-transit (TLS) and at-rest (destination encryption) |
| Validation | Data integrity verification after transfer |
| Scheduling | Hourly, daily, weekly scheduled transfers |
| Bandwidth | Configurable bandwidth throttling |
| Filtering | Include/exclude patterns for selective transfer |
DataSync vs. S3 CLI / rsync
| Feature | DataSync | S3 CLI/rsync |
|---|---|---|
| Speed | 10x faster (purpose-built protocol) | Standard network speed |
| Scheduling | Built-in | Requires cron/scheduling tool |
| Validation | Automatic integrity checks | Manual |
| Encryption | Automatic TLS | Must configure |
| Management | AWS managed | Self-managed |
AWS Transfer Family
| Feature | Detail |
|---|---|
| Protocols | SFTP, FTPS, FTP, AS2 |
| Backend | S3 or EFS |
| Authentication | Service-managed users, Active Directory, custom Lambda authorizer |
| Endpoint | Public, VPC, or VPC with Elastic IP |
| Use case | Partners/vendors uploading files via SFTP without changing their workflows |
On the Exam: "Migrate terabytes of data from on-premises NFS to EFS" → DataSync. "Partners need to upload files via SFTP to S3" → AWS Transfer Family.
A company needs to migrate 50 TB of data from their on-premises NFS storage to Amazon EFS as quickly as possible. Which service provides the fastest transfer?
A company's business partners currently upload files via SFTP to an on-premises server. The company wants to move this workflow to AWS without requiring partners to change their SFTP clients. Which service should they use?
Which service should be used to schedule recurring data synchronization between an on-premises NFS file system and Amazon S3, with automatic integrity validation?