Amazon S3 Glacier is a very low-cost storage service, primarily targeted for data archiving and backup.
Key points for Amazon S3 Glacier
- Storage is designed for storing infrequently accessed data with an expectation to return requested data in the tune of several hours (rather than milliseconds or even minutes)
- You can store up to 40 TB per Archive, and can have unlimited Archives
- Each single upload request can be up to 4 GB
- For archives larger than 100 MB, it’s recommended to use Multipart upload capability
- You can upload data directly to Glacier, or move via Lifecycle management policy
- 99.999999999% durability (eleven 9s)
- 3+ AZ replication
- Supports SSL for data in transit and encryption for data at rest
- 90 day and 40 KB minimum charge; object retrieval fee
- Retrieval: three configurable options, ranging from minutes to hours
- You can leverage S3 Inventory report to get CSV, ORC, or Parquet file output of your Objects and associated Metadata on daily or weekly basis for an S3 Bucket or Prefix.
Key Components of Amazon S3 Glacier
Archive
- Data is stored in Glacier as an “archive” – each archive is a single file , or group of files as a zipped (zip, tar, etc.) file – still a single file
- An archive can be up to 40 TB, and you can have unlimited number of archives
- Each archive is immutable. That is – you cannot change existing archive. You can however delete and add a new one.
Vault
- Vault is like a Storage Unit for your archive, which you can put lock on
- Each Vault can have multiple Archives – this is how you can group together logical archives in single Vault
- Each AWS Account can have up to 1,000 Vaults
- You can attach notifications to the Vault
- If you delete a Vault within 3 months of its creation date, you will be charged for a Deletion fee
Vault Lock
- Vault Lock enables controls over the Vault via Vault Lock Policy
- Lock is to keep your Vault safe by applying policies such as:
- No deletes allowed
- Multi-factor Authentication (MFA) required
- Lock enables you to enforce compliance controls
- Once locked, the Vault Lock Policy becomes immutable. Vault Lock ensures that a Lock Policy cannot be deleted or altered until there are no more Archives to protect in the Vault.
Access Policies
- Vault Access Policy – you can define resource-based policies for your Vault, to specify who has access and what actions can be performed
- User based Policy – you can define user-based policies for IAM users / groups / roles to specify what permissions (e.g., read / write / delete) do they have on your Vaults
Retrieval from Amazon S3 Glacier
There are three retrieval options to allow retrieval at different fetch speeds (at a cost) per your need:
- Bulk – 5-12 hours
- Standard – 3-5 hours
- Expedited – 1-5 minutes
- Provisioned Capacity Units (PCUs) – you can purchase PCUs, which guarantees that your retrieval capacity for Expedited retrievals will be available when you need it.
- Each PCU ensures that at least 3 Expedited Retrievals can be performed every 5 minutes, at up to 150 MB/s of retrieval throughput.
- Range Retrieval – enables you to retrieve part of an archive
- You specify a byte range that can start at zero (start of your archive), or at any 1 MB internal thereafter. The end of the range can be either the end of your archive, or any 1 MB interval greater than the start of your range.
Pricing
Amazon S3 Glacier is billed for following components:
- Storage – per GB per month
- Retrieval
- Size – per GB
- Number of Requests – per 1,000 requests
- Price varies by associated Retrieval Time – Expedited / Standard / Buk
- Provisioned Expedited Retrieval – per Provisioned Capacity Unit
External Resources