Amazon Redshift is a fully managed petabyte-scale Data Warehouse, providing simple and cost-effective solution for analytics.

 

Key Points for Amazon Redshift

  • Redshift is a Data Warehouse designed for Online Analytical Processing (OLAP)
  • Redshift provides fast query performance by virtue of Columnar Storage, Data Compression, Parallel Processing and Zone Maps design.
  • Federated Query – allows you to query your operational, relational database.
  • You can join data from Redshift Data Warehouse, Data Lake, and Operational stores
  • Redshift RA3 with managed storage uses high performance SSDs for hot data, and S3 for cold data, thus enabling high performance, but cost-effective storage solution based on the data.
  • Caching – Redshift leverages result caching to deliver sub-second response times for repeat queries.
  • Redshift supports most of the popular Business Intelligence tools in the market.
  • Backups – Data in Redshift is automatically backed up to S3, and may be setup to asynchronously replicate Snapshots to S3 in another Region for DR purpose.
  • Security – Redshift supports both at-rest and in-transit encryption of the data.

 


Amazon Redshift Integration with Data Lake

  • Redshift support deep integration with Data Lake, and associated components.
  • Easily export data to and from your Data Lake.
  • Supports queries on open file formats such as Parquet, ORC, JSON, Avro, CSV, and S3.

 

  • Following diagram shows how Redshift integrations seamlessly with other data components, to make data analytics easy:

Redshift Data Lake Integration

Image courtesy of AWS

 


Redshift Spectrum

Redshift Spectrum is a feature of Amazon Redshift that extends the reachability of data that can be searched (and thus analyzed) via Redshift queries. Redshift Spectrum can analyze data stored in S3 as if it were part of Amazon Redshift.

  • Redshift Spectrum can scan exa-byte size of data
  • You are only charged for data scanned by Spectrum

 

 

Pricing

Amazon Redshift is billed for following components:

  • Node (Instances) – per hour Instance is running
    • Price varies by Instance type
    • Reduced Price if Reserved Instances are used
  • Redshift Spectrum – per terabyte of data scanned
  • Managed Storage – per GB per month
  • Backup Storage – per GB per month

 


External Resources