03/02/2022 10:51 | Category: redshift

Tags: postgresqueriesawsredshift

federated queries from postgres to redshift without enhanced vpc routing

RedShift boasts the abililty to create an external schema that serves as a connection to RDS instances. I'm briefly going to cover resources that exist for creating Postgres -> RedShift federated queries without using enhanced VPC routing.

Resources

General overview

Our architecture will require either:

  • RDS and RedShift to exist within the same VPC
  • There to be a NAT that exists between the separate VPCs

We'll be discussing same VPC instances. Our architecture looks like the following:

RedShift and Postgres sharing a VPC

Implementation of the federated query

The implementation of this service is broken out below.

Prerequisites:

Implementation

  1. Adding our external schema to the RedShift cluster will provide us access to the instance.

This query is directly pulled from the AWS Docs

-- Runs in RedShift
CREATE EXTERNAL SCHEMA my_redshift_schema_name
FROM POSTGRES
DATABASE 'my_postgres_database_name' SCHEMA 'my_postgres_schema'
URI 'endpoint to aurora postgres hostname'
IAM_ROLE 'arn:aws:iam::123456789012:role/my/redshift/user/role/'
SECRET_ARN 'arn:aws:secretsmanager:us-west-2:123456789012:secret:federation/test/dataplane-apg-creds-YbVKQw';
  1. Verify that the schema now exists in RedShift
  2. Likely we'll run into an issue with there being timeouts if we attempt to connect to a table like below

Check the table returns values from Postgres:

select *
from my_redshift_schema_name.some_table

Producing something like:

SQL Error: Invalid Operation: Failed to incorporate external table
into local catalog. Error= Unable to connect to endpoint with address xxxxxxx"
  1. Add RedShift to the security group of our instance, this will now change our architecture to the following

RedShift and Postgres sharing security groups in VPC

  1. The query we ran to check the RedShift returns from Postgres should now work

Check the table returns values from Postgres:

select *
from my_redshift_schema_name.some_table