AWS RDS Migrations (Part 2)
Part 2: Configure RDS Source Account Infrastructure
Resource Setup in the Source Account (Database Account)
In part 1, we discussed our options for RDS Migration strategies. A critical part of a cross-account migration is establishing secure network connectivity and access between the two accounts. Since our accounts don’t share a VPC, we have a few options:
VPC Peering: This is the most common method. We establish a VPC peering connection between the VPCs in the source and target accounts. This allows the DMS replication instance in the target account to communicate privately with the source RDS instance. We also need to configure route tables and security groups to allow traffic between them.
Public Access: If our RDS instance was publicly accessible, we could use its public IP address or DNS name and configure the security group of the source RDS instance to allow inbound traffic from the DMS replication instance’s public IP address. This is less secure and generally not a recommended practice for production databases.
VPC PrivateLink: AWS PrivateLink allows us to privately access services hosted in a different VPC or a different AWS account without using public IPs, VPNs, or VPC peering. This is a great solution when VPC CIDR blocks overlap or other restrictions prevent peering.
In our case, VPC Peering and Public access are not an option due to organizational security restrictions. If we want to use the Database Migration Service, we’ll need to establish a VPC PrivateLink between the source account and the destination account.
Steps to enable VPC PrivateLink for RDS Access:
Create a Network Load Balancer (NLB) in the source account’s VPC. The NLB will be the entry point for traffic from the target account. It should target the private IP address of your RDS instance. Requirements:
This NLB must be an internal NLB.
Create a target group that uses the IP address of your RDS instance as the target. Using the IP address is crucial because the DNS name of the RDS instance can change during failovers.
The NLB should listen on the same port as your RDS database (e.g., 3306 for MySQL, 5432 for PostgreSQL).
# Security Group for the Network Load Balancer (NLB)
NlbSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupName: !Sub "${AWS::StackName}-NLB-SG"
GroupDescription: Security Group for the DMS PrivateLink NLB
VpcId: !Ref VpcId
SecurityGroupIngress:
-
CidrIp: !Ref VpcCidrBlock
Description: Allow internal VPC traffic to RDS via NLB
FromPort: !Ref RdsInstancePort
ToPort: !Ref RdsInstancePort
IpProtocol: tcp
Tags:
- Key: Name
Value: !Sub "${AWS::StackName}-NLB-SG"
# The Network Load Balancer (NLB) to privately expose the RDS database
RdsNLB:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Name: !Sub "${AWS::StackName}-RDS-NLB"
Type: network
Subnets: !Ref SubnetIds
Scheme: internal
SecurityGroups:
- !Ref NlbSecurityGroup
Tags:
- Key: Name
Value: !Sub "${AWS::StackName}-RDS-NLB"
# The Target Group for the NLB, with the RDS instance's private IP as the target
RdsTargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Name: !Sub "${AWS::StackName}-RDS-TG"
VpcId: !Ref VpcId
Port: !Ref RdsInstancePort
Protocol: TCP
TargetType: ip
Targets:
- Id: !Ref RdsInstanceIp
Port: !Ref RdsInstancePort
HealthCheckEnabled: true
HealthCheckPort: !Ref RdsInstancePort
HealthCheckProtocol: TCP
HealthCheckIntervalSeconds: 30
HealthCheckTimeoutSeconds: 10
HealthyThresholdCount: 3
# The NLB listener to forward traffic to the target group
NlbListener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
LoadBalancerArn: !Ref RdsNLB
Port: !Ref RdsInstancePort
Protocol: TCP
DefaultActions:
- Type: forward
TargetGroupArn: !Ref RdsTargetGroup
Create a Lambda Function to register the IP addresses from the RDS endpoint in the NLB target group. This is required because the NLB must use the RDS IP address. Since IP addresses can change, we need to resolve the DNS endpoint for the RDS instance and update the NLB target group if the address has changed.
"""This is the Python script that runs inside a Lambda function container.
It performs the logic to update the NLB target group with the current
RDS Instance Ip address."""
import os
import time
import logging
import boto3
import dns.resolver
# Initialize the boto3 ELB client
elb_client = boto3.client("elbv2", os.environ.get("AWS_REGION", "us-east-1"))
def configure_logger() -> logging.Logger:
"""Configure the logger for the Lambda function"""
some_logger = logging.getLogger()
if not some_logger.hasHandlers():
stream_handler = logging.StreamHandler()
formatter = logging.Formatter("%(asctime)s %(levelname)s %(message)s")
stream_handler.setFormatter(formatter)
some_logger.addHandler(stream_handler)
logging_level = os.environ.get("LOGGING_LEVEL", "INFO").upper()
some_logger.setLevel(logging_level)
return some_logger
# Configure logging
logger = configure_logger()
def get_current_target_group_arn(listener_arn):
"""
Finds the ARN of the target group currently associated with a listener.
"""
try:
response = elb_client.describe_listeners(ListenerArns=[listener_arn])
default_actions = response["Listeners"][0]["DefaultActions"]
# We assume the default action is 'forward' to a single target group.
if default_actions and default_actions[0]["Type"] == "forward":
return default_actions[0]["TargetGroupArn"]
except Exception as e:
logger.exception("Error getting current target group: %s", e)
return None
def get_current_target_ips(target_group_arn):
"""
Retrieves the list of IP addresses currently registered in a target group.
"""
try:
response = elb_client.describe_target_health(TargetGroupArn=target_group_arn)
current_ips = [
target["Target"]["Id"]
for target in response["TargetHealthDescriptions"]
if target["Target"]["Id"] is not None
]
return set(current_ips)
except Exception as e:
logger.exception("Error getting current target IPs: %s", e)
return set()
def create_new_target_group(nlb_arn, dns_name, rds_port):
"""
Creates a new target group and registers IP addresses from a DNS lookup.
"""
# Use dns.resolver to perform the lookup
try:
answers = dns.resolver.resolve(dns_name, "A")
new_ips = [str(a) for a in answers]
except dns.resolver.NoAnswer:
logger.error("DNS lookup for %s returned no answer.", dns_name)
return None
except Exception as e:
logger.exception("Error during DNS lookup for %s: %s", dns_name, e)
return None
# Get NLB VPC ID from its ARN
nlb_description = elb_client.describe_load_balancers(LoadBalancerArns=[nlb_arn])
nlb_vpc_id = nlb_description["LoadBalancers"][0]["VpcId"]
# Create the new target group
new_tg_name = f"rds-dns-tg-{int(time.time())}"
try:
response = elb_client.create_target_group(
Name=new_tg_name,
Protocol="TCP",
Port=int(rds_port),
VpcId=nlb_vpc_id,
TargetType="ip",
)
new_tg_arn = response["TargetGroups"][0]["TargetGroupArn"]
except Exception as e:
logger.exception("Error creating new target group: %s", e)
return None
logger.debug("Created new target group: %s", new_tg_arn)
# Register the new IPs as targets
try:
targets = [{"Id": ip, "Port": int(rds_port)} for ip in new_ips]
elb_client.register_targets(TargetGroupArn=new_tg_arn, Targets=targets)
except Exception as e:
logger.exception("Error registering targets: %s", e)
elb_client.delete_target_group(TargetGroupArn=new_tg_arn) # Clean up
return None
return new_tg_arn
def update_nlb_listener(listener_arn, new_tg_arn):
"""
Updates the NLB listener to point to the new target group.
"""
try:
elb_client.modify_listener(
ListenerArn=listener_arn,
DefaultActions=[{"Type": "forward", "TargetGroupArn": new_tg_arn}],
)
logger.info("Updated listener %s to point to %s", listener_arn, new_tg_arn)
return True
except Exception as e:
logger.exception("Error updating NLB listener: %s", e)
return False
def handler(event, context):
"""
The main handler function for the Lambda.
This function will be triggered by an EventBridge schedule.
"""
logger.debug("Received event: %s", event)
logger.debug("Received context: %s", context)
# Get configuration from environment variables
nlb_listener_arn = os.environ.get("NLB_LISTENER_ARN")
rds_dns_endpoint = os.environ.get("RDS_DNS_ENDPOINT")
rds_port = os.environ.get("RDS_PORT")
nlb_arn = os.environ.get("NLB_ARN")
if not all([nlb_listener_arn, rds_dns_endpoint, rds_port, nlb_arn]):
logger.error("Required environment variables are not set. Exiting.")
return {
"statusCode": 400,
"body": "Required environment variables are not set.",
}
logger.info("Starting NLB target group update process...")
# Get the ARN of the current target group
old_tg_arn = get_current_target_group_arn(nlb_listener_arn)
if not old_tg_arn:
logger.error("Could not find the current target group. Aborting.")
return {"statusCode": 500, "body": "Could not find the current target group."}
logger.info("Found existing target group: %s", old_tg_arn)
# Get IPs from DNS lookup
try:
answers = dns.resolver.resolve(rds_dns_endpoint, "A")
resolved_ips = {str(a) for a in answers}
except (dns.resolver.NoAnswer, Exception) as e:
logger.error("Error during DNS lookup for %s: %s", rds_dns_endpoint, e)
return {"statusCode": 500, "body": "Failed to resolve RDS DNS endpoint."}
# Get IPs from the current target group
current_tg_ips = get_current_target_ips(old_tg_arn)
logger.info("Current IPs: %s", current_tg_ips)
logger.info("Resolved IPs: %s", resolved_ips)
# Compare IP sets to determine if an update is needed
if current_tg_ips == resolved_ips:
logger.info("IP addresses have not changed. No update needed.")
return {"statusCode": 200, "body": "NLB target group is already up to date."}
logger.info("IP addresses have changed. Starting update process.")
# Create the new target group with updated IPs
new_tg_arn = create_new_target_group(nlb_arn, rds_dns_endpoint, rds_port)
if not new_tg_arn:
logger.error("Failed to create new target group. Aborting.")
return {"statusCode": 500, "body": "Failed to create new target group."}
# Give some time for health checks to pass on the new target group
logger.info("Waiting for new target group to pass health checks...")
time.sleep(90) # Wait 3 health check intervals (30s each)
# Update the listener to point to the new target group
if update_nlb_listener(nlb_listener_arn, new_tg_arn):
# Delete the old target group
try:
elb_client.delete_target_group(TargetGroupArn=old_tg_arn)
logger.info("Successfully deleted old target group: %s", old_tg_arn)
result = {
"statusCode": 200,
"body": "NLB target group updated successfully.",
}
except Exception as e:
logger.exception("Failed to delete old target group: %s", e)
result = {
"statusCode": 500,
"body": (
"NLB target group updated,"
f" but failed to delete old target group: {e}"
),
}
else:
# If listener update failed, delete the new target group and revert
logger.error("Listener update failed. Attempting to delete new target group.")
try:
elb_client.delete_target_group(TargetGroupArn=new_tg_arn)
logger.info("Successfully deleted new target group.")
except Exception as e:
logger.exception(
(
"Failed to delete new target group "
"after a failed listener update: %s",
e,
)
)
result = {"statusCode": 500, "body": "Failed to update NLB listener."}
return result
Create a VPC Endpoint Service that uses the NLB as its entry point.
The NLB is the actual resource that receives the traffic from the target VPC and forwards it to the specified RDS source.
Enable the “Require acceptance for endpoint” option so you can manually approve connection requests from the Target Account.
# The VPC Endpoint Service to publish the NLB
RdsEndpointService:
Type: AWS::EC2::VPCEndpointService
Properties:
AcceptanceRequired: true # Requires manual acceptance
ContributorInsightsEnabled: true
NetworkLoadBalancerArns:
- !Ref RdsNLB
Tags:
- Key: Name
Value: !Sub "${AWS::StackName}-EndpointService"
RdsEndpointServicePermissions:
Type: AWS::EC2::VPCEndpointServicePermissions
Properties:
ServiceId: !Ref RdsEndpointService
AllowedPrincipals:
- !Sub "arn:aws:iam::${ControlTowerAccount}:root"In summary, this setup allows us to securely expose the RDS instance in the source account to the DMS replication instance in the target account over VPC PrivateLink.
Key features include:
PrivateLink Creation: The NLB and VPC Endpoint Service are created, exposing the RDS instance (via its private IP) as a PrivateLink service.
Connection: The target account will be able to connect to the RDS database using the private IP address of the VPC Endpoint, completely bypassing the VPC CIDR overlap issue and maintaining private network traffic.
Resilience: The Lambda function should continually monitor the RDS IP address, correcting the NLB target group if the IP ever changes. This ensures the PrivateLink connection remains operational for the DMS replication. I have scheduled the Lambda to run every 15 minutes using the EventBridge Scheduler. For example:
Events:
ScheduledEvent:
Type: Schedule
Properties:
Schedule: cron(0/15 * * * ? *)
Name: RdsIpChangeHandler-Scheduled-EventIn Part 3, we will configure the infrastructure necessary for DMS in our RDS target account.
