Zappa, Django, and Lambda VPC Support

Use VPC and RDS to Back a Serverless Django App

Posted by Ryan S. Brown on Sat, Mar 12, 2016
In Mini-Project
Tags: python, lambda, vpc, django, api gateway

In my last post on Zappa, I covered adding it to your existing Django app, but left it using SQLite as a database (which fails). In this post, we’ll modify the setup to use RDS secured in a VPC to persist data.

Creating a VPC

First, we’ll need a VPC that fits the requirements for RDS (spanning two availability zones) and Lambda (internet access). Here’s the shortlist:

  • Subnets in two availability zones
  • Routes and ACLs to allow traffic out of the VPC
  • Internet accessibility either with a NAT instance or EC2 NAT Gateway
  • Security group for Lambda functions to run in
  • Security group that allows Lambda group database access
  • An RDS instance (ideally a cheap one)

Andreas and Michael Wittig from cloudonaut.io have provided an example template that covers the first two requirements, available on their site. It’s not a brief template, so I won’t post it in full here. Their template is released under Apache 2.0, the same license as all other code on this blog.

To start, we’ll add the parameters that will be needed later when creating our database: DBUser and DBPassword. They won’t be used until we add the RDS instance to the template which will have to be last, as it depends on all the security groups we’re going to add.

"Parameters": {
        "DBUser": {
            "Description": "User for RDS",
            "Type": "String",
            "Default": "django"
        },
        "DBPassword": {
            "Description": "Password for RDS database",
            "Type": "String",
            "MinLength": "12",
            "Default": "1qaz2wsx3edc"
        },
        "ClassB": {
            "Description": "Class B of VPC (10.XXX.0.0/16)",
            "Type": "String",
            "MinLength": "1",
            "MaxLength": "3",
            "Default": "10"
        }
    },

1qaz2wsx3edc is a terrible password (if you use qwerty, type it and you’ll see why) so make sure you override it when deploying RDS. Lambda always requires internet access to run inside a VPC, even if you don’t directly make external service calls. To allow this, we’ll add a NAT Gateway for each availability zone.

Note: Technically we could use a single NAT Gateway for both availability zones, but that would mean a failure in either AZ would make our app unavailable.

Each gateway needs its own Elastic IP Address, so we’ll have to define those as well. This change is in commit c8043 in the project repo, and you can run a stack update once you’ve added this to start the NAT Gateways creating - for me it took a few minutes for them to be ready.

"NatEIPA": {
    "DependsOn": ["VPCGatewayAttachment"],
    "Type": "AWS::EC2::EIP",
    "Properties": {
        "Domain": "vpc"
    }
},
"NatEIPB": {
    "DependsOn": ["VPCGatewayAttachment"],
    "Type": "AWS::EC2::EIP",
    "Properties": {
        "Domain": "vpc"
    }
},
"NatGatewayA": {
    "DependsOn": ["VPCGatewayAttachment"],
    "Type": "AWS::EC2::NatGateway",
    "Properties": {
        "AllocationId": {"Fn::GetAtt": ["NatEIPA", "AllocationId"]},
        "SubnetId": {"Ref": "SubnetAPublic"}
    }
},
"NatGatewayB": {
    "DependsOn": ["VPCGatewayAttachment"],
    "Type": "AWS::EC2::NatGateway",
    "Properties": {
        "AllocationId": {"Fn::GetAtt": ["NatEIPB", "AllocationId"]},
        "SubnetId": {"Ref": "SubnetBPublic"}
    }
},

With the NAT Gateways configured, now we need to change the route tables in our private subnets to use them by default. Without these routes, when the Lambda functions start up in the VPC they’ll just fail immediately.

The AWS::EC2::Route resource recently added support for NAT Gateways, so we can use NatGatewayId with a Ref from the Gateway resources we defined earlier. See commit 2ffb5 for the exact diff.


"RouteTablePrivateA": {
    "Type": "AWS::EC2::RouteTable",
    "Properties": {
        "VpcId": {"Ref": "VPC"},
        "Tags": [{
            "Key": "Name",
            "Value": "Private"
        }]
    }
},
"RouteTablePrivateB": {
    "Type": "AWS::EC2::RouteTable",
    "Properties": {
        "VpcId": {"Ref": "VPC"},
        "Tags": [{
            "Key": "Name",
            "Value": "Private"
        }]
    }
},
"RouteTableAssociationAPrivate": {
    "Type": "AWS::EC2::SubnetRouteTableAssociation",
    "Properties": {
        "SubnetId": {"Ref": "SubnetAPrivate"},
        "RouteTableId": {"Ref": "RouteTablePrivateA"}
    }
},
"RouteTableAssociationBPrivate": {
    "Type": "AWS::EC2::SubnetRouteTableAssociation",
    "Properties": {
        "SubnetId": {"Ref": "SubnetBPrivate"},
        "RouteTableId": {"Ref": "RouteTablePrivateB"}
    }
},
"RouteTablePrivateNatGatewayA": {
    "Type": "AWS::EC2::Route",
    "DependsOn": "VPCGatewayAttachment",
    "Properties": {
        "RouteTableId": {"Ref": "RouteTablePrivateA"},
        "DestinationCidrBlock": "0.0.0.0/0",
        "NatGatewayId": {"Ref": "NatGatewayA"}
    }
},
"RouteTablePrivateNatGatewayB": {
    "Type": "AWS::EC2::Route",
    "DependsOn": "VPCGatewayAttachment",
    "Properties": {
        "RouteTableId": {"Ref": "RouteTablePrivateB"},
        "DestinationCidrBlock": "0.0.0.0/0",
        "NatGatewayId": {"Ref": "NatGatewayB"}
    }
},

With that, we’re actually done setting up the network - internet access will work in the public subnets via internet gateway, and private subnets via the NAT Gateways we configured. We’re ready to configure the database.

Adding RDS

Running RDS in a VPC requires that you provide a group of a least 2 subnets, because the support for multi-AZ replication/failover requires that at least two availability zones are available for RDS to use. In CloudFormation, the group is a separate resource defined as a list of subnet ID’s. This change is in 32ad445.

"DbSubnetGroup": {
    "Type" : "AWS::RDS::DBSubnetGroup",
    "Properties": {
        "DBSubnetGroupDescription": "Subnets for RDS db instance",
        "SubnetIds": [{"Ref": "SubnetAPrivate"}, {"Ref": "SubnetBPrivate"}]
    }
},

We’ll also need two security groups - one for Lambda to execute in, and one to allow ingress from the Lambda group to the database. This change is in cb0f79.

"DBSecurityGroup": {
    "Type": "AWS::EC2::SecurityGroup",
    "Properties" : {
        "GroupDescription": "Open database for access",
        "VpcId": {"Ref": "VPC"},
        "SecurityGroupIngress" : [{
            "IpProtocol" : "tcp",
            "FromPort" : "3306",
            "ToPort" : "3306",
            "SourceSecurityGroupId" : { "Ref" : "LambdaExecSecurityGroup" }
        }]
    }
},
"LambdaExecSecurityGroup": {
    "Type": "AWS::EC2::SecurityGroup",
    "Properties" : {
        "GroupDescription": "Lambda functions execute with this group",
        "VpcId": {"Ref": "VPC"}
    }
},

Now we’re set to make the database - I turned off multi-AZ redundancy because it costs more and I didn’t need it for this demo, but if you were serving production traffic you’d want to turn it on (and probably use a better database than a t2.micro). This change is in 7f643f1.

"DBInstance" : {
    "Type": "AWS::RDS::DBInstance",
    "Properties": {
        "DBName"            : "djangohellodb",
        "Engine"            : "MySQL",
        "MultiAZ"           : false,
        "MasterUsername"    : { "Ref" : "DBUser" },
        "DBInstanceClass"   : "db.t2.micro",
        "AllocatedStorage"  : 5,
        "MasterUserPassword": { "Ref" : "DBPassword" },
        "VPCSecurityGroups" : [{"Fn::GetAtt": ["DBSecurityGroup", "GroupId"]}],
        "DBSubnetGroupName" : {"Ref": "DbSubnetGroup"}
    }
},

Now we have the full template, create it on your infrastructure Launch stack TestStack to spin up a copy. You can view the full template on Github as well. It’ll take 15 minutes or so, and you can continue the tutorial while it’s working.

Adding IAM Permissions

On the IAM role for the Lambda function, you’ll have to add AWSLambdaVPCAccessExecutionRole to allow the function to run inside your VPC. Go to the IAM Roles console, and find the ZappaLambdaExecution role. Click on “Attach Policy” and search for AWSLambda.

attach AWSLambdaVPCAccessExecutionRole policy

This policy provides the Lambda service some permissions to manage network interfaces, which it needs to make a connection between the servers (managed by AWS) where Lambda runs and the VPC we’re putting the database in.

Moving a Lambda Function Inside a VPC

If you’ve been burned by creating an RDS instance in the wrong VPC before, then needing to snapshot and rehydrate it in the right subnet group, you’ll appreciate that the Lambda execution VPC can be changed at any time. (!!!) Since there isn’t a persistent instance, network or tenancy changes aren’t even a big deal.

In the Lambda console and find the function that Django-Zappa deployed, probably named django-zappa-example-testing and go to the configuration tab. Under “Advanced” you’ll find VPC settings. CloudFormation will have named the new VPC “10.10.0.0/16” unless you specified a different ClassB parameter in the template.

Enable VPC execution in AWS Lambda VPC console

Select both private subnets, and the security group with “LambdaExecSecurityGroup” in its name. Then hit “Save” at the top of the page and go back to your API Gateway URL. If you don’t have it handy, the API Gateway dashboard has the URL at the top.

Since we haven’t made any code changes or updated our deployment, going to the page should look exactly like before - except that the underlying function is running in the VPC, one step closer to a working database.

Under the hood, when executing in a VPC the Lambda service creates a network interface and sends all traffic from your function through it. That’s why you need to have NAT Gateway or another internet access path available.

Database Setup

To use MySQL, you’ll need to add MySQL-python to the requirements.txt file and pip install -r requirements.txt to download it. Then we’ll need to create the database on the RDS instance. You have a few options for doing this - the easiest is to take the SQLdump from the repository and apply it to RDS from an EC2 instance that’s inside your VPC.

First, you’ll need to get the connection information for the MySQL we provisioned in CloudFormation. Check the stack outputs (you may need to scroll a bit).

CloudFormation output showing database host and port

One way to get connected without doing much work is to create an EC2 instance to run the SQL commands to create tables from. You could also use a VPN to get inside the VPC. I chose the EC2 instance route, and put an SSH security group in the template for convenience.

On an Amazon Linux instance, run these commands:

$ sudo yum install -y mysql
$ curl -L -v -s https://raw.githubusercontent.com/ryansb/django-zappa-example/master/dbdump.sql |
   mysql -u django -p1qaz2wsx3edc -D djangohellodb --port 3306 \
   --host zde95ilidabtaj.cpduqtjksm3w.us-east-1.rds.amazonaws.com

Now open hello_django/settings.py and replace the DATABASES dictionary to stop using SQLite and connect to RDS.

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'djangohellodb',
        'USER': 'django',
        'PASSWORD': '1qaz2wsx3edc',
        'HOST': 'zde95ilidabtaj.cpduqtjksm3w.us-east-1.rds.amazonaws.com',
        'PORT': '3306',
    }
}

Redeploying

Now it’s time to redeploy our changes to the test environment.

 $ ./manage.py update testing
Packaging project as zip....
Uploading zip (15.3MiB)...
Your updated Zappa deployment is live!

Back in the browser, refreshing the page should work and you’ll see the sample data from SQLite is gone.

A blank slate - whatever lunch you like.

Now go ahead and enter a few lunches - you’ll see that they persist correctly (great) and that the function is still fast after warmup (super great).

Note: Lambda creates network interfaces in your VPC. Before you can delete the stack you must detach (and delete) them because they depend on your VPC.

Wrapping Up

To recap, we’ve covered:

  • Moving Lambda functions to a VPC
  • Using CloudFormation to build a VPC and associated security groups
  • Deploying Django on Lambda
  • How Lambda attaches to your VPC

Get the updated project on the rds-db repo branch and check out Django-Zappa on Github. Keep up with future posts via RSS. If you have suggestions, questions, or comments feel free to email me, ryan@serverlesscode.com .


Tweet this, send to Hackernews, or post on Reddit