How to Use LXML in Lambda

Use Ansible and Boto to Build C Extensions for Python

Posted by Ryan S. Brown on Tue, Mar 15, 2016
In Reader Question
Tags: lambda, python, xml

Following on a Twitter conversation with @ST215 about his side project, JailJawn, which uses Python and needs lxml, which in turn depends on a bunch of C libraries.

To make this run on Lambda, you have to provide C libraries compiled to run in the environment Lambda provides. To do that, we’ll need to take a few steps.

  1. Install C libraries on an EC2 instance
  2. Install Python libraries in a virtualenv
  3. Zip them all up
  4. Add your library

Just like in my sklearn post I built an Ansible playbook to install the dependencies on an EC2 instance and save them to S3. From the OS, it needs to copy in:

/usr/lib64/libxslt.so.1
/usr/lib64/libexslt.so.0
/usr/lib64/libxml2.so.2
/usr/lib64/libgcrypt.so.11
/lib64/libgpg-error.so.0
/usr/lib64/liblzma.so.5
/lib64/libz.so.1

The code to do it is available in a branch on the sklearn repo.

Once you get the zipfile from S3, you just need to add custom code that imports lxml or lxml.etree and they’ll work in Lambda.


Tweet this, send to Hackernews, or post on Reddit