To make this run on Lambda, you have to provide C libraries compiled to run in the environment Lambda provides. To do that, we’ll need to take a few steps.
- Install C libraries on an EC2 instance
- Install Python libraries in a virtualenv
- Zip them all up
- Add your library
Just like in my sklearn post I built an Ansible playbook to install the dependencies on an EC2 instance and save them to S3. From the OS, it needs to copy in:
/usr/lib64/libxslt.so.1 /usr/lib64/libexslt.so.0 /usr/lib64/libxml2.so.2 /usr/lib64/libgcrypt.so.11 /lib64/libgpg-error.so.0 /usr/lib64/liblzma.so.5 /lib64/libz.so.1
The code to do it is available in a branch on the sklearn repo.
Once you get the zipfile from S3, you just need to add custom code that imports
lxml.etree and they’ll work in Lambda.