Philadelphia: +1 267-546-4123 hello@dignitas.digital
AWS lambda functions for video processing

Use AWS Lambda Functions for Your Video Processing Needs

Are you dealing with video processing in your application?

 

Is this causing issues regarding your user experience?

 

Do you think this is causing your users to choose an alternative?

 

Are you using FFMPEG for your video processing?

 

If the answers to the above is yes, then you need to take this piece of code out of your core application and leverage lambda’s for the same. This is a very common problem that is usually a pain point for legacy or tightly coupled applications.

The solution might sound simple that we simply take our FFMPEG code to lambda and let it execute there, but the execution for the same is a bit challenging. You cannot simply add FFMPEG to your package manager, build and deploy the same. Though this nodejs code might work locally, but it will fail on lambda. For the very purpose of such applications where there are executable files involved, we leverage lambda layers.

 

Lambda Layers – As per the AWS document a lambda layer is as follows –

A Lambda layer is an archive containing additional code, such as libraries, dependencies, or even custom runtimes. When you include a layer in a function, the contents are extracted to the /opt directory in the execution environment.”

 

Now you must be wondering about two things (if not, then you should!)

  • Video processing is an extensive IO operation, requiring not only a high memory to run the processing but also actual disk space (ephemeral) to keep the original and new videos?
  • A lambda layers sounds great, but, how do we fit FFMPEG into a lambda layer?

 

Let’s talk about the memory part, lambdas are by default given 512 MB of memory and the same disk space, now this memory can be between 128 MB and 10240 MB and the ephemeral storage can be between 512 MB and 10240MB.

 

Pricing for the same can be found here –

https://aws.amazon.com/lambda/pricing/

 

Which such infrastructural capabilities you should be able to run processing on relatively big videos.

Along with this, you can and should increase the timeout of your lambda, usually keeping it around 5 minutes works.

Now coming to the second part, which is the tricky one, that is, how do we fit FFMPEG into something called a Layer?

There are multiple steps that are given online, but I will share the one that worked (in our case).

 

Download the FFMPEG file from https://ffmpeg.org/download.html (it will eventually lead you to https://johnvansickle.com/ffmpeg/). Take the latest file from git master.

 

Once you have downloaded it, untar it.

Tar –xvf filename

 

You will find the FFMPEG file, now take this file out, you can copy it to some other location if you want, and add it to a zip folder.

Zip –r nameofzipfile.zip  ffmpeg

 

Not you copy this file to an s3 bucket.

aws s3 cp ffmpeg.zip s3://nameofbucket/

 

Once you are done with this, go lambda in AWS, search of Layers on the left. Name the layer, add the s3 path of the same, choose the compatible architectures, runtimes (optional, but recommended) and select create.

Now go to your lambda function, in the overview section, you will have layers below your lambda function, click that and add the same, save.

Now you should have FFMPEG part of your lambda. There are multiple articles which differ from this approach, we tried all of them, and this one worked the best.

 

Now in order to call this FFMPEG piece of the code, you can take hints from the below code –

https://github.com/serverlesspub/ffmpeg-aws-lambda-layer

 

This gives a very good example of how to use child processes and download/upload file to s3.

One thing to add here is that the path might be different from the one provided in the code above. In our case the path of FFMPEG was in

‘/opt/ffmpeg’,

rather than ‘/opt/bin/ffmpeg’

 

This can cause a lot of frustration as the errors thrown by FFMPEG can be a bit misleading.

Also, when you are inputting a parameter, please do ensure that in case the parameter has a space, then it has to be comma separated, else you would find another set of weird errors to deal with. For example ‘–crf 17‘ is incorrect, it should be ‘-crf’,’17’.

As long as these are lambda’s we are good, but another issue has to be taken care of when dealing with API Gateways. The default timeout is 29 seconds, which cannot be overridden. Now how to tackle that is shared in this blog.

If you are planning to revisit your old architecture and want to make your application modern without redoing it, then do give us a shout out on hello@dignitas.digital. We can help you redo your application without burning a hole in your pocket.