Philadelphia: +1 267-546-4123 hello@dignitas.digital
moving storage to cloud system

How We Moved a Legacy Software System to AWS and Slashed the TCO By More Than 50% for the Client

Client Overview:

A biomedical and genomic research center located in Cambridge, Massachusetts, United States.

 

Problem:

The client came to us with the requirement that they had a legacy system written in C++, which was their core library helping them in complex gene processing. This processing was done via a batch process. Around this library, they had a Ruby on Rails website which was not being maintained at all. It consisted of deprecated packages, making rework extremely challenging. On top of that, there were servers that we’re continuously working round the clock waiting for these jobs (created by the C++ library) to be sent their way. There were 3 servers (maybe with a backup server as well), which were called, based on the number of genes that were being passed to analyze. For less than 10 genes we had a small server, for 10-50 genes we had the medium server, and for more than 100 genes (up to 200), we had a large server.

The details of the data/reports that were generated on each run were stored on the file system which was not too efficient. The client wanted these files to be hosted so that sending a URL of the same via email would point the user to the concerned report. To add to that, there was no code repository, we were given huge .tar files to figure things out. The client knew that the system worked, but not how, and as developers/devops engineers, it was our responsibility to figure things out. As the results were not time critical, therefore a synchronous response was not expected.

 

Solution:

The first step was to decouple the application to understand what was already in place, what all could be salvaged, what could be reused, and what pieces of the logic were essential and could be leveraged in our case.

There was RoR code that had not been maintained for almost a decade, then there were the C++ files, a MySQL connection that was behind a firewall which we never actually got access to. As there was no repository, we had data dumps of 150+ GB, which we had to unzip/untar on a different instance altogether. The instructions that were available on the website were followed and reverse engineered in many cases to get a better understanding of the functionality and code.

After multiple tries, we were able to build the C++ code and run it successfully on our EC2 Linux machine.

Now that we know that we could run this piece of code independently, it was time to use containers/dockers for this purpose.

An account was created on dockerhub and an image for this part was created.

 

The configuration of the same is below:

FROM ubuntu:bionic

RUN apt-get update && DEBIAN_FRONTEND=noninteractive \

            apt-get install -y build-essential git cmake autoconf libtool pkg-config awscli

WORKDIR /src

COPY /src ./

ENV PATH=$PATH:/src

RUN apt-get install dos2unix

 

Let me try to explain in brief what we were trying to accomplish with this configuration:

We were setting up our container with the Ubuntu environment, installing git, C++ library/tools, and the AWS command-line interface. We chose our directory from our local environment, copied it onto the image, set up the environment variable, installed do2unix (noticed while writing this piece that this should have been added in the second step) to help with the formatting of text while changing from Windows to Linux.

 

Now, where does the windows piece come from?

Well, we were hosting our small front-end application on a custom .NET core application. This was basically removing all the fat from the old RoR solution and building it as the lean application. Could this have been done on any other platform? Yes. A completely serverless play could have been being able to put it up on Amazon S3.

 

For running our docker, we used AWS Batch:

This is the secret sauce of the whole architecture. AWS batch was introduced by AWS around the end of 2016 but became mainstream only later. The reason for using batch is because as the name suggests, it is useful for batch processing, i.e., we do not need to keep our servers on all the time, instead of using containers we run our batch on-demand, thereby drastically saving on our infrastructure costs.

The parameters were chosen via the website, passed on via the rest API to the API Gateway which further calls the Lambda function/s.

 

To get a better understanding, look at the architecture of infrastructure we decided to follow:

architecture of infrastructure for AWS

EC2 – We chose a Windows 2019 machine, the base version of the same. This is a micro instance. As this is a new account setup for the client, this will be free for one year – unless we upgrade the instance to a more powerful one based on the traffic and load, we get.

The front-end consisting of our custom solution made up in .NET core is hosted on this. This is a thin client, where we have taken a lot of the functionality away from the front-end piece which existed in the Ruby on Rails solution. As the RoR solution was an outdated one, a lot of pieces regarding the logic had to be reverse engineered to bring it all together. Once the user has input all the information on this website, an HTTP request is created.

Lambda + API Gateway – A post request is created which gets all the information from the client and passes it onto the lambda function. As lambda functions cannot be called, they must be tied up with the API Gateway.

AWS Batch – Though we have given the brief of this service above, the steps we follow to create a batch are mentioned below (As we are assuming a lot of young devops would like to know how does AWS Batch work)

 

Step 1

Create a Compute Environment:

We chose a “Managed Environment” as we did not want to worry about scaling at our end but let AWS do it for us.

As a provisioning model, we chose Fargate (Allows AWS Batch to run containers without having to manage servers or clusters of Amazon EC2 instances) as it fit our Docker-based batch perfectly.

Select the subnet/VPC and “Create Compute Environment”

 

Step 2

Create a Job Queue:

Enter the Job Queue Name

Attach it to the Compute Environment created in Step 1

 

Step 3

Creating a Job Definition:

Enter the job definition name.

Select the platform type –Fargate (refer to step 1)

This is the part where you add your image to start your container. In our case, we hosted it as a public image on Docker hub.

Add any commands that need to be passed via bash or JSON – we had to customize this part as our rest API being called via API Gateway/Lambda had parameters from our front end that the client had passed.

 

Sample code of the same:

const AWS = require(‘aws-sdk’);

console.log(‘Loading function’);

exports.handler = async (event, context) => {

// Log the received event

console.log(‘Received event: ‘, event);

// Get parameters for the SubmitJob call from event

// http://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html

const params = {

jobDefinition: event.jobDefinition,

jobName: event.jobName,

jobQueue: event.jobQueue,

containerOverrides: event.containerOverrides || null,

parameters: event.parameters || null,

};

// Submit the Batch Job

try {

const data = await new AWS.Batch().submitJob(params).promise();

const jobId = data.jobId;

console.log(‘jobId:’, jobId);

return jobId + ”  ” + event.jobDefinition + ”  ” + event.jobName + ”  ” + event.jobQueue+ ”  ” + event.containerOverrides+ ”  ” + event.parameters.genes+ ”  ” + event.parameters.genelistName;

} catch (err) {

console.error(err);

const message = `Error calling SubmitJob for: ${event.jobName}`;

console.error(message);

throw new Error(message);

}

};

 

Select the vcpu and memory details and click Create.

Once this is done, you can start submitting jobs via the Jobs section by selecting the Job Definition and Job Queue to test if all is connected fine.

S3+Lambda+SES – Though S3 is used by the Docker image for static files, it is also used as the storage for the results. On successful completion of the batch, the files are stored on the s3 bucket. A lambda function is triggered on the creation of the files which calls SES to shoot an email to the user that has submitted the request.

Cloudwatch +Lambda – If CloudWatch records any unsuccessful batches, it sends an email to the admin of the system with the log of the failed request via SNS. The topic is created, and the admins are subscribed to these topics.

 

When the time came to put all this together, it took us a few days of work rather than weeks or months (of course knowing about these services and how they work together is the beauty of microservices and devops).

Owing to this infrastructure, a sister web application running a different algorithm, but a similar structure was revamped in 20% of the time it took the original one.

 

What did the client get besides a faster turnaround time?

First, savings! With 3 servers running 24*7, we are assuming high bills for resources that were not being used most of the time. The cost was optimized owing to using AWS Batch, which only gets called when you run a batch, no other servers running high computation loads besides the thin client machine which is anyway a micro instance.

The application is highly maintainable. This is what we love about microservices or decoupled architecture. Rather than having an application that does everything and becomes a rabbit hole when we try to debug something, we have distributed our risks to services that do a task (really well). Hence no more bottlenecks.

99.99999%+ uptime, though there were a couple of outages on AWS last year, we have complete confidence in their infrastructure.

Having issues with your legacy systems? Are they burning a hole in your pocket? Let us help you with moving your application to the cloud and give your systems a new lifeline.