Imagine that you are running some jobs/tasks during your office hours, which usually would be 8 AM to 8 PM, that is 12 hours for 5 days a week. That makes is 12*5=60 hours per week, but for how long are your servers running (and you paying for it), 24*7=168 hours.
Very often we neglect this as we feel the cost of building a system to optimize your costs would outweigh the benefit. We try to prove how you can actually save $$$ by spending $.
For our example, we will use a t3 extra-large machine (AWS jargon being used here, but essentially it is a machine having 4 vCPUs and 16 GB of RAM – a good robust machine for gene processing, image processing kind of work), in fact, 4 of them to help you with your parallel processing.
The cost of 1 machine: $0.1670 per hour
1 day cost- 0.1670*24 = $4.008
30-day cost = $120.24
Cost of 4 machines = $480.96
Annual cost= $5771.52
We propose to reduce this cost by 75%+ (you will have to trust us on this one, and of course continue reading this article).
The initial case we asked you to the image was bringing down our costs down by 65%, but it suffered from some drawbacks:
- What if some processing is needed during out-of-office hours?
- What if you want to productize this solution of yours and sell it as SAAS?
- Who is the gatekeeper to start and stop the servers, if automated, how will one know if someone is between the tasks?
- We can switch machines on and off based on the load, but the initial load time for the server to be in a ready state to start accepting jobs could be a deal-breaker.
These are some very important questions that have a direct relation with the business and of course your bottom line.
Hibernation to the rescue:
AWS came out with the hibernate service a while back, starting with Linux instances and then finally launching with Windows. The objective here is that you should be able to hibernate your instance either from your machine, AWS control panel, or via code (in our case we use Lambda functions leveraging AWS SAM) and when the instance starts, it does not restart, but starts from where you left it. Think of it as if you closed your laptop and opened it to be right where you were. Using this approach, your initial time taken to get your system in the ready state is almost 0 (machine start -> pending->system checks, this process does take a minute).
In our case, we created a listener function whose job was to find out the number of items in the queue (jobs to process). Sharing some code with you:
console.log(“No of items in waiting: ” + itemswaiting);
console.log(“Seed value: “+seed);
console.log(“Math Ceil :” + Math.ceil(itemswaiting/seed));
The above gives us an idea of how many servers are required to process the videos optimally.
In our case, the seed value is 2.
Let us assume that we have 5 videos that are waiting to be processed, the number of servers needed as per our calculation:
Ceiling of (5/2) – 1
Hence according to this calculation, we need 2 servers for this job.
We have tagged one server as primary, and multiple secondary servers (currently 2). In this case, we will run the primary server and 1 secondary server.
As soon as the listener function says that there are no more jobs to process, the servers go into hibernation.
To get a better idea the below diagram shows how the hibernation process works:
As per AWS docs, the below happens:
When you hibernate a running instance, the following happens:
- When you initiate hibernation, the instance moves to the stopping Amazon EC2 signals the operating system to perform hibernation (suspend-to-disk). The hibernation freezes all of the processes saves the contents of the RAM to the EBS root volume and then performs a regular shutdown.
- After the shutdown is complete, the instance moves to the stopped
- Any EBS volumes remain attached to the instance, and their data persists, including the saved contents of the RAM.
- Any Amazon EC2 instance store volumes remain attached to the instance, but the data on the instance store volumes is lost.
- In most cases, the instance is migrated to a new underlying host computer when it’s started. This is also what happens when you stop and start an instance.
- When you start the instance, the instance boots up and the operating system reads in the contents of the RAM from the EBS root volume, before unfreezing processes to resume its state.
- The instance retains its private IPv4 addresses and any IPv6 addresses. When you start the instance, the instance continues to retain its private IPv4 addresses and any IPv6 addresses.
- Amazon EC2 releases the public IPv4 address. When you start the instance, Amazon EC2 assigns a new public IPv4 address to the instance.
- The instance retains its associated Elastic IP addresses. You’re charged for any Elastic IP addresses that are associated with a hibernated instance. With EC2-Classic, an Elastic IP address is disassociated from your instance when you hibernate it. For more information, see EC2-Classic.
- When you hibernate a ClassicLink instance, it’s unlinked from the VPC to which it was linked. You must link the instance to the VPC again after starting it. For more information, see ClassicLink.
In order to get started with hibernation, you need to select the type of instance as not all instances are supported by this feature.
Post that, while creating the instance, you need to check the box where you enable hibernation
Please keep in mind, that an existing instance cannot be edited to enable hibernation, you need to create this instance from scratch.
Another point worth noting is the issues associated (limitations) with hibernation (reference):
- When you hibernate an instance, the data on any instance store volumes is lost.
- You can’t hibernate an instance that has more than 150 GB of RAM.
- If you create a snapshot or AMI from an instance that is hibernated or has hibernation enabled, you might not be able to connect to a new instance that is launched from the AMI or from an AMI that was created from the snapshot.
- You can’t change the instance type or size of an instance when hibernation is enabled.
- You can’t hibernate an instance that is in an Auto Scaling group or used by Amazon ECS. If your instance is in an Auto Scaling group and you try to hibernate it, the Amazon EC2 Auto Scaling service marks the stopped instance as unhealthy and might terminate it and launch a replacement instance. For more information, see Health Checks for Auto Scaling Instances in the Amazon EC2 Auto Scaling User Guide.
- You can’t hibernate an instance that is configured to boot in UEFI mode.
- If you hibernate an instance that was launched into a Capacity Reservation, the Capacity Reservation does not ensure that the hibernated instance can resume after you try to start it.
- We do not support keeping an instance hibernated for more than 60 days. To keep the instance for longer than 60 days, you must start the hibernated instance, stop the instance, and start it.
- We constantly update our platform with upgrades and security patches, which can conflict with existing hibernated instances. We notify you about critical updates that require a start for hibernated instances so that we can perform a shutdown or a reboot to apply the necessary upgrades and security patches.
Now that we know about Hibernation in detail, let us let you know the savings we were able to bring in for the customer:
Previous monthly cost – $480.96
March 2022 – $226
April 2022 – $111
There were optimizations made to the code in March which helped bring the costs further down in April, making our savings almost 77%.
Reduction in the annual bill making it $1400.
Some would argue that there is a cost to building such a process, but from our experience, that cost far outweighs the benefits this brings. Also, we are currently looking at only 3 instances, in theory, this process can now work (scale) for n servers.
Are you also looking at reducing your AWS bill, or exploring ways to optimize your infrastructure, in that case, please get in touch with us and help us in helping you save $$$.