Slash Down Your Hosting Costs By 95% On Google App Engine
Tax season is always a busy time for financial service websites. One of our clients, a tax return portal, anticipated the high use of their app and was looking to decrease infrastructure costs. During this time, the app itself could not afford any downtime, and was being used to service other clients as well, such as CrossLink, so taking the app offline was not an option. The hosting and operating costs, however, were nearing $5000 per month.
This article describes some of the changes we made to the infrastructure of the app and its settings, and how these resulted in a 95% drop in hosting costs. Pretty amazing right?
Here’s How You Do it:
Simple Changes Bring Remarkable Efficiency
Originally, this project was running CRON jobs and was polling third party APIs. We employed deferred tasks instead which helped in a reduction of resource utilization. Essentially the CRON was being executed every 2 minutes to keep the instances up. The instance classes were also poorly planned.
We introduced smaller instances and moved from F4 to F2. Not only did it serve 10 requests per second, but it also supported more memory intensive operations as well.
This is the configuration we used:
instance_class: F2 automatic_scaling: min_idle_instances: 1 max_idle_instances: 10
Despite the fact that Google App Engine offers a number of great features, it is often criticized for its higher cost compared to other cloud-based platforms. But it’s unfair to blame App Engine alone, as developers often overlook the most basic of configurations starting out, only to end up with a massive bill.
Instance related parameters are major components for controlling the performance and cost for an App Engine application.
But the major question that remains is how to configure the hosting parameters of the application?
Here’s how we did it:
Well, this can be decided entirely by the metrics provided by App Engine such as the request rate (number of requests per second), traffic (bytes per second), and latency of the application which can be easily evaluated from the App Engine’s Dashboard.
For those who don’t know, Google App Engine is Google’s product for managed hosting. It has support for various technologies like (Python, Java, PHP etc) and has various tools available at the developer’s disposal, including:
- Task Queues
- CRON jobs
- Map Reduce and
It offers auto scaling, version management, memcache as well. It also works efficiently with architecture for the ease of deploying and managing micro services.
All these features and tools that come along with app engine have a quota limit set for free usage. Once the quota is exceeded for more than a day, the user then needs to enable billing for further usage.
The most expensive resource of an App Engine based application is the front-end instance. The cost increases with the number of such instances. The fee for this is calculated by the number of hours an instance is up and running.
There are various parameters which can control the instance spawning and up times namely:-
Auto Scaling: Assumed by appengine by default, with Instance Class of F1 if nothing is specified in the app.yaml file.
Instance Class: Used to control the class of Instance to be used in order to serve the application on App Engine. F1, F2 and F4 have their individual attributes (Memory and Processing power).
Maximum Idle Instances: Maximum number of idle instances that are to be maintained by App Engine for a particular deployed version. This parameter plays an essential role in controlling the speed and performance metrics of the application.
A high maximum reduces the number of idle instances gradually when load levels return to normal after a spike. This helps your application maintain a steady performance through fluctuations in request load. Also, the number of idle instances increases, thus increasing the cost.
A low maximum keeps running costs lower, but can degrade performance in the face of volatile load levels.
Minimum Idle Instances: This defines the number of resident instances that App Engine should maintain for an application’s version.
A lower value for these would degrade the performance but would keep the running costs low.
Also a higher value would make the server easily handle spikes as app engine keeps these ‘Resident’ instances running at all times.
Minimum Pending Latency: The minimum amount of time that App Engine should allow for a request to wait in the pending queue before starting a new instance to handle it.
A low minimum means that requests must spend less time in the pending queue when all existing instances are active. This would enhance the performance but the running costs would be very high as the app engine charges the user based on the running hours of instances.
A high minimum means that requests will remain pending for longer if all existing instances are active. In this case, the request would have to wait for a longer time before an instance is spawned by App Engine.
Maximum Concurrent Requests: It defines the number of concurrent requests an automatic scaling instance can accept before the scheduler spawns a new instance.
You might experience increased API latency if this setting is too high. Note that the scheduler might spawn a new instance before the actual maximum number of requests is reached.
As you can see, there are a lot of options that offer different pros and cons depending on the needs of the app itself and how to correctly balance it. In these cases, it’s a good idea to work together to determine what settings might work for different types of apps and instances.
Share what settings work for you and who knows we can together come up with best matrix for cost efficient hosting on Google App Engine!
Originally published at www.systango.com on March 30, 2017.