Machine-learning driven Optimization outperforms Auto Scaling in DynamoDB capacity management

DynamoDB is a fully managed, high performance, highly scalable NoSQL database service offered by AWS.  DynamoDB offers virtually unlimited performance and storage capacity and supports dynamic scaling. One of the challenges with using DynamoDB is provisioning read/write capacity. Under provisioning could lead to application failures, while over provisioning could obviously waste a lot of money. Often customers take a simplistic approach that ensures application execution without errors and favor over provisioning based on a guestimate of maximum needed capacity.  More creative customers use scripts that would adjust the provisioning based on a schedule (which works for a lot of workloads with fixed utilization patterns). The burden as always is on the customers to set provisioning appropriately to match application utilization.

A few months ago, AWS announced ‘auto scaling’ feature for DynamoDB to address this issue. With auto scaling read write provisioning increases/decreases based on application utilization. Customers are able to set desired target utilization, upper and lower limits for read/write capacity and DynamoDB will automatically adjust the provisioning. Obviously a great feature and relieves customers of the burden of monitoring/adjusting provisioning to keep costs under control. However, our understanding is that auto scaling uses a ‘reactive’ approach to capacity provisioning. This would mean that while it will work well for scaling up, it may not be very effective in scaling down capacity (which may not be in the interest of AWS, one could argue).

FittedCloud has offered machine learning driven capacity provisioning for DynamoDB for a while now and long before auto scaling was released. FittedCloud solution uses machine learning to learn from historical utilization patterns and predict/setup provisioning for most optimum resource provisioning. It works in a completely lights out manner and all a customer has to do is to enable machine learning driven provisioning in our SaaS interface. FittedCloud solution uses adaptive machine learning, which not only learns from historical utilization patterns but also reacts to real time, unexpected capacity spikes, making it very effective and powerful.

We often get questions from customers as to how FittedCloud machine learning driven DynamoDB optimization compares with DynamoDB auto scaling.  So, we ran some tests in our environment to compare both solutions.

Test Environment

The test was setup to apply FittedCloud Machine Learning driven optimization followed by switching to the AWS auto scaling method, then compare the cost savings.

Application Details

The application writes to the DynamoDB table daily around midnight from 23:00 to 01:00.  Usually the write activity generates a few write spikes during that period.

Results

As shown in the graph above, FittedCloud Machine Learning driven optimization was able to provisioned capacity more closely matching the actual consumed activities.  It’s able to take a more holistic view of the table’s activities combined with insights of DynamoDB’s daily 4 drops limitation and provision capacity in a more optimal way.  On the other hand, AWS auto scaling either used up the 4 drops quickly early in the day or not able to drop due to inactivity of the table.

Cost savings

In this case the FittedCloud Machine Learning driven optimization was able to achieve  89% less cost over DynamoDB auto scaling.

Conclusion

It is clear that FittedCloud machine learning driven DynamoDB outperformed AWS auto scaling significantly.  While auto scaling performed well with scaling up capacity, it did poorly on scaling down to reduce cost. This is understandable as it is our understanding that auto scaling uses a ‘reactive’ approach to provisioning and hence has no intelligent way to know when to scale down other than to use basic rules such as drop down after a period (after how long?) of inactivity. It also seems that auto scaling drops capacity in small steps, and exhausting the 4 scale-down per day limit often. That logic doesn’t work well, given such constraints.

While auto scaling is a great feature, our belief is that an adaptive machine learning driven approach is far more accurate and effective than DynamoDB auto scaling.

We welcome you to give our solution a try here.