RDS is a managed relational database service offered by AWS. RDS enables any service to access a maintenance-free elastic relational database, at least maintenance-free from user’s perspective. The up-front engineering effort is minimal for any project to use a relational database. Of course, all these conveniences come at a cost.
The cost of RDS service comprises of EC2 compute cost, EBS data storage cost, IO cost, and outbound data transfer cost. These costs are measured by provisioned capacity and in some cases by consumed capacity. If you were to provision a db.r3.4xlarge RDS instance running MySQL and provision 1000GB of gp2 storage, on-demand monthly cost would be a minimum of $1379.70 (for db.r3.4xlarge instance)+ $115 (for storage) = $1494.70. We say ‘minimum’ because the data transfer out of RDS costs money and it depends on the amount of data transferred. If you multi-AZ deployment, cost doubles. There are additional costs for backup etc. As is the case with most of the AWS resources, AWS charges customers for the provisioned capacity and they do not care if you fully utilize the RDS instance type of the provisioned storage capacity or performance.
In order to optimize RDS costs, it is important to make sure that rightsized RDS instances are selected. It is also important that the provisioned storage capacity and performance match the application needs. Customers may monitor RDS utilization information using AWS console and decide to rightsize RDS instances and attached storage (reducing storage capacity manually is not an easy task though). Alternatively customers may also use third party tools such as FittedCloud to automate the entire process of monitoring and taking actions to keep RDS instances optimized all the time (I will write another blog on how that can be done manually).
There are also many other ways to control RDS costs. Here I will list some simple steps to manage RDS costs.
- Architectural Assessment: RDS offers great benefits of relational data models and convenient SQL queries. However, it is a relatively expensive way for storing data. Do an architecture assessment to make sure that a project really needs RDS. Use an e-commerce project as an example, RDS is appropriate for sales transaction data due to its ACID property. On the other hand, product descriptions and pictures could be stored in cheaper storage systems such as S3.
- Proof of Concept: Do a small pilot project to try out different instance types and collect database usage patterns. When selecting an instance type, consider that a service workload may fluctuate during a day, on different day of a week, and throughout a year. There is no need to overprovision the instance based on unreliable workload predictions. There is always opportunities to switch instance types. Be aware that instance type switching requires downtime.
- Data Lifecycle Planning: Create a data lifecycle management policy. For data that are not accessed for a while, have an automation tool to move them off RDS to a less costly storage service such as S3. Sometimes the raw is not needed after they are processed. For example, after monthly business reports are generated and stored in S3, the original data on RDS may no longer needed and can be removed from RDS. Company data retention policy also impacts how long data need to be kept in RDS.
- Avoid Data Dump: Due to widespread knowledge of SQL and convenience of using RDS, resist the temptation of using it as a data dump. Set up a policy and train developers so that only data that need to be in RDS are stored in it. Review the schema periodically and optimize it for RDS.
- SQL Optimization: Carefully design SQL queries to minimize the number of requests needed to generate a result. For example, instead of making multiple requests to produce a daily sales report, use SQL join or stored procedure to generate a sales table in one request.
- Cache Data: Use a cache to reduce RDS load and data transfer cost. You may use AWS ElastiCache or your own cache mechanism. Adding a cache layer increases the complexity of your project. However, in the long run, cost saving as well as service performance could prove the benefit is well worth the effort.
- Smooth Operations: For planned workload, smooth out RDS utilizations in the span of the workload. Typical provisions of RDS need to meet the peak workload requirements. Reducing peaks and valleys of workloads allows optimal provisions of RDS instance types.
Although following these steps seem to be a lot of work, they are already often taken for on-promise projects. The benefit of these steps may be vague for on-promise projects. But they are quite obvious in AWS when the monthly bill comes in.
Sometimes, simple housekeeping could reduce monthly AWS bill significantly. Fittedcloud offers advanced machine learning driven RDS Optimization. Fittedcloud can automatically detect unused and under-utilized RDS instances and suggest corrective actions or schedules for utilization. Please visit https://www.fittedcloud.com/aws-rds-optimization/ for details.