By Jonathan Katz, Principal Product Manager - Technical, AWS; Grant Scanlen, Sr. Database Specialist Solutions Architect, AWS; and Jagdish Mirani, Sr. Product Marketing Manager, AWS
Kubernetes (K8s) gives us an elegant approach to scaling stateless workloads. By itself, scaling stateless workloads will run into bottlenecks if the underlying database doesn’t have a corresponding approach to scaling its workload. As an Amazon Web Services (AWS) managed service, Amazon Aurora Serverless v2 mirrors K8s by providing a similarly fluid ability to scale the database workload. This article describes how you can bridge these two worlds to get the best of each world by managing Amazon Aurora as an external AWS managed resource directly from K8s.
Amazon Aurora: Enterprise-Ready Database, with Open Source Compatibility
Amazon Aurora is a modern relational database service offering unparalleled high performance and availability at global scale, fully open-source MySQL- and PostgreSQL-compatible editions, and a range of developer tools for building serverless and machine learning (ML)-driven applications.
Aurora features a distributed, fault-tolerant, and self-healing storage system that is decoupled from compute resources and auto-scales up to 128 TiB per database instance. It delivers high performance and availability with up to 15 low-latency read replicas, point-in-time recovery, continuous backup to Amazon Simple Storage Service (Amazon S3), and replication across three Availability Zones (AZs).
Amazon Aurora Serverless v2
Amazon Aurora Serverless v2’s automated capacity management is a lot like K8’s in the way it continuously monitors capacity and adjusts by scaling up or down as needed. Aurora Serverless v2’s capacity auto-scaling is fast and granular, making it cost effective. With Aurora Serverless v2, you create a database, specify the desired database capacity range, and connect your applications. All the time-consuming administration tasks are fully managed and automated, including hardware provisioning, database setup, patching, and backups, while providing the security, availability, and reliability of commercial databases at 1/10th the cost. This allows your development and DBA teams to focus exclusively on application development and schema management.
Setting up an Aurora Serverless v2 database cluster with instances in multiple AZs allows it to operate in a highly available configuration that automatically fails over so that your database can remain up and running even in case of an AZ outage. The highlights, benefits, and key takeaways related to Aurora Serverless v2 are described in this IDC analyst report.
The automation provided by Aurora Serverless v2 for databases is a good complement for how K8s provides similar automation for containers that run your application code. Extending K8s to manage Aurora as an external resource achieves the best of both worlds—K8s to natively manage your container deployments, and Aurora Serverless v2 to manage your data from K8s as an external resource.
Bridging Kubernetes Workloads to AWS Databases
Managing Aurora in a K8s world relies on controllers that extend K8s. Controllers use the K8s API to control the lifecycle of custom resources that are not native to K8s, like databases. By using a controller for a database, the management of the database can be automated, much like a native K8s resource. You can choose to containerize and run your databases on K8s, or you can choose to run them as an externally managed resource that is connected to your K8s application. The latter approach provides some key benefits.
While controllers that manage databases running natively on K8s can provide a “set-and-forget” approach when things work, what happens when there are problems? Troubleshooting a problem can involve looking at many more places in the stack, including the application, database, the controller, the storage layer, and the K8s environment itself. This greatly increases the surface area of error and requires building a team with specialized expertise to ensure availability. The approach of using controllers that manage databases running as external resources is not fraught with these problems to the same degree.
AWS has built controllers for many of our databases (AWS Controllers for Kubernetes, or ACK) that adopt the approach of managing them from K8s as external resources. With ACK, you can take advantage of AWS-managed services for your K8s applications without needing to define resources outside of the K8s cluster or run services that provide supporting capabilities like databases or message queues within the K8s cluster. Each ACK service controller manages resources for a particular AWS service, and is packaged into a separate container image that is published in a public repository.
How ACK Works
The idea behind AWS Controllers for Kubernetes (ACK) is to enable K8s users to describe the desired state of AWS resources using the K8s API and configuration language. ACK resources are defined using YAML-formatted manifest files to both initially define the resource configuration and also to modify it. Once the manifest file is created, the resource it defines is initially created by using the file name as the input argument to the Kubernetes “kubectl apply” command. To change a resource configuration, you simply edit the appropriate parameters in the existing resource manifest file, then call the “kubectl apply” command in the same manner as the initial resource creation.
How ACK works with Aurora Serverless v2
With the ACK service controller for Amazon Relational Database Service (Amazon RDS), you can provision an Aurora Serverless v2 database by creating manifest files that describe the required properties of the database cluster and instances. After cluster creation, these properties can be modified by updating the appropriate .yaml file parameters. For example, to set or change the capacity of an Aurora Serverless v2 instance, you would include the minCapacity and maxCapacity parameters in the manifest file with appropriate values.