It’s All About the Multi-AZ Transaction Log
The distributed transaction log supports strongly consistent append operations and stores data encrypted in multiple AZs for both durability and availability. It also acts as a replication bus, providing a consistent view where all the read-replicas can consume the data. This enables read-replicas to have an eventually consistent view of the data that is present in the primary. It also allows clusters with fewer nodes to benefit from the same durability and consistency properties as larger clusters.
With a durable transaction log in place, we shifted focus to consistency and high availability. MemoryDB supports lossless failover. This is done by coordinating failover activities using the same transaction log that keeps track of update commands. A replica in steady-state is eventually consistent, but will become strongly consistent during promotion to primary. Before accepting client commands as primary, a replica applies all unobserved changes from the transaction log and only then participates in the leader election process. Once the replica receives all the unobserved changes and is promoted to be the primary through the leader election process, it is ready to append updates to the multi-AZ transaction log. This allows the system to provide linearizable consistency, which is the strongest form of consistency for both reads and writes across failovers. It also ensures that there is always a single primary, preventing “split brain” problems which are typically observed in other database systems under certain networking partitions, where writes can be mistakenly accepted simultaneously by two nodes only to be later thrown away.
Why Open Source Redis Node-Local Consistency may not be Enough
Open source Redis is well suited for high performance, non-durable use-cases, but it is not designed for use cases that require durability guarantees. Even with Redis Append-Only-File (AOF), the system can provide a useful persistence mechanism but does not deliver database-grade durability. Redis was designed to be incredibly fast with microsecond reads and writes, but made a trade-off to improve latency at the cost of consistency. As data is stored in-memory, any process loss (such as a power failure) means a node loses all data and requires repair from scratch, which is computationally expensive and time-consuming. Even in a replicated or clustered setup, Redis replication is asynchronous, meaning writes can be lost in case of primary node failure. Moreover, in rare cases you can lose your entire data even in a replicated environment. A single failure lowers the resilience of the entire system as the likelihood of cascading failure and permanent data loss becomes higher. This allows Redis to respond quickly, but prevents the system from maintaining strong consistency during failures.
Durability isn’t the only requirement to improve consistency. Redis’s replication system is asynchronous: all updates to primary nodes are replicated after being committed. In the event of a failure of a primary, acknowledged updates can be lost. For example, in a catalog microservice, a price update to an item may be reverted after a node failure, causing the application to advertise an outdated price. This type of inconsistency is even harder to detect than losing an entire item.
Open source Redis has a number of mechanisms for tunable consistency but none can guarantee strong consistency in a highly available distributed setup. For persistence to disk, Redis supports the AOF feature where all update commands are written to disk in a file known as a transaction log. In the event of a process restart, the engine will re-run all of these logged commands and reconstruct the data structure state. Because this recovery process takes time, AOF is primarily useful for configurations that can afford to sacrifice availability. When used with replication, data loss can occur if a primary fails, even with AOF enabled. When the primary fails, a failover is initiated, and a stale replica can take over leadership, resulting in data loss. Also, in a cloud environment where the transaction log is stored in instance stores, there is a real risk of losing the data if a node were to stop or get terminated or an entire Availability Zone (AZ) fails. Even if the transaction log is stored in a network attached disk, which will increase the overall latency, Redis would still need to resolve any inconsistencies among multiple node-local copies of the data.
Basically, open source Redis is built to be highly available, but cannot provide any guarantee for consistency. For Redis to avoid losing an update, all replicas must process it. To ensure this, some customers use a command called WAIT, which can block the calling client until all replicas have acknowledged an update. This technique also does not turn Redis into a strongly consistent system. First, it allows reads to data not yet fully committed by the cluster (a “dirty read”). For example, an order in a retail shopping application may show as being successfully placed even though it could still be lost. Second, writes will fail when any node fails, significantly reducing availability. These caveats make it suitable for a fast cache where speed is the only attribute that matters but are nonstarters for an enterprise-grade database which requires durability and strong consistency. Now with MemoryDB, such Redis workloads have access to a primary database which provides the reliability guarantees that enterprises seek.
Conclusion
Modern application workloads place increasingly high demands on databases in terms of latency, throughput, and concurrency, while also requiring guarantees of security, durability, and availability. Many customers address these needs by adding a cache to existing database systems, for example, by introducing Amazon ElastiCache to a system with an existing database of record. MemoryDB offers a simpler alternative. MemoryDB can be used as a system of record that synchronously persists every write request to disk across multiple AZs with strong consistency, durability, and high availability. In other words, using MemoryDB, businesses gain ultra-fast responsiveness from their IT infrastructure, seamlessly operationalizing their data at in-memory speeds without compromising on data accuracy. Since MemoryDB supports the same Redis data structures as open source Redis, it can be a drop-in replacement for workloads that benefit from durability. As enterprises build out the infrastructure for their applications, they no longer need to support a dual database environment with both a cache and a database to ensure blazing fast speed. They now have access to a single, ultra-fast database which also reduces operational overhead.
Customers looking for a strongly consistent, durable, in-memory database that offers microsecond-scale read performance should consider MemoryDB. To learn more, visit the Getting Started page for MemoryDB or if you have any questions, you can contact the team directly at memorydb-help@amazon.com.