For the past few years, database vendors have been busily enhancing their cloud offerings and consolidating the innovations that arose more than 10 years ago from the big data and NoSQL movements. While both NoSQL and big data were enormously influential for database technology, it remains true that the vast majority of databases are running on architectures that are positively ancient in computer science terms.
The most popular databases in use today are Oracle, SQL Server, MySQL, Postgres, and MongoDB. Of these five, four are based on technologies established in the 1980s—the triumvirate of SQL, ACID transactions, and the relational data model. In reality, very little has changed in database management over the past 30 years.
For about 20 years now, I’ve kept watch for impending changes in database technology. At this point in 2020, I see three possible paradigm shifts ahead. First, quantum computing stands to turn almost every aspect of computer science on its head—but doesn’t look likely to change anything in the near future. Second, if anyone ever develops a persistent storage medium as cheap as disk and as fast as RAM, then every database architecture we have today would need to be revisited—but again, there’s no sign of anything imminent. Finally, blockchain represents a significant shift in the way we store data—and it is here now.
Blockchain addresses a question in database reliability that we’ve had no answer for since the emergence of digital storage almost 70 years ago: How do we know that what has been written to a digital device has not been maliciously overwritten?
Pre-digital storage technologies—paper for instance—had many limitations, but it was hard to alter a record without leaving some sort of forensic evidence. For many years, paper ledgers represented a good enough form of proof for financial records because attempts to overwrite individual entries would leave some trace. With digital technologies, an overwritten record is indistinguishable from the original, and despite the many layers of security surrounding enterprise databases, data falsification is still a big problem.
Blockchain solves this problem by a clever combination of cryptography and game theory. Participants around the globe are motivated to participate in the blockchain network by a reward system. Entries written to the blockchain are cryptographically signed in a manner that is virtually impossible to undo once written. In the 10-year history of the Bitcoin blockchain, no one has managed to falsify a Bitcoin entry despite the billions of dollars in rewards that a successful falsification could unlock. Alas, blockchains are not database replacements and so can’t be used as the building blocks for applications. Amazon’s QLDB (Quantum Ledger Database) is an example of a new breed of “hybrid” systems that attempts to combine the best features of blockchain and database.
In QLDB, data is represented as tables. Each row in the table is cryptographically linked to the previous row, just as in a blockchain, each block is linked to the previous block. Unlike a blockchain, though, the ultimate provenance of any record is guaranteed by Amazon, not by a distributed network. You can think of Amazon as being the witness or notary for the transactions.
However, despite the lack of multi-party consensus, given Amazon’s strong market presence, QLDB is gaining traction and validating the general concept of a merger between blockchain and databases. Oracle has introduced a similar blockchain table type in its latest release, and there are some innovative startups offering blockchain databases services—Fluree, BigChainDB, and ProvenDB (my company).
As a true believer in the power of blockchain to revolutionize database technology, I’m hardly unbiased. But with major players such as Amazon and Oracle deploying blockchain databases, and with increasing innovation from startups, it would seem that this is one database technology trend that is not in the very distant future.