When it comes to cloud-based database management, there are really only two players: Amazon, the value leader, and Microsoft, the innovation leader. Amazon has carved out a niche as the value leader in cloud-based database management, supporting not only its own implementations of various database platforms such as MySQL and Hadoop, but also supporting premier commercial DBMSs such as Microsoft SQL Server and Oracle. Meanwhile, Microsoft has, in my mind, carved out a very strong niche as the innovation leader by offering powerful technologies to integrate on-premises databases with various Azure services.
Data Management Using Hybrid Cloud- and Earth-Based SQL Server Databases
I described some of the innovations Microsoft is making in the cloud in previous articles, such as one last January in which I described a raft of new cloud offerings in SQL Server 2014 and one last April in which I wrote about the Hadoop-powered offerings of SQL Server HDinsight. Microsoft continues to do cool things in this space, including a feature announced last November at the PASS Summit 2014 conference called StretchDB.
Using StretchDB, an enterprise can “stretch” an on-premises database into the cloud, such that “hot,” heavily used data is stored in the on-premises instance of SQL Server, while “cold” and infrequently used data is transparently stored in Azure. A stretched database automatically and transparently manages synchronization and movement of aging data from on-premises to the cloud.
A stretched database automatically and transparently manages synchronization and movement of aging data from on premises to the cloud.
“Transparent” is a key word for stretch databases. Developers don’t need to do anything special when writing applications against stretched databases. Similarly, DBAs do not have to do a lot of special operations for stretch databases beyond defining the criteria that delineates cold data from hot data, using ALTER DATABASE and ALTER TABLE statements. Once configured, StretchDB then automatically manages access control and the movement of data from on-premises SQL Server into Azure. In addition, where desired, DBAs still have a lot of granular control over operations that might require a bit of extra thought on a StretchDB, such as defragmentation, refreshing index statistics, and backups.
The Benefits of Stretching Databases
Let me re-emphasize two main points. First, DBAs maintain full control from their on-premises SQL Server up into Azure. There is no need to manage two different data stores. Second, most applications will continue to work as is. There will likely be some gaps in early iterations, but it will be very few.
On the flip side, keep in mind that the benefits are compelling for certain scenarios:
Cost Savings:
Stretched databases can shunt infrequently used data into the cloud, where storage and processing power is cheap and elastic. In addition, if the stretched data is only used a few times in a given year, users are not billed for times when the system is quiescent.
Performance Improvements:
Since your active data is stored on-premises, you can be sure that the most commonly executed batches are run on your best hardware. Say you had a 256GB flash drive on your SQL Server for a 10TB database. An all on premises instance of SQL Server would build execution plans for the entire, online database. With StretchDB, SQL Server can see that 9TB of the database is in Azure and that none of the batches need to access the stretched data. Consequently, execution plans are optimized for the operations that make up 99% of the database workload.
Scalability:
StretchDB shares some common traits with partitioning, which enables large databases because the DBA can perform detailed segregation of I/O. StretchDB enables the same sort of seamless, scalable database growth, except the “partition” is now stored in Azure. (There are some breakdowns in the specifics of the analogy. But in general, the idea works.) In a nutshell, you can build enormous databases without adding more hardware.
While the product was announced in November 2014, Microsoft is not making promises about a specific delivery date. Knowing Microsoft’s product delivery patterns as I do, I can say with some authority that you’ll have the opportunity to participate in at least a couple of CTPs (Community Technology Previews), both private and public, and provide feedback to Microsoft during that time.
Do you have scenarios that might benefit from StretchDB? If so, I want to hear about them!