A constant influx of new technologies and strategies to improve enterprise functions and growth can often lead to companies being overwhelmed with choice; which program is right for my needs? How do I get the most out of this application? Am I using the wrong platform for my workload?
DBTA hosted a webinar sponsored by InfluxData titled “How to Choose the Right Database for Your Workloads”, tackling that very indecisiveness that enterprises struggle when choosing the right platform for their individualized needs. Charles Mahler, technical writer at InfluxData, offered a plethora of information that informed listeners of particular database use cases, downsides, and benefits for optimal enterprise function.
Mahler makes it clear that, at the end of the day, all databases are attempting to accomplish the same task: storing data and making that data available for query later on. However, various types of databases suit differing management requirements, base technology, skill sets, and business stages; for optimal benefit, enterprises must deeply reflect on their own needs in order to get the most out of a database platform that aligns with their current processes. To make that decision for their enterprise, Mahler explained the aspects of differing databases that can make—or break—enterprises if chosen or not chosen to best suit business functions.
Mahler breaks down each database during the webinar, summarized below:
Relational Databases are considered the “old school” of database platforms, where they follow the relational model and store data in tabular form. Data is stored on disk as rows, and SQL is used for query functions. Examples include PostgreSQL and MySQL.
- Pros: Solid performance across a broad range of applications; a strong ecosystem of people who know how to use it; battle tested through its many years of usage; maintains data consistency
- Cons: Challenging to scale horizontally; B-tree index can struggle with high write volume workloads; has a defined schema which means you have to plan everything ahead; any mistakes can cause issues later
- Use cases: Suitable for most general applications; any application where ACID is required
Key-Value Databases are the simplest type of NoSQL database, effectively being a hash table. Keys in the database point to unstructured blobs of data, consisting of no metadata; due to that lack of metadata, data stored on these databases can be of any type. Examples include DynamoDB, Redis, and Riak.
- Pros: Great read and write performance with low latency; a flexible schema; horizontally scalable
- Cons: Weak consistency; minimal querying capabilities due to the lack of metadata
- Use cases: Personalized recommendations; session management; real-time features for accelerated data location
Document Databases are a conceptual extension of key-value databases with added support for metadata and semi-structured data. Use of metadata allows for efficient query and data insights. Additionally, documents hold all relevant data rather than relying on joins. Examples include MongoDB and CouchDB.
- Pros: Flexible schema; great performance of reads and writes; faster data retrieval; documents can map better to application needs; horizontally scalable
- Cons: Consistency (however, this is case by case depending on the specific DB platform); issues with keeping data in sync with each other
- Use cases: Suitable for most general applications (like relational but with the advantage of scale); applications where rapid iteration is beneficial
Graph Databases are used for storing and analyzing relationships between connected data sets with specialized query languages. Examples include Neo4J and DGraph.
- Pros: Fast queries for graph analysis; developer productivity improved with query language and built-in algorithms; flexible schema; easy to establish new relationships between data points
- Cons: Not ideal if application data isn’t highly connected since this is a specialized database
- Use cases: Fraud detection; social networks; recommendation features
Time series databases are the specialty of InfluxData, and are designed for time series data, as the name would suggest. They are optimized for high write throughput and querying based on time ranges. These databases also have built-in features for data lifecycle management. Examples include InfluxDB, TimescaleDB, and QuestDB.
- Pros: Very fast data ingest and query performance for both broad and narrow queries; developer productivity improved through built-in features
- Cons: Update and deletion performance is low, however, you wouldn't want to update the database because its based in tracking history
- Use cases: Monitoring for software visibility; IoT applications; financial data; analytics and event tracking