Columnar databases store data in columns rather than rows. This format has benefits for improved compression and processing speeds, and is used for analytics workloads. Examples include Clickhouse, Vertica, and Redshift.
- Pros: More efficient for analytic type queries; better data compression since it’s all located within column data type; SIMD processing
- Cons: Writing data can be less efficient (though not a massive amount, it’s recommended to write in larger batches as opposed to one data point at a time); performance hits to updating and deleting; decreases in speed if accessing multiple column values
- Use cases: Analytics; data warehousing; observability
In-memory databases store data entirely in RAM, using fully optimized data structures without tradeoffs for interacting with disk storage. Examples include Redis and Memcached.
- Pros: High performance because of its storage exclusively in memory, no disk needed; low latency; can store versatile data types
- Cons: RAM is more expensive than disk; large data sets will need to be scaled horizontally; secondary database is typically needed
- Use cases: Caching; real-time applications; pub/sub
Search databases are designed for efficiently storing and querying text-based documents. These databases can handle structured or unstructured data. Examples include Elasticsearch, Solr, and MeiliSearch.
- Pros: Improved developer productivity through built in algorithms and query languages; increased performance; horizontally scalable
- Cons: Higher write volumes may necessitate a sacrifice to indexing or replication
- Use cases: Full-text search; log analysis; autocompletion; analytics
Vector databases are built for storing and searching vector embeddings of unstructured data, allowing for the search of images, videos, text, audio, etc. Examples include Milvus and Pinecone.
- Pros: Efficient vector search allowing for search of different types of data; horizontally scalable; hybrid storage for working with both RAM and disk
- Cons: Very specialized database that comes with the overhead of maintaining and managing the database
- Use cases: Duplicate removal; anomaly detection; ranking and recommendation engines; similarity search for unstructured data; semantic search
NewSQL databases attempt to get the best of both worlds by combining features of relational and non-relational databases. They are capable of high consistency while maintaining high availability. Examples include CockroachDB, Spanner, and TiDB.
- Pros: SQL support; horizontal scalability; typically cloud-native architecture friendly
- Cons: High complexity even if hidden; potential latency; some SQL limitations; is very new technology
- Use cases: Suitable for most general applications; some of these databases can also support analytics workloads
You can view an archived version of this webinar here.