Newsletters




Experts Offer 10 Big Data Predictions for 2025


As AI technologies and services have accelerated their own proliferation throughout the market, the data that trains these machines is more important than ever.

The global big data and analytics industry is expected to experience significant growth in the next few years. It is predicted to grow at a CAGR of 14.9% between 2024 and 2032 and reach $1.088 trillion by 2032.

This growth is mainly driven by organizations recognizing the transformative power of utilizing vast amounts of data to improve operations, make informed decisions, and gain a competitive edge.

The global data market has three major facets: data analytics, storage, and management. By 2030, these three markets alone will contribute to a global data market of over $1.75 trillion.

Here, data professionals share their big data and database/infrastructure predictions for 2025:

The catalog wars will get even more heated: The competition to dominate the data catalog space will become a high-stakes showdown. As hybrid and multi-cloud ecosystems grow, organizations will demand seamless interoperability, driving fierce innovation in governance, lineage, and user-defined functions (UDFs). Apache Iceberg will emerge as a key player, redefining standards for open table formats with its hybrid catalog capabilities. This race won’t just reshape data architecture—it will decide who controls the future of data portability.—Alex Merced, senior tech evangelist, Dremio

Observability-driven development: We need to shift observability left, the way we have with security and many other areas of IT, so that it's actually being done as part of the design of an application. Right now, engineers aren't thinking about the metrics, data, and observability that they need as they're building things—it's almost always retrofitted afterwards. We’ve done test-driven development; why not observability-driven development?—Jacob Rosenberg, senior leader for infrastructure and platform engineering, Chronosphere

Iteration is the Key to Success: With global digital transformation spending set to reach 3.9 trillion U.S. dollars by 2027, organizations will increasingly realize that large, one-off transformation projects are a recipe for failure. The pitfalls of attempting a "big bang" transformation, resulting in cost overruns and delays, are well-documented. In 2025, businesses need to embrace an iterative approach to digital transformation. Rather than overhauling everything at once, companies will prioritize breaking projects into smaller, more manageable stages, allowing for continuous improvement and adjustment. As part of this strategy, CTOs will also focus on mitigating risks from emerging technologies and AI, recognizing these as critical areas of concern. This shift will enable businesses to deliver value faster, address potential challenges early, and build a more adaptable, sustainable transformation strategy.—Karl Bagci, head of information security, Exclaimer

Data literacy becomes a mass movement—empowered by composable apps: In 2025, a mass data literacy movement will take hold, driven by composable apps that seamlessly integrate real-time analytics into everyday experiences. Consumers will actively engage with data on energy usage, shopping habits, and sustainability through intuitive, user-friendly platforms. Companies that simplify data reporting and empower users will thrive, while those relying on opaque, complex reports will face a consumer backlash demanding transparency.—Ariel Katz, CEO, Sisense

The increased adoption of streaming data platforms: In 2025, streaming data platforms will become indispensable for managing the exponential growth of observability and security data. Organizations will increasingly adopt streaming data platforms to process vast volumes of logs, metrics, and events in real-time, enabling faster threat detection, anomaly resolution, and system optimization to meet the demands of ever-evolving infrastructure and cyber threats.—Bipin Singh, senior director of product marketing at Redpanda

Contextualizing data will be the next frontier for data platforms: The evolution of the data platform is essential to the evolution of AI. Next year, we’ll see breakthroughs that help LLMs better understand the data they’re working with through the semantic layer. Today’s data platforms are largely missing the semantic layer of data, which is the understanding of what the data means. For instance, when you have financial data in a table, it’s typically the developer or the analyst who is tasked with understanding where that data came from, how it was calculated, and what it means—but this understanding should be baked directly into the data platforms. Having to rely on these additional stakeholders and build that understanding into every application you develop on top of your data is extremely burdensome. As a result, the semantic layer must be pushed down close to the data so that AI can understand the nature of it, and do a much better job at analyzing it. Users don’t want to, and shouldn’t have to, reinvent the semantic concepts for each application. They must push down to the data layer, that’s the next evolution.—Benoit Dageville, co-founder and president of products, Snowflake

Increased multi- and hybrid-cloud architectures: In the coming year we expect to see ever more attention on multi-cloud and hybrid-cloud architectures. For resilience against both region-wide and provider-wide cloud failures, organizations need to architect their "always on" applications to support instantaneous failover from one cloud to another, or from the cloud to on-premises infrastructure. Fully distributed databases do a great job of underpinning these architectures, often with minimal application code changes.—Phillip Merrick, co-founder and CEO of pgEdge

The emergence of next-generation archive storage: As data volumes grow, more efficient and cost-effective archival storage solutions have become critical. Flash and disk-based storage options, while fast, come with high costs when scaling to large capacities. This has led to a resurgence in tape storage as a viable solution for modern needs, and the introduction of new, emerging technologies like storage on glass. Companies will look to aggregate smaller units into larger configurations that combine the scalability of tape with the flexibility of cloud standards. The renewed interest in tape and other archival storage solutions will continue to expand as the demands of modern data management evolve.—Jason Lohrey, CEO of Arcitecta

The transformation of big data to “small data”: The past few years have seen a rise in data volumes, but 2025 will bring the focus from ‘big data’ to ‘small data.’ We’re already seeing this mindset shift with large language models giving way to small language models. Organizations are realizing they don’t need to bring all their data to solve a problem or complete an initiative—they need to bring the right data. The overwhelming abundance of data, often referred to as the ‘data swamp,’ has made it harder to extract meaningful insights. By focusing on more targeted, higher-quality data—or the ‘data pond’—organizations can ensure data trust and precision. This shift towards smaller, more relevant data will help speed up analysis tcimelines, get more people using data, and drive greater ROI from data investments.—Francois Ajenstat, chief product officer at Amplitude

Focus on simplified, IT generalist-friendly solutions: Many enterprises will invest in failover clustering solutions that are easier to manage, targeting the growing need for solutions that can be operated by IT generalists, not just clustering experts. With automation, simplified interfaces, and streamlined deployment, these clustering solutions will allow organizations to maintain high availability without the complexity typically associated with failover clustering—appealing particularly to small and medium-sized businesses looking to achieve enterprise-grade resilience.—Cassius Rhue, vice president, customer experience, SIOS Technology


Sponsors