MODELOPS
There are a number of “Ops”-style methodologies that have reshaped collaboration and automation inside and outside of data centers. An emerging methodology promises to play a role in building and testing AI and machine learning models. “We see many companies struggling when it comes to executing AI and machine learning projects at scale,” said Hillary Ashton, chief product officer at Teradata. “It’s a major undertaking—success here hinges on organizations building and running more powerful analytics than ever before to deploy and manage their AI and ML projects at scale. The way to make that happen, and truly live up to the transformative potential of AI and ML, is to ensure ModelOps are prioritized with every AI initiative.”
Progress: ModelOps may not be a new concept within enterprises, but “its importance has grown significantly in the last few years,” said Ashton. “As companies accelerate investments in AI and ML initiatives, AI models will increase drastically from 10s to 10,000s and beyond. Just think of the number of models for every healthcare patient, or every sensor in a warehouse or factory. Model- Ops enables efficient management, deployment, and monitoring of AI models throughout their lifecycle, ensuring their optimal performance and scalability.”
Potential roadblocks: Growing and scaling quickly “places often unreasonable and complex demands on the data science teams charged with democratizing analytics,” said Ashton. “Poor data quality, inadequate or limited access to relevant data can greatly impede the effective deployment and operation of AI models.”
Business benefits: “Every AI and ML model put into production is guaranteed to degrade over time because dynamic business environments change constantly,” said Ashton. “ModelOps supplies the framework for managing, deploying, monitoring, and maintaining the analytic model performance required to effectively operationalize AI and ML investments, generate reliable insights, and ultimately achieve success with AI-powered initiatives. These tools allow data scientists to identify and remediate data quality risks, automatically detect when models degrade, and schedule model retraining.”
CLOUD DATA LAKES
With the huge influx of data in its varied forms, and flexible cloud services to accommodate it, a new breed of data environment has emerged. Cloud data lakes enable “a tremendous amount of innovation around creating a cloud-based single source of truth for inexpensively managing and delivering data,” said Ian Clayton, chief product officer for CDP platform at Redpoint Global. “This central data view is empowering all data teams—ranging from the small startups to the large enterprises—to democratize both their data and the data orchestration architecture that makes it actionable.”
Anil Dangol, data engineering and architecture leader at Launch Consulting, expects to see a cultural shift toward migrating to vendor-agnostic cloud data storage formats such as Apache Iceberg. “Apache Iceberg is an open source data lake table format that provides ACID transactions and scalable analytics for large-scale data platforms,” he noted.
Progress: There is widespread recognition that data in the cloud “finally provides performance, security, reliability, and self-service in an open source framework without the need for DBAs or for writing perfect SQL queries,” said Clayton. “With the promise of best-in-breed connections, the ease of parallel big data computing, and consumption-based pricing, the democratization of data lakes has quickly set a new standard for big data cloud computing.”
As part of this trend, open source environments such as Apache Iceberg have experienced substantial growth in popularity, he added.
Potential roadblocks: A data cloud is not about business processes, which must be established and managed separately. “The enterprise must manage processes above the components within a cloud data lake,” said Clayton. “For instance, delivering a good customer experience requires careful data transformation and governance—data quality, identity resolution—to provide the privacy and personalization customers expect.”
In addition, moving to an environment such as Apache Iceberg “poses a significant challenge for organizations due to the cost associated with refactoring their existing data platforms and data pipeline workloads,” Dangol cautioned. “As Apache Iceberg has gained traction only in recent years, there are concerns regarding security, compliance, regulation, data quality, ease of use, and various other factors that need to be addressed.”
Business benefits: “Cloud data lakes open up data democratization, which drives innovation,” said Clayton. “Agility breeds ambition—the business is not locked into an inflexible data model or outdated applications. Users are empowered to experiment, to test new ideas, to fail fast. The adoption of a cloud database-centered model provides continuous best-of-breed for businesses where the constant is perfect data—everything else is anchored on that.”