IBM has announced enhancements to Cloud Pak for Data that leverage the DataOps methodology to help clients get their data "business-ready" for AI.
According to IBM, the time required to collect and organize vast datasets into usable data for AI can stress resources and actually stall AI projects. Improving data preparation, management and automation are pillars of the DataOps (data operations) principle, which outlines methods for automating and streamlining data flows across an enterprise. DataOps is also at the core of the "Organize" rung of IBM's AI Ladder strategy, from which clients can transform datasets into data that's tuned and prepared for AI.
The key technical updates unveiled include:
Watson Knowledge Catalog (WKC), the company's data and AI catalog built into Cloud Pak for Data, IBM's multi-cloud data and analytics platform, has been updated with new quality and governance capabilities for policy enforcement. Already equipped with certain governance capabilities, WKC offers access to rich third-party data, such as socio-economic data, household data that can be combined with enterprise data in a single enterprise catalog.
StoredIQ InstaScan is an unstructured data management and privacy solution that is designed to identify risk hot spots in data sources and prioritizes potential fixes and remediations, to help reduce the time needed to meet compliance data collection obligations and AI projects. In addition, users can conduct periodic risk assessment tests, helping to build confidence and trust in the data. The software also enables users to define policies for assessing cloud data sources, to help assure collection and management is accountable and more accurate.
InfoSphere DataStage, an ETLtool available in Cloud Pak for Data has been updated with a new feature called, change data capture designed to continuously capture data changes and automatically transforms and delivers that anywhere clients demand. Other new capabilities to the platform identify assets from a data catalog, and then automatically generate jobs, easing the user experience for data engineers. In addition, new collaboration features are engineered to make it easier for business users and data engineers to share data and insights.