Organizations have an opportunity “to quickly and dramatically improve their data quality through automation around data hygiene, standardization, matching, identity resolution, householding, and other profiling tasks,” said Nash. Automating data transformation to become error-free allows data science teams to focus on value-added activities instead of manual cleansing, he pointed out.
In addition to automated data quality and governance, IT and data managers need to provide real-time access and transparent visibility of data quality to others within the organization who rely on the data to do their jobs, said Nash. “Going forward, it will be essential for IT to find a way to invest in data quality and make this data accessible and transparent to more roles throughout the business—including marketers.”
ROBOTS TAKE THE REINS
The coming year will see a greater embrace of robotic process automation (RPA) to manage day-to-day data management functions. “The increasing adoption of new digital systems will need data parallelization and consistency between data transfer systems,” said Borya Shakhnovich, CEO of airSlate. “Digital systems will not be able to be used to their full effectiveness without RPA added into the mix”—even if only on a temporary basis, he added.
As demand for automation in digital transformation grows, so will the need for RPA, Shakhnovich predicted. “Digital transformation has been a mainstay in the marketplace for a while now, and the progress and escalation due to the pandemic and our remote world have many marketplaces providing automation. In the wake of successful digital transformation, we’ll see automation begin to touch all facets of businesses for automation in fulfillment, accounting, advertising, and marketing. This concept will percolate for the SMBs with customer relationship automation, which I anticipate is where we’ll see the most growth in the next decade.”
TAPPING INTO DARK DATA
The existence of “dark data” is a phenomenon that has vexed data managers and professionals for some time and forms a huge portion of organizations’ data assets. It presents risk analogous to the unseen bulk of an iceberg. However, the greatest opportunity for enterprises in the year ahead may be finally surfacing and understanding dark data that is impactful to their businesses. “Dark data is a side effect of the ongoing transition to hybrid, distributed, and multi-cloud environments,” explained Rajesh Raheja, senior vice president and head of engineering with Boomi. “It’s the information that different business units within an organization collect, process, and store, which is unintentionally invisible to other areas of the business. Customer data captured by various business units, but not kept in sync with a central system, can lead to inconsistencies across the organization.” Additionally, there could even be a new category of data that the rest of the organization is not aware of, resulting from something such as a new lead capture system, Raheja noted.
The vast majority of companies “don’t know where all private customer data is stored within their systems and applications,” said Prashant Sharma, a data scientist and co-founder and CTO of Secuvy. “And it gets worse: Up to 85% of data collected today is unstructured with data sprawls spanning across SaaS apps, file shares, and traditional databases. This makes data governance and privacy compliance quite challenging, cumbersome, and expensive.”
The problem has been an enterprise’s or governance team’s inability to recognize dark data, Raheja said. “Dark data will impact the future of data environments no matter what road enterprises take. Some companies will overlook it and face compliance breaches.”