Teradata is extending its big data reach with three new capabilities. The company is introducing Teradata Loom 2.3 which helps a Hadoop data lake avoid becoming a data swamp by offering integrated metadata management, data lineage and data wrangling; Teradata Cloud for Hadoop, which allows customers to leverage Hadoop without the headaches of sizing, installing, and maintaining it; and a broad partnership with Cloudera that covers product integration and go-to-market strategy.
Teradata Loom 2.3
Loom is a platform for profiling, preparing and tracking data lineage for data in Hadoop. According to Chris Twogood, vice president of products and services at Teradata, while everyone has heard of the data lake architecture, the problem is that when data comes in from many different sources it can be difficulat to understand where its origin, how valid it is, and whether it is current. This was not a big issue in the early days of Hadoop because most Hadoop clusters had a single data source like web log data, but now as Hadoop vendors and customers are looking at the data lake approach, it is emerging as a problem, he said.
“We combine data wrangling, lineage and metadata all into a single product with a single UI with full search capabilities and an open REST API as the front end to the environment,” said Twogood. The product provides automated discovery and introspecting of new data in a cluster, triggers external processing, and automatically collects metadata about a job such as lineage and statistics, and also polls YARN job history for lineage, and offers advanced user interfaces for data exploration, profiling and preparation.
Loom will be available free and pre-installed with the Hortonworks Sandbox and with Cloudera QuickStart VM for non-production use to allow users to test it and learn it. In addition, Teradata Loom Community Edition will be available with support from the community, and a full featured edition of Teradata Loom will also be available on a subscription basis.
Teradata Cloud for Hadoop
Teradata is also announcing Teradata Cloud for Hadoop, which Twogood says, offers a turnkey, full service cloud environment with optional consulting help to allow organizations who have not invested in data science expertise to have the abislity to get up and running quickly with a Hadoop deployment. Teradata Cloud for Hadoop offers flexible subscriptions, monitoring support from Teradata’s cloud operations team and optional consulting through Teradata PS or the newly acquired Think Big organization.
Teradata Partnership with Cloudera
And finally, Teradata has announced a partnership with Cloudera. With this alliance, Tereadata, which has had a partnership with Hadoop vendor Hortonworks is now extending that to Cloudera, said Twogood. The Cloudera partnership covers broad technology integration and development roadmap alignment, and a unified go-to-market, sales and support offering. The two companies are also optimizing the integration between Teradata’s integrated data warehouse and Cloudera’s enterprise data hub offering to facilitate access to multiple data sources through the Teradata Unified Data Architecture.
“Increasingly, customers want a one-stop shop for their data analytics needs,” said Twogood. “This enables us to give customers the choice of Cloudera or Hortonworks. We integrate with both.”
For more information about Cloudera, go to www.cloudera.com and for information about Teradata, go to www.teradata.com.