Newsletters




Big Data

The well-known three Vs of Big Data - Volume, Variety, and Velocity – are increasingly placing pressure on organizations that need to manage this data as well as extract value from this data deluge for Predictive Analytics and Decision-Making. Big Data technologies, services, and tools such as Hadoop, MapReduce, Hive and NoSQL/NewSQL databases and Data Integration techniques, In-Memory approaches, and Cloud technologies have emerged to help meet the challenges posed by the flood of Web, Social Media, Internet of Things (IoT) and machine-to-machine (M2M) data flowing into organizations.



Big Data Articles

Databases are hampered by a reliance on disk-based storage, a technology that has been in place for more than two decades. Even with the addition of memory caches and solid state drives, the model of relying on repeated access to the permanent information storage devices is still a bottleneck in capitalizing on today's "big data," according to a new survey of 323 data managers and professionals who are part of the IOUG. Nearly 75% of respondents believe that in-memory technology is important to enabling their organization to remain competitive in the future. Yet, almost as many also indicate they lack the in-memory skills to deliver even current business requirements. The research results are detailed in a new report, titled "Accelerating Enterprise Insights: 2013 IOUG In-Memory Strategies Survey."

Posted January 24, 2013

EMC Greenplum has qualified Attunity RepliWeb for Enterprise File Replication (EFR) and Attunity Managed File Transfer (MFT) with EMC Greenplum Hadoop (HD). Attunity RepliWeb for EFR and Attunity MFT are high-performance, easy-to-use solutions for automating, managing and accelerating the process of making data available for big data analytics with Hadoop. According to Attunity, the products, launched earlier this year, are the first and only solutions currently qualified by EMC for Greenplum HD. "Greenplum has come into the marketplace by storm and has had a strong vision of being data-independent or data-agnostic. They want to make sure that their analytic platform can handle both structured and unstructured data and this aligns very well with Attunity's mission statement of any data, any time, anywhere," Matt Benati, vice president of Global Marketing at Attunity, tells DBTA.

Posted January 24, 2013

Enterprise NoSQL database provider MarkLogic Corporation has partnered with business intelligence vendor Tableau Software to offer analytics and visualization over unstructured big data. The partnership allows business users to leverage Tableau's business intelligence and reporting solutions to access disparate data sets of structured and unstructured data house in a MarkLogic NoSQL database. "Not only can you build rich, sophisticated applications, but you can also make use of that data where it is, and have business users connect to that data, visualize it, and do analytics over it, without involving the development center," Stephen Buxton, MarkLogic's director of product management, tells DBTA.

Posted January 24, 2013

SAP announced a new option for SAP Business Suite customers — SAP Business Suite powered by SAP HANA — providing an integrated family of business applications that captures and analyzes transactional data in real time on a single in-memory platform. With Business Suite on HANA, "SAP has reinvented the software that reinvented businesses," stated Rob Enslin, member of the Global Executive Board and SAP head of sales, as part of his presentation during the company's recent launch event.

Posted January 10, 2013

For many years, enterprise data center managers have struggled to implement disaster recovery strategies that meet their RTO/RPOs and business continuity objectives while staying within their budget. While the challenges of moving, managing, and storing massive data volumes for effective disaster protection have not changed - exponential data growth and the advent of big data technologies, have made the challenge of disaster recovery protection more difficult than ever before.

Posted December 19, 2012

Despite the rise of big data, data warehousing is far from dead. While traditional, static data warehouses may have indeed seen their day, an agile data warehouse — one that can map to the needs of the business and change as the business changes — is quickly on the rise. Many of the conversations today around big data revolve around volume and while that is certainly valid, the issue is also about understanding data in context to make valuable business decisions. Do you really understand why a consumer takes action to buy? How do their purchases relate? When will they do it again? Big data is limited when it comes to answering these questions. An agile approach — one that gives even big data a life beyond its initial purpose — is the value data warehousing can bring to bear and is critical to long-term business success.

Posted December 19, 2012

For years, data warehouses and extract, transform and load (ETL) have been the primary methods of accessing and archiving multiple data sources across enterprises. Now, an emerging approach - data virtualization - promises to advance the concept of the federated data warehouse to deliver more timely and easier-to-access enterprise data. These are some of the observations made at Composite Software's third Annual Data Virtualization Day, held in New York City. This year's gathering was the largest ever, with nearly 250 customers and practitioners in attendance, Composite reports.

Posted November 13, 2012

Attunity Ltd., a provider of information availability software solutions, has released Attunity Managed File Transfer (MFT) for Hadoop. The new enterprise data transfer solution is designed to accelerate big data collection processes and integrate them seamlessly into and out of Hadoop. MFT enables organizations to collect and transfer big data in both the cloud and enterprise data centers for strategic initiatives including log and machine-data analytics, business intelligence, and data archiving."Attunity MFT for Hadoop, the first of several Hadoop solutions that Attunity will unveil, is designed to deliver on the great promise of Hadoop by helping organizations achieve faster time-to-value for big data analytics projects," says Matt Benati, VP Global Marketing at Attunity.

Posted November 09, 2012

The opportunities and challenges presented by big data are addressed in a new report summarizing the results of a survey of data managers and professionals who are part of the Independent Oracle Users Group. The survey was underwritten by Oracle Corporation and conducted by Unisphere Research, a division of Information Today, Inc. Key highlights from the survey include the finding that more than one out of 10 data managers now have in excess of a petabyte of data within their organizations, and a majority of respondents report their levels of unstructured data are growing.

Posted October 24, 2012

Business analytics vendor OpTier has released OpTier APM 5.0, a solution that provides real-time business transaction analytics and deep diagnostics, enabling improved visibility and gains in productivity. OpTier also released a new Big Data Analytics solution that takes advantage of OpTier's business transaction-based platform, providing real-time big data analytics already in context and reducing time and cost.

Posted October 01, 2012

Data management vendor Terracotta, Inc. has released BigMemory Go, the latest innovation in the BigMemory line that allows customers to put as much data in memory as desired to speed application performance at big data scale. The product is being offered via a free 32GB per instance production license that can be deployed on as many servers as desired.

Posted September 25, 2012

The first computer program I ever wrote (in 1979, if you must know) was in the statistical package SPSS (Statistical Package for the Social Sciences), and the second computer platform I used was SAS (Statistical Analysis System). Both of these systems are still around today—SPSS was acquired by IBM as part of its BI portfolio, and SAS is now the world's largest privately held software company. The longevity of these platforms—they have essentially outlived almost all contemporary software packages—speaks to the perennial importance of data analysis to computing.

Posted September 19, 2012

Oracle announced enhanced support for the R statistical programming language, including new platform ports of R for Oracle Solaris and AIX in addition to Linux and Windows, connectivity to Oracle TimesTen In-Memory Database in addition to Oracle Database, and integration of hardware-specific Math libraries for faster performance. "Big data analytics is a top priority for our customers, and the R statistical programming language is a key tool for performing these analytics," says Andrew Mendelsohn, senior vice president, Oracle Database Server Technologies.

Posted September 12, 2012

Business intelligence software vendor Actuate has partnered with VoltDB, provider of ultra-high-throughput relational database systems, to offer a solution that will allow ActuateOne and VoltDB customers to process their big data more quickly and effectively, resulting in improved insights. Together, the VoltDB and ActuateOne alliance is expected to substantially reduce the time from big data access to operational insights, providing customers with a competitive advantage in the market and improving the bottom line.

Posted September 11, 2012

Big data and cloud analytics vendor Kognitio has partnered with Xtremeinsights, a provider of solutions for leveraging Hadoop in existing data management systems. Together, the partners aim to deliver software and integration technologies to businesses that want to leverage the Hadoop platform and gain actionable insights from their big data. Using its in-memory analytical platform, Kognitio speeds up the analysis of data from Hadoop clusters, enabling ad hoc, real-time analytics at a significantly lower cost. "Xtremeinsights can build the underlying infrastructure so that your business users can do ad hoc analysis on ridiculous amounts of data and get answers in real-time," Michael Hiskey, Kognitio's vice president of marketing and business development, tells DBTA.

Posted August 23, 2012

Pentaho's Business Analytics 4.5 is now certified on Cloudera's latest releases, Cloudera Enterprise 4.0 and CDH4. Pentaho also announced that its visual design studio capabilities have been extended to the Sqoop and Oozie components of Hadoop. "Hadoop is a very broad ecosystem. It is not a single project," Ian Fyfe, chief technology evangelist at Pentaho, tells DBTA. "Sqoop and Oozie are shipped as part of Cloudera's distribution so that is an important part of our support for Cloudera as well - providing that visual support which nobody else in the market does today."

Posted August 23, 2012

ParAccel, an enterprise analytics platform provider, has announced the general availability of ParAccel 4.0. "If you consider how analytics have traditionally been run - offline, static, standalone - there's a large gap that needs to be filled if organizations want to meet 21st century demands," says Chuck Berger, CEO at ParAccel. The new release builds on ParAccel's expertise to help organizations achieve high performance, interactive big data analytics with improved speed and reliability

Posted August 14, 2012

Cloud operating system provider Nimbula has unveiled its elastic Hadoop solution with MapR Technologies, allowing users to run their Hadoop clusters on private clouds. The elasticity and multi-tenancy of Nimbula Director paired with the dependability and security of MapR Hadoop Distribution allows for a fully-functional and highly-available Hadoop cluster on a single pool of infrastructure. "What we're trying to achieve is have the power of Hadoop on top of a private cloud and bringing the best of each world to the customer," Reza Malekzadeh, Nimbula's vice president of marketing & sales, tells 5 Minute Briefing. Customers can run Hadoop and non-Hadoop workloads on the same shared infrastructure.

Posted August 14, 2012

Syncsort, a global leader in high-performance data integration solutions, has certified its DMExpress data integration software for high-performance loading of Greenplum Database. Syncsort has also joined the Greenplum Catalyst Developer Program. Syncsort DMExpress software delivers extensive connectivity that makes it easy to extract and transform data from nearly any source, and rapidly load it into the massively parallel processing (MPP) Greenplum Database without the need for manual tuning or custom coding. "IT organizations of all sizes are struggling to keep pace with the spiraling infrastructure demands created by the sheer volume, variety and velocity of big data," says Mitch Seigle, vice president, Marketing and Product Management, Syncsort.

Posted July 25, 2012

Hyve Solutions, a division of SYNNEX Corporation has entered a software licensing agreement with IBM to offer IBM InfoSphere BigInsights software with its BigD family of systems hardware. The turnkey platform is intended to provide mid-market clients with an enterprise-class big data system to help them quickly deploy Hadoop-based analytics without the need for on-premise professional services or developers. To achieve enterprise-class standards in BigD systems, Hyve Solutions and IBM collaborated with Zettaset, Inc. to build in safeguards that provide service management, failover and restart, as well as alerting and monitoring features.

Posted June 19, 2012

The term "big data" refers to the massive amounts of data being generated on a daily basis by businesses and consumers alike - data which cannot be processed using conventional data analysis tools owing to its sheer size and, in many case, its unstructured nature. Convinced that such data hold the key to improved productivity and profitability, enterprise planners are searching for tools capable of processing big data, and information technology providers are scrambling to develop solutions to accommodate new big data market opportunities.

Posted May 23, 2012

Organizations are struggling with big data, which they define as any large-size data store that becomes unmanageable by standard technologies or methods, according to a new survey of 264 data managers and professionals who are subscribers to Database Trends and Applications. The survey was conducted by Unisphere Research, a division of Information Today, Inc., in partnership with MarkLogic in January 2012. Among the key findings uncovered by the survey is the fact that unstructured data is on the rise, and ready to engulf current data management systems. Added to that concern, say respondents, is their belief that management does not understand the challenge that is looming, and is failing to recognize the significance of unstructured data assets to the business.

Posted May 09, 2012

Zettaset has announced SHadoop, a new security initiative designed to improve security for Hadoop. The new initiative will be incorporated as a security layer into Zettaset's Hadoop Orchestrator data management platform. The SHadoop layer is intended to mitigate architectural and input validation issues that exist within the core Hadoop code, and improve upon user role audit tracking and user level security.

Posted April 12, 2012

RainStor, a provider of big data management software, is joining with IBM in the big data market. RainStor will work with IBM to deliver a solution that combines IBM's enterprise-class, Hadoop-based product, InfoSphere BigInsights, with RainStor's Big Data Analytics on Hadoop product to enable faster, more flexible analytics on multi-structured data, without the need to move data out of the Hadoop environment. According to the vendors, the new combined solution can reduce the TCO for customers by significantly reducing physical storage, and also improving the performance of querying and analyzing big data sets across the enterprise.

Posted March 27, 2012

At the Strata Conference today Calpont anounced InfiniDB 3, the latest release of its high performance analytic database. Designed from the ground up for large-scale, high-performance dimensional analytics, predictive analytics, and ad hoc business intelligence, the new release includes capabilities to capitalize on a variety of data structures and deployment variations to meet organizations' need for a flexible and scalable big data architecture.

Posted February 29, 2012

Composite Software has introduced version 6.1 of its Composite Data Virtualization Platform. The new release offers improved caching performance, expanded caching targets, data ship join for Teradata, and Hadoop MapReduce connectivity. Composite 6.1 also provides improvements to the data services development environment with an enhanced data services editor and new publishing options for Representational State Transfer (REST) and Open Data Protocol (OData) data services.

Posted February 17, 2012

RainStor, a provider of big data management software, has unveiled the RainStor Big Data Analytics on Hadoop, which the company describes as the first enterprise database running natively on Hadoop. It is intended to enable faster analytics on multi-structured data without the need to move data out of the Hadoop Distributed File System (HDFS) environment. There is architectural compatibility with the way Rainstor manages data and the way Hadoop Distributed File Systems manage CSV files, says Deirdre Mahon, vice president of marketing at Rainstor.

Posted January 25, 2012

The Oracle Big Data Appliance, an engineered system of hardware and software that was first unveiled at Oracle OpenWorld in October, is now generally available. The new system incorporates Cloudera's Distribution Including Apache Hadoop (CDH3) with Cloudera Manager 3.7, plus an open source distribution of R. The Oracle Big Data Appliance represents "two industry leaders coming together to wrap their arms around all things big data," says Cloudera COO Kirk Dunn.

Posted January 25, 2012

"Big data" and analytics have become the rage within the executive suite. The promise is immense - harness all the available information within the enterprise, regardless of data model or source, and mine it for insights that can't be seen any other way. In short, senior managers become more effective at business planning, spotting emerging trends and opportunities and anticipating crises because they have the means to see both the metaphorical trees and the forest at the same time. However, big data technologies don't come without a cost.

Posted January 11, 2012

The big data playing field grew larger with the formation of Hortonworks and HPCC Systems. Hortonworks is a new company consisting of key architects and core contributors to the Apache Hadoop technology pioneered by Yahoo. In addition, HPCC Systems, which has been launched by LexisNexis Risk Solutions, aims to offer a high performance computing cluster technology as an alternative to Hadoop.

Posted July 27, 2011

The rise of "big data" solutions - often involving the increasingly common Hadoop platform - together with the growing use of sophisticated analytics to drive business value - such as collective intelligence and predictive analytics - has led to a new category of IT professional: the data scientist.

Posted May 12, 2011

Google's first "secret sauce" for web search was the innovative PageRank link analysis algorithm which successfully identifies the most relevant pages matching a search term. Google's superior search results were a huge factor in their early success. However, Google could never have achieved their current market dominance without an ability to reliably and quickly return those results. From the beginning, Google needed to handle volumes of data that exceeded the capabilities of existing commercial technologies. Instead, Google leveraged clusters of inexpensive commodity hardware, and created their own software frameworks to sift and index the data. Over time, these techniques evolved into the MapReduce algorithm. MapReduce allows data stored on a distributed file system - such as the Google File System (GFS) - to be processed in parallel by hundreds of thousands of inexpensive computers. Using MapReduce, Google is able to process more than a petabyte (one million GB) of new web data every hour.

Posted January 11, 2010

Google introduced the MapReduce algorithm to perform massively parallel processing of very large data sets using clusters of commodity hardware. MapReduce is a core Google technology and key to maintaining Google's website indexes.

Posted September 14, 2009

Pages
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170

Sponsors