MathWorks showcased the latest release of MATLAB, which is used in the development of analytics and algorithms to help solve engineering and scientific problems, at Strata + Hadoop World 2016 in New York.
In the new release, tall arrays now provide a way to work naturally with out-of-memory data using familiar MATLAB functions and syntax, removing the need to learn big data programming. Engineers and scientists can use tall arrays with hundreds of math, statistics, and machine learning algorithms. Code can run on Hadoop clusters or be integrated directly into Spark applications.
“We created this array type that allows you to point to a set of data – it could be on a local hard drive a database somewhere, a distributed file system for example, and we now have algorithms that will just operate against that data,” said Dave Oswill, MATLAB product manager at MathWorks.
“It manages bringing the data in as segments, doing the calculations, manages all the results, and then gives you the final results that you need.”
In additional advancements to help engineers and scientists work with large datasets, R2016b also includes a timetable data container for indexing and synchronizing time-stamped tabular data; string arrays to help manipulate, compare, and store text data efficiently; and new functions for preprocessing data.
“We have greatly expanded the ability for MATLAB to work with different types of datasets,” said Oswill.
The goal with R2016b, he noted, is to allow domain experts to work with more data, more easily, improving system design, performance, and reliability.
More information about R2016b highlights is available from MathWorks.