While data technology has come a long way in recent years, data scientists still struggle to find meaningful information quickly in large sets of data. Apache Spark is creating new advanced analytics capabilities through extremely high-speed computing.
A recent DBTA webcast spotlighted an intriguing project in which IBM has partnered with SETI (Search for Extraterrestrial Intelligence) scientists in using signal patterns in their search for extraterrestrial intelligence. The project leverages IBM Analytics for Apache Spark, a powerful cluster-computing framework with in-memory processing and analytics.
This project allows scientists, who were once limited to analyzing data captured over very short periods of time, to analyze signal patterns captured over multiple years. Chetna Warade, developer advocate with IBM, discussed how IBM’s and SETI’s work has been progressing using Apache Spark to search for intelligent life amongst the stars.
“Our objective is to make sure that we have a data science environment that will allow people to come together and collaborate. The next objective is to incorporate and implement Apache Spark,” stated Warade.
SETI is a non-profit organization dedicated to scientific research and education regarding the origin and nature of life in the universe. SETI has collaborated with IBM in taking data from the Allen Telescope Array (ATA). The study began a little over 11 weeks ago and is still in the proof of concept stage now. This entails reviewing historical data from ATA with the intent to be able to perform real-time processing of the ATA data stream.
One of the analytical objectives is to detect transient signal patterns. “In layman’s terms, transient signal pattern is a signal pattern or a change in the signal that is interesting enough to us to divide the data in a smaller chunk and take a deep dive into what is going on,” explained Warade.
Another analytical objective is to apply advanced transform algorithms such as KLT (Karhunen-Loeve Transform) to isolate signals that otherwise go undetected by FFT (Fast Fourier Transform). Potential analytical goals that are under consideration for the future include machine learning and pattern matching.
For a replay of this webinar, titled “A Journey Through Space with IBM Analytics for Apache Spark,” go here.