Quarks: Embedding Open Cognitive Analytics at the Edge of the Internet of Things

The Internet of Things (IoT) is streaming more deeply into every aspect of our lives. At the same time, cognitive computing is penetrating more aspects of the IoT as algorithms enable edge devices and applications to take more intelligent actions on a wide range of local sensor readings without the need to round-trip back to a central server.

Driving the IoT into ubiquity is the smartphone. Device-embedded analytics of various sorts are a pervasive feature of more IoT applications for consumer, business, science, government, and other use cases. When IoT edge devices capture unstructured data–such as video, audio, and environmental data feeds—cognitive algorithms can continuously distill it all down to actionable insights..

For example, the intelligent digital advisors embedded in IoT-enabled products leverage more sophisticated machine-learning algorithms, analyze myriad sources of device-acquired sensor data, and adjust their responses rapidly within dynamically shifting environmental contexts. And such up-and-coming product segments as autonomous vehicles would be impossible without embedded cognitive analytics that act on streaming sensor data.

A new generation of data scientists is emerging to build cognitive IoT applications and deploy them into every conceivable type of device and application. By 2020, new products in every sector of the economy will have been rearchitected as “cognitive IoT” endpoints. As this trend intensifies, embedded cognitive IoT will become a key focus of the next generation of cloud application developers. What will these skilled professionals need to be successful?

One of the key ingredients will be an open-source platform for building, tuning, and deploying cognitive applications to the edge of the IoT cloud, which some have referred to as the “fog.”  An open environment is absolutely essential because a closed development platform would not serve the needs of IoT developer ecosystems to work with the broadest range of data sources, analytic libraries, cloud computing environments, and IoT edge devices.

The ideal development platform for open, embedded, cognitive IoT applications would be one that:

Facilitates embedding of cognitive analytics capabilities so that IoT devices and apps can adapt continuously and react locally to their environments and, as needed, to metrics and commands from neighboring devices;

Enables developers to access, configure, and tweak any cognitive computing algorithm that is suited to the IoT analytics challenges they face;

Executes cognitive algorithms on any size device at the IoT edge or at gateways in containers on widely supported processing frameworks, including Apache Spark and Hadoop;

Presents flexible and familiar programming models for IoT development, including Python, Java and C;

Analyzes data as it streams at the device level and thereby eliminates the need to store it persistently;

Supports execution of all or most cognitive IoT processing locally, reducing or eliminating the need to round-trip many capabilities back to cloud-based computing clusters;

Accelerates memory-speed local drilldown into growing streams of locally acquired and cached sensor data

Accesses data from any streaming data platform, as well as from any RDBMS, hub, repository, file system, or other store in any cloud, on-premise, or other at-rest data platform;

Taps into the massively parallel computing power of the cloud computing clusters at the heart of distributed IoT environments;

Scales to support any size, volume, and speed of data from any IoT device anywhere on the planet;

Sources data from any device, sensor, database, stream, and middleware fabric in the IoT;

Correlates events at the application level across the vast span of the IoT;

Distributes execution of analytic algorithms dynamically out to disparate edge devices in order to maximize end-to-end application speed, throughput, and agility;

Enables robust, secure, and efficient performance, job, state, and health management under centralized administration across heterogeneous IoTs, clouds, devices, and apps;

Exposes a functional API that simplifies development, testing, and deployment of algorithms and other complex application logic artifacts that are being deployed to the edges of a complex IoT application; and

Incorporates a library of prebuilt IoT analytics to speed developer productivity in the building and maintain edge applications

In this light, I’m happy to call your attention to an important new initiative from IBM: contribution of its Quarks project now on GitHub.  IBM intends to submit this to Apache to earn incubator status.  Spawned in IBM Research, Quarks provides an open codebase for developers to build embedded cognitive analytics for IoT endpoints and applications. Quarks can manage and analyze continuous streaming data on any IoT device. It provides a single runtime for analyzing IoT data at the edge using sophisticated techniques. It runs on IoT edge devices and gateways, and enables continuous correlation of data across the IoT. And it works with Spark, Hadoop, IBM Streams, IBM IoT Foundation and many other environments.

Other recent IBM announcements relevant to developers of cognitive IoT applications are two new Apache Spark related open source projects: Torree (which enables interactive workloads between applications and a Spark cluster) and Eclairis (a JavaScript client for Apache Spark).

IBM urges developers to go to Quarks on GitHub to get started. If you’re an IBM customer, see how Quarks works seamlessly with your deployments of Bluemix, Streams, and Watson IoT.

For more information on Quarks, visit this IBM resource page. Also please come to an upcoming O’Reilly conference on stream computing that will feature Quarks.

For more information on IBM’s investments in cognitive IoT applications, tools, and partner ecosystems, check out this recent press release.


You Might Also Enjoy

Kevin Bates
Kevin Bates
9 months ago

Limit Notebook Resource Consumption by Culling Kernels

There’s no denying that data analytics is the next frontier on the computational landscape. Companies are scrambling to establish teams of data scientists to better understand their clientele and how best to evolve product solutions to the ebb and flow of today’s business ecosystem. With Apache Hadoop and Apache Spark entrenched as the analytic engine and coupled with a trial-and-error model to... Read More

Gidon Gershinsky
Gidon Gershinsky
10 months ago

How Alluxio is Accelerating Apache Spark Workloads

Alluxio is fast virtual storage for Big Data. Formerly known as Tachyon, it’s an open-source memory-centric virtual distributed storage system (yes, all that!), offering data access at memory speed and persistence to a reliable storage. This technology accelerates analytic workloads in certain scenarios, but doesn’t offer any performance benefits in other scenarios. The purpose of this blog is to... Read More