Big data in real-time : enter the new era

Written by Michel Remise, on 29 March 2019


Real-time analysis of big data has been announced as the major trend of big data in 2017

Big data in real-time has become a great challenge. Big data impacts all the industrial fields. A great number of CEOs wish to benefit from real-time experiences to take decisions more quickly.

This evolution of trend has not come about by chance. The rapid increase of the IoT and users, the ease to exploit data from sensors has contributed to this buzz. Thus, several offers have been developed digging into the “hype” of real-time experience to meet this need. Spark or Stormare are usually the preferred frameworks of the large structures . Globally, the carriers of real-time offers, have sold more or less, in a more or less packaged way. The main message is the same: collecting, storage, processing and visualization of data, all in real-time !

Here are some examples of trendy frameworks that are well fitted for real-time experience:

You are already part of the mouvement

In fact, on an everyday basis you are confronted to real-time analysis. An e-commerce website needs to display as fast as possible the recommendations or ads. Your actions are constantly translated into data that is already processed in a Big Data architecture for real-time analysis. When we carry out calculations on a flow of data, we usually need a result quickly. In some cases, you need the result in less than a hundred seconds. Indeed, targeted ads are almost immediately displayed! Likewise, the amount of collected data can be important. Considering, all the users logging in on social media, exploiting all the data is a big challenge.

The data on your actions generate massive quantities of data and also value. In the same way, autonomous cars should be able to process a multitude of signals detected by sensors. Material failures from a cluster of calculation must trigger alarms. High-frequency trading requires to deliver a result very quickly. The surveillance of production line, the optimization of supply chain, fraud detection, patient care, smart cities smart grid, have to generate action in real time … In short, you have understood it, real-time Big Data is well suited for many fields.

Speed and management of flows : those are the main issue of real-time processing. They require the conception of dedicated distributed architectures.

Build your own Big Data in real-time architecture

At first, this diagram may seem complicated. However, it only exposes the fact that stream processing requires a large place in the conception of an architecture. The real-time architectures are usually used for : preprocessing, data correlation, models training, predictions making, patterns sequencing, alerts tigerring etc.

In practice, these architectures are not easy to execute. The diversity of tools of storage, processing and visualization is significant and it is not easy to find its way in this cloud of tools. Especially as the connectors between the different technical bricks are sometimes nonexistent. Then it is to you to implement the non-existing functionalities, without considering the upgrades, the nightmarish settings (Hadoop), the skeletal documentations, etc. After this introduction, I offer you to a series of tutorials to create little by little a simple architecture to do real-time processing. We will rely on an easy use-case: the storage of flows with Couchbase and its processing with

Read our article about Deep Learning here