Introduction to real-time data processing and how it’s different from batch data processing

Michael.Cortez · March 11, 2022, 1:19pm

Introduction

This document will introduce real-time data processing and how it’s different from batch data processing.

Imagine that you’re sitting on a bridge overlooking a river. From this perspective, you are fixed and water is flowing by you, carrying leaves and branches. You can see that objects upstream are coming towards you, but you can only interact with things within your reach from the bridge. Likewise, once a branch is past the bridge, you will lose the opportunity to catch or manipulate it.

Thinking about real time data is much the same. In fact, that’s why we call real time data “streams”. It can be very different from how people are used to using data and desktop applications, which is called “Batch Processing”.

Batch processing

Batch Processing - A way of processing large amounts of data collected over a period of time. In this type of processing function, data is collected, grouped, then processed and the output is sent in a batch or collective response. This type of processing is not time based, and is executed by the batch monitor in the low end of the main memory.

Screen Shot 2022-03-11 at 7.14.22 AM

Batch Processing

Advantages of batch processing

Ideal for processing large amounts of data or transactions
Increased efficiency over processing each individually
Allows a good audit trail
Processing can be timed or occur off peak usage times

Real Time Processing

Real Time Processing - Real time processing systems are high speed quick response systems. This is best used in situations where a large number of events need to be processed in a short time. Quick processing returns immediate responses from the system and is tailored to applications where real time data is required.

Screen Shot 2022-03-11 at 7.15.40 AM

Real time processing

Advantages of real time processing

No significant delay on responses to processes
Information is always up to date
Allows the user to make decisions on “live” or “real-time” data

Corva’s system is designed around real time processing. This is important to note when building apps in Dev Center, because many existing algorithms are written around batch processing. In a batch processing structure, we would have a massive data set available to do what we want with but in a real time processing structure we only have the latest record to process. Does this mean these algorithms cant be handled by Corva’s platform?

Yes but it’s not what Corva was designed for really.

As developers we need to make sure we understand the differences in these two types of data processing to effectively create apps that will run efficiently. Yes, we could make an api call for 10,000 records then process those records but is that sustainable and reliable? No.

In conclusion, both methods have their advantages and disadvantages. The main thing we need to remember is Corva is set up for stream data processing and apps built in Dev Center should be designed around this.

Topic	Replies	Views
Stream App- Backend Back-end backend	317	January 12, 2022
External Data Streaming Workflow Data Sets	126	January 10, 2022
Developing Backend Scheduled App Backend backend	94	January 10, 2022
Developing Back End Task App Backend backend	110	January 10, 2022
Developing Backend Stream App Backend backend , tutorials	123	January 12, 2022

Introduction to real-time data processing and how it’s different from batch data processing

Introduction

Batch processing

Real Time Processing

Related Topics