Sunday, 21 February 2021

Velocity – Data Processing

What is Big Data?

Big Data is a modern subfield of data science that requires the use of several tools, methodologies and techniques to explore, analyse and process complex data sets in order to be provided insights and information systematically. 


The Five V’s of Big Data 

Big Data had broken into four dimensions by IBM data scientists. Volume, Velocity, Variety, Veracity and Value are known as the five V’s which can be referred as the characteristics or the key elements that define Big Data. These characteristics are crucial and must be considered by companies that want to run operations successfully. Volume refers to the size of big data while velocity can be defined as the pace in which the data is getting accumulated. Variety stands for the complexity of the data while Veracity can be considered to assure the accuracy of the data. Lastly, the term Value was added, representing the usefulness of accumulated data.
 
 
Velocity – Analysis of Streaming Data 
 
The speed at which data from IoT, mobile data and social media is generated, accumulated, distributed and collected can be defined as Velocity. The rate of velocity parallels with the speed in with the data is acquired and processed. Velocity-oriented databases provide users real-time analytics, information and insights which enable companies to make valuable business decisions at the right time. In addition; acquiring, analysing and processing data in real-time allows companies to solve upcoming problems before facing complex issues, making real-time decisions and catching frauds as well. Besides, Processing information quickly into data allows users to have flexibility in their queries and reports.
 
On the other hand, the systems which analyse the data should be compatible with the task which means high velocity data may require distributed processing techniques due to the speed it is being generated. Some organizations came up with several solutions for streaming data such as; Apache’s Kafka and Spark, Amazon’s Kinesis, Google’s Cloud Functions and other streaming applications that process information in almost real-time to handle the high velocity of data (Reca, 2020).
 

Every day 2.5 quintillion bytes of data is created to be managed, secured (Marr, 2018). Processing that streaming data instantaneously requires strong data handling processes. High velocity allows users to react high volume of information within a short time. Twitter or Facebook messages, social media posts, Facebook status updates, online transactions, fraud checking, credit card swipes, compliance mechanisms, GPS signals, purchase transaction records and live transportation data can be shown as high velocity data examples. For instance, Facebook needs to ingest, process and file all of the photos uploaded by the users in a short time (Gewirtz, 2018). Also, one TB of trade information can be captured by The New York Stock Exchange during each trading session. In addition, velocity is essential for packet analysis for cybersecurity since the massive flow of data should be analysed and investigated quickly in order to detect anomalies and prevent risks. Thus, data should be gathered, processed and presented near real-time to achieve successful business results (Tudor, 2020). 
 
Buket Bostanci

Keywords: big data, data velocity, velocity, data flow, data volume


References & Sources

Gewirtz, D., 2018. Volume, Velocity, and Variety: Understanding the three V’s of Big Data.  [online] ZDNet. Available at: https://www.zdnet.com/article/volume-velocity-and-variety-understanding-the-three-vs-of-big-data/[Accessed 21 February 2021].
 
Reca, M., 2020. The 5 V’s of Big Data. [online] FlyData. Available at: https://www.flydata.com/blog/5-vs-of-big-data [Accessed 21 February 2021].
 
Marr, B., 2018. How Much Data do we Create Every Day? The Mind-Blowing Stats Everyone Should Read. [online] Forbes. Available at: https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/?sh=6da11b3560ba [Accessed 21 February 2021].
 
Tudor, N., 2020. Understanding the 5Vs of Big Data: Volume, Velocity, Variety, Veracity & Value. [online] BornFight. Available at: https://bornfight.com/blog/understanding-the-5-vs-of-big-data-volume-velocity-variety-veracity-value/ [Accessed 21 February 2021].