Saturday, 27 February 2021

How Trustful Is Your Data? | Veracity & Data Cleansing and Transformation


The term veracity can be defined one of the 5Vs of big data which signifies the trustfulness, quality and the credibility of data that an organisation has collected to gain meaningful insights for the best decision-making processes (Understanding The 5Vs Of Big Data, Blog - Acuvate, 2021). In general, veracity focuses on nature of the data source, considering how accurate and reliable is the information and how it is relevant to business objectives of an organisation. Moreover, having records with high veracity may bring good opportunities to businesses to achieve their goals while processing big data such as helping them to find out a relationship between a purchased product and a training course related to the product directly (Smallcombe, 2021).


Why Veracity is Vital for Businesses?

In today’s accelerated digital world, businesses more likely to rely on real-time decision-making processes regarding the high volume and velocity of data they gather to analyse, by the help of machine learning rather than human brain. As a result, this trend brings the question for accuracy and trustfulness issues of the data and highlights the term veracity. For example, deprivation of veracity can cause a snowball effect and lead to unplanned results for the organisations such as cost loss after being exposed to manipulated data. Imagine a brand who spreads wrong information which is related to human health, could cause a serious damage in terms of its consumers. Besides, the brand could lose its reputation easily (How Data Veracity Will Determine Our Future, Accenture Insi, 2021). In addition, by ensuring an accurate and trustful data, veracity provides a goal-oriented customer engagement which may lead to a better and personalised customer experience for businesses. It also encourages the self-assurance of digital marketers by ensuring them a trustful, quality and consistent data (Why Data Veracity is the Foundation for a Personalized Customer Experience, 2021).


Data Cleansing and the Difference Between Cleansing and Transformation

Data cleansing can be expressed as a process which errors are detected and eliminated, incompatibilities are fixed, and data is transformed into a similar format to improve the quality of data. Therefore, data cleansing and data quality are connected to each other as data cleansing is the only way to create a quality data as long as enormous amount of data collected from different sources might have accuracy and inconsistency issues that affect the quality of data (Ridzuan and Wan Zainon, 2019).

The quality data has 5 components which are defined as validity, accuracy, completeness, consistency and uniformity that can be ordered according to business objectives of an organisation.


Finally, while data cleansing focuses on removal of the unrelated data from the dataset, data transformation can be explained as a process of changing the structure or format of the data into another one to be analysed. Data transformation is also called as data wrangling, or data munging (Data cleaning: The benefits and steps to creating and using clean data, 2021).


Recommended Tasks Before Data Cleansing and Transformation

Here are some tasks recommended by software company Import.io that organisations should consider to be done before data cleansing and transformation (What is Data Cleansing and Transformation/ Wrangling?, Import.io, 2021): 

  • Clarify the business objectives including general strategy, current customer issues, estimated return on investment etc.
  • Examine sources of data to develop a data model related to business objectives as well as decide for what purpose the data will be used for.
  • Data profiling is required before transforming the data to identify the structure of the data and detect quality problems to understand whether the data is valuable enough for transformation or not.

Keywords: Veracity, Quality Data, Trustfulness, Decision-making, Accuracy, Data Cleansing, Data Transformation, Business Objectives


Ilgin Damla Omay 



References and Sources


Acuvate. 2021. Understanding The 5Vs Of Big Data, Blog - Acuvate. [online] Available at: https://acuvate.com/blog/understanding-the-5vs-of-big-data/ [Accessed 23 February 2021].


Import.io. 2021. What is Data Cleansing and Transformation/ Wrangling?, Import.io. [online] Available at: https://www.import.io/post/what-is-data-cleansing-and-transformation-wrangling/ [Accessed 23 February 2021].


Redpoint Global. 2021. Why Data Veracity is the Foundation for a Personalized Customer Experience. [online] Available at: https://www.redpointglobal.com/blog/why-data-veracity-is-the-foundation-for-a-personalized-customer-experience/ [Accessed 23 February 2021].


Ridzuan, F. and Wan Zainon, W., 2019. A Review on Data Cleansing Methods for Big Data. Procedia Computer Science, 161, pp.731-738.


Smallcombe, M., 2021. The 7 Vs of Big Data. [online] Xplenty. Available at: https://www.xplenty.com/blog/7-vs-big-data/#veracity [Accessed 23 February 2021].


Tableau. 2021. Data cleaning: The benefits and steps to creating and using clean data. [online] Available at: https://www.tableau.com/learn/articles/what-is-data-cleaning [Accessed 23 February 2021].


WordPressBlog. 2021. How Data Veracity Will Determine Our Future, Accenture Insi. [online] Available at: https://www.accenture.com/nl-en/blogs/insights/data-veracity-and-the-future-of-the-digital-economy [Accessed 23 February 2021].



Monday, 22 February 2021

What are the Benefits of Volume-Data Storage and How it is Beneficial to Digital Marketers?



Data storage is a medium to store a huge amount of Information. Any electronic document can be store in less space than any paper document. This information converts into bits and can be stored in the electric circuit. The storage volume can be removable hard disk but it doesn’t remove physically from computer or storage system. This Volume data storage is flexible and it can contract or expand. In it, files and documents can record digitally and saved in a storage system for future use. If we talk about a business point of view data is the most important asset. So in today’s digital world, the best method of volume data storage is websites, blogs, social media sites. Making a high-quality website for storing data is a good option so that it can include information about things that customers find relevant and interesting. With the help of data storage, it’s easy to get the contact information of the customers. Digital marketers have to deal with a large amount of data on different sources like social media, websites and that collected data helps them to do marketing digitally in the digital world. Not only this there are many other data management or data storage platforms where digital marketing experts can store their data and use it whenever they needed. Here if we talk about different marketing which is called content marketing, in content marketing data storage is very much important because if we don’t have any accurate data or we are not able to make proper and right content and in content marketing strategy data plays a vital role especially when we don’t have any idea about data analytics. By capturing or recording data digital marketers are better to understand what content or marketing strategies and methods are required for getting a better result. The things which come in mind for storing huge amount of data for a long time is optical media like DVD, and external USB - SSD drive for storing last 20 years as well as Network Attached Storage (NAS) with RAID: These have high performance magnetic hard disks and are your best option. Because of RAID, there is failure protection. I am not sure that original disks can last 20 years, but you can replace them (NAS allows hot-swapping of disks). In volume data storage it helps to keep a record of all past and activities of digital transactions. It also helps in data analysis so that any marketer can forecast the trends of the future market. It also provides flexibility and mobility and storage is easy and can move from one place to another, not only this it’s easy to forecast the better strategy formation. With the help of volume data storage decision-making is easy for the management which benefits the company’s business. 



In conclusion, it is not easy to manage and recognize the huge amount of data for the human brain so that with the help of data storage it is easy to keep a record of the day to day activities which increase the efficiency of the employee in many ways as a result easy to analysis the market scenario that helps business winning out in today’s competition.



Damanvir Kaushal



Keywords

#Volume Data storage

#Digital marketing

#Data storage

#Digital Transformation

#Social media



References and Sources :

  • https://searchstorage.techtarget.com/definition/volume
  • https://techterms.com/definition/volume
  • https://docs.oracle.com/cloud-machine/latest/stcomputecs/ELUSE/GUID-6ACE7B13-442B-459C-8868-566F1CB65E3F.htm#ELUSE-GUID-6ACE7B13-442B-459C-8868-566F1CB65E3F
  • https://docs.microsoft.com/en-us/windows-hardware/drivers/ifs/storage-device-stacks--storage-volumes--and-file-system-stacks
  • https://kt.cern/competences/high-volume-data-management-storage


Sunday, 21 February 2021

Velocity – Data Processing

What is Big Data?

Big Data is a modern subfield of data science that requires the use of several tools, methodologies and techniques to explore, analyse and process complex data sets in order to be provided insights and information systematically. 


The Five V’s of Big Data 

Big Data had broken into four dimensions by IBM data scientists. Volume, Velocity, Variety, Veracity and Value are known as the five V’s which can be referred as the characteristics or the key elements that define Big Data. These characteristics are crucial and must be considered by companies that want to run operations successfully. Volume refers to the size of big data while velocity can be defined as the pace in which the data is getting accumulated. Variety stands for the complexity of the data while Veracity can be considered to assure the accuracy of the data. Lastly, the term Value was added, representing the usefulness of accumulated data.
 
 
Velocity – Analysis of Streaming Data 
 
The speed at which data from IoT, mobile data and social media is generated, accumulated, distributed and collected can be defined as Velocity. The rate of velocity parallels with the speed in with the data is acquired and processed. Velocity-oriented databases provide users real-time analytics, information and insights which enable companies to make valuable business decisions at the right time. In addition; acquiring, analysing and processing data in real-time allows companies to solve upcoming problems before facing complex issues, making real-time decisions and catching frauds as well. Besides, Processing information quickly into data allows users to have flexibility in their queries and reports.
 
On the other hand, the systems which analyse the data should be compatible with the task which means high velocity data may require distributed processing techniques due to the speed it is being generated. Some organizations came up with several solutions for streaming data such as; Apache’s Kafka and Spark, Amazon’s Kinesis, Google’s Cloud Functions and other streaming applications that process information in almost real-time to handle the high velocity of data (Reca, 2020).
 

Every day 2.5 quintillion bytes of data is created to be managed, secured (Marr, 2018). Processing that streaming data instantaneously requires strong data handling processes. High velocity allows users to react high volume of information within a short time. Twitter or Facebook messages, social media posts, Facebook status updates, online transactions, fraud checking, credit card swipes, compliance mechanisms, GPS signals, purchase transaction records and live transportation data can be shown as high velocity data examples. For instance, Facebook needs to ingest, process and file all of the photos uploaded by the users in a short time (Gewirtz, 2018). Also, one TB of trade information can be captured by The New York Stock Exchange during each trading session. In addition, velocity is essential for packet analysis for cybersecurity since the massive flow of data should be analysed and investigated quickly in order to detect anomalies and prevent risks. Thus, data should be gathered, processed and presented near real-time to achieve successful business results (Tudor, 2020). 
 
Buket Bostanci

Keywords: big data, data velocity, velocity, data flow, data volume


References & Sources

Gewirtz, D., 2018. Volume, Velocity, and Variety: Understanding the three V’s of Big Data.  [online] ZDNet. Available at: https://www.zdnet.com/article/volume-velocity-and-variety-understanding-the-three-vs-of-big-data/[Accessed 21 February 2021].
 
Reca, M., 2020. The 5 V’s of Big Data. [online] FlyData. Available at: https://www.flydata.com/blog/5-vs-of-big-data [Accessed 21 February 2021].
 
Marr, B., 2018. How Much Data do we Create Every Day? The Mind-Blowing Stats Everyone Should Read. [online] Forbes. Available at: https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/?sh=6da11b3560ba [Accessed 21 February 2021].
 
Tudor, N., 2020. Understanding the 5Vs of Big Data: Volume, Velocity, Variety, Veracity & Value. [online] BornFight. Available at: https://bornfight.com/blog/understanding-the-5-vs-of-big-data-volume-velocity-variety-veracity-value/ [Accessed 21 February 2021].




Tuesday, 16 February 2021

COVID19 and Big data | Big data helping businesses in 2021

Big data is the focus domain in the data science domain after covid19 and recent times. This can be simply explained by the example that recently to renew the license people used to visit the office before but now it's all done online. The amount of data collected starting from 2020 is enormous. It is same time true that big data analytics helped in tracking and preventing the spread of coronavirus. JHU released a dashboard for tracking and reporting real-time data on COVID-19 for most of the countries. (JHU, 2021)

COVID19 tracking dashboard by John Hopkins

Arogya Setu, the covid19 tracking app of India is downloaded 127.6 million till now on App Store and Play Store which makes it the world's most downloaded covid tracking app, the data collected from users including location, their test details helped in tracking and isolating the covid positive patients. Similiar applications developed are greatly supporting the segregation of population and location based on the amount of risk and also supports government and organizations to take up timely actions. (Yourstory, 2020)

Since the virus is novel, there require researchers to analyze a huge volume of information – out of scope for the human brain to comprehend on its own. The area called predictive analytics helps governments to manage patients who are more vulnerable most at risk from the virus.
Machine learning along with big data analytics aids in treatments like finding antibodies required for vaccine preparation.
Big Data in 2021 comes to action in similar scenarios as discussed where it helped in tracking spread to planning activities regarding the virus, the same time big data during this scenario came into wider discussion due to the fact that these applications collect a large amount of user data to fuel its functionalities. (TCD, 2021)

The future of business in 2021 can be improved like how Big Data supports covid19 tracking and prevention from 2019. Businesses can make smarter decisions using Big Data analytics where strategies are made using facts derived from data and not from assumptions or feelings.

Adoption of strategies: There is a wide range of tools available for data analysts where the centralized data is available to the different departments of the company not just the IT departments with controlled access s required. This enables departments to form strategies that aid in their varied operations. This avoids any miscalculation by the IT department since the data is available across all the departments to cross-check analysts and reports.

Serving the right customer: Likewise, how Facebook and Google know about us it is crucial for companies to know their customers well using the data they provided with content to better serve them and increase conversions by providing customized solutions.

Products and services: As discussed above knowing customer's likes and behavior also helps in customizing and offering unique products and services. Big Data Analytics will also assist in promoting and improving products and services that need attention leading to improved innovation. 

Boosting income: Using data it is possible it conduct competitor analysis to understand what works better for them in comparison. This helps in identifying the trends in their market, capture them at early stages, and adopting them. 

Big Data opens door to AI to automate decision making leading to increased revenue and happy customers. 

- Spoorthi Joshi S



References

YourStory 2020 | Arogya Setu remains among top 10 downloaded apps globally in May: NITI Aayog CEO Kant | https://yourstory.com/2020/06/arogya-setu-top-10-downloaded-apps-globally-niti-aayog  (accessed on 14th February 2021) 

JHU 2021 | The Johns Hopkins Coronavirus Resource Center (CRC) | https://coronavirus.jhu.edu/about 
(accessed on 14th February 2021) 

TCD 2021 | COVID-19 and the Perils and Promise of Big Data | https://www.tcd.ie/business/news-events/covid-19-big-data.php (accessed on 15th February 2021) 








What Is Big Data and How It Affects the Way of Decision Making?


In the era of digital transformation and emerging technologies, one cannot deny the importance of Big Data analytics which directly affects and enhances the decision-making process of an organisation. Here comes the question. What is Big Data and why effective usage of Big Data is crucial for companies who wants to achieve their digital marketing goals in the long term?

Components of Big Data

Big Data can be defined as large amount of data which is also fast and complex that makes the analysing process unmanageable with conventional methods. According to Laney, the components of Big Data are the three Vs which forms the definition of the term as high volumevelocity and variety of data to be processed. Volume means the various different sources that organisations can benefit for data collection such as connected IoT devices and social media platforms. Velocity describes the high-speed streaming of data including near real-time actions. Variety expresses the different formats of data which can be occurred both structured or unstructured such as email, video or financial transactions (Big Data: What it is and why it matters, 2021). Two more Vs which are entitled value and veracity have been added to the three Vs by some sources to emphasise the importance of the value and truthfulness of the data (What Is Big Data? | Oracle Ireland, 2021).


Why Organisations Analysing Big Data?

A survey was conducted in 2017 indicated that the number of companies which have adapted Big Data analytics has been dramatically increasing in the recent years. Moreover, the survey reveals the most common purpose of using Big Data as data warehouse optimisation which requires an integration of Big Data platforms such as Hadoop and Spark into an existing data infrastructure of an organisation. The main reason of using Big Data is followed by the purpose of analysing social media and streaming data coming from IoT devices which considered as data lake sources to define the unprocessed data, and exposing fraud detection (Watson, 2019). Once the organisations know the sources of Big Data, they will be able to manage and store the data by the help of Big Data platforms or cloud spaces, and then analyse it in terms of relevancy for the business purposes to have better, evident-based and data driven decisions in the marketplace that they compete (Big Data, what it is and why it matters, no date). It could be said organisations that are aware of the significance of the Big Data and have ability to process and analyse it properly, are more likely to have meaningful insights from their customers and make successful decisions while applying their digital marketing strategy.


Big Data Applications

There have been many applications of Big Data that have emerged as Natural Language Processing including Digital Assistants such as Amazon's Alexa and Apple's Siri, chatbots which are used for customer support, and Visualisation Systems including Drones and Facial Actions Coding Systems (FACS). For example, the company P&G used a coding software to analyse facial expressions of people for testing the scents of its new product which ended up with more precise and successful predictions of the software rather than what participants said (Hugh J., 2019). Another example of using Big Data shows how Amazon has been collecting detailed purchase data from its customers since then and how it lets other companies to access its ad portal to make them reach to the specific target audience (17 Big Data Examples & Applications, 2021).

Finally, using of Big Data by many companies brings governance concerns and security issues as well. To protect the privacy of personal data, regular controls of service providers need to be reviewed and the compliance with laws and regulations should be considered.


Keywords: Analytics, Big Data, Decision Making, Digital Marketing, IoT, Social Media


Ilgin Damla Omay


References and Sources

Built In. 2021. 17 Big Data Examples & Applications. [online] Available at: https://builtin.com/big-data/big-data-examples-applications [Accessed 16 February 2021].


Hugh J., W., 2019. Update Tutorial: Big Data Analytics: Concepts, Technology, and Applications. Communications of the Association for Information Systems, pp.364-379.


Oracle.com. 2021. What Is Big Data? | Oracle Ireland. [online] Available at: https://www.oracle.com/ie/big-data/what-is-big-data/ [Accessed 16 February 2021].


Sas.com. 2021. Big Data: What it is and why it matters. [online] Available at: https://www.sas.com/en_us/insights/big-data/what-is-big-data.html [Accessed 16 February 2021].