Saturday, 27 February 2021

How Trustful Is Your Data? | Veracity & Data Cleansing and Transformation


The term veracity can be defined one of the 5Vs of big data which signifies the trustfulness, quality and the credibility of data that an organisation has collected to gain meaningful insights for the best decision-making processes (Understanding The 5Vs Of Big Data, Blog - Acuvate, 2021). In general, veracity focuses on nature of the data source, considering how accurate and reliable is the information and how it is relevant to business objectives of an organisation. Moreover, having records with high veracity may bring good opportunities to businesses to achieve their goals while processing big data such as helping them to find out a relationship between a purchased product and a training course related to the product directly (Smallcombe, 2021).


Why Veracity is Vital for Businesses?

In today’s accelerated digital world, businesses more likely to rely on real-time decision-making processes regarding the high volume and velocity of data they gather to analyse, by the help of machine learning rather than human brain. As a result, this trend brings the question for accuracy and trustfulness issues of the data and highlights the term veracity. For example, deprivation of veracity can cause a snowball effect and lead to unplanned results for the organisations such as cost loss after being exposed to manipulated data. Imagine a brand who spreads wrong information which is related to human health, could cause a serious damage in terms of its consumers. Besides, the brand could lose its reputation easily (How Data Veracity Will Determine Our Future, Accenture Insi, 2021). In addition, by ensuring an accurate and trustful data, veracity provides a goal-oriented customer engagement which may lead to a better and personalised customer experience for businesses. It also encourages the self-assurance of digital marketers by ensuring them a trustful, quality and consistent data (Why Data Veracity is the Foundation for a Personalized Customer Experience, 2021).


Data Cleansing and the Difference Between Cleansing and Transformation

Data cleansing can be expressed as a process which errors are detected and eliminated, incompatibilities are fixed, and data is transformed into a similar format to improve the quality of data. Therefore, data cleansing and data quality are connected to each other as data cleansing is the only way to create a quality data as long as enormous amount of data collected from different sources might have accuracy and inconsistency issues that affect the quality of data (Ridzuan and Wan Zainon, 2019).

The quality data has 5 components which are defined as validity, accuracy, completeness, consistency and uniformity that can be ordered according to business objectives of an organisation.


Finally, while data cleansing focuses on removal of the unrelated data from the dataset, data transformation can be explained as a process of changing the structure or format of the data into another one to be analysed. Data transformation is also called as data wrangling, or data munging (Data cleaning: The benefits and steps to creating and using clean data, 2021).


Recommended Tasks Before Data Cleansing and Transformation

Here are some tasks recommended by software company Import.io that organisations should consider to be done before data cleansing and transformation (What is Data Cleansing and Transformation/ Wrangling?, Import.io, 2021): 

  • Clarify the business objectives including general strategy, current customer issues, estimated return on investment etc.
  • Examine sources of data to develop a data model related to business objectives as well as decide for what purpose the data will be used for.
  • Data profiling is required before transforming the data to identify the structure of the data and detect quality problems to understand whether the data is valuable enough for transformation or not.

Keywords: Veracity, Quality Data, Trustfulness, Decision-making, Accuracy, Data Cleansing, Data Transformation, Business Objectives


Ilgin Damla Omay 



References and Sources


Acuvate. 2021. Understanding The 5Vs Of Big Data, Blog - Acuvate. [online] Available at: https://acuvate.com/blog/understanding-the-5vs-of-big-data/ [Accessed 23 February 2021].


Import.io. 2021. What is Data Cleansing and Transformation/ Wrangling?, Import.io. [online] Available at: https://www.import.io/post/what-is-data-cleansing-and-transformation-wrangling/ [Accessed 23 February 2021].


Redpoint Global. 2021. Why Data Veracity is the Foundation for a Personalized Customer Experience. [online] Available at: https://www.redpointglobal.com/blog/why-data-veracity-is-the-foundation-for-a-personalized-customer-experience/ [Accessed 23 February 2021].


Ridzuan, F. and Wan Zainon, W., 2019. A Review on Data Cleansing Methods for Big Data. Procedia Computer Science, 161, pp.731-738.


Smallcombe, M., 2021. The 7 Vs of Big Data. [online] Xplenty. Available at: https://www.xplenty.com/blog/7-vs-big-data/#veracity [Accessed 23 February 2021].


Tableau. 2021. Data cleaning: The benefits and steps to creating and using clean data. [online] Available at: https://www.tableau.com/learn/articles/what-is-data-cleaning [Accessed 23 February 2021].


WordPressBlog. 2021. How Data Veracity Will Determine Our Future, Accenture Insi. [online] Available at: https://www.accenture.com/nl-en/blogs/insights/data-veracity-and-the-future-of-the-digital-economy [Accessed 23 February 2021].



2 comments:

  1. This post provided great insights in regardance with data cleansing from the business perspective. The source of data contribute to a large portion in the truthfulness of the data, the integrity of the data can be improved by the team managing it whether its manual or by a software. The difference between cleansing and transformation can be understood very clearly from this blog. I feel data cleansing before transformation is very important added that the cleansed data should be verified for its integrity before proceeding t the phase of of data transformation

    ReplyDelete
  2. In today’s world data truthfulness is the biggest issue, the following blog provides a great insight of data protection, true data, data cleansing and transformation and veracity.
    From the following blog the term veracity and its objectives and importance for the organization and business are clearly explained. In various aspects of business development veracity is highly recommended to give good opportunities, truthfulness and accuracy in data which leads to a satisfying customer experience and engagement.
    With a large amount of data comes errors, this blog gives a good explanation about data cleansing and transformation.

    Damanvir Kaushal

    ReplyDelete