The term veracity can be defined one of the 5Vs of big data which signifies the trustfulness, quality and the credibility of data that an organisation has collected to gain meaningful insights for the best decision-making processes (Understanding The 5Vs Of Big Data, Blog - Acuvate, 2021). In general, veracity focuses on nature of the data source, considering how accurate and reliable is the information and how it is relevant to business objectives of an organisation. Moreover, having records with high veracity may bring good opportunities to businesses to achieve their goals while processing big data such as helping them to find out a relationship between a purchased product and a training course related to the product directly (Smallcombe, 2021).
Why Veracity is Vital for Businesses?
In today’s accelerated digital world, businesses more likely to rely on real-time decision-making processes regarding the high volume and velocity of data they gather to analyse, by the help of machine learning rather than human brain. As a result, this trend brings the question for accuracy and trustfulness issues of the data and highlights the term veracity. For example, deprivation of veracity can cause a snowball effect and lead to unplanned results for the organisations such as cost loss after being exposed to manipulated data. Imagine a brand who spreads wrong information which is related to human health, could cause a serious damage in terms of its consumers. Besides, the brand could lose its reputation easily (How Data Veracity Will Determine Our Future, Accenture Insi, 2021). In addition, by ensuring an accurate and trustful data, veracity provides a goal-oriented customer engagement which may lead to a better and personalised customer experience for businesses. It also encourages the self-assurance of digital marketers by ensuring them a trustful, quality and consistent data (Why Data Veracity is the Foundation for a Personalized Customer Experience, 2021).
Data Cleansing and the Difference Between Cleansing and Transformation
Data cleansing can be expressed as a process which errors are detected and eliminated, incompatibilities are fixed, and data is transformed into a similar format to improve the quality of data. Therefore, data cleansing and data quality are connected to each other as data cleansing is the only way to create a quality data as long as enormous amount of data collected from different sources might have accuracy and inconsistency issues that affect the quality of data (Ridzuan and Wan Zainon, 2019).
The quality data has 5 components which are defined as validity, accuracy, completeness, consistency and uniformity that can be ordered according to business objectives of an organisation.
Finally, while data cleansing focuses on removal of the unrelated data from the dataset, data transformation can be explained as a process of changing the structure or format of the data into another one to be analysed. Data transformation is also called as data wrangling, or data munging (Data cleaning: The benefits and steps to creating and using clean data, 2021).
Recommended Tasks Before Data Cleansing and Transformation
Here are some tasks recommended by software company Import.io that organisations should consider to be done before data cleansing and transformation (What is Data Cleansing and Transformation/ Wrangling?, Import.io, 2021):
- Clarify the business objectives including general strategy, current customer issues, estimated return on investment etc.
- Examine sources of data to develop a data model related to business objectives as well as decide for what purpose the data will be used for.
- Data profiling is required before transforming the data to identify the structure of the data and detect quality problems to understand whether the data is valuable enough for transformation or not.
Keywords: Veracity, Quality Data, Trustfulness, Decision-making, Accuracy, Data Cleansing, Data Transformation, Business Objectives
Ilgin Damla Omay
References and Sources
Acuvate. 2021. Understanding The 5Vs Of Big Data, Blog - Acuvate. [online] Available at: https://acuvate.com/blog/understanding-the-5vs-of-big-data/ [Accessed 23 February 2021].
Import.io. 2021. What is Data Cleansing and Transformation/ Wrangling?, Import.io. [online] Available at: https://www.import.io/post/what-is-data-cleansing-and-transformation-wrangling/ [Accessed 23 February 2021].
Redpoint Global. 2021. Why Data Veracity is the Foundation for a Personalized Customer Experience. [online] Available at: https://www.redpointglobal.com/blog/why-data-veracity-is-the-foundation-for-a-personalized-customer-experience/ [Accessed 23 February 2021].
Ridzuan, F. and Wan Zainon, W., 2019. A Review on Data Cleansing Methods for Big Data. Procedia Computer Science, 161, pp.731-738.
Smallcombe, M., 2021. The 7 Vs of Big Data. [online] Xplenty. Available at: https://www.xplenty.com/blog/7-vs-big-data/#veracity [Accessed 23 February 2021].
Tableau. 2021. Data cleaning: The benefits and steps to creating and using clean data. [online] Available at: https://www.tableau.com/learn/articles/what-is-data-cleaning [Accessed 23 February 2021].
WordPressBlog. 2021. How Data Veracity Will Determine Our Future, Accenture Insi. [online] Available at: https://www.accenture.com/nl-en/blogs/insights/data-veracity-and-the-future-of-the-digital-economy [Accessed 23 February 2021].