Data Quality Assurance and Governance in the Age of Big Data: Strategies, Challenges, and Industry Applications
Published 2021-07-16
Keywords
- big data,
- data quality assurance,
- data governance,
- automation,
- machine learning
How to Cite
Abstract
The advent of big data has ushered in a new era of opportunities and challenges in the domain of data quality assurance across organisations. Conventional data quality assurance strategies are confronted with formidable obstacles arising from the sheer volume, diversity, and rapidity of big data sources. This paper delves into the latest strategies that are emerging to address data quality assurance and governance in the context of big data environments. Specifically, we explore the utility of cutting-edge technologies such as automation, machine learning, and adaptive frameworks, which are tailored to facilitate real-time quality assurance for streaming data. Notwithstanding the advances in technology, persistent challenges persist in domains like metadata management, the rectification of erroneous data, and the integration of human-in-the-loop verification processes. The criticality of this topic is underscored by industry use cases that demonstrate an escalating demand for robust data quality assurance practices to harness the potential of big data across a multitude of sectors, including finance, healthcare, and the Internet of Things (IoT). Organisations entrusted with the responsibility of safeguarding customer data face an imperative to cultivate and implement rigorous data quality practises. These practises are indispensable in upholding the fundamental principles of data accuracy, completeness, compliance, and ethical standards in the contemporary era of big data. As data stewards, organisations are compelled to meet these demands to ensure the trustworthiness and reliability of the data upon which critical decisions are based.