•  
  •  
 

Future Computing and Informatics Journal

Future Computing and Informatics Journal

Abstract

Achieving high level of data quality is considered one of the most important assets for any small, medium and large size organizations. Data quality is the main hype for both practitioners and researchers who deal with traditional or big data. The level of data quality is measured through several quality dimensions. High percentage of the current studies focus on assessing and applying data quality on traditional data. As we are in the era of big data, the attention should be paid to the tremendous volume of generated and processed data in which 80% of all the generated data is unstructured. However, the initiatives for creating big data quality evaluation models are still under development. This paper investigates the data quality dimensions that are mostly used in both traditional and big data to figure out the metrics and techniques that are used to measure and handle each dimension. A complete definition for each traditional and big data quality dimension, metrics and handling techniques are presented in this paper. Many data quality dimensions can be applied to both traditional and big data, while few number of quality dimensions are either applied to traditional data or big data. Few number of data quality metrics and barely handling techniques are presented in the current works.

Included in

Data Science Commons

Share

COinS