Most existing databases suffer from data inconsistencies. Enhancing data quality efforts are necessary to resolve this issue. In this paper, two techniques are proposed for mining accurate conditional functional dependencies rules from such databases to be employed for data cleaning. The idea of the proposed techniques is to mine firstly maximal closed frequent patterns, then mine the dependable conditional functional dependencies rules with the help of lift measure. Moreover, data repairing algorithm is proposed for fixing inconsistent tuples found in the database exploiting the generated rules. An extensive experimental is conducted study to confirm the effectiveness of the proposed techniques compared with existing technique on both real-life and synthetic medical data sets.
"Fixing rules for data cleaning based on conditional functional dependency,"
Future Computing and Informatics Journal: Vol. 1
, Article 2.
Available at: https://digitalcommons.aaru.edu.jo/fcij/vol1/iss1/2