Nowadays, there is an increasing demand in mining interesting patterns from the big data. The process of analyzing such a huge amount of data is really computationally complex task when using traditional methods. The overall purpose of this paper is in twofold. First, this paper presents a novel approach to identify consistent and inconsistent association rules from sales data located in distributed environment. Secondly, the paper also overcomes the main memory bottleneck and computing time overhead of single computing system by applying computations to multi node cluster. The proposed method initially extracts frequent itemsets for each zone using existing distributed frequent pattern mining algorithms. The paper also compares the time efficiency of Mapreduce based frequent pattern mining algorithm with Count Distribution Algorithm (CDA) and Fast Distributed Mining (FDM) algorithms. The association generated from frequent itemsets are too large that it becomes complex to analyze it. Thus, Mapreduce based consistent and inconsistent rule detection (MR-CIRD) algorithm is proposed to detect the consistent and inconsistent rules from big data and provide useful and actionable knowledge to the domain experts. These pruned interesting rules also give useful knowledge for better marketing strategy as well. The extracted consistent and inconsistent rules are evaluated and compared based on different interestingness measures presented together with experimental results that lead to the final conclusions.
J. Prajapati, Dinesh; Garg, Sanjay; and Chauhan, N.C.
"Interesting association rule mining with consistent and inconsistent rule detection from big sales data in distributed environment,"
Future Computing and Informatics Journal: Vol. 2
, Article 3.
Available at: https://digitalcommons.aaru.edu.jo/fcij/vol2/iss1/3