A Scalable Distributed Approach for Exploration Global Frequent Patterns

Authors

  • Houda Essalmi Laboratory of Engineering Sciences, Polydisciplinary Faculty of Taza, University of Sidi Mohamed Ben Abdellah Fez, Morocco
  • Anass El Affar Laboratory of Engineering Sciences, Polydisciplinary Faculty of Taza, University of Sidi Mohamed Ben Abdellah Fez, Morocco

DOI:

https://doi.org/10.32985/ijeces.16.7.1

Keywords:

Data mining, Parallel Processing, Frequent Patterns tree, Communication costs

Abstract

Finding patterns in transactional databases regularly is an essential part of data mining since it makes it simpler to identify significant connections and reoccurring patterns in datasets. Scalable, high-performance computing solutions that employ parallel computing systems to optimize resource efficiency and data analysis as data volumes continue to grow are necessary for efficiently processing large databases. To solve these issues, this paper presents Exploration Global Frequent Patterns (EGFP), a new parallel algorithm designed to generate global frequent patterns in different distributed datasets. By facilitating the distribution of workloads and data partitioning, the approach reduces communication costs and ensures efficient parallel execution. Our approach uses two prefix-tree structures to generate a significantly compacted and structured representation of frequent patterns. The first structure local-tree serves to store local support values to effectively collect and arrange transaction data. Global prefix counts are then aggregated and ranked to improve frequency-based analysis and provide a more organized and useful representation of frequent patterns. To find the globally prevalent patterns, a Master site develops a second structure global-tree for each prefix based on this arranged data. Experimental results on large-scale benchmark datasets show that EGFP outperforms other existing methods including CD and PFP-tree in terms of execution time and scalability, while incurring considerably less communication cost.

Downloads

Published

2025-07-02

How to Cite

[1]
H. Essalmi and A. El Affar, “A Scalable Distributed Approach for Exploration Global Frequent Patterns”, IJECES, vol. 16, no. 7, p. x-x, Jul. 2025.

Issue

Section

Original Scientific Papers