On the big data processing algorithms for finding frequent sequences
Yükleniyor...
Tarih
2023
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Wiley
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
Sequential pattern mining algorithms extract trendy sequence appearances insideordered transactional datasets such as market basket datasets. There is a lack ofresearch employing big data processing techniques to locate frequent sequences onlarge-scale datasets. Furthermore, there is a need for optimized sequential patternmining algorithms that run on ordered one-dimensional sequences. We also observe alack of sequential pattern search studies in the literature, where the focus is centeredaround multi-dimensional data sequences. Existing approaches that deal with orderedone-dimensional datasets suffer from scalability issues as the amount of data to beanalyzed is enormous. This research investigates the big data processing techniquesused to find frequent sequences in large-scale datasets. It also proposes a scalablesequence pattern mining algorithm called Sequential Pattern Acquisition by ReducingSearch Space (SPARSS) designed for distributed data processing systems that effi-ciently handle large datasets containing sequential one-element data. It introducesa prototype implementation of SPARSS and provides information on the SPARSS’smemory and time requirements, which were calculated as part of experimental stud-ies on a real-world dataset. The results confirm our expectations and demonstrateSPARSS’s superior scalability and run-time efficiency compared to other distributedalgorithms.
Açıklama
Anahtar Kelimeler
Apache Spark, Big Data, Distributed Systems, DLA, GSP, Prefixspan, Sequential Pattern Mining
Kaynak
Concurrency and Computation: Practice and Experience
WoS Q Değeri
Q3
Scopus Q Değeri
Q2
Cilt
35
Sayı
24
Künye
Can, A. B., Zaval, M., Uzun-Per, M., & Aktaş, M. S. (2023). On the big data processing algorithms for finding frequent sequences. Concurrency and Computation: Practice and Experience, 35(24), pp.1-17. https://doi.org/10.1002/cpe.7660