On the big data processing algorithms for finding frequent sequences

Sequential pattern mining algorithms extract trendy sequence appearances insideordered transactional datasets such as market basket datasets. There is a lack ofresearch employing big data processing techniques to locate frequent sequences onlarge-scale datasets. Furthermore, there is a need for optimized sequential patternmining algorithms that run on ordered one-dimensional sequences. We also observe alack of sequential pattern search studies in the literature, where the focus is centeredaround multi-dimensional data sequences. Existing approaches that deal with orderedone-dimensional datasets suffer from scalability issues as the amount of data to beanalyzed is enormous. This research investigates the big data processing techniquesused to find frequent sequences in large-scale datasets. It also proposes a scalablesequence pattern mining algorithm called Sequential Pattern Acquisition by ReducingSearch Space (SPARSS) designed for distributed data processing systems that effi-ciently handle large datasets containing sequential one-element data. It introducesa prototype implementation of SPARSS and provides information on the SPARSS’smemory and time requirements, which were calculated as part of experimental stud-ies on a real-world dataset. The results confirm our expectations and demonstrateSPARSS’s superior scalability and run-time efficiency compared to other distributedalgorithms.

Anahtar Kelimeler

Apache Spark, Big Data, Distributed Systems, DLA, GSP, Prefixspan, Sequential Pattern Mining

Kaynak

Concurrency and Computation: Practice and Experience

WoS Q Değeri

Q3

Scopus Q Değeri

Q2

Cilt

35

Sayı

24

Künye

Can, A. B., Zaval, M., Uzun-Per, M., & Aktaş, M. S. (2023). On the big data processing algorithms for finding frequent sequences. Concurrency and Computation: Practice and Experience, 35(24), pp.1-17. https://doi.org/10.1002/cpe.7660

Bağlantı

https://doi.org/10.1002/cpe.7660
https://hdl.handle.net/20.500.13055/401

Koleksiyon

Bilgisayar Mühendisliği Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

On the big data processing algorithms for finding frequent sequences

Dosyalar

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

Açıklama