On the big data processing algorithms for finding frequent sequences

dc.authorid0000-0002-4958-4575en_US
dc.authorscopusid55355863500en_US
dc.authorwosidGHQ-7349-2022en_US
dc.contributor.authorCan, Ali Burak
dc.contributor.authorZaval, Mounes
dc.contributor.authorUzun-Per, Meryem
dc.contributor.authorAktaş, Mehmet Sıddık
dc.date.accessioned2023-03-01T07:20:26Z
dc.date.available2023-03-01T07:20:26Z
dc.date.issued2023en_US
dc.departmentFakülteler, Mühendislik ve Doğa Bilimleri Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.description.abstractSequential pattern mining algorithms extract trendy sequence appearances insideordered transactional datasets such as market basket datasets. There is a lack ofresearch employing big data processing techniques to locate frequent sequences onlarge-scale datasets. Furthermore, there is a need for optimized sequential patternmining algorithms that run on ordered one-dimensional sequences. We also observe alack of sequential pattern search studies in the literature, where the focus is centeredaround multi-dimensional data sequences. Existing approaches that deal with orderedone-dimensional datasets suffer from scalability issues as the amount of data to beanalyzed is enormous. This research investigates the big data processing techniquesused to find frequent sequences in large-scale datasets. It also proposes a scalablesequence pattern mining algorithm called Sequential Pattern Acquisition by ReducingSearch Space (SPARSS) designed for distributed data processing systems that effi-ciently handle large datasets containing sequential one-element data. It introducesa prototype implementation of SPARSS and provides information on the SPARSS’smemory and time requirements, which were calculated as part of experimental stud-ies on a real-world dataset. The results confirm our expectations and demonstrateSPARSS’s superior scalability and run-time efficiency compared to other distributedalgorithms.en_US
dc.identifier.citationCan, A. B., Zaval, M., Uzun-Per, M., & Aktaş, M. S. (2023). On the big data processing algorithms for finding frequent sequences. Concurrency and Computation: Practice and Experience, 35(24), pp.1-17. https://doi.org/10.1002/cpe.7660en_US
dc.identifier.doi10.1002/cpe.7660en_US
dc.identifier.endpage17en_US
dc.identifier.issn1532-0626
dc.identifier.issn1532-0634
dc.identifier.issue24en_US
dc.identifier.scopus2-s2.0-85148655130en_US
dc.identifier.scopusqualityQ2en_US
dc.identifier.startpage1en_US
dc.identifier.urihttps://doi.org/10.1002/cpe.7660
dc.identifier.urihttps://hdl.handle.net/20.500.13055/401
dc.identifier.volume35en_US
dc.identifier.wosWOS:000934843100001en_US
dc.identifier.wosqualityQ3en_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynak.otherSCI-E - Science Citation Index Expandeden_US
dc.institutionauthorUzun-Per, Meryem
dc.language.isoenen_US
dc.publisherWileyen_US
dc.relation.ispartofConcurrency and Computation: Practice and Experienceen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/closedAccessen_US
dc.subjectApache Sparken_US
dc.subjectBig Dataen_US
dc.subjectDistributed Systemsen_US
dc.subjectDLAen_US
dc.subjectGSPen_US
dc.subjectPrefixspanen_US
dc.subjectSequential Pattern Miningen_US
dc.titleOn the big data processing algorithms for finding frequent sequencesen_US
dc.typeArticleen_US
dspace.entity.typePublication

Dosyalar

Orijinal paket
Listeleniyor 1 - 1 / 1
Kapalı Erişim
İsim:
On the big data processing algorithms for finding frequent sequences.pdf
Boyut:
2.39 MB
Biçim:
Adobe Portable Document Format
Açıklama:
Tam Metin / Full Text
Lisans paketi
Listeleniyor 1 - 1 / 1
Kapalı Erişim
İsim:
license.txt
Boyut:
1.44 KB
Biçim:
Item-specific license agreed upon to submission
Açıklama: