Can, Ali BurakZaval, MounesUzun-Per, MeryemAktaş, Mehmet Sıddık2023-03-012023-03-012023Can, A. B., Zaval, M., Uzun-Per, M., & Aktaş, M. S. (2023). On the big data processing algorithms for finding frequent sequences. Concurrency and Computation: Practice and Experience, 35(24), pp.1-17. https://doi.org/10.1002/cpe.76601532-06261532-0634https://doi.org/10.1002/cpe.7660https://hdl.handle.net/20.500.13055/401Sequential pattern mining algorithms extract trendy sequence appearances insideordered transactional datasets such as market basket datasets. There is a lack ofresearch employing big data processing techniques to locate frequent sequences onlarge-scale datasets. Furthermore, there is a need for optimized sequential patternmining algorithms that run on ordered one-dimensional sequences. We also observe alack of sequential pattern search studies in the literature, where the focus is centeredaround multi-dimensional data sequences. Existing approaches that deal with orderedone-dimensional datasets suffer from scalability issues as the amount of data to beanalyzed is enormous. This research investigates the big data processing techniquesused to find frequent sequences in large-scale datasets. It also proposes a scalablesequence pattern mining algorithm called Sequential Pattern Acquisition by ReducingSearch Space (SPARSS) designed for distributed data processing systems that effi-ciently handle large datasets containing sequential one-element data. It introducesa prototype implementation of SPARSS and provides information on the SPARSS’smemory and time requirements, which were calculated as part of experimental stud-ies on a real-world dataset. The results confirm our expectations and demonstrateSPARSS’s superior scalability and run-time efficiency compared to other distributedalgorithms.eninfo:eu-repo/semantics/closedAccessApache SparkBig DataDistributed SystemsDLAGSPPrefixspanSequential Pattern MiningOn the big data processing algorithms for finding frequent sequencesArticle10.1002/cpe.76603524117Q3WOS:0009348431000012-s2.0-85148655130Q2