WSUI: An Efficient Weighted Sequential Pattern Mining with Index-Based Structures and Frequency-Based Weighting
Từ khóa:
Sequential pattern mining, weighted patterns, pseudo-IDList, upper bounds, frequency-based weightingTóm tắt
Sequential pattern mining discovers commonly found ordered patterns but has regarded all items as equals without considering the importance of each. Weighted sequential pattern mining (WSPM) overcomes such drawbacks by taking the importance of items into consideration with corresponding weights. However, currently available approaches are faced with high memory and computational cost problems. Motivated by this, we proposed WSUI, which successfully combines the memory-efficient pseudo-IDList structures with strict upper bounds and automatically frequency-based weight distribution. Our method guarantees pattern completeness by keeping track of all possible ends for i-extensions and taking advantage of pseudo-IDList in s-extensions. Experiments on three real-world datasets show that WSUI outperforms the state-ofthe-art EWSPM consistently. It provides substantial speed-up while keeping the exhaustive pattern discovery and competitive memory consumption.